Opened 7 years ago
Closed 7 years ago
#16662 closed enhancement (fixed)
OA for n=1046,1059,2164,3992,3994
Reported by: | ncohen | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-6.4 |
Component: | combinatorial designs | Keywords: | |
Cc: | vdelecroix | Merged in: | |
Authors: | Nathann Cohen | Reviewers: | Vincent Delecroix |
Report Upstream: | N/A | Work issues: | |
Branch: | 7a73e74 (Commits, GitHub, GitLab) | Commit: | 7a73e74549a080ac7ac139241521a871d667cd45 |
Dependencies: | #16604 | Stopgaps: |
Description
As the title says ! New MOLS from another paper.
Nathann
Change History (45)
comment:1 Changed 7 years ago by
- Branch set to u/ncohen/16662
- Status changed from new to needs_review
comment:2 Changed 7 years ago by
- Commit set to 6e47439f639b269679ee5640d4c497ec73ded499
comment:3 Changed 7 years ago by
- Commit changed from 6e47439f639b269679ee5640d4c497ec73ded499 to 355ac2a741154e5fca99ec1d5137c5be0b9d4377
comment:4 Changed 7 years ago by
- Milestone changed from sage-6.3 to sage-6.4
comment:5 follow-up: ↓ 6 Changed 7 years ago by
- Status changed from needs_review to needs_info
Hello
1) I corrected typos in the doc, added a check, added a test, modify few sentences at u/vdelecroix/16662
2) I do not understand why he following is not valid
sage: k=10; n=2; m=79; a=b=c=1 sage: for kk,nn in ((k,m),(k,m+1),(k,m+2),(k,a),(k,b),(k,c)): ....: assert designs.orthogonal_array(k,n,existence=True) sage: OA=thwart_lemma_3_5(10,2,79,1,1,1) Traceback (most recent call last): ... 913 if existence: 914 return Unknown --> 915 raise NotImplementedError("I don't know how to build an OA({},{})!".format(k,n)) 916 917 if check: NotImplementedError: I don't know how to build an OA(10,69)!
Are a,b,c allowed to be 1? In the examples of the article a,b,c are never 1 (but d might be).
3) I found two examples which yield to error I do not understand from the specifications. The first one is related to the point 2) I guess
sage: OA=thwart_lemma_3_5(10,2,89,1,1,1) Traceback (most recent call last): ... 456 for B in master_design: 457 # The missing entries belong to the last n_trunc columns --> 458 assert all(x is not None for x in B[:k]) 459 n_in_truncated = n_trunc-B.count(None) 460 AssertionError:
For the other one, the existence of an OA(9,21) should not be necessary for the construction
sage: OA = thwart_lemma_3_5(9,9,22,8,8,8,complement=True) Traceback (most recent call last): ... 913 if existence: 914 return Unknown --> 915 raise NotImplementedError("I don't know how to build an OA({},{})!".format(k,n)) 916 917 if check: NotImplementedError: I don't know how to build an OA(9,21)!
4) I found several new values for which the thwart construction works. But I first would like to understand first why 3) fail.
Vincent
comment:6 in reply to: ↑ 5 Changed 7 years ago by
Hello !
1) I corrected typos in the doc, added a check, added a test, modify few sentences at u/vdelecroix/16662
Is it me or is the git server slow ? It takes MINUTES to download a branch O_o
2) I do not understand why he following is not valid
How, not "valid" ? The input is valid but in order to build the new designs you need some sub-designs, and among the subdesigns there is a OA(10,69)
that Sage does not know how to build. Soo well, the construction can't be applied because you need one of the required designs... What's wrong with that ? This is what the find_
functions usually checks !
Are a,b,c allowed to be 1? In the examples of the article a,b,c are never 1 (but d might be).
I don't think that there is anything wrong with =1 values.
3) I found two examples which yield to error I do not understand from the specifications. The first one is related to the point 2) I guess
No, it is just that there exists no OA(13,2)
. The code has to build an OA(k+3,n)
(or OA(k+4,n)
when d
is defined) and it does not exist if k is too large. I missed that one, probably because that's the kind of thing that I filter in the find_
function usually.
For the other one, the existence of an OA(9,21) should not be necessary for the construction
The exception I added for the previous one is also triggered here.
4) I found several new values for which the thwart construction works. But I first would like to understand first why 3) fail.
Nathann
comment:7 Changed 7 years ago by
- Commit changed from 355ac2a741154e5fca99ec1d5137c5be0b9d4377 to 8fcaed066b7f110af10c6a471cbebcad370bd312
comment:8 Changed 7 years ago by
- Status changed from needs_info to needs_review
comment:9 Changed 7 years ago by
Hi,
You add a check at the begining but there is no associated specification in the doc...
Vincent
comment:10 Changed 7 years ago by
Very bad... I was not able to build the potentially new OA(12,3362)
sage: OA=thwart_lemma_3_5(12,16,207,13,13,13,11,complement=True) Traceback (most recent call last): ... .../incidence_structures.pyc in blocks(self, copy) 520 if self._point_to_index is None: 521 from copy import deepcopy --> 522 return deepcopy(self._blocks) 523 else: 524 return [[self._points[i] for i in b] for b in self._blocks] ... MemoryError:
comment:11 Changed 7 years ago by
Well you can't build a OA(2^300+1,2^300)
either... :-P
This deepcopy is a crime.
Nathann
comment:12 follow-up: ↓ 13 Changed 7 years ago by
I said that because this one is smaller than the largest you add to the database: an OA(12,3994).
I will create a ticket to remove all deepcopy around (and in that case I think it is more like a toto.blocks(copy=False)
which is needed).
Vincent
comment:13 in reply to: ↑ 12 Changed 7 years ago by
Yoooooooo !
I said that because this one is smaller than the largest you add to the database: an OA(12,3994).
I will create a ticket to remove all deepcopy around (and in that case I think it is more like a
toto.blocks(copy=False)
which is needed).
Please don't, I fixed that in at least two needs_review
tickets already ^^;
Nathann
comment:14 follow-up: ↓ 15 Changed 7 years ago by
All right...
I put constructions for OA(10,1048), OA(11,1524) and OA(12,3362) at u/vdelecroix/16662
. And by the way, I wrote a function which compute the set of new values you get by using this construction... should I add it somewhere? It is not exactly a find_XXX
function, it go the other way around: play with parameters m,n,a,b,c,d and see if the m*n+a+b+c+d you obtain is already available.
Vincent
comment:15 in reply to: ↑ 14 ; follow-up: ↓ 16 Changed 7 years ago by
Yo !
I put constructions for OA(10,1048), OA(11,1524) and OA(12,3362) at
u/vdelecroix/16662
.
Oh. Are they the only other MOLS that we can build ? If there are more it may be worth writing a find function after all...
Nathann
comment:16 in reply to: ↑ 15 ; follow-up: ↓ 17 Changed 7 years ago by
Replying to ncohen:
Yo !
I put constructions for OA(10,1048), OA(11,1524) and OA(12,3362) at
u/vdelecroix/16662
.Oh. Are they the only other MOLS that we can build ? If there are more it may be worth writing a find function after all...
Seems to be the only one. If you want to write this find function that's also an option. But I am not sure you will ever get the OA(10,1046)
in less than a minute. A possible alternative is to write a function that checks that:
- those constructions are not useless (i.e. there are no other way to get them)
- there is nothing more to build with these thwarts (this is pretty much what I wrote)
Vincent
comment:17 in reply to: ↑ 16 ; follow-up: ↓ 18 Changed 7 years ago by
Yo !
Seems to be the only one. If you want to write this find function that's also an option.
I tried several times and it really was complicated. And required a lot of copy/paste.
But I am not sure you will ever get the
OA(10,1046)
in less than a minute
Well, you need to fill the cache at first, like for all constructions.
A possible alternative is to write a function that checks that:
- those constructions are not useless (i.e. there are no other way to get them)
Why do you care so much about not having useless constructions ? It is good to have several ways to generate the same design. Some paths may be better than others, some paths may result in designs with different properties...
- there is nothing more to build with these thwarts (this is pretty much what I wrote)
The problem with that is that it is true with the current OA that Sage can build. Wait a bit and new one will appear that may change that.
Nathann
comment:18 in reply to: ↑ 17 ; follow-up: ↓ 19 Changed 7 years ago by
Replying to ncohen:
Yo !
Seems to be the only one. If you want to write this find function that's also an option.
I tried several times and it really was complicated. And required a lot of copy/paste.
So let us not do that.
A possible alternative is to write a function that checks that:
- those constructions are not useless (i.e. there are no other way to get them)
Why do you care so much about not having useless constructions ? It is good to have several ways to generate the same design. Some paths may be better than others, some paths may result in designs with different properties...
All right. Then there are thousands of thwarts construction that you may want to consider...
- there is nothing more to build with these thwarts (this is pretty much what I wrote)
The problem with that is that it is true with the current OA that Sage can build. Wait a bit and new one will appear that may change that.
This is cool. If a ticket creates new OA that allows more of this construction, then it will create an error in a doctest and the database will be automatically updated. Nevertheless, filling the cache up to 3000 is too long for a doctest. So I can keep my file under my pillow and run it from time to time.
Vincent
comment:19 in reply to: ↑ 18 Changed 7 years ago by
Yo !
So let us not do that.
Hmmmmmmmmm. I know it will haunt me :-P
All right. Then there are thousands of thwarts construction that you may want to consider...
Well, for the moment I try to fill the table until we stick to the Handbook.
This is cool. If a ticket creates new OA that allows more of this construction, then it will create an error in a doctest and the database will be automatically updated. Nevertheless, filling the cache up to 3000 is too long for a doctest. So I can keep my file under my pillow and run it from time to time.
Under a pillow is no place to store code. You can't wash away the stains of a memory leak.
Nathann
comment:20 Changed 7 years ago by
All right, I wrote the find_thwart_lemma_3_5
... it works at least for the parameters that were found before. It is at u/vdelecroix/16662
.
Vincent
comment:21 Changed 7 years ago by
Could you add some comments ? So many loops and tests, even an internal caching system and no comments is a bit tough.
Also, I don't think you should implement caching systems like that in a find_
function. If we need such caching (and we do) it should be implemented where it belongs, with a dedicated function.
I had in mind something like that:
designs.orthogonal_arrays.build(k,n) designs.orthogonal_arrays.build_information(k,n) designs.orthogonal_arrays.is_available(k,n) designs.orthogonal_arrays.exists(k,n) designs.orthogonal_arrays.max_k(n) designs.orthogonal_arrays.all_n(k, min=None, max=None)
Though I like this interface, it means quite some amount of copy/paste in the code. For TD and MOLS. Maybe there is a trick.
Nathann
comment:22 follow-up: ↓ 24 Changed 7 years ago by
Hello,
It is not only caching. With a more naive approach you would have
for a in xrange(n): if not orthogonal_arrays(k,a,existence=True): continue for b in xrange(a+1): if not orthogonal_arrays(k,b,existence=True): continue for c in xrange(b+1): if not orthogonal_arrays(k,c,existence=True): continue
About your caching system, what would be the difference between is_available
and exists
? And instead of all_n
I would prefer nrange
or something more pythonic.
I will write some doc but I won't change the code.
Vincent
comment:23 Changed 7 years ago by
It is at u/vdelecroix/16662
.
Vincent
comment:24 in reply to: ↑ 22 Changed 7 years ago by
Yo !
It is not only caching. With a more naive approach you would have
What you do there is the work of the "all_n" function I wound like to write. I would like to replace the
for n in range(N-2): if is_prime(n)
with
for n in prime_powers(N-2): ...
But it is not possible because you use n to build the cache.
About your caching system, what would be the difference between
is_available
andexists
? And instead ofall_n
I would prefernrange
or something more pythonic.
is_available
would not trigger non-existence tests. But perhaps that is less useful since your rewrote the is_sum_of_squares
thing.
Nathann
comment:25 Changed 7 years ago by
With the find function
sage: %time designs.orthogonal_array(None,1000,existence=1) CPU times: user 24.6 s, sys: 52 ms, total: 24.6 s Wall time: 24.6 s 9
Without
sage: %time designs.orthogonal_array(None,1000,existence=1) CPU times: user 14.5 s, sys: 32 ms, total: 14.5 s Wall time: 14.5 s 9
comment:26 follow-up: ↓ 28 Changed 7 years ago by
Yo !
It seems that the list "good" is very very often useless. I wanted to know how often it was used and added a "print n" right before the loop "for a in good"
@@ -999,6 +999,7 @@ def find_thwart_lemma_3_5(k,N): orthogonal_array(k,m+2,existence=True)): continue + print "A",n for a in good: if a > N-n*m: break @@ -1023,7 +1024,7 @@ def find_thwart_lemma_3_5(k,N): orthogonal_array(k,m+2,existence=True) and orthogonal_array(k,m+3,existence=True)): continue - + print "B",n for a in good: if a > N-n*m: break
Guess how many times they appear when you type designs.orthogonal_array(None,1000,existence=1)
: never.
Which means that for 1000 at least computing all OA(k,n,existence=True)
for n<N is all for naught (but takes 10s).
Nathann
comment:27 Changed 7 years ago by
By the way, shouldn't range(N-3)
be range(N-2)
? The last element is not included in the list, so m=a=b=c=1 is not tested.
Nathann
comment:28 in reply to: ↑ 26 Changed 7 years ago by
Hello,
Moving the good
part would change nothing to the timings. You already have the following for the m
loop which is the cost of everything
if not (orthogonal_array(k,m+0,existence=True) and orthogonal_array(k,m+1,existence=True) and orthogonal_array(k,m+2,existence=True)): continue
If you feel like this function is too costly, we can remove it from the find_recursive_constructions
and just store the interesting values:
if (k,n) not in ((10,1046), (10,1048), (10,1059), (11,1524), (11,2164), (12,3362), (12,3992), (12,3994)): return False
About your question, you were right and we can even go up to n=N-1
.
Tell me what you think about the first question and I will add a commit.
Vincent
comment:29 Changed 7 years ago by
Commit a u/vdelecroix/16662
.
- I removed the good
- I cut the loop as much as possible
Vincent
comment:30 follow-up: ↓ 32 Changed 7 years ago by
Helloooooooo Vincent !
Okay... So let's do that:
1) You saved some time since the first version, and although it is still slower than needed it's not that awful that we should keep this construction out of those that are automatically tested
2) Those triple loops are really awful things, so we will have to do something about it
3) I am thinking of a data structure that would be useful to us, and whose purpose is to store things like "all n such that there exists a OA(k,n)" or even "all m such that there exists a OA(k,m), OA(k,m+1), OA(k,m+2)".
What it stores: a set of integers defined by a boolean function What it is meant to answer: give the list of integers between x and y such that the boolean function is satisfied.
Of course we want to minimize the number of boolean function queries. Even though it takes spaces, I am thinking of something like that:
An array which associates to (any) integer n:
a) if f(n) is computed: the smallest integer n'>=n such that f(n') is True or has not been computed yet.
b) if f(n) is not computed yet: None
This way, if you want to get all solutions to f(n) is True
between x and y, here is what you do:
current=x while current<=y: if array[current] is None: array[current] = f(current) elif array[current] == current: yield current current+=1 else: new_current = array[current] # if array[a]==b and array[b]==c then let's define array[a]=c if array[new_current] is not None: array[current] = array[new_current] current = new_current
This may be cool if we ever implement the all_n
or range_n
function.
And we could use it in conjunction with a real function to solve the partition problem, i.e. "try to find integers from a set S whose sum is equal or lequal to C".
Anyway.
1) I merged all your commits into one, as some were undoing the previous ones
2) I added a small commit on top of it. In particular some additional constraint on k that removes fake '+' in the n<100 area :-D
Your code is good otherwise.
Nathann
comment:31 Changed 7 years ago by
- Commit changed from 8fcaed066b7f110af10c6a471cbebcad370bd312 to 2f936c57507cbd475f860a76b94ed44882d7de7b
comment:32 in reply to: ↑ 30 ; follow-up: ↓ 33 Changed 7 years ago by
- Status changed from needs_review to needs_info
Hello,
Replying to ncohen:
3) I am thinking of a data structure that would be useful to us, and whose purpose is to store things like "all n such that there exists a OA(k,n)" or even "all m such that there exists a OA(k,m), OA(k,m+1), OA(k,m+2)".
+1
What it stores: a set of integers defined by a boolean function What it is meant to answer: give the list of integers between x and y such that the boolean function is satisfied.
Of course we want to minimize the number of boolean function queries. Even though it takes spaces, I am thinking of something like that:
An array which associates to (any) integer n:
a) if f(n) is computed: the smallest integer n'>=n such that f(n') is True or has not been computed yet.
b) if f(n) is not computed yet: None
[...]
This may be cool if we ever implement the
all_n
orrange_n
function.
And:
- what is the next value (like next_prime)
- what is the previous value (like previous_prime)
The problem with your approach is that you store a lot of data. It is perhaps not as bad as what I imagine.
Questions:
1) why do you stop n
at N-3
in the first loop? I think that n=N-1, m=1, a=1, b=0, c=0
is a valid input:
sage: OA = thwart_lemma_3_5(3,13,1,1,0,0) sage: is_orthogonal_array(OA,3,14,2) True
So the bound should be N-1
. Perhaps the particular case of m=1
is taken care by another construction, in that case, this must be documented (and the upper bound for n updated accordingly).
2) the check d <= n
is not necessary because of the lower bound on c
.
I edit my commit at u/vdelecroix/16662
to simplify the code with respect to 2). As soon as 1) is solve, I would be happy to set this to positive review.
Vincent
comment:33 in reply to: ↑ 32 ; follow-up: ↓ 34 Changed 7 years ago by
Yooooooo !
And:
- what is the next value (like next_prime)
That's easy to do. It would be the result of all_n(min=current_value,max=None).next()
- what is the previous value (like previous_prime)
This would be harder, but I'm not sure it is needed either O_o
The problem with your approach is that you store a lot of data. It is perhaps not as bad as what I imagine.
it bothered me at first but we are thinking of k lists (and k<50 usually) or size around 2000 at most. How bad can that be ? If it were stored in C it would represent 500ko :-P
1) why do you stop
n
atN-3
in the first loop? I think thatn=N-1, m=1, a=1, b=0, c=0
is a valid input:
Argg... Because I thought we could assume a,c,b>0
, I did some changes then removed them, but I forgot one. If d=0 and c=0 I think that what the code does is roughly equivalent to Wilson's construction wth 2 truncated columns. But well, you are right, I will undo that !
2) the check
d <= n
is not necessary because of the lower bound onc
.I edit my commit at
u/vdelecroix/16662
to simplify the code with respect to 2). As soon as 1) is solve, I would be happy to set this to positive review.
I just downloaded your branch but did not see anything new. I just merged it with the beta0 but I will add a commit in a second !
Nathann
comment:34 in reply to: ↑ 33 ; follow-up: ↓ 35 Changed 7 years ago by
Replying to ncohen:
2) the check
d <= n
is not necessary because of the lower bound onc
.I edit my commit at
u/vdelecroix/16662
to simplify the code with respect to 2). As soon as 1) is solve, I would be happy to set this to positive review.I just downloaded your branch but did not see anything new. I just merged it with the beta0 but I will add a commit in a second !
I edited my commit. I did not add a commit. My last commit in my branch is 68a2d99 and not 2f936c5.
Vincent
comment:35 in reply to: ↑ 34 ; follow-up: ↓ 36 Changed 7 years ago by
Yo !
I edited my commit. I did not add a commit. My last commit in my branch is 68a2d99 and not 2f936c5.
Oh, I see. I think that this N-2 is correct after all, however. If n=N-2, then it means that there are at most two points in the last 4 columns, i.e. that at most two columns are used. And because all pairs of points are equivalent, this is already handled by the wilson construction with two truncated columns.
I had written some code which I hoped would improve the performances of this function, but well... It produced no difference whatsoever. And I don't know why yet, so let's get this in.
Perhaps the cached "all_n" function will change something :-/
Nathann
comment:36 in reply to: ↑ 35 Changed 7 years ago by
Replying to ncohen:
Yo !
I edited my commit. I did not add a commit. My last commit in my branch is 68a2d99 and not 2f936c5.
Oh, I see. I think that this N-2 is correct after all, however. If n=N-2, then it means that there are at most two points in the last 4 columns, i.e. that at most two columns are used. And because all pairs of points are equivalent, this is already handled by the wilson construction with two truncated columns.
I believe it but please write it in a comment!
I had written some code which I hoped would improve the performances of this function, but well... It produced no difference whatsoever. And I don't know why yet, so let's get this in.
Perhaps the cached "all_n" function will change something
:-/
I hope it will.
Vincent
comment:37 Changed 7 years ago by
- Commit changed from 2f936c57507cbd475f860a76b94ed44882d7de7b to a9b594f3991f67cf7d2c82a91d8c332c47c35ba7
comment:38 Changed 7 years ago by
- Status changed from needs_info to positive_review
Great great... and of lot of rebase for the follow-up tickets!
Vincent
comment:39 Changed 7 years ago by
Thanks !
Nathann
comment:40 Changed 7 years ago by
- Status changed from positive_review to needs_work
There is a (simple) conflict with #16604.
Vincent
comment:41 Changed 7 years ago by
see at u/vdelecroix/16662
.
Vincent
comment:42 Changed 7 years ago by
- Commit changed from a9b594f3991f67cf7d2c82a91d8c332c47c35ba7 to 7a73e74549a080ac7ac139241521a871d667cd45
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
3cf3983 | trac #16604: Reviewing the review
|
d612056 | trac #16604: on the doc again
|
bff470e | trac #16604: small check of dimensions
|
6c21904 | trac #16797: correct a row/column inversion
|
48b6902 | trac #16604: merge #16797
|
579e75b | trac #16604: input check + doctest
|
e1a83d0 | trac #16604: Optional check flag
|
05c6915 | trac #16604: Variable rename and list->set
|
24c4f7f | trac #16604: Merge with updated #16797
|
7a73e74 | trac #16662: Merge with updated #16604
|
comment:43 Changed 7 years ago by
- Status changed from needs_work to positive_review
comment:44 Changed 7 years ago by
- Reviewers set to Vincent Delecroix
comment:45 Changed 7 years ago by
- Branch changed from u/ncohen/16662 to 7a73e74549a080ac7ac139241521a871d667cd45
- Resolution set to fixed
- Status changed from positive_review to closed
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
trac #16604: new OA for n=112,160,224,514,796
trac #16604: OA for n=640, reviewer's remarks and silly mistake
trac #16604: OA(15,896)
trac #16604: OA(16,208)
trac #16604: OA(16,176)
trac #16604: Now without copy and paste
trac #16604: OA(20,352)
trac #16604: OA(20,416)
trac #16604: OA(20,544)
trac #16662: OA for n=1046,1059,2164,3992,3994