Opened 5 years ago

Closed 4 years ago

#16662 closed enhancement (fixed)

OA for n=1046,1059,2164,3992,3994

Reported by: ncohen Owned by:
Priority: major Milestone: sage-6.4
Component: combinatorial designs Keywords:
Cc: vdelecroix Merged in:
Authors: Nathann Cohen Reviewers: Vincent Delecroix
Report Upstream: N/A Work issues:
Branch: 7a73e74 (Commits) Commit: 7a73e74549a080ac7ac139241521a871d667cd45
Dependencies: #16604 Stopgaps:

Description

As the title says ! New MOLS from another paper.

Nathann

Change History (45)

comment:1 Changed 5 years ago by ncohen

  • Branch set to u/ncohen/16662
  • Status changed from new to needs_review

comment:2 Changed 5 years ago by git

  • Commit set to 6e47439f639b269679ee5640d4c497ec73ded499

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

00654fbtrac #16604: new OA for n=112,160,224,514,796
ba6609dtrac #16604: OA for n=640, reviewer's remarks and silly mistake
c218148trac #16604: OA(15,896)
6074f4btrac #16604: OA(16,208)
35cbadctrac #16604: OA(16,176)
bfcc6f5trac #16604: Now without copy and paste
326ece4trac #16604: OA(20,352)
a621220trac #16604: OA(20,416)
54fc1e1trac #16604: OA(20,544)
6e47439trac #16662: OA for n=1046,1059,2164,3992,3994

comment:3 Changed 4 years ago by git

  • Commit changed from 6e47439f639b269679ee5640d4c497ec73ded499 to 355ac2a741154e5fca99ec1d5137c5be0b9d4377

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

0232b73trac #16604: OA(20,544)
e5f428dtrac #16604: Merged with 6.3.beta6
355ac2atrac #16662: OA for n=1046,1059,2164,3992,3994

comment:4 Changed 4 years ago by vbraun_spam

  • Milestone changed from sage-6.3 to sage-6.4

comment:5 follow-up: Changed 4 years ago by vdelecroix

  • Status changed from needs_review to needs_info

Hello

1) I corrected typos in the doc, added a check, added a test, modify few sentences at u/vdelecroix/16662

2) I do not understand why he following is not valid

sage: k=10; n=2; m=79; a=b=c=1
sage: for kk,nn in ((k,m),(k,m+1),(k,m+2),(k,a),(k,b),(k,c)):
....:     assert designs.orthogonal_array(k,n,existence=True)
sage: OA=thwart_lemma_3_5(10,2,79,1,1,1)
Traceback (most recent call last):
...
    913         if existence:
    914             return Unknown
--> 915         raise NotImplementedError("I don't know how to build an OA({},{})!".format(k,n))
    916 
    917     if check:

NotImplementedError: I don't know how to build an OA(10,69)!

Are a,b,c allowed to be 1? In the examples of the article a,b,c are never 1 (but d might be).

3) I found two examples which yield to error I do not understand from the specifications. The first one is related to the point 2) I guess

sage: OA=thwart_lemma_3_5(10,2,89,1,1,1)
Traceback (most recent call last):
...
    456     for B in master_design:
    457         # The missing entries belong to the last n_trunc columns
--> 458         assert all(x is not None for x in B[:k])
    459         n_in_truncated = n_trunc-B.count(None)
    460 

AssertionError: 

For the other one, the existence of an OA(9,21) should not be necessary for the construction

sage: OA = thwart_lemma_3_5(9,9,22,8,8,8,complement=True)
Traceback (most recent call last):
...
    913         if existence:
    914             return Unknown
--> 915         raise NotImplementedError("I don't know how to build an OA({},{})!".format(k,n))
    916 
    917     if check:

NotImplementedError: I don't know how to build an OA(9,21)!

4) I found several new values for which the thwart construction works. But I first would like to understand first why 3) fail.

Vincent

Last edited 4 years ago by vdelecroix (previous) (diff)

comment:6 in reply to: ↑ 5 Changed 4 years ago by ncohen

Hello !

1) I corrected typos in the doc, added a check, added a test, modify few sentences at u/vdelecroix/16662

Is it me or is the git server slow ? It takes MINUTES to download a branch O_o

2) I do not understand why he following is not valid

How, not "valid" ? The input is valid but in order to build the new designs you need some sub-designs, and among the subdesigns there is a OA(10,69) that Sage does not know how to build. Soo well, the construction can't be applied because you need one of the required designs... What's wrong with that ? This is what the find_ functions usually checks !

Are a,b,c allowed to be 1? In the examples of the article a,b,c are never 1 (but d might be).

I don't think that there is anything wrong with =1 values.

3) I found two examples which yield to error I do not understand from the specifications. The first one is related to the point 2) I guess

No, it is just that there exists no OA(13,2). The code has to build an OA(k+3,n) (or OA(k+4,n) when d is defined) and it does not exist if k is too large. I missed that one, probably because that's the kind of thing that I filter in the find_ function usually.

For the other one, the existence of an OA(9,21) should not be necessary for the construction

The exception I added for the previous one is also triggered here.

4) I found several new values for which the thwart construction works. But I first would like to understand first why 3) fail.

Nathann

comment:7 Changed 4 years ago by git

  • Commit changed from 355ac2a741154e5fca99ec1d5137c5be0b9d4377 to 8fcaed066b7f110af10c6a471cbebcad370bd312

Branch pushed to git repo; I updated commit sha1. New commits:

053945dtrac #16662: merge 6.3
166b525trac #16662: doc + test
8fcaed0trac #16662: New assertion

comment:8 Changed 4 years ago by ncohen

  • Status changed from needs_info to needs_review

comment:9 Changed 4 years ago by vdelecroix

Hi,

You add a check at the begining but there is no associated specification in the doc...

Vincent

comment:10 Changed 4 years ago by vdelecroix

Very bad... I was not able to build the potentially new OA(12,3362)

sage: OA=thwart_lemma_3_5(12,16,207,13,13,13,11,complement=True)
Traceback (most recent call last):
...
.../incidence_structures.pyc in blocks(self, copy)
    520             if self._point_to_index is None:
    521                 from copy import deepcopy
--> 522                 return deepcopy(self._blocks)
    523             else:
    524                 return [[self._points[i] for i in b] for b in self._blocks]
...
MemoryError:

comment:11 Changed 4 years ago by ncohen

Well you can't build a OA(2^300+1,2^300) either... :-P

This deepcopy is a crime.

Nathann

comment:12 follow-up: Changed 4 years ago by vdelecroix

I said that because this one is smaller than the largest you add to the database: an OA(12,3994).

I will create a ticket to remove all deepcopy around (and in that case I think it is more like a toto.blocks(copy=False) which is needed).

Vincent

comment:13 in reply to: ↑ 12 Changed 4 years ago by ncohen

Yoooooooo !

I said that because this one is smaller than the largest you add to the database: an OA(12,3994).

I will create a ticket to remove all deepcopy around (and in that case I think it is more like a toto.blocks(copy=False) which is needed).

Please don't, I fixed that in at least two needs_review tickets already ^^;

Nathann

comment:14 follow-up: Changed 4 years ago by vdelecroix

All right...

I put constructions for OA(10,1048), OA(11,1524) and OA(12,3362) at u/vdelecroix/16662. And by the way, I wrote a function which compute the set of new values you get by using this construction... should I add it somewhere? It is not exactly a find_XXX function, it go the other way around: play with parameters m,n,a,b,c,d and see if the m*n+a+b+c+d you obtain is already available.

Vincent

comment:15 in reply to: ↑ 14 ; follow-up: Changed 4 years ago by ncohen

Yo !

I put constructions for OA(10,1048), OA(11,1524) and OA(12,3362) at u/vdelecroix/16662.

Oh. Are they the only other MOLS that we can build ? If there are more it may be worth writing a find function after all...

Nathann

comment:16 in reply to: ↑ 15 ; follow-up: Changed 4 years ago by vdelecroix

Replying to ncohen:

Yo !

I put constructions for OA(10,1048), OA(11,1524) and OA(12,3362) at u/vdelecroix/16662.

Oh. Are they the only other MOLS that we can build ? If there are more it may be worth writing a find function after all...

Seems to be the only one. If you want to write this find function that's also an option. But I am not sure you will ever get the OA(10,1046) in less than a minute. A possible alternative is to write a function that checks that:

  • those constructions are not useless (i.e. there are no other way to get them)
  • there is nothing more to build with these thwarts (this is pretty much what I wrote)

Vincent

comment:17 in reply to: ↑ 16 ; follow-up: Changed 4 years ago by ncohen

Yo !

Seems to be the only one. If you want to write this find function that's also an option.

I tried several times and it really was complicated. And required a lot of copy/paste.

But I am not sure you will ever get the OA(10,1046) in less than a minute

Well, you need to fill the cache at first, like for all constructions.

A possible alternative is to write a function that checks that:

  • those constructions are not useless (i.e. there are no other way to get them)

Why do you care so much about not having useless constructions ? It is good to have several ways to generate the same design. Some paths may be better than others, some paths may result in designs with different properties...

  • there is nothing more to build with these thwarts (this is pretty much what I wrote)

The problem with that is that it is true with the current OA that Sage can build. Wait a bit and new one will appear that may change that.

Nathann

comment:18 in reply to: ↑ 17 ; follow-up: Changed 4 years ago by vdelecroix

Replying to ncohen:

Yo !

Seems to be the only one. If you want to write this find function that's also an option.

I tried several times and it really was complicated. And required a lot of copy/paste.

So let us not do that.

A possible alternative is to write a function that checks that:

  • those constructions are not useless (i.e. there are no other way to get them)

Why do you care so much about not having useless constructions ? It is good to have several ways to generate the same design. Some paths may be better than others, some paths may result in designs with different properties...

All right. Then there are thousands of thwarts construction that you may want to consider...

  • there is nothing more to build with these thwarts (this is pretty much what I wrote)

The problem with that is that it is true with the current OA that Sage can build. Wait a bit and new one will appear that may change that.

This is cool. If a ticket creates new OA that allows more of this construction, then it will create an error in a doctest and the database will be automatically updated. Nevertheless, filling the cache up to 3000 is too long for a doctest. So I can keep my file under my pillow and run it from time to time.

Vincent

comment:19 in reply to: ↑ 18 Changed 4 years ago by ncohen

Yo !

So let us not do that.

Hmmmmmmmmm. I know it will haunt me :-P

All right. Then there are thousands of thwarts construction that you may want to consider...

Well, for the moment I try to fill the table until we stick to the Handbook.

This is cool. If a ticket creates new OA that allows more of this construction, then it will create an error in a doctest and the database will be automatically updated. Nevertheless, filling the cache up to 3000 is too long for a doctest. So I can keep my file under my pillow and run it from time to time.

Under a pillow is no place to store code. You can't wash away the stains of a memory leak.

Nathann

comment:20 Changed 4 years ago by vdelecroix

All right, I wrote the find_thwart_lemma_3_5... it works at least for the parameters that were found before. It is at u/vdelecroix/16662.

Vincent

Last edited 4 years ago by vdelecroix (previous) (diff)

comment:21 Changed 4 years ago by ncohen

Could you add some comments ? So many loops and tests, even an internal caching system and no comments is a bit tough.

Also, I don't think you should implement caching systems like that in a find_ function. If we need such caching (and we do) it should be implemented where it belongs, with a dedicated function.

I had in mind something like that:

designs.orthogonal_arrays.build(k,n)
designs.orthogonal_arrays.build_information(k,n)
designs.orthogonal_arrays.is_available(k,n)
designs.orthogonal_arrays.exists(k,n)
designs.orthogonal_arrays.max_k(n)
designs.orthogonal_arrays.all_n(k, min=None, max=None)

Though I like this interface, it means quite some amount of copy/paste in the code. For TD and MOLS. Maybe there is a trick.

Nathann

comment:22 follow-up: Changed 4 years ago by vdelecroix

Hello,

It is not only caching. With a more naive approach you would have

for a in xrange(n):
    if not orthogonal_arrays(k,a,existence=True):
        continue
    for b in xrange(a+1):
        if not orthogonal_arrays(k,b,existence=True):
            continue
        for c in xrange(b+1):
            if not orthogonal_arrays(k,c,existence=True):
                continue

About your caching system, what would be the difference between is_available and exists? And instead of all_n I would prefer nrange or something more pythonic.

I will write some doc but I won't change the code.

Vincent

comment:23 Changed 4 years ago by vdelecroix

It is at u/vdelecroix/16662.

Vincent

comment:24 in reply to: ↑ 22 Changed 4 years ago by ncohen

Yo !

It is not only caching. With a more naive approach you would have

What you do there is the work of the "all_n" function I wound like to write. I would like to replace the

for n in range(N-2):
    if is_prime(n)

with

for n in prime_powers(N-2):
   ...

But it is not possible because you use n to build the cache.

About your caching system, what would be the difference between is_available and exists? And instead of all_n I would prefer nrange or something more pythonic.

is_available would not trigger non-existence tests. But perhaps that is less useful since your rewrote the is_sum_of_squares thing.

Nathann

comment:25 Changed 4 years ago by ncohen

With the find function

sage: %time designs.orthogonal_array(None,1000,existence=1)
CPU times: user 24.6 s, sys: 52 ms, total: 24.6 s
Wall time: 24.6 s
9

Without

sage: %time designs.orthogonal_array(None,1000,existence=1)
CPU times: user 14.5 s, sys: 32 ms, total: 14.5 s
Wall time: 14.5 s
9

comment:26 follow-up: Changed 4 years ago by ncohen

Yo !

It seems that the list "good" is very very often useless. I wanted to know how often it was used and added a "print n" right before the loop "for a in good"

@@ -999,6 +999,7 @@ def find_thwart_lemma_3_5(k,N):
                     orthogonal_array(k,m+2,existence=True)):
                 continue

+            print "A",n
             for a in good:
                 if a > N-n*m:
                     break
@@ -1023,7 +1024,7 @@ def find_thwart_lemma_3_5(k,N):
                     orthogonal_array(k,m+2,existence=True) and
                     orthogonal_array(k,m+3,existence=True)):
                 continue
-
+            print "B",n
             for a in good:
                 if a > N-n*m:
                     break         

Guess how many times they appear when you type designs.orthogonal_array(None,1000,existence=1): never.

Which means that for 1000 at least computing all OA(k,n,existence=True) for n<N is all for naught (but takes 10s).

Nathann

comment:27 Changed 4 years ago by ncohen

By the way, shouldn't range(N-3) be range(N-2) ? The last element is not included in the list, so m=a=b=c=1 is not tested.

Nathann

comment:28 in reply to: ↑ 26 Changed 4 years ago by vdelecroix

Hello,

Moving the good part would change nothing to the timings. You already have the following for the m loop which is the cost of everything

            if not (orthogonal_array(k,m+0,existence=True) and
                    orthogonal_array(k,m+1,existence=True) and
                    orthogonal_array(k,m+2,existence=True)):
                continue

If you feel like this function is too costly, we can remove it from the find_recursive_constructions and just store the interesting values:

if (k,n) not in  ((10,1046), (10,1048), (10,1059), (11,1524),
                  (11,2164), (12,3362), (12,3992),  (12,3994)):
    return False

About your question, you were right and we can even go up to n=N-1.

Tell me what you think about the first question and I will add a commit.

Vincent

comment:29 Changed 4 years ago by vdelecroix

Commit a u/vdelecroix/16662.

  • I removed the good
  • I cut the loop as much as possible

Vincent

comment:30 follow-up: Changed 4 years ago by ncohen

Helloooooooo Vincent !

Okay... So let's do that:

1) You saved some time since the first version, and although it is still slower than needed it's not that awful that we should keep this construction out of those that are automatically tested

2) Those triple loops are really awful things, so we will have to do something about it

3) I am thinking of a data structure that would be useful to us, and whose purpose is to store things like "all n such that there exists a OA(k,n)" or even "all m such that there exists a OA(k,m), OA(k,m+1), OA(k,m+2)".

What it stores: a set of integers defined by a boolean function What it is meant to answer: give the list of integers between x and y such that the boolean function is satisfied.

Of course we want to minimize the number of boolean function queries. Even though it takes spaces, I am thinking of something like that:

An array which associates to (any) integer n:

a) if f(n) is computed: the smallest integer n'>=n such that f(n') is True or has not been computed yet.

b) if f(n) is not computed yet: None

This way, if you want to get all solutions to f(n) is True between x and y, here is what you do:

current=x
while current<=y:
    if array[current] is None:
        array[current] = f(current)
    elif array[current] == current:
        yield current
        current+=1
    else:
        new_current = array[current]
        # if array[a]==b and array[b]==c then let's define array[a]=c
        if array[new_current] is not None: 
            array[current] = array[new_current]
        current = new_current

This may be cool if we ever implement the all_n or range_n function.

And we could use it in conjunction with a real function to solve the partition problem, i.e. "try to find integers from a set S whose sum is equal or lequal to C".

Anyway.

1) I merged all your commits into one, as some were undoing the previous ones

2) I added a small commit on top of it. In particular some additional constraint on k that removes fake '+' in the n<100 area :-D

Your code is good otherwise.

Nathann

comment:31 Changed 4 years ago by git

  • Commit changed from 8fcaed066b7f110af10c6a471cbebcad370bd312 to 2f936c57507cbd475f860a76b94ed44882d7de7b

Branch pushed to git repo; I updated commit sha1. New commits:

adf5105trac #16662: find_thwart_lemma_3_5
2f936c5trac #16662: Review

comment:32 in reply to: ↑ 30 ; follow-up: Changed 4 years ago by vdelecroix

  • Status changed from needs_review to needs_info

Hello,

Replying to ncohen:

3) I am thinking of a data structure that would be useful to us, and whose purpose is to store things like "all n such that there exists a OA(k,n)" or even "all m such that there exists a OA(k,m), OA(k,m+1), OA(k,m+2)".

+1

What it stores: a set of integers defined by a boolean function What it is meant to answer: give the list of integers between x and y such that the boolean function is satisfied.

Of course we want to minimize the number of boolean function queries. Even though it takes spaces, I am thinking of something like that:

An array which associates to (any) integer n:

a) if f(n) is computed: the smallest integer n'>=n such that f(n') is True or has not been computed yet.

b) if f(n) is not computed yet: None

[...]

This may be cool if we ever implement the all_n or range_n function.

And:

  • what is the next value (like next_prime)
  • what is the previous value (like previous_prime)

The problem with your approach is that you store a lot of data. It is perhaps not as bad as what I imagine.

Questions:

1) why do you stop n at N-3 in the first loop? I think that n=N-1, m=1, a=1, b=0, c=0 is a valid input:

sage: OA = thwart_lemma_3_5(3,13,1,1,0,0)
sage: is_orthogonal_array(OA,3,14,2)
True

So the bound should be N-1. Perhaps the particular case of m=1 is taken care by another construction, in that case, this must be documented (and the upper bound for n updated accordingly).

2) the check d <= n is not necessary because of the lower bound on c.

I edit my commit at u/vdelecroix/16662 to simplify the code with respect to 2). As soon as 1) is solve, I would be happy to set this to positive review.

Vincent

comment:33 in reply to: ↑ 32 ; follow-up: Changed 4 years ago by ncohen

Yooooooo !

And:

  • what is the next value (like next_prime)

That's easy to do. It would be the result of all_n(min=current_value,max=None).next()

  • what is the previous value (like previous_prime)

This would be harder, but I'm not sure it is needed either O_o

The problem with your approach is that you store a lot of data. It is perhaps not as bad as what I imagine.

it bothered me at first but we are thinking of k lists (and k<50 usually) or size around 2000 at most. How bad can that be ? If it were stored in C it would represent 500ko :-P

1) why do you stop n at N-3 in the first loop? I think that n=N-1, m=1, a=1, b=0, c=0 is a valid input:

Argg... Because I thought we could assume a,c,b>0, I did some changes then removed them, but I forgot one. If d=0 and c=0 I think that what the code does is roughly equivalent to Wilson's construction wth 2 truncated columns. But well, you are right, I will undo that !

2) the check d <= n is not necessary because of the lower bound on c.

I edit my commit at u/vdelecroix/16662 to simplify the code with respect to 2). As soon as 1) is solve, I would be happy to set this to positive review.

I just downloaded your branch but did not see anything new. I just merged it with the beta0 but I will add a commit in a second !

Nathann

comment:34 in reply to: ↑ 33 ; follow-up: Changed 4 years ago by vdelecroix

Replying to ncohen:

2) the check d <= n is not necessary because of the lower bound on c.

I edit my commit at u/vdelecroix/16662 to simplify the code with respect to 2). As soon as 1) is solve, I would be happy to set this to positive review.

I just downloaded your branch but did not see anything new. I just merged it with the beta0 but I will add a commit in a second !

I edited my commit. I did not add a commit. My last commit in my branch is 68a2d99 and not 2f936c5.

Vincent

comment:35 in reply to: ↑ 34 ; follow-up: Changed 4 years ago by ncohen

Yo !

I edited my commit. I did not add a commit. My last commit in my branch is 68a2d99 and not 2f936c5.

Oh, I see. I think that this N-2 is correct after all, however. If n=N-2, then it means that there are at most two points in the last 4 columns, i.e. that at most two columns are used. And because all pairs of points are equivalent, this is already handled by the wilson construction with two truncated columns.

I had written some code which I hoped would improve the performances of this function, but well... It produced no difference whatsoever. And I don't know why yet, so let's get this in.

Perhaps the cached "all_n" function will change something :-/

Nathann

comment:36 in reply to: ↑ 35 Changed 4 years ago by vdelecroix

Replying to ncohen:

Yo !

I edited my commit. I did not add a commit. My last commit in my branch is 68a2d99 and not 2f936c5.

Oh, I see. I think that this N-2 is correct after all, however. If n=N-2, then it means that there are at most two points in the last 4 columns, i.e. that at most two columns are used. And because all pairs of points are equivalent, this is already handled by the wilson construction with two truncated columns.

I believe it but please write it in a comment!

I had written some code which I hoped would improve the performances of this function, but well... It produced no difference whatsoever. And I don't know why yet, so let's get this in.

Perhaps the cached "all_n" function will change something :-/

I hope it will.

Vincent

comment:37 Changed 4 years ago by git

  • Commit changed from 2f936c57507cbd475f860a76b94ed44882d7de7b to a9b594f3991f67cf7d2c82a91d8c332c47c35ba7

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

2a64fa2trac #16662: find_thwart_lemma_3_5
68a2d99trac #16662: Review
7e0480etrac #16662: Merged with 6.4.beta0
a9b594ftrac #16662: A comment about n<N-2

comment:38 Changed 4 years ago by vdelecroix

  • Status changed from needs_info to positive_review

Great great... and of lot of rebase for the follow-up tickets!

Vincent

comment:39 Changed 4 years ago by ncohen

Thanks !

Nathann

comment:40 Changed 4 years ago by vdelecroix

  • Status changed from positive_review to needs_work

There is a (simple) conflict with #16604.

Vincent

comment:41 Changed 4 years ago by vdelecroix

see at u/vdelecroix/16662.

Vincent

comment:42 Changed 4 years ago by git

  • Commit changed from a9b594f3991f67cf7d2c82a91d8c332c47c35ba7 to 7a73e74549a080ac7ac139241521a871d667cd45

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

3cf3983trac #16604: Reviewing the review
d612056trac #16604: on the doc again
bff470etrac #16604: small check of dimensions
6c21904trac #16797: correct a row/column inversion
48b6902trac #16604: merge #16797
579e75btrac #16604: input check + doctest
e1a83d0trac #16604: Optional check flag
05c6915trac #16604: Variable rename and list->set
24c4f7ftrac #16604: Merge with updated #16797
7a73e74trac #16662: Merge with updated #16604

comment:43 Changed 4 years ago by ncohen

  • Status changed from needs_work to positive_review

comment:44 Changed 4 years ago by vdelecroix

  • Reviewers set to Vincent Delecroix

comment:45 Changed 4 years ago by vbraun

  • Branch changed from u/ncohen/16662 to 7a73e74549a080ac7ac139241521a871d667cd45
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.