Opened 3 years ago
Closed 3 years ago
#25125 closed enhancement (fixed)
dancing links: find all solutions in parallel
Reported by:  slabbe  Owned by:  

Priority:  major  Milestone:  sage8.3 
Component:  combinatorics  Keywords:  thursdaysbdx 
Cc:  Merged in:  
Authors:  Sébastien Labbé  Reviewers:  Julian Rüth, Vincent Delecroix, Vincent Klein 
Report Upstream:  N/A  Work issues:  
Branch:  c985179 (Commits, GitHub, GitLab)  Commit:  c9851791ab1b04561be9e7cd18e4f706d44b8545 
Dependencies:  Stopgaps: 
Description
Add all_solutions
method so that one can do:
sage: S = Subsets(range(4)) sage: rows = map(list, S) sage: dlx = dlx_solver(rows) sage: dlx Dancing links solver for 4 columns and 16 rows sage: dlx.number_of_solutions() 15 sage: sorted([sorted(s) for s in dlx.all_solutions(ncpus=8)]) [[1, 2, 3, 4], [1, 2, 10], [1, 3, 9], [1, 4, 8], [1, 14], [2, 3, 7], [2, 4, 6], [2, 13], [3, 4, 5], [3, 12], [4, 11], [5, 10], [6, 9], [7, 8], [15]]
Change History (47)
comment:1 Changed 3 years ago by
 Branch set to u/slabbe/25125
 Commit set to c4aa5950b532924fe9952f7b2ec8ecc42cef45be
comment:2 Changed 3 years ago by
 Status changed from new to needs_review
comment:3 Changed 3 years ago by
spliting > splitting
add a test / example for parallel?
comment:4 Changed 3 years ago by
 Commit changed from c4aa5950b532924fe9952f7b2ec8ecc42cef45be to 71b86832e62cec0b6fb7ba26e75514ee64adabfc
Branch pushed to git repo; I updated commit sha1. New commits:
71b8683  25125: fix doc

comment:5 followup: ↓ 9 Changed 3 years ago by
Since tests are often run in parallel, I know Volker does not like when doctests use more than one cpus. Instead I added a sentence explaining how to use ncpus
argument.
reneeds_review.
comment:6 Changed 3 years ago by
I think you forgot to set the "Author" on this ticket.
comment:7 Changed 3 years ago by
 Reviewers set to Julian Rüth
 Status changed from needs_review to needs_info
I think it's a bit weird that the ncpus example does not actually set ncpus>1. It's understandable from your comment on the ticket why you did that but it's not understandable from looking at the code. I think it would be nice to add a comment to the docstring.
Why do you set ncpus=1
as a default? Shouldn't the default just be None
? During doctests that is going to be "2" which seems to be Ok for doctesting, see ncpus.py
.
comment:8 Changed 3 years ago by
comment:9 in reply to: ↑ 5 Changed 3 years ago by
Replying to slabbe:
I know Volker does not like when doctests use more than one cpus
For reference, here is the previous discussion on this aspect: https://trac.sagemath.org/ticket/24424#comment:3
At #24424, in the end, I wrote ncpus=2
in the doctests.
comment:10 Changed 3 years ago by
 Commit changed from 71b86832e62cec0b6fb7ba26e75514ee64adabfc to 30738114df9b404c0aa2ec0b2e475f3ade6f8f62
Branch pushed to git repo; I updated commit sha1. New commits:
3073811  25125: setting ncpus=None as default

comment:11 Changed 3 years ago by
 Status changed from needs_info to needs_review
comment:12 Changed 3 years ago by
If parallelity is what people usually want in #24424 as well, then None
should be the default there as well I think.
comment:13 Changed 3 years ago by
 Status changed from needs_review to needs_work
The docstrings do not really work anymore now. It's not explained what None
means. Also the example about parallel computation is not really different from the "nonparallel" one that is parallel as well.
comment:14 Changed 3 years ago by
Ok, thanks for your comment. I will rework on this in a few days. I need to prepare for a travel.
comment:15 Changed 3 years ago by
 Commit changed from 30738114df9b404c0aa2ec0b2e475f3ade6f8f62 to 0a86ab2331766b2c6c99a541730afb193c0ac7c3
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
9448bfd  25125: add all_solutions method to dancing links

f6257b8  25125: fix doc

876bfd3  25125: setting ncpus=None as default

606c156  25125: improved sentences before some doctests

0a86ab2  rewriting number_of_solutions method like one_solution and all_solutions

comment:16 Changed 3 years ago by
 Status changed from needs_work to needs_review
I rebased the branch on top of rc3.
I improved the sentences before some doctests to avoid problems mentionned by saraedum.
I also rewrote number_of_solutions
method to be like one_solution
and all_solutions
methods. I deleted the _number_of_solutions_iterator
method. After changing ncpus=1
to ncpus=None
, there was an infinite loop because _number_of_solutions_iterator
was calling number_of_solutions
which was calling _number_of_solutions_iterator
again because ncpus
was not set to 1 by default. I think the code makes more sense like it is now.
Needs review.
comment:17 Changed 3 years ago by
 Keywords thursdaysbdx added
comment:18 followup: ↓ 24 Changed 3 years ago by
Why
sage: x = dlx_solver(rows)  sage: x.number_of_solutions() + sage: x.number_of_solutions(ncpus=1) 3  sage: x.number_of_solutions() + sage: x.number_of_solutions(ncpus=1) 0
If the behavior with and without ncpus=1
are different, this should be specified.
comment:19 Changed 3 years ago by
Why would you split stuff here
+ return sum(val for ((args, kwds), val) in nb_sol(indices))
can be
+ return sum(x[1] for x in nb_sol(indices))
comment:20 Changed 3 years ago by
 Commit changed from 0a86ab2331766b2c6c99a541730afb193c0ac7c3 to f20e6030bc8b4bdee274fb093e802a180f414d59
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
a7dec68  25125: add all_solutions method to dancing links

0afeffa  25125: fix doc

3d255c8  25125: setting ncpus=None as default

7c27fc8  25125: improved sentences before some doctests

78c50e3  rewriting number_of_solutions method like one_solution and all_solutions

f20e603  25125: allow reinitialization of dlx solver

comment:21 Changed 3 years ago by
I found a way to reinitialize the dlx solver.
The branch was rebased on rc4.
Needs review.
comment:22 Changed 3 years ago by
18 still applies.
comment:23 Changed 3 years ago by
 Commit changed from f20e6030bc8b4bdee274fb093e802a180f414d59 to 22306233cc4b7eeb34aef812fbff856e312d1ea9
Branch pushed to git repo; I updated commit sha1. New commits:
2230623  25125: reinitialize the dlx solver when needed

comment:24 in reply to: ↑ 18 Changed 3 years ago by
Replying to vdelecroix:
If the behavior with and without
ncpus=1
are different, this should be specified.
The behavior is not different anymore. Reinitialization is performed automatically when needed.
comment:25 followup: ↓ 26 Changed 3 years ago by
 Milestone changed from sage8.2 to sage8.3
 Reviewers changed from Julian Rüth to Julian Rüth, Vincent Delecroix
 Status changed from needs_review to needs_work
Replying to slabbe:
Replying to vdelecroix:
If the behavior with and without
ncpus=1
are different, this should be specified.The behavior is not different anymore. Reinitialization is performed automatically when needed.
Then why modifying the test by specifying ncpus=1
? You should not remove doctests and it would be better anyway to test with ncpus=1
, ncpus=2
and ncpus=None
.
comment:26 in reply to: ↑ 25 Changed 3 years ago by
 Status changed from needs_work to needs_review
Replying to vdelecroix:
Then why modifying the test by specifying
ncpus=1
? You should not remove doctests and it would be better anyway to test withncpus=1
,ncpus=2
andncpus=None
.
Because before the input was ncpus=1
:
 def number_of_solutions(self, ncpus=1, column=None): + def number_of_solutions(self, ncpus=None, column=None):
Therefore, the previous doctest with no input was testing the case when ncpus=1
. Since the new default value for ncpus is None
, if I want to *avoid* removing a doctest, I need to specifically set ncpus=1
in the doctest.
If ncpus
is not equal to 1
, then the parallel code applies which uses restrict
which temporarily creates new dlx solvers which are used only once before being deleted. So to me, there is no need to test with ncpus=1
, ncpus=2
and ncpus=None
.
Thus, I rechange the status to needs review. Tell me if you insist on this aspect and if you want me to add those doctests.
comment:28 Changed 3 years ago by
Thanks to both of you for the review. I am very happy with the state it has reached now.
comment:29 Changed 3 years ago by
 Branch changed from u/slabbe/25125 to 22306233cc4b7eeb34aef812fbff856e312d1ea9
 Resolution set to fixed
 Status changed from positive_review to closed
comment:30 Changed 3 years ago by
 Commit 22306233cc4b7eeb34aef812fbff856e312d1ea9 deleted
 Resolution fixed deleted
 Status changed from closed to new
I'm getting random failures, especially on OSX
sage t long src/sage/games/sudoku.py ********************************************************************** File "src/sage/games/sudoku.py", line 549, in sage.games.sudoku.Sudoku.solve Failed example: next(h.solve(algorithm='dlx')) Exception raised: Traceback (most recent call last): File "/Users/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 562, in _run self.compile_and_execute(example, compiler, test.globs) File "/Users/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 972, in compile_and_execute exec(compiled, globs) File "<doctest sage.games.sudoku.Sudoku.solve[3]>", line 1, in <module> next(h.solve(algorithm='dlx')) StopIteration ********************************************************************** 1 item had failures: 1 of 14 in sage.games.sudoku.Sudoku.solve [96 tests, 1 failure, 2.38 s]  sage t long src/sage/games/sudoku.py # 1 doctest failed 
or
sage t long src/sage/games/sudoku.py ********************************************************************** File "src/sage/games/sudoku.py", line 863, in sage.games.sudoku.Sudoku.dlx.make_row Failed example: len(list(h.solve(algorithm='dlx'))) # indirect doctest Expected: 5 Got: 0 ********************************************************************** 1 item had failures: 1 of 3 in sage.games.sudoku.Sudoku.dlx.make_row [96 tests, 1 failure, 2.40 s]  sage t long src/sage/games/sudoku.py # 1 doctest failed 
Try runnning tests in a loop, e.g.
for i in `seq 0 1000` ; do ./sage t long src/sage/games/sudoku.py ; done
comment:31 Changed 3 years ago by
 Branch changed from 22306233cc4b7eeb34aef812fbff856e312d1ea9 to u/slabbe/25125
 Commit set to d13e0ec76680814329f26aa3860c54065168e469
 Status changed from new to needs_review
On Ubuntu, I can not reproduce the issue.
The methods I am adding/modifying to the dancing links class are not used by the sudoku solver which uses directly the search
and get_solution
methods. The only thing I changed that may have affected the sudoku thing is the way the __init__
is done which is now done with the reinitalize
method. Maybe OSX does not like the line
self._x = dancing_links()
that was in the reinitialize
which was called by the __init__
? So, I added a new commit which does
self._x = dancing_links()
only for the reinitialization and avoids it for __init__
as it was before. Can somebody with OSX test whether this fixes the issue?
New commits:
b4fdeb5  25125: add all_solutions method to dancing links

5ddc189  25125: fix doc

0842d46  25125: setting ncpus=None as default

11b3894  25125: improved sentences before some doctests

0f320ea  rewriting number_of_solutions method like one_solution and all_solutions

2fce4fe  25125: allow reinitialization of dlx solver

a5c4667  25125: reinitialize the dlx solver when needed

d13e0ec  25125: Calling constructor dancing_links() only in reinitialize not in __init__

comment:32 Changed 3 years ago by
 Reviewers changed from Julian Rüth, Vincent Delecroix to Julian Rüth, Vincent Delecroix, Vincent Klein
Tests for i in `seq 0 1000` ; do ./sage t long src/sage/games/sudoku.py ; done
has been successfully passed on OSX (High Sierra) with d13e0ec
.
comment:33 Changed 3 years ago by
And are you able to reproduce the problem mentionned by Volker with commit a5c4667 ?
comment:34 Changed 3 years ago by
Yes ! I should have mentioned that.
comment:35 Changed 3 years ago by
 Status changed from needs_review to positive_review
d13e0ec fix the problem.
comment:36 Changed 3 years ago by
Thanks for your help testing on OSX.
Sébastien
comment:37 Changed 3 years ago by
 Status changed from positive_review to needs_work
sage t long src/sage/combinat/matrices/dancing_links.pyx ********************************************************************** File "src/sage/combinat/matrices/dancing_links.pyx", line 683, in sage.combinat.matrices.dancing_links.dancing_linksWrapper.all_solutions Failed example: [sorted(s) for s in S] Expected: [[0, 1], [2, 3], [4, 5]] Got: [[0, 1], [4, 5], [2, 3]] ********************************************************************** 1 item had failures: 1 of 16 in sage.combinat.matrices.dancing_links.dancing_linksWrapper.all_solutions [215 tests, 1 failure, 7.16 s]
comment:38 Changed 3 years ago by
comment:39 Changed 3 years ago by
 Commit changed from d13e0ec76680814329f26aa3860c54065168e469 to 25da49e6e04ca7fed46442d27f66d3dc297362ab
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
9a158e6  25125: add all_solutions method to dancing links

89b5901  25125: fix doc

c54a64d  25125: setting ncpus=None as default

cf7b435  25125: improved sentences before some doctests

004126a  rewriting number_of_solutions method like one_solution and all_solutions

d18e090  25125: allow reinitialization of dlx solver

c05e73b  25125: reinitialize the dlx solver when needed

10f885d  25125: Calling constructor dancing_links() only in reinitialize not in __init__

25da49e  25125: sorted lists in doctests

comment:40 Changed 3 years ago by
 Status changed from needs_work to needs_review
Sorry Volker. I was sure that I put sorted everywhere needed, but I forgot one.
Rebased on 8.3.beta2.
I added a commit which added the sorted
where the doctest problem occurs. I also added two other sorted
elsewhere.
Needs review.
comment:41 Changed 3 years ago by
sage t a
works on ubuntu and OSX ( except for ext_rep.py and external.py killed due to abort signal, which is not related to this ticket).
comment:42 Changed 3 years ago by
Random errors on dancing_links.pyx on OSX.
sage admin$ for i in `seq 0 50` ; do ./sage t long src/sage/combinat/matrices/dancing_links.pyx ; done ... sage t long warnlong 46.8 src/sage/combinat/matrices/dancing_links.pyx [215 tests, 5.41 s]  All tests passed!  Total time for all tests: 5.5 seconds cpu time: 4.8 seconds cumulative wall time: 5.4 seconds macbookprosierra02:sage admin$ for i in `seq 0 50` ; do ./sage t long src/sage/combinat/matrices/dancing_links.pyx ; done Running doctests with ID 201805291130559317aab4. Git branch: t/25125/25125 Using optional=gfortran,gmpy2,mpir,python2,sage Doctesting 1 file. sage t long warnlong 46.8 src/sage/combinat/matrices/dancing_links.pyx ********************************************************************** File "src/sage/combinat/matrices/dancing_links.pyx", line 603, in sage.combinat.matrices.dancing_links.dancing_linksWrapper.one_solution Failed example: sorted(d.one_solution()) Expected: [0, 1] Got: [2, 3] ********************************************************************** 1 item had failures: 1 of 17 in sage.combinat.matrices.dancing_links.dancing_linksWrapper.one_solution [215 tests, 1 failure, 5.46 s]  sage t long warnlong 46.8 src/sage/combinat/matrices/dancing_links.pyx # 1 doctest failed  Total time for all tests: 5.6 seconds cpu time: 4.9 seconds cumulative wall time: 5.5 seconds Running doctests with ID 20180529113104c04446f2. Git branch: t/25125/25125 Using optional=gfortran,gmpy2,mpir,python2,sage Doctesting 1 file. sage t long warnlong 46.8 src/sage/combinat/matrices/dancing_links.pyx ********************************************************************** File "src/sage/combinat/matrices/dancing_links.pyx", line 634, in sage.combinat.matrices.dancing_links.dancing_linksWrapper.one_solution Failed example: sorted(dlx.one_solution()) Expected: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] Got: [2, 3, 4, 5, 6, 7, 9, 10, 11, 18] ********************************************************************** 1 item had failures: 1 of 17 in sage.combinat.matrices.dancing_links.dancing_linksWrapper.one_solution [215 tests, 1 failure, 5.42 s]  sage t long warnlong 46.8 src/sage/combinat/matrices/dancing_links.pyx # 1 doctest failed
comment:43 Changed 3 years ago by
 Status changed from needs_review to needs_work
comment:44 Changed 3 years ago by
 Commit changed from 25da49e6e04ca7fed46442d27f66d3dc297362ab to c9851791ab1b04561be9e7cd18e4f706d44b8545
Branch pushed to git repo; I updated commit sha1. New commits:
c985179  25125: sorting every list even those of size one in doctests

comment:45 Changed 3 years ago by
 Status changed from needs_work to needs_review
Thanks for your carefull review Vincent K. We avoided a nuisance to Volker.
The failures you found come from the fact that we change the ncpus=1
to ncpus=None
in the one_solution
method which made deterministic doctests become not deterministic. If the doctest is run with more than one cpu, than the first solution found depend on the speed of the computation made on each cpus.
I made the doctests for one_solution
independent of the first solution found.
I also added a bunch of sorted
(even for list of size one!) to prevents other similar problem.
Hopefully, these were the last non deterministic doctests to be fixed.
Needs review.
comment:46 Changed 3 years ago by
 Status changed from needs_review to positive_review
All tests passed (with several iteration of sage t ...dancing_links.pyx
) both on unbuntu and osx.
comment:47 Changed 3 years ago by
 Branch changed from u/slabbe/25125 to c9851791ab1b04561be9e7cd18e4f706d44b8545
 Resolution set to fixed
 Status changed from positive_review to closed
New commits:
25125: add all_solutions method to dancing links