Opened 6 months ago

Closed 5 months ago

#25125 closed enhancement (fixed)

dancing links: find all solutions in parallel

Reported by: slabbe Owned by:
Priority: major Milestone: sage-8.3
Component: combinatorics Keywords: thursdaysbdx
Cc: Merged in:
Authors: Sébastien Labbé Reviewers: Julian Rüth, Vincent Delecroix, Vincent Klein
Report Upstream: N/A Work issues:
Branch: c985179 (Commits) Commit: c9851791ab1b04561be9e7cd18e4f706d44b8545
Dependencies: Stopgaps:

Description

Add all_solutions method so that one can do:

sage: S = Subsets(range(4))
sage: rows = map(list, S)
sage: dlx = dlx_solver(rows)
sage: dlx
Dancing links solver for 4 columns and 16 rows
sage: dlx.number_of_solutions()
15
sage: sorted([sorted(s) for s in dlx.all_solutions(ncpus=8)])
[[1, 2, 3, 4],
 [1, 2, 10],
 [1, 3, 9],
 [1, 4, 8],
 [1, 14],
 [2, 3, 7],
 [2, 4, 6],
 [2, 13],
 [3, 4, 5],
 [3, 12],
 [4, 11],
 [5, 10],
 [6, 9],
 [7, 8],
 [15]]

Change History (47)

comment:1 Changed 6 months ago by slabbe

  • Branch set to u/slabbe/25125
  • Commit set to c4aa5950b532924fe9952f7b2ec8ecc42cef45be

New commits:

c4aa59525125: add all_solutions method to dancing links

comment:2 Changed 6 months ago by slabbe

  • Status changed from new to needs_review

comment:3 Changed 6 months ago by mantepse

spliting -> splitting

add a test / example for parallel?

comment:4 Changed 6 months ago by git

  • Commit changed from c4aa5950b532924fe9952f7b2ec8ecc42cef45be to 71b86832e62cec0b6fb7ba26e75514ee64adabfc

Branch pushed to git repo; I updated commit sha1. New commits:

71b868325125: fix doc

comment:5 follow-up: Changed 6 months ago by slabbe

Since tests are often run in parallel, I know Volker does not like when doctests use more than one cpus. Instead I added a sentence explaining how to use ncpus argument.

re-needs_review.

comment:6 Changed 6 months ago by saraedum

I think you forgot to set the "Author" on this ticket.

comment:7 Changed 6 months ago by saraedum

  • Reviewers set to Julian Rüth
  • Status changed from needs_review to needs_info

I think it's a bit weird that the ncpus example does not actually set ncpus>1. It's understandable from your comment on the ticket why you did that but it's not understandable from looking at the code. I think it would be nice to add a comment to the docstring.

Why do you set ncpus=1 as a default? Shouldn't the default just be None? During doctests that is going to be "2" which seems to be Ok for doctesting, see ncpus.py.

comment:8 Changed 6 months ago by slabbe

  • Authors set to Sébastien Labbé

comment:9 in reply to: ↑ 5 Changed 6 months ago by slabbe

Replying to slabbe:

I know Volker does not like when doctests use more than one cpus

For reference, here is the previous discussion on this aspect: https://trac.sagemath.org/ticket/24424#comment:3

At #24424, in the end, I wrote ncpus=2 in the doctests.

comment:10 Changed 6 months ago by git

  • Commit changed from 71b86832e62cec0b6fb7ba26e75514ee64adabfc to 30738114df9b404c0aa2ec0b2e475f3ade6f8f62

Branch pushed to git repo; I updated commit sha1. New commits:

307381125125: setting ncpus=None as default

comment:11 Changed 6 months ago by slabbe

  • Status changed from needs_info to needs_review

comment:12 Changed 6 months ago by saraedum

If parallelity is what people usually want in #24424 as well, then None should be the default there as well I think.

comment:13 Changed 6 months ago by saraedum

  • Status changed from needs_review to needs_work

The docstrings do not really work anymore now. It's not explained what None means. Also the example about parallel computation is not really different from the "non-parallel" one that is parallel as well.

comment:14 Changed 6 months ago by slabbe

Ok, thanks for your comment. I will rework on this in a few days. I need to prepare for a travel.

comment:15 Changed 6 months ago by git

  • Commit changed from 30738114df9b404c0aa2ec0b2e475f3ade6f8f62 to 0a86ab2331766b2c6c99a541730afb193c0ac7c3

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

9448bfd25125: add all_solutions method to dancing links
f6257b825125: fix doc
876bfd325125: setting ncpus=None as default
606c15625125: improved sentences before some doctests
0a86ab2rewriting number_of_solutions method like one_solution and all_solutions

comment:16 Changed 6 months ago by slabbe

  • Status changed from needs_work to needs_review

I rebased the branch on top of rc3.

I improved the sentences before some doctests to avoid problems mentionned by saraedum.

I also rewrote number_of_solutions method to be like one_solution and all_solutions methods. I deleted the _number_of_solutions_iterator method. After changing ncpus=1 to ncpus=None, there was an infinite loop because _number_of_solutions_iterator was calling number_of_solutions which was calling _number_of_solutions_iterator again because ncpus was not set to 1 by default. I think the code makes more sense like it is now.

Needs review.

comment:17 Changed 6 months ago by slabbe

  • Keywords thursdaysbdx added

comment:18 follow-up: Changed 6 months ago by vdelecroix

Why

             sage: x = dlx_solver(rows)
-            sage: x.number_of_solutions()
+            sage: x.number_of_solutions(ncpus=1)
             3
-            sage: x.number_of_solutions()
+            sage: x.number_of_solutions(ncpus=1)
             0

If the behavior with and without ncpus=1 are different, this should be specified.

comment:19 Changed 6 months ago by vdelecroix

Why would you split stuff here

+ return sum(val for ((args, kwds), val) in nb_sol(indices))

can be

+ return sum(x[1] for x in nb_sol(indices))

comment:20 Changed 6 months ago by git

  • Commit changed from 0a86ab2331766b2c6c99a541730afb193c0ac7c3 to f20e6030bc8b4bdee274fb093e802a180f414d59

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

a7dec6825125: add all_solutions method to dancing links
0afeffa25125: fix doc
3d255c825125: setting ncpus=None as default
7c27fc825125: improved sentences before some doctests
78c50e3rewriting number_of_solutions method like one_solution and all_solutions
f20e60325125: allow reinitialization of dlx solver

comment:21 Changed 6 months ago by slabbe

I found a way to reinitialize the dlx solver.

The branch was rebased on rc4.

Needs review.

comment:22 Changed 6 months ago by vdelecroix

18 still applies.

comment:23 Changed 6 months ago by git

  • Commit changed from f20e6030bc8b4bdee274fb093e802a180f414d59 to 22306233cc4b7eeb34aef812fbff856e312d1ea9

Branch pushed to git repo; I updated commit sha1. New commits:

223062325125: reinitialize the dlx solver when needed

comment:24 in reply to: ↑ 18 Changed 6 months ago by slabbe

Replying to vdelecroix:

If the behavior with and without ncpus=1 are different, this should be specified.

The behavior is not different anymore. Reinitialization is performed automatically when needed.

comment:25 follow-up: Changed 6 months ago by vdelecroix

  • Milestone changed from sage-8.2 to sage-8.3
  • Reviewers changed from Julian Rüth to Julian Rüth, Vincent Delecroix
  • Status changed from needs_review to needs_work

Replying to slabbe:

Replying to vdelecroix:

If the behavior with and without ncpus=1 are different, this should be specified.

The behavior is not different anymore. Reinitialization is performed automatically when needed.

Then why modifying the test by specifying ncpus=1? You should not remove doctests and it would be better anyway to test with ncpus=1, ncpus=2 and ncpus=None.

comment:26 in reply to: ↑ 25 Changed 6 months ago by slabbe

  • Status changed from needs_work to needs_review

Replying to vdelecroix:

Then why modifying the test by specifying ncpus=1? You should not remove doctests and it would be better anyway to test with ncpus=1, ncpus=2 and ncpus=None.

Because before the input was ncpus=1 :

-    def number_of_solutions(self, ncpus=1, column=None):
+    def number_of_solutions(self, ncpus=None, column=None):

Therefore, the previous doctest with no input was testing the case when ncpus=1. Since the new default value for ncpus is None, if I want to *avoid* removing a doctest, I need to specifically set ncpus=1 in the doctest.

If ncpus is not equal to 1, then the parallel code applies which uses restrict which temporarily creates new dlx solvers which are used only once before being deleted. So to me, there is no need to test with ncpus=1, ncpus=2 and ncpus=None.

Thus, I rechange the status to needs review. Tell me if you insist on this aspect and if you want me to add those doctests.

comment:27 Changed 6 months ago by vdelecroix

  • Status changed from needs_review to positive_review

Ho I see.

comment:28 Changed 6 months ago by slabbe

Thanks to both of you for the review. I am very happy with the state it has reached now.

comment:29 Changed 5 months ago by vbraun

  • Branch changed from u/slabbe/25125 to 22306233cc4b7eeb34aef812fbff856e312d1ea9
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:30 Changed 5 months ago by vbraun

  • Commit 22306233cc4b7eeb34aef812fbff856e312d1ea9 deleted
  • Resolution fixed deleted
  • Status changed from closed to new

I'm getting random failures, especially on OSX

sage -t --long src/sage/games/sudoku.py
**********************************************************************
File "src/sage/games/sudoku.py", line 549, in sage.games.sudoku.Sudoku.solve
Failed example:
    next(h.solve(algorithm='dlx'))
Exception raised:
    Traceback (most recent call last):
      File "/Users/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 562, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/Users/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 972, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.games.sudoku.Sudoku.solve[3]>", line 1, in <module>
        next(h.solve(algorithm='dlx'))
    StopIteration
**********************************************************************
1 item had failures:
   1 of  14 in sage.games.sudoku.Sudoku.solve
    [96 tests, 1 failure, 2.38 s]
----------------------------------------------------------------------
sage -t --long src/sage/games/sudoku.py  # 1 doctest failed
----------------------------------------------------------------------

or

sage -t --long src/sage/games/sudoku.py
**********************************************************************
File "src/sage/games/sudoku.py", line 863, in sage.games.sudoku.Sudoku.dlx.make_row
Failed example:
    len(list(h.solve(algorithm='dlx')))  # indirect doctest
Expected:
    5
Got:
    0
**********************************************************************
1 item had failures:
   1 of   3 in sage.games.sudoku.Sudoku.dlx.make_row
    [96 tests, 1 failure, 2.40 s]
----------------------------------------------------------------------
sage -t --long src/sage/games/sudoku.py  # 1 doctest failed
----------------------------------------------------------------------

Try runnning tests in a loop, e.g.

for i in `seq 0 1000` ; do ./sage -t --long src/sage/games/sudoku.py ; done

comment:31 Changed 5 months ago by slabbe

  • Branch changed from 22306233cc4b7eeb34aef812fbff856e312d1ea9 to u/slabbe/25125
  • Commit set to d13e0ec76680814329f26aa3860c54065168e469
  • Status changed from new to needs_review

On Ubuntu, I can not reproduce the issue.

The methods I am adding/modifying to the dancing links class are not used by the sudoku solver which uses directly the search and get_solution methods. The only thing I changed that may have affected the sudoku thing is the way the __init__ is done which is now done with the reinitalize method. Maybe OSX does not like the line

self._x = dancing_links()

that was in the reinitialize which was called by the __init__. So, I added a new commit which does

self._x = dancing_links()

only for the reinitialization and avoids it for __init__ as it was before. Can somebody with OSX test whether this fixes the issue?


New commits:

b4fdeb525125: add all_solutions method to dancing links
5ddc18925125: fix doc
0842d4625125: setting ncpus=None as default
11b389425125: improved sentences before some doctests
0f320earewriting number_of_solutions method like one_solution and all_solutions
2fce4fe25125: allow reinitialization of dlx solver
a5c466725125: reinitialize the dlx solver when needed
d13e0ec25125: Calling constructor dancing_links() only in reinitialize not in __init__
Version 0, edited 5 months ago by slabbe (next)

comment:32 Changed 5 months ago by vklein

  • Reviewers changed from Julian Rüth, Vincent Delecroix to Julian Rüth, Vincent Delecroix, Vincent Klein

Tests for i in `seq 0 1000` ; do ./sage -t --long src/sage/games/sudoku.py ; done has been successfully passed on OSX (High Sierra) with ​d13e0ec.

comment:33 Changed 5 months ago by slabbe

And are you able to reproduce the problem mentionned by Volker with commit a5c4667 ?

comment:34 Changed 5 months ago by vklein

Yes ! I should have mentioned that.

Last edited 5 months ago by vklein (previous) (diff)

comment:35 Changed 5 months ago by vklein

  • Status changed from needs_review to positive_review

​d13e0ec fix the problem.

comment:36 Changed 5 months ago by slabbe

Thanks for your help testing on OSX.

Sébastien

comment:37 Changed 5 months ago by vbraun

  • Status changed from positive_review to needs_work
sage -t --long src/sage/combinat/matrices/dancing_links.pyx
**********************************************************************
File "src/sage/combinat/matrices/dancing_links.pyx", line 683, in sage.combinat.matrices.dancing_links.dancing_linksWrapper.all_solutions
Failed example:
    [sorted(s) for s in S]
Expected:
    [[0, 1], [2, 3], [4, 5]]
Got:
    [[0, 1], [4, 5], [2, 3]]
**********************************************************************
1 item had failures:
   1 of  16 in sage.combinat.matrices.dancing_links.dancing_linksWrapper.all_solutions
    [215 tests, 1 failure, 7.16 s]

comment:38 Changed 5 months ago by vbraun

Last edited 5 months ago by vbraun (previous) (diff)

comment:39 Changed 5 months ago by git

  • Commit changed from d13e0ec76680814329f26aa3860c54065168e469 to 25da49e6e04ca7fed46442d27f66d3dc297362ab

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

9a158e625125: add all_solutions method to dancing links
89b590125125: fix doc
c54a64d25125: setting ncpus=None as default
cf7b43525125: improved sentences before some doctests
004126arewriting number_of_solutions method like one_solution and all_solutions
d18e09025125: allow reinitialization of dlx solver
c05e73b25125: reinitialize the dlx solver when needed
10f885d25125: Calling constructor dancing_links() only in reinitialize not in __init__
25da49e25125: sorted lists in doctests

comment:40 Changed 5 months ago by slabbe

  • Status changed from needs_work to needs_review

Sorry Volker. I was sure that I put sorted everywhere needed, but I forgot one.

Rebased on 8.3.beta2.

I added a commit which added the sorted where the doctest problem occurs. I also added two other sorted elsewhere.

Needs review.

comment:41 Changed 5 months ago by vklein

sage -t -a works on ubuntu and OSX ( except for ext_rep.py and external.py killed due to abort signal, which is not related to this ticket).

comment:42 Changed 5 months ago by vklein

Random errors on dancing_links.pyx on OSX.

sage admin$ for i in `seq 0 50` ; do ./sage -t --long src/sage/combinat/matrices/dancing_links.pyx ; done
...
sage -t --long --warn-long 46.8 src/sage/combinat/matrices/dancing_links.pyx
    [215 tests, 5.41 s]
----------------------------------------------------------------------
All tests passed!
----------------------------------------------------------------------
Total time for all tests: 5.5 seconds
    cpu time: 4.8 seconds
    cumulative wall time: 5.4 seconds
macbookpro-sierra-02:sage admin$ for i in `seq 0 50` ; do ./sage -t --long src/sage/combinat/matrices/dancing_links.pyx ; done
Running doctests with ID 2018-05-29-11-30-55-9317aab4.
Git branch: t/25125/25125
Using --optional=gfortran,gmpy2,mpir,python2,sage
Doctesting 1 file.
sage -t --long --warn-long 46.8 src/sage/combinat/matrices/dancing_links.pyx
**********************************************************************
File "src/sage/combinat/matrices/dancing_links.pyx", line 603, in sage.combinat.matrices.dancing_links.dancing_linksWrapper.one_solution
Failed example:
    sorted(d.one_solution())
Expected:
    [0, 1]
Got:
    [2, 3]
**********************************************************************
1 item had failures:
   1 of  17 in sage.combinat.matrices.dancing_links.dancing_linksWrapper.one_solution
    [215 tests, 1 failure, 5.46 s]
----------------------------------------------------------------------
sage -t --long --warn-long 46.8 src/sage/combinat/matrices/dancing_links.pyx  # 1 doctest failed
----------------------------------------------------------------------
Total time for all tests: 5.6 seconds
    cpu time: 4.9 seconds
    cumulative wall time: 5.5 seconds
Running doctests with ID 2018-05-29-11-31-04-c04446f2.
Git branch: t/25125/25125
Using --optional=gfortran,gmpy2,mpir,python2,sage
Doctesting 1 file.
sage -t --long --warn-long 46.8 src/sage/combinat/matrices/dancing_links.pyx
**********************************************************************
File "src/sage/combinat/matrices/dancing_links.pyx", line 634, in sage.combinat.matrices.dancing_links.dancing_linksWrapper.one_solution
Failed example:
    sorted(dlx.one_solution())
Expected:
    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
Got:
    [2, 3, 4, 5, 6, 7, 9, 10, 11, 18]
**********************************************************************
1 item had failures:
   1 of  17 in sage.combinat.matrices.dancing_links.dancing_linksWrapper.one_solution
    [215 tests, 1 failure, 5.42 s]
----------------------------------------------------------------------
sage -t --long --warn-long 46.8 src/sage/combinat/matrices/dancing_links.pyx  # 1 doctest failed

comment:43 Changed 5 months ago by vklein

  • Status changed from needs_review to needs_work

comment:44 Changed 5 months ago by git

  • Commit changed from 25da49e6e04ca7fed46442d27f66d3dc297362ab to c9851791ab1b04561be9e7cd18e4f706d44b8545

Branch pushed to git repo; I updated commit sha1. New commits:

c98517925125: sorting every list even those of size one in doctests

comment:45 Changed 5 months ago by slabbe

  • Status changed from needs_work to needs_review

Thanks for your carefull review Vincent K. We avoided a nuisance to Volker.

The failures you found come from the fact that we change the ncpus=1 to ncpus=None in the one_solution method which made deterministic doctests become not deterministic. If the doctest is run with more than one cpu, than the first solution found depend on the speed of the computation made on each cpus.

I made the doctests for one_solution independent of the first solution found.

I also added a bunch of sorted (even for list of size one!) to prevents other similar problem.

Hopefully, these were the last non deterministic doctests to be fixed.

Needs review.

comment:46 Changed 5 months ago by vklein

  • Status changed from needs_review to positive_review

All tests passed (with several iteration of sage -t ...dancing_links.pyx) both on unbuntu and osx.

comment:47 Changed 5 months ago by vbraun

  • Branch changed from u/slabbe/25125 to c9851791ab1b04561be9e7cd18e4f706d44b8545
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.