Opened 9 years ago

Closed 5 years ago

Last modified 5 years ago

#6637 closed enhancement (fixed)

standardize the interface to TransitiveIdeal and friends

Reported by: nthiery Owned by: mhansen
Priority: major Milestone: sage-6.3
Component: combinatorics Keywords: backtrack, enumerated set, transitive closure, days57
Cc: sage-combinat, hivert Merged in:
Authors: Sébastien Labbé Reviewers: Travis Scrimshaw
Report Upstream: N/A Work issues:
Branch: 3191690 (Commits) Commit:
Dependencies: #14052 Stopgaps:

Description (last modified by slabbe)

  1. Implement a single entry point for recursively enumerated sets:
     RecursivelyEnumeratedSet(seeds, successors, structure=..., enumeration=...)

where structure takes values in the set [None, 'forest', 'graded', 'symmetric'] and enumeration takes values in the set [None, 'depth', 'breadth', 'naive'].

  1. Deprecate TransitiveIdeal, TransitiveIdealGraded and SearchForest as entry point.
  1. TransitiveIdeal(succ, seeds) keeps the same behavior as before and is now the same as RecursivelyEnumeratedSet(seeds, succ, structure=None, enumeration='naive').
  1. TransitiveIdealGraded(succ, seeds, max_depth) keeps the same behavior as before and is now the same as RecursivelyEnumeratedSet(seeds, succ, structure=None, enumeration='breadth', max_depth=max_depth).

Remarks:

  1. For now the code of SearchForest is still in sage/combinat/backtrack.py. It should be moved in sage/sets/recursively_enumerated_set.pyx. See #16351.
  1. TransitiveIdeal and TransitiveIealGraded are used in the code of sage/combinat, sage/categories and sage/groups at least. These should be updated to use RecursivelyEnumeratedSet for speed improvements and also to avoid issues explained in C below. See #16352.
  1. Note that there were some issues with TransitiveIdeal and TransitiveIdealGraded, namely:
  • Enumeration of TransitiveIdeal is claimed to be depth first search in the top level file backtrack.py, but in fact, it is neither breadth first neither depth first. It is what I call a naive search.
  • Enumeration of TransitiveIdealGraded is indeed breadth first as claimed but it does not make use of the graded hypothesis at all because it remembers every generated elements.

See my status report at SageDays57 for more info.

Attachments (1)

trac_6637_recursive_set-sl.patch (15.8 KB) - added by slabbe 6 years ago.

Download all attachments as: .zip

Change History (62)

comment:1 Changed 6 years ago by slabbe

  • Report Upstream set to N/A

Here are some discussions related to this ticket:

Last edited 6 years ago by slabbe (previous) (diff)

comment:2 Changed 6 years ago by slabbe

  • Summary changed from Follow up on #6000: standardize the interface to TransitiveIdeal and friends to standardize the interface to TransitiveIdeal and friends

comment:3 Changed 6 years ago by slabbe

  • Description modified (diff)

comment:4 Changed 6 years ago by slabbe

In a tentative to find good choices for name and so on, here is how I see this.

  • Let X be a set.
  • Let R be a binary relation on X, that is R is a subset of the cartesian product of X times X.
  • Denote by xRy if x is R-related to y.

Now

  • Let seeds be a subset of X.
  • Let succ be a callable python object X -> 2^X such that xRy if and only if y is in succ(x).

We are interested in the subset S of X that can be generated from the seeds using the succ recursively. More precisely in the set

S = {y : x = x1 R x2 R x3 R ... R xn = y and x in seeds}.

Moreover, we are interested in the enumeration of S itself and we consider depth-first and breadth-first as different and both usefull.

Such a relation G = (X,R) can be seen as a directed graph. I think this remark is useful as it may provide some vocabulary. Indeed the set S is the connected components of the generators in the digraph G.

I see some different cases :

1. We do not know anything more about the relation. We need to save in memory all the known objects to avoid duplicates. This is what is currently done in TransitiveIdeal (depth-first search) and (curiously) in TransitiveIdealGraded (breadth-first search) also.

2. The directed graph S is a forest with given seeds. Equivalently, one may say that S do not contain cycle (oriented or not). This is what is currently done in SearchForest.

3. The relation is graded. By graded here I mean what I thought TransitiveIdealGraded was doing until I look more carefully at the doc and the code. More seriously, by graded I mean the following : for all (x1 in seeds) and (y1 in seeds),

if (x1 R x2 R ... R xn) and (y1 R y2 R ... R ym) and (xn=ym), then (n=m).

The relation is graded if all path from the origin to an element have the same length. In this case, we only need to save in memory the current level.

4. The relation is symmetric. If the relation is symmetric, we only need to keep in memory the last two level of depth. This is what I needed and coded this week. And this is why I started to look more carefully at the code in sage/combinat/backtrack.py...

That is it for now!

Last edited 5 years ago by slabbe (previous) (diff)

comment:5 follow-ups: Changed 6 years ago by nthiery

I totally agree with the analysis!

I don't know yet what would be the best name for the argument provided by the user to describe the relation. Behind the scene we are definitely modelling relation. But what the user provide is not the relation but a function that computes the (out) neighbors for this relation. If at the end of the day we choose "TransitiveClosure?" as name for the main entry point, then "neighbors" would be consistent. If we go for "RecursiveSet?" (or RecursiveEnumeratedSet? or variant thereof) then "operators" would be consistent.

Cheers,

Nicolas

comment:6 in reply to: ↑ 5 Changed 6 years ago by slabbe

I think relation would not be too bad for the name of the successor keyword and should be considered as well. In some sense, a binary relation is a function f: X -> 2^X and a function is a relation such that |f(x)|=1 for all x.

Possibilities for successor keyword :

  • succ : suitable for non symmetric relation, currently used in TransitiveIdeal and TransitiveIdealGraded
  • successors
  • operators
  • neighbors : suitable vocabulary for symmetric relation
  • children : suitable vocabulary for non symmetric relation, currently used in SearchForest.
  • relatives : suitable vocabulary for symmetric relation

Possibilities for generators keyword :

  • generators
  • gens
  • roots
  • seeds
Last edited 5 years ago by slabbe (previous) (diff)

comment:7 in reply to: ↑ 5 Changed 6 years ago by slabbe

If we go for RecursiveSet (or RecursiveEnumeratedSet or variant thereof)

I like RecursiveSet. Maybe RecursiveEnumeratedSet is more related to what we do but is also longer.

Some links:

Last edited 5 years ago by slabbe (previous) (diff)

comment:8 in reply to: ↑ 5 Changed 6 years ago by slabbe

I don't know yet what would be the best name for the argument provided by the user to describe the relation.

For the above 4 cases, I would suggest arguments like the following :

RecursiveSet(seeds, succ)
RecursiveSet(seeds, succ, structure="forest")
RecursiveSet(seeds, succ, structure="graded")
RecursiveSet(seeds, succ, structure="symmetric")
Last edited 5 years ago by slabbe (previous) (diff)

Changed 6 years ago by slabbe

comment:9 Changed 6 years ago by slabbe

I just added a patch. It implements the RecursiveSet_symmetric class and creates factory called RecursiveSet. For now, RecursiveSet returns either an instance of TransitiveIdeal, SearchForest or RecursiveSet_symmetric. I started an empty class RecursiveSet_graded. See examples inside the docstring of the class RecursiveSet.

It is not ready for review, but comments are welcome to help me continue this work.

Actually, my questions are :

  • How should I merge RecursiveSet with TransitiveIdeal and SearchForest?
  • Do we like this interface?

comment:10 Changed 6 years ago by slabbe

  • Dependencies set to #14052

comment:11 Changed 5 years ago by jdemeyer

  • Milestone changed from sage-5.11 to sage-5.12

comment:12 Changed 5 years ago by vbraun_spam

  • Milestone changed from sage-6.1 to sage-6.2

comment:13 Changed 5 years ago by slabbe

  • Branch set to u/slabbe/6637
  • Commit set to c2861f04a7250c3c72438adc50ccad8a43c22e54

New commits:

c2861f0Some fixes to TransitiveIdeal and TransitiveIdealGraded, added RecursiveSet factory

comment:14 Changed 5 years ago by git

  • Commit changed from c2861f04a7250c3c72438adc50ccad8a43c22e54 to 5924b4637237d77d81a3577b2a6d86c23f96a86a

Branch pushed to git repo; I updated commit sha1. New commits:

5924b46Cleaning RecursiveSet and subclasses

comment:15 Changed 5 years ago by slabbe

  • Cc hivert added

comment:16 Changed 5 years ago by git

  • Commit changed from 5924b4637237d77d81a3577b2a6d86c23f96a86a to 29aebab4b5bc94f8bfc35c994edb35d90d70557b

Branch pushed to git repo; I updated commit sha1. New commits:

29aebabnow 100% coverage, seeds instead of gens

comment:17 Changed 5 years ago by git

  • Commit changed from 29aebab4b5bc94f8bfc35c994edb35d90d70557b to b363c547908004c908baff117a53ec2279cc6f1b

Branch pushed to git repo; I updated commit sha1. New commits:

b363c54Cleaning documentation and reverting some changes to TransitiveIdeals

comment:18 Changed 5 years ago by git

  • Commit changed from b363c547908004c908baff117a53ec2279cc6f1b to b8108ae30f45fb6dedadda77b53e30f5bc5e37f6

Branch pushed to git repo; I updated commit sha1. New commits:

b8108aeimplemented depth search with stack and breadth search with queue

comment:19 Changed 5 years ago by git

  • Commit changed from b8108ae30f45fb6dedadda77b53e30f5bc5e37f6 to 9de2cf9ba5f22b83ab14827651f430292e727684

Branch pushed to git repo; I updated commit sha1. New commits:

9de2cf9moving RecursivelyEnumerableSet to sage/sets

comment:20 Changed 5 years ago by git

  • Commit changed from 9de2cf9ba5f22b83ab14827651f430292e727684 to 5d634535fcee7fcba413170826b03011373505d1

Branch pushed to git repo; I updated commit sha1. New commits:

5d63453doc fixes + adding Parent and category

comment:21 Changed 5 years ago by slabbe

  • Description modified (diff)
  • Status changed from new to needs_review

comment:22 Changed 5 years ago by git

  • Commit changed from 5d634535fcee7fcba413170826b03011373505d1 to 4934fa2967e5bb521167d34c15e3152b40f30e63

Branch pushed to git repo; I updated commit sha1. New commits:

4934fa2deprecate import of TransitiveIdeal + level method for RecursivelyEnumeratedSet

comment:23 Changed 5 years ago by git

  • Commit changed from 4934fa2967e5bb521167d34c15e3152b40f30e63 to 94a4d7aaa77bfc48894d17d5242d324c3e798307

Branch pushed to git repo; I updated commit sha1. New commits:

94a4d7alevel -> graded component, algorithm -> enumeration, more docs, deprecated SearchForest also

comment:24 Changed 5 years ago by git

  • Commit changed from 94a4d7aaa77bfc48894d17d5242d324c3e798307 to 52ce4a357d72762caf3339fe0282a0364e15a842

Branch pushed to git repo; I updated commit sha1. New commits:

52ce4a3succ -> successors, fixed doctests in backtrack.py

comment:25 Changed 5 years ago by slabbe

I think I will stop now. The next thing to do would be to move code from sage/combinat/backtrack.py to sage/sets/recursively_enumerated_set.py more precisely, the SearchForest code. But since it is mostly about moving code (does not change any functionality), I suggest to do it in a another ticket and review/merge this ticket now.

Needs review!

Sébastien

comment:26 Changed 5 years ago by slabbe

  • Keywords days57 added

comment:27 Changed 5 years ago by vbraun_spam

  • Milestone changed from sage-6.2 to sage-6.3

comment:28 follow-up: Changed 5 years ago by chapoton

  • Status changed from needs_review to needs_work

your commits remove completely combinat/all.py ....

comment:29 in reply to: ↑ 28 Changed 5 years ago by slabbe

Replying to chapoton:

your commits remove completely combinat/all.py ....

I see. It is strange because I can't see which commit did that... I will investigate.

comment:30 Changed 5 years ago by tscrim

I believe that's an error with the trac plugin (I've seen that before).

Can I make a feature request for this ticket, could we also cythonize this for speed (or at least make the new file a .pyx file)?

comment:31 Changed 5 years ago by slabbe

Indeed, there is some gain. I did one example:

Python:

sage: f = lambda a: [a-1,a+1]
sage: C = RecursivelyEnumeratedSet([10, 15], f, structure='symmetric')
sage: it = iter(C)
sage: %time L = [next(it) for _ in xrange(10^6)]
CPU times: user 5.82 s, sys: 239 ms, total: 6.06 s
Wall time: 6.07 s

Cython:

sage: f = lambda a: [a-1,a+1]
sage: C = RecursivelyEnumeratedSet([10, 15], f, structure='symmetric')
sage: it = iter(C)
sage: %time L = [next(it) for _ in xrange(10^6)]
CPU times: user 4.47 s, sys: 408 ms, total: 4.88 s
Wall time: 4.89 s

comment:32 Changed 5 years ago by git

  • Commit changed from 52ce4a357d72762caf3339fe0282a0364e15a842 to 42262457c67606ba2861e9ba7b5cb6f20450f09d

Branch pushed to git repo; I updated commit sha1. New commits:

b5d53aeMerge branch 'master' into t/6637
4226245#6637: recursively_enumerated_set ffrom .py to .pyx

comment:33 Changed 5 years ago by slabbe

  • Description modified (diff)

comment:34 Changed 5 years ago by slabbe

  • Description modified (diff)

I keep the status of the ticket to needs work because I realized that some doctest were broken in the sage library. I am working on it.

comment:35 Changed 5 years ago by git

  • Commit changed from 42262457c67606ba2861e9ba7b5cb6f20450f09d to 766a1b0d7a09c8d9597308592db9543145e61751

Branch pushed to git repo; I updated commit sha1. New commits:

e174ae0Trac #6637: Adding a __contains__ method to RecursivelyenumeratedSet
766a1b0Trac #6637: fix doctest in root_system/plot.py

comment:36 Changed 5 years ago by slabbe

  • Description modified (diff)
  • Status changed from needs_work to needs_review

comment:37 Changed 5 years ago by tscrim

  • Authors set to Sébastien Labbé
  • Branch changed from u/slabbe/6637 to public/ticket/6637
  • Commit changed from 766a1b0d7a09c8d9597308592db9543145e61751 to d8b942bf45c05f4df5fe093b0b63ac63e7127db6
  • Reviewers set to Travis Scrimshaw

Some more speedup by doing some more cythonization. Before:

sage: f = lambda a: [a-1,a+1]
sage: C = RecursivelyEnumeratedSet([10, 15], f, structure='symmetric')
sage: it = iter(C)
sage: %time L = [next(it) for _ in xrange(10^6)]
CPU times: user 9.68 s, sys: 147 ms, total: 9.83 s
Wall time: 9.81 s

With my commit:

sage: f = lambda a: [a-1,a+1]
sage: C = RecursivelyEnumeratedSet([10, 15], f, structure='symmetric')
sage: it = iter(C)
sage: %time L = [next(it) for _ in xrange(10^6)]
CPU times: user 8.02 s, sys: 103 ms, total: 8.13 s
Wall time: 8.15 s

I'm sure I haven't done the best cythonization job on this, but it works and all tests pass. If you're happy with my changes, then positive review.


New commits:

0d592ceMerge branch 'u/slabbe/6637' of trac.sagemath.org:sage into public/ticket/6637
f200525More cythonization.
72f5bf6Merge branch 'u/slabbe/6637' of trac.sagemath.org:sage into public/ticket/6637
df40815Some tweaks in the cythonization.
e68a15fFixed pickling issue.
d8b942bSome other potential cython speedups.

comment:38 Changed 5 years ago by slabbe

  • Description modified (diff)

comment:39 Changed 5 years ago by slabbe

  • Description modified (diff)

comment:40 Changed 5 years ago by slabbe

I do gain one more second with your improvements. Great!

sage: sage: f = lambda a: [a-1,a+1]                                         
sage: sage: C = RecursivelyEnumeratedSet([10, 15], f, structure='symmetric')
sage: sage: it = iter(C)                                                    
sage: sage: %time L = [next(it) for _ in xrange(10^6)]                      
CPU times: user 3.49 s, sys: 246 ms, total: 3.74 s                          
Wall time: 3.79 s   

I do not like factory function in general and was happy to use for the first time the metaclass stuff. But apparently, it is not as efficient? Did you check if only removing the metaclass stuff was giving a speedup?

Last edited 5 years ago by slabbe (previous) (diff)

comment:41 Changed 5 years ago by git

  • Commit changed from d8b942bf45c05f4df5fe093b0b63ac63e7127db6 to 3191690431cfe1614444004863b15a331941ad00

Branch pushed to git repo; I updated commit sha1. New commits:

3191690Trac #6637: More deprecation doc in backtrack.py

comment:42 Changed 5 years ago by slabbe

Travis, did you check that the doc was building fine? I am not able to, I get :

$ sage -docbuild reference/structure html
   [structure] WARNING: intersphinx inventory '/Users/slabbe/Applications/sage-git/src/doc/output/html/en/reference/quivers/objects.inv' not fetchable due to <type 'exceptions.IOError'>: [Errno 2] No such file or directory: '/Users/slabbe/Applications/sage-git/src/doc/output/html/en/reference/quivers/objects.inv'
   Error building the documentation.

   Note: incremental documentation builds sometimes cause spurious
   error messages. To be certain that these are real errors, run
   "make doc-clean" first and try again.
   Traceback (most recent call last):
     File "/Users/slabbe/Applications/sage-git/src/doc/common/builder.py", line 1477, in <module>
         getattr(get_builder(name), type)()
           File "/Users/slabbe/Applications/sage-git/src/doc/common/builder.py", line 699, in _wrapper
               getattr(DocBuilder, build_type)(self, *args, **kwds)
                 File "/Users/slabbe/Applications/sage-git/src/doc/common/builder.py", line 94, in f
                     execfile(sys.argv[0])
                       File "/Users/slabbe/Applications/sage-git/src/doc/common/custom-sphinx-build.py", line 210, in <module>
                           raise OSError(ERROR_MESSAGE)
                           OSError: [structure] WARNING: intersphinx inventory '/Users/slabbe/Applications/sage-git/src/doc/output/html/en/reference/quivers/objects.inv' not fetchable due to <type 'exceptions.IOError'>: [Errno 2] No such file or directory: '/Users/slabbe/Applications/sage-git/src/doc/output/html/en/reference/quivers/objects.inv'

Once, I may confirm the docs does build fine. I will set this to positive review.

comment:43 Changed 5 years ago by slabbe

  • Status changed from needs_review to positive_review

make doc-clean fixed the error. Doc builds fine.

comment:44 Changed 5 years ago by nthiery

  • Status changed from positive_review to needs_info

Sebastien wants to double check the Metaclass thingy.

comment:45 follow-up: Changed 5 years ago by tscrim

As I recall, metaclasses are not supported in extension classes by Cython. The metaclass should not change the speed since it's only called/used upon object creation.

comment:46 follow-up: Changed 5 years ago by tscrim

Also FTR, I liked your usage of the metaclass (and I can't check the doc until I get my docbuilder to actually work again... :/ ).

comment:47 in reply to: ↑ 46 Changed 5 years ago by slabbe

If you agree Travis, I will try to put the metaclass use back in. Also, maybe Florent can say a word about it. He coached me on how to implement it.

comment:48 in reply to: ↑ 45 Changed 5 years ago by slabbe

Replying to tscrim:

As I recall, metaclasses are not supported in extension classes by Cython. The metaclass should not change the speed since it's only called/used upon object creation.

Ok, now I see what you mean. When the class is cdef, then the __classcall_private__ do not get called instead of the __init__:

sage: f = lambda a: [a-1,a+1]
sage: RecursivelyEnumeratedSet([0], f)
A recursively enumerated set (depth first search)
sage: RecursivelyEnumeratedSet([0], f, structure='symmetric')
Traceback (most recent call last):
...
TypeError: __init__() got an unexpected keyword argument 'structure'

I also saw in the doc that: "For a Cython class (not cdef since they doesn’t allows metaclasses)"

comment:49 Changed 5 years ago by slabbe

I pushed on my branch a commit using metaclass. In the end, I had to create a factory def outside of the class...

comment:50 Changed 5 years ago by git

  • Commit changed from 3191690431cfe1614444004863b15a331941ad00 to dd72bfc1227605362e9baf847a623ddd6bd297f9

Branch pushed to git repo; I updated commit sha1. New commits:

34c9db2Removed _generic from base class and added _factory to function.
dd72bfcMerge branch 'public/ticket/6637' of trac.sagemath.org:sage into public/ticket/6637

comment:51 follow-up: Changed 5 years ago by tscrim

Looking at your commit, I don't see any benefit in using the metaclass. However I'm not opposed to the renaming, so I've just pushed that.

comment:52 in reply to: ↑ 51 Changed 5 years ago by slabbe

Replying to tscrim:

Looking at your commit, I don't see any benefit in using the metaclass.

Yes exactly. I needed to play with it to understand what you meant.

However I'm not opposed to the renaming, so I've just pushed that.

Well, the renaming was just an easy way for me to check that sage -t recursively_enumerated_set.pyx was ok after I realised that __classcall_private__ was ignored by cdef class. It was not a suggestion, but it would not be the first renamed function:

sage: import_statements(sum)
from sage.misc.functional import symbolic_sum

So, we go with commit ​dd72bfc instead of ​3191690 ?

Last edited 5 years ago by slabbe (previous) (diff)

comment:53 follow-up: Changed 5 years ago by tscrim

  • Status changed from needs_info to needs_review

If that's okay with you.

comment:54 in reply to: ↑ 53 Changed 5 years ago by slabbe

Replying to tscrim:

If that's okay with you.

The more I think about it, the less I like it. I think ​dd72bfc can be confusing for someone looking at the file for the first time. Until that person looks at the file sage/sets/all.py he will not understand how the __init__ handles the structure argument. And the key will always be hidden in some other file sage/sets/all.py. I suggest we go with your initial factory function. More precisely with commit ​3191690. Do you agree?

If so, I do not know what should we do then (git question). Should we update the commit field? Should we reset the HEAD of the branch? Should we create a new branch pointing to the commit?

comment:55 follow-up: Changed 5 years ago by tscrim

Good point. What we'll do is create a new branch which just points to the old commit (afterwards we can delete our old branches). Do you want to do this or should I?

Last edited 5 years ago by tscrim (previous) (diff)

comment:56 in reply to: ↑ 55 Changed 5 years ago by slabbe

Let me try.

Last edited 5 years ago by slabbe (previous) (diff)

comment:57 Changed 5 years ago by slabbe

  • Branch changed from public/ticket/6637 to public/ticket/6637a
  • Commit changed from dd72bfc1227605362e9baf847a623ddd6bd297f9 to 3191690431cfe1614444004863b15a331941ad00

The following did the trick:

git checkout 3191690 -b t/6637a

comment:58 Changed 5 years ago by tscrim

  • Status changed from needs_review to positive_review

Then let's get this in. Thanks Sébastien.

comment:59 Changed 5 years ago by slabbe

Thanks for the review and cython improvements!

comment:60 Changed 5 years ago by vbraun

  • Branch changed from public/ticket/6637a to 3191690431cfe1614444004863b15a331941ad00
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:61 Changed 5 years ago by slabbe

  • Commit 3191690431cfe1614444004863b15a331941ad00 deleted
  • Description modified (diff)
Note: See TracTickets for help on using tickets.