Opened 6 years ago
Closed 5 years ago
#18109 closed defect (fixed)
Restructure IntegerListLex code
Reported by: | vdelecroix | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-6.10 |
Component: | combinatorics | Keywords: | |
Cc: | aschilling, jdemeyer, nthiery, ncohen, bgillespie | Merged in: | |
Authors: | Jeroen Demeyer | Reviewers: | Anne Schilling, Travis Scrimshaw |
Report Upstream: | N/A | Work issues: | |
Branch: | 65025ab (Commits) | Commit: | 65025ab68743a1dc6c30eb73b6db4e4283f49d4a |
Dependencies: | #15525 | Stopgaps: |
Description (last modified by )
Split up IntegerListsLex
in a fast Cython back-end and a (slower) Python front-end. Restructure with multiple implementations (like #17920) in mind.
This ticket will not change anything to the public interface of IntegerListsLex
.
Change History (100)
comment:1 follow-up: ↓ 2 Changed 6 years ago by
comment:2 in reply to: ↑ 1 ; follow-up: ↓ 3 Changed 6 years ago by
We want to keep the parent to model the set itself, ask questions like cardinality or building the polyhedron, do constructions on top of it (e.g. use it as indexing set for a vector space), etc.
The 'Lex' there seems a bit too much for the mathematical object that you want to represent. You describe things that could be a method of an 'IntegerLists?' object (or more specifically methods of 'Compositions' or 'Partitions').
- Being able to specify an element constructor is a useful feature as well. What we need to discuss here is whether we want to switch to using lists (or tuples!) by default.
There should be a way to enumerate these objects without paying this cost, however. A way to have both is to implement the iterator to return a copy of the current list, or a tuple (or even the current list itself, with big 'read only' warnings), and then implement in IntegerListsLex
an __iter__
that wraps every element returned by that iterator with <whatever you need>.
Nathann
comment:3 in reply to: ↑ 2 Changed 6 years ago by
Replying to ncohen:
There should be a way to enumerate these objects without paying this cost, however. A way to have both is to implement the iterator to return a copy of the current list, or a tuple (or even the current list itself, with big 'read only' warnings), and then implement in
IntegerListsLex
an__iter__
that wraps every element returned by that iterator with <whatever you need>.
That's also what I have in mind: a low-level class designed to be clean and fast implemented in Cython without overhead. And then a class on top of that which can implement whatever extra Python features that you want.
comment:4 Changed 6 years ago by
- Summary changed from IntegerListLex better not be a parent to Restructure IntegerListLex code
comment:5 Changed 6 years ago by
- Branch set to u/jdemeyer/ticket/18109
comment:6 Changed 6 years ago by
- Commit set to 8d5ca559f1b4a376a84d88a995a3fde6bfb74a6f
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
8d5ca55 | Restructure IntegerListsLex code
|
comment:7 Changed 6 years ago by
- Description modified (diff)
comment:8 Changed 6 years ago by
- Description modified (diff)
comment:9 follow-ups: ↓ 10 ↓ 25 Changed 6 years ago by
Hi Jeroen,
As a matter of fact, if you isolate the commit that just move a file, the diff looks much nicer. I am not able to get something reasonable with the options -B
, -C
, -M
or -D
. Do you know what to do?
Vincent
comment:10 in reply to: ↑ 9 Changed 6 years ago by
Replying to vdelecroix:
Hi Jeroen,
As a matter of fact, if you isolate the commit that just move a file, the diff looks much nicer. I am not able to get something reasonable with the options
-B
,-C
,-M
or-D
. Do you know what to do?
Sorry no. I don't know how to nicely show diffs which split up a file in two (interestingly, hg
has better support for this!)
comment:11 Changed 6 years ago by
- Dependencies set to #18181
comment:12 Changed 6 years ago by
- Commit changed from 8d5ca559f1b4a376a84d88a995a3fde6bfb74a6f to d8fd440915b0438aecf7f1f5d8d5eb2d5f366ddd
comment:13 Changed 6 years ago by
- Commit changed from d8fd440915b0438aecf7f1f5d8d5eb2d5f366ddd to 9fbbd3c6c2900b5d6f326846522338774751cfe8
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
9fbbd3c | Restructure IntegerListsLex code
|
comment:14 follow-up: ↓ 15 Changed 6 years ago by
A comment on the design... Should we really support +Infinity in the iterator? I would go for unsigned int
variables and UINT_MAX
as a synonmyous for Infinity. Of course it makes sense for the higher classes (e.g. to give a proper answer to .cardinality()
).
comment:15 in reply to: ↑ 14 ; follow-up: ↓ 16 Changed 6 years ago by
Replying to vdelecroix:
A comment on the design... Should we really support +Infinity in the iterator?
Yes.
I would go for
unsigned int
variables andUINT_MAX
as a synonmyous for Infinity.
And not support IntegerLists(10^100)
?
comment:16 in reply to: ↑ 15 ; follow-ups: ↓ 17 ↓ 18 Changed 6 years ago by
Replying to jdemeyer:
Replying to vdelecroix:
A comment on the design... Should we really support +Infinity in the iterator?
Yes.
I would go for
unsigned int
variables andUINT_MAX
as a synonmyous for Infinity.And not support
IntegerLists(10^100)
?
I said for the iterator. Not for the main class. I would not bother if iter(IntegerLists(10^100))
just failed. It should be very fast for small entries. I guess that one option would be to use Nathann strategy in #18137 with fused Cython type (here unsigned int
and mpz_t
). But I remember that it was nearly impossible to make it work as attributes of an extension class.
comment:17 in reply to: ↑ 16 Changed 6 years ago by
Replying to vdelecroix:
But I remember that it was nearly impossible to make it work as attributes of an extension class.
For attributes of an extension class, no. I guess you could have two classes (one for some C type and one for mpz_t
) on top of a common base class. But I have never done this.
comment:18 in reply to: ↑ 16 ; follow-up: ↓ 22 Changed 6 years ago by
Replying to vdelecroix:
Replying to jdemeyer:
Replying to vdelecroix:
A comment on the design... Should we really support +Infinity in the iterator?
Yes.
I would go for
unsigned int
variables andUINT_MAX
as a synonmyous for Infinity.And not support
IntegerLists(10^100)
?I said for the iterator. Not for the main class. I would not bother if
iter(IntegerLists(10^100))
just failed. It should be very fast for small entries. I guess that one option would be to use Nathann strategy in #18137 with fused Cython type (hereunsigned int
andmpz_t
).
I think that we really should support list(IntegerLists(10^100, length=1))
because in Sage, we always support large integers if possible.
In any case, changing this is certainly outside the scope of this ticket (it could be done in #18055 or #18056). Here, I just want to reorganize the code without changing the implementation.
comment:19 Changed 6 years ago by
- Dependencies changed from #18181 to #18181, #18184
comment:20 Changed 6 years ago by
- Commit changed from 9fbbd3c6c2900b5d6f326846522338774751cfe8 to e39566a935403eb750d444d6b236f43084caa8be
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
e39566a | Restructure IntegerListsLex code
|
comment:21 Changed 6 years ago by
- Commit changed from e39566a935403eb750d444d6b236f43084caa8be to cda4b75087577ff9583411799f8832ccf2e72866
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
cda4b75 | Restructure IntegerListsLex code
|
comment:22 in reply to: ↑ 18 Changed 6 years ago by
Replying to jdemeyer:
I think that we really should support
list(IntegerLists(10^100, length=1))
because in Sage, we always support large integers if possible.
That's part of why I am thinking of C++; then we can just have a templated iterator, and depending on the input we can choose one instantiation or the other.
In any case, changing this is certainly outside the scope of this ticket (it could be done in #18055 or #18056). Here, I just want to reorganize the code without changing the implementation.
Sounds reasonable indeed.
New commits:
cda4b75 | Restructure IntegerListsLex code
|
comment:23 Changed 6 years ago by
- Commit changed from cda4b75087577ff9583411799f8832ccf2e72866 to 7f8c40a07f55d18097ca53007ca3dfd29439551e
comment:24 Changed 6 years ago by
- Commit changed from 7f8c40a07f55d18097ca53007ca3dfd29439551e to 2b88b6416c0a6853b2fff59d7bd2ecb2d56f5a5c
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
2b88b64 | Restructure IntegerListsLex code
|
comment:25 in reply to: ↑ 9 Changed 6 years ago by
Replying to vdelecroix:
I am not able to get something reasonable with the options
-B
,-C
,-M
or-D
. Do you know what to do?
Actually, the following will show everything as copied from integer_list.py
:
git show -D -B -C01
comment:26 Changed 6 years ago by
- Commit changed from 2b88b6416c0a6853b2fff59d7bd2ecb2d56f5a5c to e67e71c7437af2daf3c6653d882e0fafb6e292b8
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
e67e71c | Restructure IntegerListsLex code
|
comment:27 Changed 6 years ago by
This now passes all doctests except for the pickle jar.
comment:28 Changed 6 years ago by
- Commit changed from e67e71c7437af2daf3c6653d882e0fafb6e292b8 to 8c587d34a7b3332baaeb97b3197f465eebf0c982
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
8c587d3 | Restructure IntegerListsLex code
|
comment:29 Changed 6 years ago by
- Description modified (diff)
Passes all doctests and documentation builds.
comment:30 Changed 6 years ago by
- Commit changed from 8c587d34a7b3332baaeb97b3197f465eebf0c982 to 2be6ad090d51f5eb033cde42a1831ab7d89092ae
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
2be6ad0 | Restructure IntegerListsLex code
|
comment:31 follow-up: ↓ 32 Changed 6 years ago by
- Status changed from new to needs_review
comment:32 in reply to: ↑ 31 ; follow-ups: ↓ 33 ↓ 34 Changed 6 years ago by
Hi Jeroen!
Since you seem to have cythonized the code already, could you add some timings compared to the old code?
Anne
PS: I was under the impression that it is harder to debug code in cython, which might make life a little harder for the new features in #18055.
comment:33 in reply to: ↑ 32 Changed 6 years ago by
Replying to aschilling:
Since you seem to have cythonized the code already, could you add some timings compared to the old code?
My goal certainly was not to gain speed, but I could check...
I was under the impression that it is harder to debug code in cython, which might make life a little harder for the new features in #18055.
I have no idea really. A lot of the code I write for Sage is Cython and it hasn't bothered me.
On the other hand, I on purpose did not Cythonize IntegerListsLexIter
(it's in a Cython source file, but doesn't use any Cython features). So if it makes your life easier, just move that one class to a Python file.
comment:34 in reply to: ↑ 32 Changed 6 years ago by
Replying to aschilling:
you seem to have cythonized the code already
Depends what you mean. I moved the files to a Cython source file and I created extension types instead of Python classes. But I didn't optimize the loops for example.
comment:35 follow-up: ↓ 36 Changed 6 years ago by
There is a small but significant speed-up (again: this is without really trying to speed up the code).
Before:
sage: timeit('list(IntegerListsLex(14, min_part=1))') 5 loops, best of 3: 1.18 s per loop
After:
sage: timeit('list(IntegerListsLex(14, min_part=1))') 5 loops, best of 3: 986 ms per loop
Before:
sage: timeit('list(IntegerListsLex(28, length=8, floor=lambda i:i, check=False))') 5 loops, best of 3: 1.11 s per loop
After:
sage: timeit('list(IntegerListsLex(28, length=8, floor=lambda i:i, check=False))') 5 loops, best of 3: 852 ms per loop
comment:36 in reply to: ↑ 35 Changed 6 years ago by
Nice that there is a slight speedup without even trying hard!
FYI, I rebased this branch on top of sage-6.7.beta0 and everything looks clean.
I noticed that in nn.y
in sage.combinat.integer_lists
it currently says
.. WARNING:: this function is likely to disappear in :trac:`17927`.
Briefly looking at 17927
this seems no longer the case.
comment:37 follow-up: ↓ 38 Changed 6 years ago by
- Commit changed from 2be6ad090d51f5eb033cde42a1831ab7d89092ae to 218808287254c4b462eeed12801310ebe4d3d124
Branch pushed to git repo; I updated commit sha1. New commits:
2188082 | Remove references to #17927
|
comment:38 in reply to: ↑ 37 ; follow-ups: ↓ 39 ↓ 40 Changed 6 years ago by
In principle this branch looks good to me; I guess it needs to be rebased to the latest development version, however.
Just to check: are you planning to reuse the Envelope class also for #17920? The planned changes there involving backward smoothing etc will be ok, right?
comment:39 in reply to: ↑ 38 Changed 6 years ago by
Replying to aschilling:
Just to check: are you planning to reuse the Envelope class also for #17920? The planned changes there involving backward smoothing etc will be ok, right?
Sure. The only risk is that errors in Envelope
will make both implementations wrong.
comment:40 in reply to: ↑ 38 ; follow-up: ↓ 41 Changed 6 years ago by
Replying to aschilling:
In principle this branch looks good to me; I guess it needs to be rebased to the latest development version, however.
Despite the red link on Trac, it actually merges cleanly with 6.7.beta1
comment:41 in reply to: ↑ 40 Changed 6 years ago by
Hi Jeroen,
Could you please rebase this patch on the latest development version, so I can do the final review? Nicolas never made a comment, so I assume he is ok with it!
Best,
Anne
comment:42 follow-up: ↓ 43 Changed 6 years ago by
- Commit changed from 218808287254c4b462eeed12801310ebe4d3d124 to 9582a7a5248f609a2d2de02497b4b1d36d9dd96e
Branch pushed to git repo; I updated commit sha1. New commits:
9582a7a | Merge tag '6.7.beta3' into t/18109/ticket/18109
|
comment:43 in reply to: ↑ 42 ; follow-up: ↓ 62 Changed 6 years ago by
I went through the code and the restructuring looks good to me. All tests pass and the documentation builds. One question I have though is about the comparison functions in base.py
and lists.py
. The tests look almost identical. So which one is actually used for IntegerListsLex
?
comment:44 follow-ups: ↓ 45 ↓ 56 ↓ 63 Changed 6 years ago by
Hi Jeroen,
Sorry, I got sidetracked reviewing other tickets. Thanks Anne for the reminder, and for checking the code and the documentation!
I have just been through the code myself (not checking the fine details though).
Altogether, I believe it's correct. Cythonizing the envelope function and making the iterator into a Cython class that depends on little of the Sage infrastructure was in our long term plans, so that's great (I would have done it after the algorithm refactorisation, but since it's done, let's move forward). Generalizing a base IntegerLists? class is useful as well.
On the other hand, is it absolutely necessary to have this separation
between frontend and backend classes? It introduces some complexity
and in fact some duplication (e.g. in the tests). In particular, I am
worried that it might introduce complexity in the construction of new
special case classes based on IntegerListsLex
, when the primary goal
is to make this as trivial as possible. I'd need to actually implement
such new classes to qualify this worry.
If it's just a question of avoiding calls to the element constructor,
could not this be done just locally in __iter__
(returning directly
an IntegerListsLexIter
object, or mapping the constructor on it)?
If it's a question of making the iterator code independent of
Parent
, well, it already is: it only uses the IntegerListsLex
object as a data holder for the constraints, right? As long as we
don't have a specific use case (backed up by benchmarks if speed is
the motivation), there may not be much point in maintaining additional
code for this.
At the very least, there should be some entry point (typically in the
module description) listing all the classes, and clarifying their
respective roles. Maybe slightly more systematic class names could
help; say: IntegerLists
, IntegerListsLex
, IntegerListsBackend
,
IntegerListsLexBackend
.
Some small suggestions:
- some of the cdef attributes could be declared as
int's. E.g.
Envelope.min_length
.
- maybe we could use the occasion to rename the attribute
IntegerListsLexIter.parent
especially if it's not a parent anymore.
- Another small complication is having to handle pickling by hand each
time. At least, would there be a way to implement
XXX.__reduce__
to return(XXX, *)
rather than having to implement anunpickle_XXX
function each time?
Cheers,
Nicolas
comment:45 in reply to: ↑ 44 ; follow-up: ↓ 46 Changed 6 years ago by
Replying to nthiery:
On the other hand, is it absolutely necessary to have this separation between frontend and backend classes? If it's a question of making the iterator code independent of
Parent
It's a question of making the backend class independent of Parent
yes. I think it's good to have a fast simple Cython backend and a proper Sage Parent
front-end.
Some small suggestions:
- some of the cdef attributes could be declared as int's. E.g.
Envelope.min_length
.
I consider that out of the scope of this ticket. These are micro-optimizations which can be done after we have a clearer view of the final design.
- maybe we could use the occasion to rename the attribute
IntegerListsLexIter.parent
especially if it's not a parent anymore.
Sure, fine for me.
- Another small complication is having to handle pickling by hand each time. At least, would there be a way to implement
XXX.__reduce__
to return(XXX, *)
rather than having to implement anunpickle_XXX
function each time?
It's certainly annoying, but that's because Cython cannot automatically (un)pickle extension types. I have to check if it can be simplified.
comment:46 in reply to: ↑ 45 ; follow-ups: ↓ 47 ↓ 48 Changed 6 years ago by
Hi Jeroen,
Replying to jdemeyer:
It's a question of making the backend class independent of
Parent
yes. I think it's good to have a fast simple Cython backend and a proper SageParent
front-end.
I agree its good in theory. But if it makes things more complicated, then it's only good in practice if we have an actual use case.
A main point being: unless we create millions of small IntegerListsLex? objects, I don't see why having it be a (facade) parent makes things slower.
I consider that out of the scope of this ticket. These are micro-optimizations which can be done after we have a clearer view of the final design.
Fair enough.
It's certainly annoying, but that's because Cython cannot automatically (un)pickle extension types. I have to check if it can be simplified.
Thanks!
Cheers,
Nicolas
PS: I forgot to word my appreciation for the hard work!
comment:47 in reply to: ↑ 46 ; follow-up: ↓ 49 Changed 6 years ago by
Replying to nthiery:
Hi Jeroen,
Replying to jdemeyer:
It's a question of making the backend class independent of
Parent
yes. I think it's good to have a fast simple Cython backend and a proper SageParent
front-end.I agree its good in theory. But if it makes things more complicated, then it's only good in practice if we have an actual use case.
A main point being: unless we create millions of small IntegerListsLex? objects, I don't see why having it be a (facade) parent makes things slower.
A use case is in MatrixSpace.__iter__
and I also have others intensive usage for building so-called square tiled surfaces (if you care, look at u/vdelecroix/flat_surfaces-6.5
). I would be grateful if the iterator could be accessible without initializing any parent.
Vincent
comment:48 in reply to: ↑ 46 Changed 6 years ago by
A main point being: unless we create millions of small IntegerListsLex? objects, I don't see why having it be a (facade) parent makes things slower.
Wouldn't this happen if somebody wants, for all n
between 10 and 1000, one instance of a partition of n
satisfying <a specific set of constraints> ?
There is a (trivial) case at least which makes sense to me, i.e. for a fixed k
find some partition of n
into integers between floor(n/k)
and ceil(n/k)
. I agree that this can be done manually without this iterator, but I expect that it can happen with more complicated set of constraints too.
Nathann
comment:49 in reply to: ↑ 47 Changed 6 years ago by
Replying to vdelecroix:
A use case is in
MatrixSpace.__iter__
and I also have others intensive usage for building so-called square tiled surfaces (if you care, look atu/vdelecroix/flat_surfaces-6.5
). I would be grateful if the iterator could be accessible without initializing any parent.
Thanks for the feedback.
Do you mind running a concrete benchmark in one of those situations, comparing a facade parent IntegerListLex
and a non-parent IntegerListsLex
as provided by this ticket? Then we coud evaluate on solid ground what the overhead actually is, and whether it's worth the additional complexity?
comment:50 Changed 6 years ago by
sage: class A: ....: pass ....: sage: class B(Parent): ....: pass ....: sage: timeit("A()") 625 loops, best of 3: 413 ns per loop sage: timeit("B()") 625 loops, best of 3: 13.2 µs per loop
comment:51 follow-up: ↓ 52 Changed 6 years ago by
That's not a fair benchmark, since that's Python, not Cython. The real benchmark is this:
sage: cython("cdef class A(object): pass") sage: cython("from sage.structure.parent cimport Parent\ncdef class B(Parent): pass") sage: timeit("A()", number=10^4, repeat=20) 10000 loops, best of 20: 68.9 ns per loop sage: timeit("B()", number=10^4, repeat=20) 10000 loops, best of 20: 14.5 µs per loop
comment:52 in reply to: ↑ 51 ; follow-up: ↓ 53 Changed 6 years ago by
Replying to jdemeyer:
That's not a fair benchmark, since that's Python, not Cython. The real benchmark is this:
sage: cython("cdef class A(object): pass") sage: cython("from sage.structure.parent cimport Parent\ncdef class B(Parent): pass") sage: timeit("A()", number=10^4, repeat=20) 10000 loops, best of 20: 68.9 ns per loop sage: timeit("B()", number=10^4, repeat=20) 10000 loops, best of 20: 14.5 µs per loop
Thanks Vincent and Jeroen. For fairness, we should also initialize the category. Here it is:
sage: %timeit B() The slowest run took 4.13 times longer than the fastest. This could mean that an intermediate result is being cached 100000 loops, best of 3: 13.3 µs per loop sage: %timeit B(category=C) The slowest run took 7.14 times longer than the fastest. This could mean that an intermediate result is being cached 10000 loops, best of 3: 21.7 µs per loop
Those benchmarks teach us that constructing a Parent
costs 200 times
more than an empty object. Not quite surprising, but it's a start. Now
what I really had in mind was a benchmark in a real use case, to
estimate what's the relative overhead in a typical situation.
Let's take something really trivial:
sage: sage: %timeit IntegerListsLex(n=1, max_length=1).an_element() The slowest run took 6.67 times longer than the fastest. This could mean that an intermediate result is being cached 10000 loops, best of 3: 113 µs per loop
Now we start to have some concrete ground to start discussions: what
we are speaking about is an overhead of 13% in the case of the most
trivial operation. There probably is some room for improvements in the
initialization of IntegerListsLex
objects; but not so much either:
there is some unavoidable option parsing and initialization work to
do. Similarly, there may be room for improvement in the initialization
of a Parent
(I am surprised that initializing the category is
costly, since for a Cython parent it does nothing but set an
attribute). Altogether, the 13% has to be taken with a grain of salt
at this point.
For additional insight, I believe we would be need to have some typical concrete use case, like the one Vincent has for matrices or flat surfaces. And see, then, what's the actual overhead. With that, we will be able to take an informed decision; not just out of fear that "parents are slow".
Vincent, could you pickup such a situation and make a benchmark?
In general I would like to follow this rule of thumb for our enumerated sets:
- On one hand, implement super optimized standalone iterators with bare bone API, whenever there is a proven need for them.
- On the other hand, when modeling an enumerated set, implement it as
a
Parent
s with nice input parsing and a full fledged API; and typically using the above iterators under the hood.
- Avoid half measures with iterables that are not
Parent
.
Here the situation is a bit delicate since we are in between the shores: the iterator needs some non trivial parsing and internal data. Which calls for an intermediate class handling the parsing and the data storage; is this class meant to model the enumerated set itself or not?
All that being said, just make your call. I am fine with additional complexity when there is a proven and worthwhile need for it. And when someone is willing to pay for the maintenance overhead ...
Cheers,
Nicolas
comment:53 in reply to: ↑ 52 ; follow-up: ↓ 54 Changed 6 years ago by
Replying to nthiery:
In general I would like to follow this rule of thumb for our enumerated sets:
- On one hand, implement super optimized standalone iterators with bare bone API, whenever there is a proven need for them.
- On the other hand, when modeling an enumerated set, implement it as a
Parent
s with nice input parsing and a full fledged API; and typically using the above iterators under the hood.
- Avoid half measures with iterables that are not
Parent
.
Why do you think that the current implementation is a "half measure"? I would say it fits your "super optimized standalone iterators with bare bone API" (except that it's currently not yet super optimized, see #18055 and #18056).
comment:54 in reply to: ↑ 53 ; follow-up: ↓ 55 Changed 6 years ago by
Good morning Jeroen,
Replying to jdemeyer:
Why do you think that the current implementation is a "half measure"? I would say it fits your "super optimized standalone iterators with bare bone API" (except that it's currently not yet super optimized, see #18055 and #18056).
Yes, I am very happy with the Cython iterator (and its upcoming
optimizations) and the Parent enumerated sets (IntegerLists*
). What
I am wondering about are the intermediate classes
(IntegerLists*Impl
). What's their precise role? Do we really need
them?
- Are there situations where an iterator is not enough (we really need to model the enumerated set itself), yet we can't afford the overhead of having a Parent?
And / or
- Are the *Impl just utility classes that are mandatory to construct the iterator?
Maybe we should have a short video conference at some point; it would be much easier to discuss!
comment:55 in reply to: ↑ 54 Changed 6 years ago by
Replying to nthiery:
Yes, I am very happy with the Cython iterator (and its upcoming optimizations) and the Parent enumerated sets (
IntegerLists*
). What I am wondering about are the intermediate classes (IntegerLists*Impl
). What's their precise role?
Well, I can see two main use cases:
- if you need to iterate twice over the same set.
- to check containment.
In the future, point counting could be added as an extra algorithm in the Impl
class, not in the iterator.
comment:56 in reply to: ↑ 44 ; follow-up: ↓ 57 Changed 6 years ago by
Replying to nthiery:
- At least, would there be a way to implement
XXX.__reduce__
to return(XXX, *)
rather than having to implement anunpickle_XXX
function each time?
That's not quite possible because of https://groups.google.com/forum/#!topic/sage-devel/limDRGuep34
comment:57 in reply to: ↑ 56 Changed 6 years ago by
That's not quite possible because of https://groups.google.com/forum/#!topic/sage-devel/limDRGuep34
... which was also solved with an 'unpickle' function T_T
comment:58 Changed 6 years ago by
Actually, pickling can be implemented better using __getstate__
and __setstate__
.
comment:59 Changed 6 years ago by
- Commit changed from 9582a7a5248f609a2d2de02497b4b1d36d9dd96e to c9408574da27b31030a8d2eb5c800dc838ae4132
Branch pushed to git repo; I updated commit sha1. New commits:
c940857 | Simplify pickling for IntegerListsImpl
|
comment:60 Changed 6 years ago by
For Envelope
, this is harder since __init__
really does some non-trivial stuff like the sign
handling.
comment:61 Changed 6 years ago by
- Commit changed from c9408574da27b31030a8d2eb5c800dc838ae4132 to d46541fb29ca682163ba62e66e48fc8128f2c825
Branch pushed to git repo; I updated commit sha1. New commits:
d46541f | Fix __richcmp__ doctests
|
comment:62 in reply to: ↑ 43 Changed 6 years ago by
Replying to aschilling:
One question I have though is about the comparison functions in
base.py
andlists.py
. The tests look almost identical.
Right, I fixed the tests.
comment:63 in reply to: ↑ 44 ; follow-up: ↓ 65 Changed 6 years ago by
Replying to nthiery:
Maybe slightly more systematic class names could help; say:
IntegerLists
,IntegerListsLex
,IntegerListsBackend
,IntegerListsLexBackend
.
Do you find ...Backend
more clear than ...Impl
? If you care a lot, I can change it, but it smells like bike-shedding.
comment:64 Changed 6 years ago by
- Commit changed from d46541fb29ca682163ba62e66e48fc8128f2c825 to 5f7a883c07534a0d7861dd4dc46e0b45ae4ff1ed
Branch pushed to git repo; I updated commit sha1. New commits:
5f7a883 | Rename attribute "_parent" to "impl"
|
comment:65 in reply to: ↑ 63 ; follow-up: ↓ 66 Changed 6 years ago by
Replying to jdemeyer:
Replying to nthiery:
Maybe slightly more systematic class names could help; say:
IntegerLists
,IntegerListsLex
,IntegerListsBackend
,IntegerListsLexBackend
.Do you find
...Backend
more clear than...Impl
? If you care a lot, I can change it, but it smells like bike-shedding.
I have a preference for Backend
since we use this vocable elsewhere in Sage (e.g. in the graph library), but don't really care otherwise; you choose. My comment was more about having a consistent naming scheme. It could be enough to e.g. rename IntegerListsImpl_invlex
to IntegerListsLexImpl
.
Thanks by the way for investigating how to better do the pickling!
comment:66 in reply to: ↑ 65 ; follow-up: ↓ 70 Changed 6 years ago by
Replying to nthiery:
Replying to jdemeyer:
Replying to nthiery:
Maybe slightly more systematic class names could help; say:
IntegerLists
,IntegerListsLex
,IntegerListsBackend
,IntegerListsLexBackend
.Do you find
...Backend
more clear than...Impl
? If you care a lot, I can change it, but it smells like bike-shedding.I have a preference for
Backend
since we use this vocable elsewhere in Sage (e.g. in the graph library), but don't really care otherwise; you choose. My comment was more about having a consistent naming scheme. It could be enough to e.g. renameIntegerListsImpl_invlex
toIntegerListsLexImpl
.
Well, I actually prefer IntegerListsImpl_invlex
over IntegerListsLexImpl
.
I intentionally do not want IntegerListsImpl_invlex
to be associated too much with IntegerListsLex
: people should not think that there must be a 1-to-1 correspondence between IntegerLists
classes and IntegerListsImpl
classes.
comment:67 Changed 6 years ago by
After replacing Impl
by Backend
:
Error compiling Cython file: ------------------------------------------------------------ ... IntegerListsBackend.__init__(self, *args, **kwds) self.check = check if self.min_part < 0: raise NotBackendementedError("strictly negative min_part") ^ ------------------------------------------------------------ sage/combinat/integer_lists/invlex.pyx:848:40: undeclared name not builtin: NotBackendementedError
comment:68 Changed 6 years ago by
What is your strategy, defining NotBackendementedError
?
comment:69 Changed 6 years ago by
- Commit changed from 5f7a883c07534a0d7861dd4dc46e0b45ae4ff1ed to 434a15236a27b2d6aef0ab95877aa396e86b1cf8
Branch pushed to git repo; I updated commit sha1. New commits:
434a152 | Rename Impl -> Backend
|
comment:70 in reply to: ↑ 66 ; follow-up: ↓ 71 Changed 6 years ago by
Replying to jdemeyer:
Well, I actually prefer
IntegerListsImpl_invlex
overIntegerListsLexImpl
.I intentionally do not want
IntegerListsImpl_invlex
to be associated too much withIntegerListsLex
: people should not think that there must be a 1-to-1 correspondence betweenIntegerLists
classes andIntegerListsImpl
classes.
I see your point. Still I find useful to give the reader a hint that, in this case, there is such a correspondence. When reading the code, I had to dig around to make sure I was getting it right.
So I'd rather have a consistent naming scheme, and a note to the developers, in the overview documentation of the module, that there need not be in general such a 1-to-1 correspondence.
Cheers,
Nicolas
PS: fun error message :-)
comment:71 in reply to: ↑ 70 Changed 6 years ago by
Replying to nthiery:
I see your point. Still I find useful to give the reader a hint that, in this case, there is such a correspondence.
There is not, even in this case. Note that IntegerListsLex
is just IntegerLists
with a classcall method and a default backend. I can easily create an IntegerLists
instance with a IntegerListsBackend_invlex
backend or an IntegerListsLex
instance with a different backend.
comment:72 Changed 6 years ago by
- Commit changed from 434a15236a27b2d6aef0ab95877aa396e86b1cf8 to 6ae2e2264bf769ca276413d2c1d73d597a93cbbb
Branch pushed to git repo; I updated commit sha1. New commits:
6ae2e22 | Some doc improvement
|
comment:73 Changed 5 years ago by
- Status changed from needs_review to needs_work
needs rebase, does not apply
comment:74 Changed 5 years ago by
Do you have an intention to review the ticket once I rebase it? I'm just asking since there is no point in rebasing if it just continues to bitrot.
comment:75 Changed 5 years ago by
Thanks for asking. The review is on my todo list for the last week of August.
comment:76 Changed 5 years ago by
- Commit changed from 6ae2e2264bf769ca276413d2c1d73d597a93cbbb to 35e2e50d85c85d6a71674d78f41ce2ba52f4c286
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
35e2e50 | Restructure IntegerListsLex code
|
comment:77 Changed 5 years ago by
- Dependencies #18181, #18184 deleted
Rebased and squashed, testing now...
comment:78 Changed 5 years ago by
- Commit changed from 35e2e50d85c85d6a71674d78f41ce2ba52f4c286 to 3ebbb659f034dd17610a6afba03d307ffc7b483e
Branch pushed to git repo; I updated commit sha1. New commits:
3ebbb65 | Workaround for Trac #18192
|
comment:79 Changed 5 years ago by
- Commit changed from 3ebbb659f034dd17610a6afba03d307ffc7b483e to 88a100a762bad34400aca66c52d22255737072a7
Branch pushed to git repo; I updated commit sha1. New commits:
88a100a | Fix one more import of IntegerListsLex
|
comment:80 Changed 5 years ago by
- Status changed from needs_work to needs_review
comment:82 Changed 5 years ago by
Do you actually plan to review this ticket? If you do, then I'll rebase immediately.
comment:83 Changed 5 years ago by
I will review this ticket once rebased.
comment:84 Changed 5 years ago by
- Commit changed from 88a100a762bad34400aca66c52d22255737072a7 to 6fcd10a8384e07df3d3302e9ae1e7c6f6f99217e
Branch pushed to git repo; I updated commit sha1. New commits:
6fcd10a | Merge tag '6.9' into t/18109/ticket/18109
|
comment:85 follow-up: ↓ 87 Changed 5 years ago by
- Branch changed from u/jdemeyer/ticket/18109 to u/tscrim/refactor_integer_lists_lex-18109
- Commit changed from 6fcd10a8384e07df3d3302e9ae1e7c6f6f99217e to 2e5e9691da0c2003d7eb3ddcf6c5c5010281f4dc
- Dependencies set to #15525
- Reviewers set to Travis Scrimshaw
I made some doc tweaks while we are moving everything around. Everything works and since we aren't actually changing the functionality, I'm not looking at that closely. In short, it looks goods.
In writing this comment, I've convinced myself that having a frontend class for each backend will likely be a good thing in practice as it can be used to distinguish different iteration orders without having to look at the backend (which may not be python visible at the end of the day). It also makes it easier to define string representations and natural extensions to subclasses.
In particular, #15525, which we will probably need to tweak some of those changes to support this new framework (and at least correct the imports, which is why I listed it as a dependency).
New commits:
2e5e969 | Some minor reviewer tweaks and doc changes.
|
comment:86 follow-up: ↓ 88 Changed 5 years ago by
Two details:
an set
should bea set
- Why this change? I find the old version more readable.
- good_sum = (nu >= p.min_sum and nu <= p.max_sum) - good_length = (l >= p.min_length and l <= p.max_length) - no_trailing_zeros = (l <= max(p.min_length,0) or mu[-1] != 0) - return good_sum and good_length and no_trailing_zeros + return (nu >= p.min_sum and nu <= p.max_sum # Good sum + and l >= p.min_length and l <= p.max_length # Good length + and (l <= max(p.min_length,0) or mu[-1] != 0)) # No trailing zeros
If this is for performance reasons, you can instead cdef bint
the variables good_sum
, good_length
and no_trailing_zeros
.
comment:87 in reply to: ↑ 85 ; follow-up: ↓ 89 Changed 5 years ago by
Replying to tscrim:
In writing this comment, I've convinced myself that having a frontend class for each backend will likely be a good thing in practice as it can be used to distinguish different iteration orders without having to look at the backend
But why would you need to "distinguish different iteration orders" in the first place?
In the few cases (maybe invlex is the only one?) where the iteration algorithm really matters, I understand that you want a separate front-end. However, it should be possible for a backend to iterate in an arbitrary order.
I really want to see the front-end and back-end as orthogonal: I can see use cases for different front-ends with the same back-end and use cases for different back-ends for the same front-end.
comment:88 in reply to: ↑ 86 Changed 5 years ago by
Replying to jdemeyer:
Two details:
- Why this change? I find the old version more readable.
- good_sum = (nu >= p.min_sum and nu <= p.max_sum) - good_length = (l >= p.min_length and l <= p.max_length) - no_trailing_zeros = (l <= max(p.min_length,0) or mu[-1] != 0) - return good_sum and good_length and no_trailing_zeros + return (nu >= p.min_sum and nu <= p.max_sum # Good sum + and l >= p.min_length and l <= p.max_length # Good length + and (l <= max(p.min_length,0) or mu[-1] != 0)) # No trailing zerosIf this is for performance reasons, you can instead
cdef bint
the variablesgood_sum
,good_length
andno_trailing_zeros
.
It was for performance reasonings, but not the reason you're thinking of: short circuiting.
comment:89 in reply to: ↑ 87 ; follow-up: ↓ 90 Changed 5 years ago by
- Reviewers changed from Travis Scrimshaw to Anne Schilling, Travis Scrimshaw
Replying to jdemeyer:
Replying to tscrim:
In writing this comment, I've convinced myself that having a frontend class for each backend will likely be a good thing in practice as it can be used to distinguish different iteration orders without having to look at the backend
But why would you need to "distinguish different iteration orders" in the first place?
In the few cases (maybe invlex is the only one?) where the iteration algorithm really matters, I understand that you want a separate front-end. However, it should be possible for a backend to iterate in an arbitrary order.
I don't need this flexibility, but it comes as an added bonus. Right now we can check backends explicitly, but this allows us to separate that off into a completely separate package and still retain a differentiation.
I really want to see the front-end and back-end as orthogonal: I can see use cases for different front-ends with the same back-end and use cases for different back-ends for the same front-end.
I can see both use cases as well, but chances are there will be a coupling between base classes and backend. However I don't think it is really useful to continue this discussion because we both have our reasons for liking it. We just need to take care of the additions from #15525 (and the doc tweak above). Do you want me to do that (since that was my ticket) or will you?
comment:90 in reply to: ↑ 89 ; follow-up: ↓ 92 Changed 5 years ago by
comment:91 Changed 5 years ago by
- Commit changed from 2e5e9691da0c2003d7eb3ddcf6c5c5010281f4dc to 8b96bb5cca7f3a76a35b75e99406b7cb1c4cb0e5
Branch pushed to git repo; I updated commit sha1. New commits:
7982d3a | Iniital fix and added regular partitions.
|
4143c96 | Merge branch 'develop' into public/combinat/partitions_constraints-15525
|
1adb368 | Added deprecation to IntegerListLex global_options arg. Fixed doctests.
|
c477d6c | Merge branch 'public/combinat/partitions_constraints-15525' of trac.sagemath.org:sage into public/combinat/partitions_constraints-15525
|
03dcda9 | Merge branch 'public/combinat/partitions_constraints-15525' of trac.sagemath.org:sage into public/combinat/partitions_constraints-15525
|
82f4fd5 | Merge branch 'public/combinat/partitions_constraints-15525' of trac.sagemath.org:sage into public/combinat/partitions_constraints-15525
|
0cc28a0 | Fixing some things.
|
1ec481d | Addressing Kevin's comments.
|
7760c71 | Merge branch 'public/combinat/partitions_constraints-15525' of trac.sagemath.org:sage into u/tscrim/refactor_integer_lists_lex-18109
|
8b96bb5 | Adding changes from #15525.
|
comment:92 in reply to: ↑ 90 Changed 5 years ago by
comment:93 Changed 5 years ago by
- Status changed from needs_review to positive_review
comment:94 Changed 5 years ago by
- Status changed from positive_review to needs_work
Doctests fail, try with the next beta
comment:95 Changed 5 years ago by
- Branch changed from u/tscrim/refactor_integer_lists_lex-18109 to u/jdemeyer/refactor_integer_lists_lex-18109
comment:96 Changed 5 years ago by
- Commit changed from 8b96bb5cca7f3a76a35b75e99406b7cb1c4cb0e5 to 824498345a881a9ebc21be694c3522d8687e275e
I'm testing this now...
New commits:
8244983 | Merge tag '6.10.beta0' into t/18109/refactor_integer_lists_lex-18109
|
comment:97 Changed 5 years ago by
- Commit changed from 824498345a881a9ebc21be694c3522d8687e275e to 65025ab68743a1dc6c30eb73b6db4e4283f49d4a
Branch pushed to git repo; I updated commit sha1. New commits:
65025ab | Fix IntegerListsLex import
|
comment:98 Changed 5 years ago by
- Status changed from needs_work to needs_review
comment:100 Changed 5 years ago by
- Branch changed from u/jdemeyer/refactor_integer_lists_lex-18109 to 65025ab68743a1dc6c30eb73b6db4e4283f49d4a
- Resolution set to fixed
- Status changed from positive_review to closed
IntegerListsLex
*iterator* as a standalone tool that depends on nothing butPython
/Cython
". In fact this could go as far as making it a standalone library in e.g. C++.__classcall__
we need to wait until the end of the deprecation period. To removeglobal_options
we need to wait for the subclasses using it to be refactored to not impose this burden onIntegerListsLex
.