Opened 5 years ago
Closed 5 years ago
#19319 closed enhancement (fixed)
iterator over products on diagonals a la Cantor
Reported by:  dkrenn  Owned by:  

Priority:  major  Milestone:  sage6.9 
Component:  misc  Keywords:  
Cc:  behackl, cheuberg, ncohen  Merged in:  
Authors:  Vincent Delecroix  Reviewers:  Daniel Krenn, Clemens Heuberger 
Report Upstream:  N/A  Work issues:  
Branch:  96c0366 (Commits)  Commit:  96c03668100f8fc42e60ae3f7d7a715d257f41cb 
Dependencies:  Stopgaps: 
Description (last modified by )
Add an iterator, which goes over the elements of a cartesian product of (possible infinite) inputs along the diagonals like the Cantor pairing function.
This was initially factored out from #19048.
Change History (44)
comment:1 Changed 5 years ago by
 Branch set to u/dkrenn/product_cantor_pairing
comment:2 Changed 5 years ago by
 Commit set to 17229c620b4aff47161b725de2ebb4f890cfc2af
 Status changed from new to needs_review
comment:3 Changed 5 years ago by
O_o
comment:4 followup: ↓ 7 Changed 5 years ago by
Yeah, I also believed that it should be shorter.
Do you really need this Kantor enumeration, or perhaps you don't care for as long as the product of two infinite iterators works correctly?
comment:5 followups: ↓ 9 ↓ 11 Changed 5 years ago by
If you don't care, I just gave it a try:
def product(A,B): B = iter(B) lb = [] b_done = False for a in A: try: if not b_done: lb.append(B.next()) except StopIteration: b_done = True for b in lb: yield (a,b)
Does the job.
Nathann
comment:6 followup: ↓ 20 Changed 5 years ago by
About your 5 copy/paste: if you want to link every function to all others, I think the best is to have an index of methods at the head of the module and add a link toward the module.
By the way, if you feel like reviewing some doc patch: #19061
Nathann
comment:7 in reply to: ↑ 4 ; followup: ↓ 8 Changed 5 years ago by
Replying to ncohen:
Yeah, I also believed that it should be shorter.
:)
Do you really need this Kantor enumeration, or perhaps you don't care for as long as the product of two infinite iterators works correctly?
I care (at least a bit)...so I would leave it as it is.
comment:8 in reply to: ↑ 7 Changed 5 years ago by
I care (at least a bit)...so I would leave it as it is.
Could you say why? I do not see the point of having a complicated code in Sage if ten lines provide the feature.
comment:9 in reply to: ↑ 5 ; followup: ↓ 10 Changed 5 years ago by
Replying to ncohen:
If you don't care, I just gave it a try:
def product(A,B): B = iter(B) lb = [] b_done = False for a in A: try: if not b_done: lb.append(B.next()) except StopIteration: b_done = True for b in lb: yield (a,b)
Your code does not give all pairs...
comment:10 in reply to: ↑ 9 Changed 5 years ago by
Your code does not give all pairs...
Yeah, only works for infinite sets.
comment:11 in reply to: ↑ 5 Changed 5 years ago by
Replying to ncohen:
If you don't care, I just gave it a try:
def product(A,B): B = iter(B) lb = [] b_done = False for a in A: try: if not b_done: lb.append(B.next()) except StopIteration: b_done = True for b in lb: yield (a,b)
sage: list(product(srange(2), ['a', 'b', 'c'])) [(0, 'a'), (1, 'a'), (1, 'b')]
comment:12 followup: ↓ 13 Changed 5 years ago by
Yeah true, it's really bad. Soooooo I need two lists. Wanted to avoid that.
The question remains, though: is there a reason to insist on Kantor ordering, or would any other order do the job?
comment:13 in reply to: ↑ 12 Changed 5 years ago by
Replying to ncohen:
The question remains, though: is there a reason to insist on Kantor ordering, or would any other order do the job?
In #19048 it does not really (except from aesthetic aspects) matter (but it has to deal with the finite case as well and it would be nice if then all pairs would occur). But I use the same code in some other Sage projects as well (not public (yet)), where it matters: There I need an order my the sums of the indices, which is Cantor. Thus having this tool in SageMath would be great.
comment:14 Changed 5 years ago by
Okay. Longer than the first attempt, though
def product(A,B): A = iter(A) B = iter(B) list_a = [] list_b = [] a_nonempty = b_nonempty = True while a_nonempty or b_nonempty: if a_nonempty: try: list_a.append(A.next()) except StopIteration: a_nonempty = False else: for b in list_b: yield (list_a[1],b) if b_nonempty: try: list_b.append(B.next()) except StopIteration: b_nonempty = False else: for a in list_a: yield (a, list_b[1])
comment:15 Changed 5 years ago by
ARgggg.. Fails too >_<
comment:16 Changed 5 years ago by
Oh no, false alarm. Seems ok.
comment:17 followup: ↓ 19 Changed 5 years ago by
A few comments on Daniel's code:
 Does not terminate if
A
is empty andB
is infinite, thus I'd simply try to yield A[0], B[0] initially and then starts
at1
. try:
except:
block initer_as_list
can be removed
comment:18 Changed 5 years ago by
 Commit changed from 17229c620b4aff47161b725de2ebb4f890cfc2af to 70185c3a13f573cf1eacbec2eb2122082e6c375f
comment:19 in reply to: ↑ 17 Changed 5 years ago by
Replying to cheuberg:
A few comments on Daniel's code:
 Does not terminate if
A
is empty andB
is infinite, thus I'd simply try to yield A[0], B[0] initially and then starts
at1
.try:
except:
block initer_as_list
can be removed
Both changed.
comment:20 in reply to: ↑ 6 ; followup: ↓ 22 Changed 5 years ago by
Replying to ncohen:
About your 5 copy/paste: if you want to link every function to all others, I think the best is to have an index of methods at the head of the module and add a link toward the module.
I think having SEEALSO blocks with links pointing directly to related functions/classes is a very useful thing (There have been some discussions on sagedevel about this.); it makes it easy for the user to find functions without a long search (Finding the "correct" function is definitly a problem for Sage users have...)
comment:21 followup: ↓ 23 Changed 5 years ago by
I have an alternative implementation (pushed as u/cheuberg/product_cantor_pairing
):
from itertools import count from sage.rings.infinity import infinity A = iter(A) B = iter(B) max_A = infinity max_B = infinity cache_A = [] cache_B = [] for s in count(): if s <= max_A: try: cache_A.append(next(A)) except StopIteration: max_A = s1 if s <= max_B: try: cache_B.append(next(B)) except StopIteration: max_B = s1 if s > max_A + max_B or max_A < 0 or max_B < 0: return for i in range(max(0, smax_B), min(s, max_A)+1): yield cache_A[i], cache_B[si]
comment:22 in reply to: ↑ 20 ; followup: ↓ 25 Changed 5 years ago by
I think having SEEALSO blocks with links pointing directly to related functions/classes is a very useful thing (There have been some discussions on sagedevel about this.); it makes it easy for the user to find functions without a long search (Finding the "correct" function is definitly a problem for Sage users have...)
You got me wrong: I did not ask you to remove the seealso block. I proposed to have a
SEEALSO:: For related functions, refer to the module's documentation
with a link of course.
Linking n functions to each other requires n*(n1) different links. If a function is moved, if a function is added, that requires much more work. Additionally, it possible that somebody *adding* a function may forget to update the others, which will not happen if we have an index of function at the head of the module.
Nathann
comment:23 in reply to: ↑ 21 Changed 5 years ago by
I have an alternative implementation (pushed as
u/cheuberg/product_cantor_pairing
):
+1 for the abscence of private class.
Nathann
comment:24 Changed 5 years ago by
 Branch changed from u/dkrenn/product_cantor_pairing to u/cheuberg/product_cantor_pairing
 Commit changed from 70185c3a13f573cf1eacbec2eb2122082e6c375f to d4dec2be9048e1b8fc9de246fe5c698d3676a13a
New commits:
d4dec2b  Trac #19319: alternative implementation

comment:25 in reply to: ↑ 22 Changed 5 years ago by
Replying to ncohen:
You got me wrong: I did not ask you to remove the seealso block. I proposed to have a
SEEALSO:: For related functions, refer to the module's documentationwith a link of course.
I understood.
Linking n functions to each other requires n*(n1) different links. If a function is moved, if a function is added, that requires much more work. Additionally, it possible that somebody *adding* a function may forget to update the others, which will not happen if we have an index of function at the head of the module.
Having these links directly in the SEEALSO
block of the function helps the user...and this ranks higher to me than what a developer has to add when moving something around (and of course there is warnlinks
which alerts when something is not where it should be). When adding a new function it is fine that not all SEEALSO
blocks contain it; maybe only the close related ones. In sage.misc.mrange
there are just a few functions and all those should IMHO be linked. In larger files I would *not* link all in all, but only build "cliques" of very related functions (but this depends on the functions, of course).
comment:26 Changed 5 years ago by
 Branch changed from u/cheuberg/product_cantor_pairing to u/dkrenn/product_cantor_pairing
comment:27 followup: ↓ 28 Changed 5 years ago by
 Commit changed from d4dec2be9048e1b8fc9de246fe5c698d3676a13a to fde8e6d09f41645ec4faf9d071cf60c7b35ef9e3
 Reviewers set to Daniel Krenn
Alternative implementation looks good to be (and all tests pass).
New commits:
fde8e6d  minor changes to code: spacings, PEP8, remove comment

comment:28 in reply to: ↑ 27 Changed 5 years ago by
comment:29 followup: ↓ 30 Changed 5 years ago by
What is the point of having it only for pairs? What if I want to iterate ZZ x ZZ x ZZ
? Using towers of iterators is definitely not a solution (hard to write/use and slow).
comment:30 in reply to: ↑ 29 ; followup: ↓ 31 Changed 5 years ago by
Replying to vdelecroix:
What is the point of having it only for pairs? What if I want to iterate
ZZ x ZZ x ZZ
? Using towers of iterators is definitely not a solution (hard to write/use and slow).
Short answer: iterating over pairs is what we need right now.
Iteration over the product of n iterators will, as far as I can see, always need some recursive algorithm. I do not know whether avoiding iterators internally would make that much of a difference with respect of speed; IMHO it will not make it more readable.
What about replacing product_cantor_pairing(A, B)
by product_cantor_pairing([A, B])
and implementing products of >= 3
iterators by a recursive call to product_cantor_pairing
(and flattening the results) for the time being?
comment:31 in reply to: ↑ 30 ; followup: ↓ 32 Changed 5 years ago by
Replying to cheuberg:
Replying to vdelecroix:
What is the point of having it only for pairs? What if I want to iterate
ZZ x ZZ x ZZ
? Using towers of iterators is definitely not a solution (hard to write/use and slow).Short answer: iterating over pairs is what we need right now.
I perfectly understand.
Iteration over the product of n iterators will, as far as I can see, always need some recursive algorithm. I do not know whether avoiding iterators internally would make that much of a difference with respect of speed; IMHO it will not make it more readable.
Nope. If you look at the code of product
in Python there is no recursion involved. It 70 lines long, with a lot of spaces and written in pure C.
What about replacing
product_cantor_pairing(A, B)
byproduct_cantor_pairing([A, B])
and implementing products of>= 3
iterators by a recursive call toproduct_cantor_pairing
(and flattening the results) for the time being?
Is that the canonical order you want for a triple?
comment:32 in reply to: ↑ 31 Changed 5 years ago by
Replying to vdelecroix:
Replying to cheuberg:
Iteration over the product of n iterators will, as far as I can see, always need some recursive algorithm. I do not know whether avoiding iterators internally would make that much of a difference with respect of speed; IMHO it will not make it more readable.
Nope. If you look at the code of
product
in Python there is no recursion involved. It 70 lines long, with a lot of spaces and written in pure C.
for infinite iterators, I suppose that things get somewhat trickier.
What about replacing
product_cantor_pairing(A, B)
byproduct_cantor_pairing([A, B])
and implementing products of>= 3
iterators by a recursive call toproduct_cantor_pairing
(and flattening the results) for the time being?Is that the canonical order you want for a triple?
Well, I am not really eager for ordering triples anyway.
But I imagine that having
product_cantor_pairing([A, B, C])
returning the elements in the same order as
product_cantor_pairing([product_cantor_pairing([A, B]), C])
(or the other way around, i.e., product_cantor_pairing([A, product_cantor_pairing([B, C])])
) would make sense.
comment:33 followup: ↓ 39 Changed 5 years ago by
 Branch changed from u/dkrenn/product_cantor_pairing to u/vdelecroix/19319
 Commit changed from fde8e6d09f41645ec4faf9d071cf60c7b35ef9e3 to c655f9ffb0ad9f0bfdaf712a4becb2e9eb9c1a70
26 lines and no recursion.
Vincent
New commits:
c655f9f  Trac 19319: Cantor iteration of cartesian products

comment:34 Changed 5 years ago by
 Commit changed from c655f9ffb0ad9f0bfdaf712a4becb2e9eb9c1a70 to 4a52a845469078b5418937e8927d473e8b3d2ec6
Branch pushed to git repo; I updated commit sha1. New commits:
4a52a84  Trac 19319: fix doctests

comment:35 Changed 5 years ago by
 Description modified (diff)
 Summary changed from iterator over pairs on diagonals a la Cantor pairing to iterator over products on diagonals a la Cantor
comment:36 followup: ↓ 40 Changed 5 years ago by
Perhaps consider using sage.misc.lazy_list.lazy_list
as a cache? That should save you making explicit appends. You could even get slightly better memory use for cartesian powers (a common case) by sharing caches.
comment:37 Changed 5 years ago by
 Branch changed from u/vdelecroix/19319 to u/cheuberg/19319
comment:38 Changed 5 years ago by
 Commit changed from 4a52a845469078b5418937e8927d473e8b3d2ec6 to 1fee7226728e5ec09ffdeb1acdb59c2aad1dc7d5
comment:39 in reply to: ↑ 33 Changed 5 years ago by
 Reviewers changed from Daniel Krenn to Daniel Krenn, Clemens Heuberger
Replying to vdelecroix:
26 lines and no recursion.
thanks  however, using IntegerListsLex
amounts to a little cheating with respect to the length of this routine ;)
.
I added three minor commits.
Any particular reason that you return lists instead of tuples?
comment:40 in reply to: ↑ 36 ; followup: ↓ 41 Changed 5 years ago by
Replying to cheuberg:
Replying to vdelecroix:
26 lines and no recursion.
thanks  however, using
IntegerListsLex
amounts to a little cheating with respect to the length of this routine;)
.
Sure. But the moto is "not reinvent the wheel".
Any particular reason that you return lists instead of tuples?
Good question. itertools
used to output tuples. But sometimes you want to play with the result (like appending some more elements). For me it is fine either way.
Replying to nbruin:
Perhaps consider using
sage.misc.lazy_list.lazy_list
as a cache? That should save you making explicit appends. You could even get slightly better memory use for cartesian powers (a common case) by sharing caches.
Using lazy list you would also need to catch IndexError
or so until we use the Cython API I would not go with it.
I will add a trivial commit for handling sharing of slices (very good idea!).
Vincent
comment:41 in reply to: ↑ 40 Changed 5 years ago by
Replying to vdelecroix:
Replying to cheuberg:
Any particular reason that you return lists instead of tuples?
Good question.
itertools
used to output tuples. But sometimes you want to play with the result (like appending some more elements). For me it is fine either way.
I'd actually prefer tuples, for consistency with itertools
and the other methods in this module.
comment:42 Changed 5 years ago by
 Branch changed from u/cheuberg/19319 to u/vdelecroix/19319
 Commit changed from 1fee7226728e5ec09ffdeb1acdb59c2aad1dc7d5 to 96c03668100f8fc42e60ae3f7d7a715d257f41cb
Hi Clemens,
I added a commit on top of yours for returning tuples and accepting a repeat
argument (with the very same specification as for itertools.product
).
Vincent
New commits:
96c0366  Trac 19319: return tuples + repeat argument

comment:43 Changed 5 years ago by
 Status changed from needs_review to positive_review
Code is fine, doctests pass, documentation builds. Thanks.
comment:44 Changed 5 years ago by
 Branch changed from u/vdelecroix/19319 to 96c03668100f8fc42e60ae3f7d7a715d257f41cb
 Resolution set to fixed
 Status changed from positive_review to closed
New commits:
copy code from #19048
rename to product_cantor_pairing
fix import in doctest
improve docstring
doctest with infinite iterator inputs
add seealso blocks
extend AUTHROS