Opened 5 years ago

Last modified 5 years ago

#19895 needs_info enhancement

extend lazy lists: various improvements and generalizations, new sublists

Reported by: dkrenn Owned by:
Priority: major Milestone: sage-7.0
Component: misc Keywords:
Cc: mantepse Merged in:
Authors: Daniel Krenn Reviewers:
Report Upstream: N/A Work issues:
Branch: u/dkrenn/extend_lazy_lists (Commits, GitHub, GitLab) Commit: 654351b51babe7956b272a521e1b49e1f2118843
Dependencies: #16137, #21164 Stopgaps:

Status badges

Description (last modified by dkrenn)

This ticket extends sage.misc.lazy_list by the following:

  • Sublists dropwhile and takewhile (similar behavior as those in the itertools; we share the same cache).
  • Simplify code in __repr__ and allow custumization.
  • Creating sublists/slices of derived instances (e.g. if there is a class Z inheriting from lazy_list_generic, then now a slice of an instance of Z can now be an instance of Z as well.
  • Improve documentation
  • Various smaller code improvements and simplifications.

Note: This is also meant as a preparation for (infinite) homogenous sequences #19896.

Change History (14)

comment:1 Changed 5 years ago by dkrenn

  • Branch set to u/dkrenn/extend_lazy_lists

comment:2 Changed 5 years ago by dkrenn

  • Commit set to 422f2334a00dc0b40d3c9db27b034ab830350d94
  • Dependencies set to #16137
  • Description modified (diff)

Last 10 new commits:

30223ecMerge branch 't/16137/16137' into u/dakrenn/extend_lazy_lists
7eb5af9changes due to move of module
a2cf2dcmethod properties
e9603d3method change_class
9c2f77buse and test change_class
f897b12update pxd description of parameters
8c1261dprepare fitting for dropwhile and takewhile
089d9d7rewrite dropwhile, takewhile
ff1c04eadd methods to directly call dropwhile, takewhile
422f233100% coverage, write docstrings and extend existing

comment:3 Changed 5 years ago by dkrenn

  • Summary changed from extend lazy lists: various improvments and generalizations, new sublists to extend lazy lists: various improvements and generalizations, new sublists

comment:4 Changed 5 years ago by mantepse

  • Cc mantepse added

comment:5 Changed 5 years ago by git

  • Commit changed from 422f2334a00dc0b40d3c9db27b034ab830350d94 to fb86dbf1f6e80a32196f822ac6b9a826c4c8a4a9

Branch pushed to git repo; I updated commit sha1. New commits:

72d23b3pass additional keyword arguments (for inherting class) to instantiation
a879daasmall code simplification
6e30ee7make cls_kwds work correctly
a02045eremove unnecessary import
fb86dbfextend a doctest: test name attribute

comment:6 Changed 5 years ago by git

  • Commit changed from fb86dbf1f6e80a32196f822ac6b9a826c4c8a4a9 to 3b63c4792714b2e57dda38b6d9a69e2cf663ba94

Branch pushed to git repo; I updated commit sha1. New commits:

1136a64fit start on iteration
8e4a562add a doctest in dropwhile
a544d6dcorrect nasty bug in takewhile (complete rewrite of method)
3b63c47simplify __repr__ code

comment:7 Changed 5 years ago by dkrenn

  • Status changed from new to needs_review

Ticket is ready for reviewing.

At the top there is

#empty_lazy_list = lazy_list_generic(initial_values=[],
#                                    start=0, stop=0, step=1)  # ... does not work

which is commented since it does not work. The following workaround is used:

empty_lazy_list = lazy_list_generic.__new__(lazy_list_generic)
empty_lazy_list.start = 0
empty_lazy_list.stop = 0
empty_lazy_list.step = 1
empty_lazy_list.cache = []
...

It simply sets all the attributes manually. Commenting in the "direct command", one gets

Traceback (most recent call last):
...
  File "/local/dakrenn/sage/7.0.beta3/local/lib/python2.7/site-packages/sage/combinat/words/morphism.py", line 144, in <module>
    from sage.misc.lazy_list import lazy_list
ImportError: cannot import name lazy_list

Any ideas? (But I am happy with any solution; deleting the commented-out lines as well)

comment:8 follow-up: Changed 5 years ago by vdelecroix

  • Status changed from needs_review to needs_info

Hello,

This does not look like an improvement to me. Lazy lists aimed to be simple. You are introducing nine new attributes. If you want a fancy_list, just inherit.

On the other hand, you can make only one object for dropwhile/takewhile. It is not good to multiply the number of classes in this file. I am already not happy that we have 4. I am also not happy with the fact that start might change. Could you make this computation in the constructor of dropwhile and avoid complicating the code of the generic list?

What is the usecase of dropwhile/takewhile that would not be taken care with itertools.dropwhile or itertools.takewhile?

comment:9 in reply to: ↑ 8 Changed 5 years ago by dkrenn

Hello Vincent,

Replying to vdelecroix:

This does not look like an improvement to me.

Not so good ;) So we need to discuss...

Lazy lists aimed to be simple.

Ok. What exactly does simple mean for you? I ask, because it makes difference if simple means only a few methods and little code or something else. What I think: lazy lists are meant as flexible data structures to be used in various other classes (words, sequences, species, lazy power series, ...). Thus there should be some flexibility and freedom.

You are introducing nine new attributes. If you want a fancy_list, just inherit.

I am afraid it is not that easy. The new attributes for formatting should provide a way to easily change the appearence of the representation string, e.g. for words (no idea if there is a plan to use this) this could be word: 10111010100.... Concerning sequences: the existing Sequence-class for finite sequences offers a way to add a newline after each comma in the formatting; this should be possible with the HomogenousSequences, which are in fact lazy lists, as well. Special sequences (and words as well) might have a name: fibonacci sequence: 0, 1, 1, 2, 3, 5, .... However, I can think of alternative ways to implement these formatting features. What comes in my mind right now would be a formatting function, to which these attributes can be passed. Thus they don't have to be stored in the class, and one can change them by overriding _repr_, but still get the advantage of not needing to write the full similar code every time. What is left of these formatting attributes in the class would be the name attribute only. Since I want to do slicing in inherted classes correctly (i.e. a slice of a HomogenousSequence should again be a HomogenousSequence and not a plain lazy_list_generic) the attributes cls with the corresponding keywords are needed.

Concerning a class fancy_list: I do not see a point of having another class, which has the same technical functionality as lazy_list_generic, but only offers more possibilites to work with (like if needed changes its appearance).

On the other hand, you can make only one object for dropwhile/takewhile.

What is the disadvantage of having two classes? One class per Feature (and those two are distinct features, although their name does not reflect this) is usually a good design choice.

It is not good to multiply the number of classes in this file. I am already not happy that we have 4.

Can you tell me why you are unhappy with 4 classes? I just see the advantage that separate features are into separate classes, which makes the code better read and understandable. (Indeed I needed a lot of time understanding the current generic class, which can track a master lazy list, do slicing and all the other basic stuff at once; however, I am now good with this design choice)

I am also not happy with the fact that start might change. Could you make this computation in the constructor of dropwhile and avoid complicating the code of the generic list?

stop is already allowed to change, so why should start be not allowed to change? Indeed I thought about doing the computations in the constructor, but I believe this is not a desired feature. Because then something already happens, even if no element is accessed. The main advantage of having the iterator-like lazy lists is that elements are considered only if needed and not before.

What is the usecase of dropwhile/takewhile that would not be taken care with itertools.dropwhile or itertools.takewhile?

Sharing caches (and other information). As I understand that's one of the main points of having lazy lists.

Looking forward to your answers/comments.

Best, Daniel

comment:10 follow-up: Changed 5 years ago by vdelecroix

Hi Daniel,

Sorry for the long delay.

Simple for me means: few attributes and few methods. No room for customization. Changing the appearance of a lazy list is certainly useful but adding +5 attributes to all lazy lists is a big waste. However, I fully agree that a better representation code would be useful. What about writing a customizable method:

def str(self, prefix, start, max_nb_elements, separator, ...):

That would overload the class but not the objects. Hence a memory footprint close to zero.

I am against the multiplication of classes since this can go forever. I would be happier if each class would actually have concrete usecase (i.e. at least used in 2 other classes). Perhaps you have some in mind for dropwhile?

Best, Vincent

comment:11 Changed 5 years ago by git

  • Commit changed from 3b63c4792714b2e57dda38b6d9a69e2cf663ba94 to 654351b51babe7956b272a521e1b49e1f2118843

Branch pushed to git repo; I updated commit sha1. New commits:

59b32fcMerge commit 'f7960c1038c1498eca8995a3e39e7c3a4f76e99e' of trac.sagemath.org:sage into t/19895/extend_lazy_lists
23b114dMerge tag '7.1.rc0' into t/19895/extend_lazy_lists
654351bfixup after merge

comment:12 Changed 5 years ago by dkrenn

Merged in latest development version 7.1.rc0.

comment:13 in reply to: ↑ 10 Changed 5 years ago by dkrenn

Hello Vincent,

Replying to vdelecroix:

Simple for me means: few attributes and few methods. No room for customization. Changing the appearance of a lazy list is certainly useful but adding +5 attributes to all lazy lists is a big waste. However, I fully agree that a better representation code would be useful. What about writing a customizable method:

def str(self, prefix, start, max_nb_elements, separator, ...):

That would overload the class but not the objects. Hence a memory footprint close to zero.

I (now) agree that adding a lot of attributes is a waste and I agree that a customizeable method is what fits better. I've refactored this part of the code into #21164. (Note that I included a couple of small bugfixes in #21164 as well as they were fixed on the fly in the branch attached to this ticket here.)

comment:14 Changed 5 years ago by dkrenn

  • Dependencies changed from #16137 to #16137, #21164
Note: See TracTickets for help on using tickets.