Opened 7 years ago
Last modified 5 years ago
#15673 needs_work enhancement
major improvements to lazy power series
Reported by: | mhansen | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-7.0 |
Component: | combinatorics | Keywords: | LazyPowerSeries,, days57 |
Cc: | mantepse, vdelecroix | Merged in: | |
Authors: | Mike Hansen | Reviewers: | Ralf Stephan, Matthieu Dien |
Report Upstream: | N/A | Work issues: | |
Branch: | u/mantepse/15673 (Commits, GitHub, GitLab) | Commit: | aa8fd83522eaeb583da0caa5c6628ddd1ed36b04 |
Dependencies: | Stopgaps: |
Change History (52)
comment:1 Changed 7 years ago by
- Branch set to u/mhansen/species_stream
comment:2 Changed 7 years ago by
- Commit set to a696708e181c85bdfa355eaa3a253d4d2679c00b
comment:3 Changed 7 years ago by
- Commit changed from a696708e181c85bdfa355eaa3a253d4d2679c00b to f4157f7f221f244b1faa2a0d07ddd8cd85bb7007
Branch pushed to git repo; I updated commit sha1. New commits:
f4157f7 | Remove dead code in GenericCombinatorialSpecies._series_helper
|
comment:4 Changed 7 years ago by
Hi Mike! Thank you for uploading the new code!
I have one immediate question: if I understand correctly, you chose to introduce a new class for every operation on streams, eg. SumStream?, ProductStream?, TailStream?, DerivativeStream?, etc.
Could you explain why...? I would have thought that these things fit more natural into various generating series classes, but I'm sure you have thought quite a lot about this...
Thanks again,
Martin
comment:5 follow-up: ↓ 6 Changed 7 years ago by
The main idea is to have the streams know about their (approximate) order. That way we don't have to worry about initializing generators with the correct approximate order. We can also do some improvements w.r.t. caching if the streams are a bit more structured.
comment:6 in reply to: ↑ 5 Changed 7 years ago by
Thanks for the explanation!
I am currently compiling sage to try your code. In case you have enough time: I recently reviewed ticket #14846, which is based on the old code... Unfortunately, it is still not in the develop branch, so maybe it's better to wait until it is.
comment:7 Changed 7 years ago by
- Description modified (diff)
comment:8 Changed 7 years ago by
Mike, Are you sure that #12659 is related?
comment:10 follow-up: ↓ 11 Changed 7 years ago by
I'm slowly digging through the new code. I cannot yet decide whether I like it or not, because I really care about it, so I'd like to understand it first :-)
Questions so far:
1) would it be feasible (and not too much work) to put the new code for streams and formal power series into a new directory (or perhaps even into two, a separate one for streams)? After all, it's an accident that it's in the species directory. Species can do without formal power series (in principle), and streams have other uses.
2) I think I mostly like new_stream. However, from a user's perspective I would rather expect to create streams with
sage: s=Stream([1,2,3,4]); t=Stream(lambda l: len(l))
and moreover, I'd expect s[4]
to give an error. That is, possibly we should have also finite streams. To create a stream which is constant, we could have something like Stream(constantly="bla")
, and streams would have to support concatenation.
If we don't, then this behaviour has to be documented very LOUDLY. (I do see that s=Stream([1,2,3,4,0])
is very convenient.)
Minor details:
StreamFromFunc
should beStreamFromFunction
and it might be nice if it wouldn't inherit fromListCachedStream
, i.e., support for streams where only some coefficients are computed.
number_computed
maybe should belength_of_cache
.
3) I don't understand the difference between SeriesStream
and a formal power series yet.
Would it be possible to use the category framework to export operations depending on what the base_ring
is? In particular, as pointed out in #14846, one should be careful with providing certain default implementations. Perhaps related: ideally I would like to be able to say,
sage: h = SymmetricFunctions(QQ).h() sage: t = SetSpecies().cycle_index_series(); h(t)
to do change of basis.
Minor details:
- I would replace aorder everywhere with order_approximation
- I somehow dislike the idea of
children
, but cannot really say why.
- the doc for
IntegralStream
mentions derivative instead of integral.
comment:11 in reply to: ↑ 10 ; follow-up: ↓ 13 Changed 7 years ago by
Replying to mantepse:
I'm slowly digging through the new code. I cannot yet decide whether I like it or not, because I really care about it, so I'd like to understand it first :-)
Questions so far:
1) would it be feasible (and not too much work) to put the new code for streams and formal power series into a new directory (or perhaps even into two, a separate one for streams)? After all, it's an accident that it's in the species directory. Species can do without formal power series (in principle), and streams have other uses.
Sure -- I'm not sure the best place for streams to go at the moment.
2) I think I mostly like new_stream. However, from a user's perspective I would rather expect to create streams with
sage: s=Stream([1,2,3,4]); t=Stream(lambda l: len(l))and moreover, I'd expect
s[4]
to give an error. That is, possibly we should have also finite streams. To create a stream which is constant, we could have something likeStream(constantly="bla")
, and streams would have to support concatenation.
You can basically do that with the "OldStreamBehavior?" class.
If we don't, then this behaviour has to be documented very LOUDLY. (I do see that
s=Stream([1,2,3,4,0])
is very convenient.)
Agreed.
Minor details:
StreamFromFunc
should beStreamFromFunction
and it might be nice if it wouldn't inherit fromListCachedStream
, i.e., support for streams where only some coefficients are computed.
The current semantics of StreamFromFunc
basically require that it be a "ListCachedStream?". We can add a MappedStream
which applies a function which takes in a coefficient and produces a new one.
number_computed
maybe should belength_of_cache
.3) I don't understand the difference between
SeriesStream
and a formal power series yet.
The distinction is pretty fine and in fact they could probably be merged. The main thing would be just be dealing with the class situation as if you had class ProductSeries(LazyPowerSeries)
which implemented the things in ProductStream
, then you'd also have to deal with "OrdinaryProductSeries
", IsotypeProductSeries
", and "CycleIndexProductSeries
". They could be created on the fly say in ._new()
, but I wasn't sure if that added complexity was worth it.
For now, the best way to think of a difference between a SeriesStream
and a LazyPowerSeries
(and friends) is that the latter adds some sort of semantics / meaning to the list of coefficients (for example, .count()
.
Would it be possible to use the category framework to export operations depending on what the
base_ring
is? In particular, as pointed out in #14846, one should be careful with providing certain default implementations. Perhaps related: ideally I would like to be able to say,sage: h = SymmetricFunctions(QQ).h() sage: t = SetSpecies().cycle_index_series(); h(t)to do change of basis.
Since many of the (internal) operations need to work with the power-sum basis, you'd mainly just wanting to be converting the coefficients "at the edges". Or would you want them to always convert to the powersum basis when they needed to?
Minor details:
- I would replace aorder everywhere with order_approximation
That's fine -- I mainly kept aorder for backwards compatibility, but we don't need that on the new streams.
- I somehow dislike the idea of
children
, but cannot really say why.
In terms of the name? Or in terms of being able to access "sub-streams"?
- the doc for
IntegralStream
mentions derivative instead of integral.
Will fix.
comment:12 Changed 7 years ago by
- Commit changed from f4157f7f221f244b1faa2a0d07ddd8cd85bb7007 to 927ff83b9f363f2d15db4b638a9439b041dd6479
Branch pushed to git repo; I updated commit sha1. New commits:
927ff83 | #15673: Minor changes from Martin Rubey's coments
|
comment:13 in reply to: ↑ 11 Changed 7 years ago by
Replying to mhansen:
2) I think I mostly like new_stream. However, from a user's perspective I would rather expect to create streams with
sage: s=Stream([1,2,3,4]); t=Stream(lambda l: len(l))and moreover, I'd expect
s[4]
to give an error. That is, possibly we should have also finite streams. To create a stream which is constant, we could have something likeStream(constantly="bla")
, and streams would have to support concatenation.You can basically do that with the "OldStreamBehavior?" class.
Well, we certainly do not want to keep two implementations for the same thing, do we? To put it bluntly, I'm not sure whether this design decision is a good one. Let's make the decisions clear:
A) do we want to create streams using Stream(thing_to_concert_into_a_stream
or StreamFromList
, StreamFromIterator
, etc.
I am sure I want the first behaviour. Is there any sage object that you create using the second model at all?
B) do we want finite streams?
I'm not so sure here. However, if we don't we agree on LOUD documentation.
The current semantics of
StreamFromFunc
basically require that it be a "ListCachedStream?". We can add aMappedStream
which applies a function which takes in a coefficient and produces a new one.
Yes, it may be good to have both. But then the semantics should be precisely the other way round.
3) I don't understand the difference between
SeriesStream
and a formal power series yet.
[...] For now, the best way to think of a difference between a
SeriesStream
and aLazyPowerSeries
(and friends) is that the latter adds some sort of semantics / meaning to the list of coefficients (for example,.count()
.
I dislike at least the choice for the names here. LazyPowerSeries
should be an element of the algebra of formal power series over a ring, I'd say. I think we should eliminate .count
entirely. If we don't, then this should go into ExponentialGeneratingSeries
, OrdinaryGeneratingSeries
, DirichletGeneratingSeries
, etc. CycleIndexSeries
is special and I need to think before having an opinion :-)
Would it be possible to use the category framework to export operations depending on what the
base_ring
is? In particular, as pointed out in #14846, one should be careful with providing certain default implementations.
It would be great if you could give your opinion on that, I don't know the category framework well enough. In Axiom/FriCAS/Aldor it was extremely useful especially for formal power series. NB, Ralf implemented the lazy series in aldor from scratch only because he wanted to be independent from Axiom/FriCAS.
Need to run. All the best,
Martin
New commits:
927ff83 | #15673: Minor changes from Martin Rubey's coments
|
comment:14 Changed 7 years ago by
Hi Mike,
one minor further issue: we still have
sage: from sage.combinat.species.series_order import unk sage: min(1,unk) Unknown series order sage: min(unk,1) 1
While I was unable to produce an actual bug using this, it might be possible. I think we should have a simple linear order unk < 1 < 2 < ... < inf
. Do you know how to do this?
comment:15 Changed 7 years ago by
Yes, that's still an issue when using Integers as series orders due to the way Python works. One way to do it would be to make sure that ".aorder" is an element of a special ordered set so that we can make sure the current comparison is being used.
comment:16 Changed 7 years ago by
(Off topic) Mike, there's something I do not understand about trac: in the change history here on the ticket I see two identical commits:
927ff83 #15673: Minor changes from Martin Rubey's coments
should I expect this?
comment:17 Changed 7 years ago by
- Milestone changed from sage-6.1 to sage-6.2
comment:18 follow-up: ↓ 19 Changed 7 years ago by
- Reviewers set to Ralf Stephan
The branch does not merge cleanly into 6.2.beta2 but it's a one line change. Tests within species are good. Docs build fine.
I also think that stream is another word for infinite sequence, and so a finite stream implementation already exists in structure.sequence, which is also fast (there are however problems in that Sequence doesn't have a parent, see #15852).
comment:19 in reply to: ↑ 18 Changed 7 years ago by
Replying to rws:
The branch does not merge cleanly into 6.2.beta2 but it's a one line change. Tests within species are good. Docs build fine.
I also think that stream is another word for infinite sequence, and so a finite stream implementation already exists in structure.sequence, which is also fast (there are however problems in that Sequence doesn't have a parent, see #15852).
Note that there is also sage.misc.lazy_list
which is a cython implementation of a lazy list and there are several interesting implementations of lazy words in sage.combinat.words
...
comment:20 follow-up: ↓ 23 Changed 7 years ago by
- Branch changed from u/mhansen/species_stream to public/ticket/15673
- Commit changed from 927ff83b9f363f2d15db4b638a9439b041dd6479 to 8ff4acc7625dd8d374a12e0170b3b7436b6bca92
I have rebased on 6.2.beta6 : here is a git branch
Is this considered to be "needs review" ?
New commits:
8ff4acc | Merge branch 'u/mhansen/species_stream' of trac.sagemath.org:sage into 15673
|
comment:21 Changed 7 years ago by
- Keywords days57 added
- Reviewers changed from Ralf Stephan to Ralf Stephan, Matthieu Dien
comment:22 Changed 7 years ago by
Hello,
I can help to improve and review this ticket. I also explore the way to use it to deal with multivariate lazy series
comment:23 in reply to: ↑ 20 Changed 7 years ago by
Replying to chapoton:
I have rebased on 6.2.beta6 : here is a git branch
Is this considered to be "needs review" ?
No, I'd say it needs work. At least the proposed way to create streams doesn't look right from a user's perspective. However, it's a huge improvement over the old code!
comment:24 Changed 7 years ago by
I updated tests and documentation
comment:25 Changed 7 years ago by
- Commit changed from 8ff4acc7625dd8d374a12e0170b3b7436b6bca92 to 5d3dfc2605797b4d4b70f5e5b7c5b9ceb8e27b58
Branch pushed to git repo; I updated commit sha1. New commits:
5d3dfc2 | add tests and improve documentation
|
comment:26 Changed 7 years ago by
#16107 depends on this.
comment:27 Changed 7 years ago by
Hi Mike!
The is_constant protocol seems somewhat convoluted; from profiling, Matthieu is telling me that it's taking a non trivial portion of the time. Could it be possibly simplified? In particular, could we just get rid of is_constant, and handle everything with a cached method "get_constant_position" which would return infinity when appropriate?
Cheers,
Nicolas
comment:28 follow-up: ↓ 29 Changed 7 years ago by
I just had a very brief look at lazy lists. Is there any reason not to use them? There may be some work involved to allow for recursively defined lists, I'm not sure.
comment:29 in reply to: ↑ 28 Changed 7 years ago by
Replying to mantepse:
I just had a very brief look at lazy lists. Is there any reason not to use them? There may be some work involved to allow for recursively defined lists, I'm not sure.
I'm working on it, I should propose an implementation soon
comment:30 Changed 7 years ago by
We opened a new ticket about lazy_list
(#16137) with the idea to replace the sage.combinat.species.new_stream
backend (slower than lazy_list
).
comment:31 Changed 7 years ago by
- Milestone changed from sage-6.2 to sage-6.3
comment:32 Changed 7 years ago by
- Milestone changed from sage-6.3 to sage-6.4
comment:33 Changed 6 years ago by
- Commit changed from 5d3dfc2605797b4d4b70f5e5b7c5b9ceb8e27b58 to 8e004ea424678354b242afd43f7bdbe17501e1f0
comment:34 Changed 6 years ago by
- Status changed from new to needs_review
comment:35 Changed 6 years ago by
Is there any intention to pursue #16137? I'm not sure, but it seems to me that my comments 11 and 13 above were not taken into account so far (in fact, 13 was not answered at all), and the design decisions taken here seem to contradict the ideas of #16137. This is partially about terminology: from a user's perspective, I wouldn't expect a stream to carry any semantics except that it is a possibly infinite sequence of things. In particular, it should work also for things other than numbers and polynomials. However, then things like IntegralStream
doesn't make sense, this is confusing formal power series with streams.
That said, I tried to read the code, and found that I'm completely lost. I have no way to adapt it as to provide an alternative implementation. Thus, in case you think my concerns are of minor importance, please go ahead!
comment:36 Changed 6 years ago by
- Commit changed from 8e004ea424678354b242afd43f7bdbe17501e1f0 to 5478a2de727c686fd1d892ea8186d54b8d839daa
comment:37 Changed 6 years ago by
- Status changed from needs_review to needs_work
This has rotten. Too bad. Needs rebase, and not an easy one.
comment:38 Changed 5 years ago by
Hello,
#16137 is in the process of being reviewed. Note that it will just be a fast and flexible data structures for (finite or infinite) list like objects. Features:
- it is very lazy (zero computation at instanciation)
- it is a tiny Cython wrapper above list and is hence very fast to access the cached values
- definition can be recursive and involve other lazy lists
- there is no such thing as "ultimately constant lazy list" or "compute with no cache lazy list"
- they are currently immutable (which allows to share memory for slices)
Vincent
comment:39 Changed 5 years ago by
- Cc vdelecroix added
comment:40 Changed 5 years ago by
I looked again at the code by Mike Hansen, I believe I understand it a little bit better than 2 years ago. I try to give a summary, and hope to get some feedback.
The code in this ticket consists of three parts: streams (with similar intentions to lazy_list
), power series, and the adaptation of the species code.
The code for streams (only roughly 500 lines) is in new_stream.py
. It introduces different classes depending on how the stream is to be produced: StreamFromIterator
, StreamFromList
, StreamFromFunction
(unused), and an "abstract base class" ListCachedStream
.
The code for series is mainly in series_stream.py
and series.py
. The main class is SeriesStream
(which is a misnomer, I'd say). Its main purpose is to set up the protocol that enables recursive definitions, by keeping track of the so called approximate order of a series, which is a lower bound on the exponent of the monomials in the series with non-vanishing coefficient.
For each operation on series (binary, like addition, multiplication, composition; unary, like derivative, etc.) introduces a new class, where the above mentioned order and the actual terms are computed.
Finally, the classes LazyPowerSeriesRing
and LazyPowerSeries
in series.py
put this into the algebraic framework.
There are many things I do not completely understand. For example, for the species code yet another class, SpeciesSeriesStream
, is introduced, rather than just using LazyPowerSeries
. I have no idea why this is so complicated. (It wasn't complicated in the original aldor code, and it's somehow hard to believe that python would need so much more code.)
Assuming that the code is OK I guess it's best if I rebase it and see what happens.
comment:41 Changed 5 years ago by
So far I have rebased the files new_stream.py
, series.py
and generating_series.py
. I guess it's best to push to a new branch, right?
One thing I'd like to note here is that many tests in generating_series.py
depend on species, which I haven't rebased yet, and thus fail :-(
comment:42 Changed 5 years ago by
- Branch changed from public/ticket/15673 to u/mantepse/15673
- Commit 5478a2de727c686fd1d892ea8186d54b8d839daa deleted
comment:43 Changed 5 years ago by
- Commit set to aa8fd83522eaeb583da0caa5c6628ddd1ed36b04
Branch pushed to git repo; I updated commit sha1. New commits:
aa8fd83 | rebase Mike Hansen's code
|
comment:44 Changed 5 years ago by
I have now completed the rebase. All tests pass. It would be great if someone more versed in python than me could look at the design...
comment:45 Changed 5 years ago by
Well done for the rebase!
Some personal feelings about the code
- There are public method that should not be there. For example
Stream.compute
. The documentation mentionsDo not use this method
. In such case, the method should be private, i.e. starting with an underscore like_compute
. Might apply as well toSeriesStream.refine_order
,SeriesStream.compute_order
,RecursiveStream.not_defined_check
, ...
- The
check_constant_decorator
is really ugly. I would implement it directly in the__getitem__
.
- You should seriously consider replacing
new_stream.py
withlazy_list.pyx
which is currently about 5x faster for__getitem__
and provide more or less the same functionalitysage: from sage.combinat.species.new_stream import StreamFromFunction sage: h = lambda l: 1 if len(l) < 2 else l[-1] + l[-2] sage: s = StreamFromFunction(h) sage: from sage.misc.lazy_list import lazy_list sage: def h2(l): l.append(l[-1]+l[-2]) sage: s2 = lazy_list(initial_values=[1,1],update_function=h2)
Thensage: %timeit s[0] 1000000 loops, best of 3: 942 ns per loop sage: %timeit s2[0] 10000000 loops, best of 3: 148 ns per loop
Though there are missing features. For example, currentlazy_list
do not care about being periodic or constant after a certain point. But I would be happy to implement needed stuff from this side.
This series code would be a good test case to see whether
lazy_list
are as flexible as they claim to be.
- In many situations it would be much faster to use Newton's method rather than an intricated tree of streams.
comment:46 Changed 5 years ago by
Thanks for your comments!
ad 1. and 3.: yes, that's what I want to do. I'm currently trying to figure out whether I should let SeriesStream
inherit from lazy_list
. I guess that Stream
, ListCachedStream
, etc. should be eliminated entirely.
ad 4.: I guess that you are referring to the examples in the species code - I do not think there is any recursive definition used in library code. Of course Newton would be faster. The point of the recursive facility is only to create a species easily, for example to check the first few terms of the cycle index series. In case one needs more speed, one will have to "really" implement the species.
ad 2.: unfortunately, I do not really know what it does. I'll try to figure out.
comment:47 follow-up: ↓ 48 Changed 5 years ago by
Hello,
Thanks to update and improve this code! ;-)
I have a stupid question: the lazy thing is a trick to compute efficiently the coefficient of the product of series or some other operations on series, right?
About the code, why does LazyPowerSeries
only inherit of AlgebraElement
and not of Stream
? It seems not so python/object to define a lazy series as a capsule of a stream so I want to understand.
All I read in the code look like this
class LazyPowerSeries(...): .... def get_order(self): return self._stream.get_order() ....
so it seems better to inherit of Stream
.
Jean-Baptiste Priez
comment:48 in reply to: ↑ 47 Changed 5 years ago by
Replying to elixyre:
I have a stupid question: the lazy thing is a trick to compute efficiently the coefficient of the product of series or some other operations on series, right?
No. The laziness is for having potentially infinite sequences.
About the code, why does
LazyPowerSeries
only inherit ofAlgebraElement
and not ofStream
?
Unless I misunderstand python's inheritance, the point is that a LazyPowerSeries
should not have the methods of it's stream of coefficients.
My current plan is to remove Stream
entirely, rename SeriesStream
to CoefficientStream
and make the latter into a ring module, which uses the new lazy_list
, or, once that is in place, Daniel Krenn's InfiniteSequence
.
comment:49 follow-up: ↓ 50 Changed 5 years ago by
- Milestone changed from sage-6.4 to sage-7.0
There is a bad INPUT::
that should be replaced by INPUT:
and doc does not build, see patchbot report
comment:50 in reply to: ↑ 49 ; follow-up: ↓ 51 Changed 5 years ago by
Replying to chapoton:
There is a bad
INPUT::
that should be replaced byINPUT:
and doc does not build, see patchbot report
Thanks, I changed this locally. However, I will not take care of the documentation at the moment, because I'm currently (really!) working on the design.
Right now I think that I want a class CoefficientStream
to model the algebraic structure of the sequence of coefficients of a formal power series with coefficients in a ring, i.e., inheriting from ModuleElement
.
I am not yet sure whether the computation of the approximate order really belongs here: after all, we want to recursively specify formal power series, not coefficient streams. The design decision is not completely clear, because formal power series and coefficient streams are almost the same thing - except that most operations feel more natural on the level of formal power series.
Suggestions and comments very welcome!
comment:51 in reply to: ↑ 50 ; follow-up: ↓ 52 Changed 5 years ago by
Replying to mantepse:
I'm currently (really!) working on the design. I am not yet sure whether the computation of the approximate order really belongs here: after all, we want to recursively specify formal power series, not coefficient streams. The design decision is not completely clear, because formal power series and coefficient streams are almost the same thing - except that most operations feel more natural on the level of formal power series.
Suggestions and comments very welcome!
I will be happy to discuss code design with you.
I have some code (about species) on the branch u/elixyre/species
(not associated to any ticket). I was trying to redesign the species with the goal to use the current Permutations
, Partitions
, etc ... I don't used the design Parent-Element, I was to lazy to update this code but I can do it easily...
Whatever I recode my own Series
on this branch (as a simple exercice to understand operations on series). There is a lot of really simple code that I assume to be a draft of a good (code) design. My code was thought to be simple to improve and understand.
(Specially I have a nice simple tool to compute the valuation.)
Do you have some branch where you try your own design that I could pull?
comment:52 in reply to: ↑ 51 Changed 5 years ago by
I will be happy to discuss code design with you.
Great!
I have some code (about species) on the branch
u/elixyre/species
(not associated to any ticket). I was trying to redesign the species with the goal to use the currentPermutations
,Partitions
, etc ... I don't used the design Parent-Element, I was to lazy to update this code but I can do it easily...
Is this in the species2 directory?
Whatever I recode my own
Series
on this branch (as a simple exercice to understand operations on series). There is a lot of really simple code that I assume to be a draft of a good (code) design. My code was thought to be simple to improve and understand.(Specially I have a nice simple tool to compute the valuation.)
So, very likely it makes sense to start from your branch?
Do you have some branch where you try your own design that I could pull?
No, because I'm doing this with pencil and paper...
Eventually we could talk, too, I have webcam + whiteboard.
One thing that should be done (maybe not on this ticket) is to deprecate the code in
sage.combinat.species.stream
.
Last 10 new commits:
Simplifications to new_stream.py
Add documentation / doctests
Remove TODO in cycle_species.py
#12648: Species do not support renaming
Fix bug in SeriesStream.is_constant()
#15248: Convert LazyPowerSeriesRing (and friends) to new coercion model
Remove dependence on sage.combinat.species.stream
Add doctests to new_stream.py
Add doctests to generating_series.py
More work on doctests for series_stream.py