Opened 14 months ago
Closed 13 months ago
#31760 closed enhancement (fixed)
Speedup WordMorphism.growing_letters
Reported by: | gh-mrejmon | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-9.4 |
Component: | combinatorics | Keywords: | words |
Cc: | Merged in: | ||
Authors: | Martin Rejmon | Reviewers: | Vincent Delecroix, Travis Scrimshaw |
Report Upstream: | N/A | Work issues: | |
Branch: | 35f617b (Commits, GitHub, GitLab) | Commit: | 35f617b7204db754e702356c7f5901df8797fa69 |
Dependencies: | Stopgaps: |
Description
This patch changes the implementation of WordMorphism.growing_letters to be less sophisticated but faster.
sage: m = WordMorphism('a->aa,b->bc,c->def,d->fe,e->df,f->gh,g->hh,h->') sage: %timeit m.growing_letters() 24.7 µs ± 74.3 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
vs.
sage: %timeit m.growing_letters() 7.89 ms ± 84 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
It also adds a method for returning "immortal" letters (letter a is mortal for a morphism s if there exists an integer k>0 such that sk(a) = ε).
Change History (29)
comment:1 Changed 14 months ago by
- Status changed from new to needs_review
comment:2 follow-up: ↓ 5 Changed 14 months ago by
comment:3 Changed 14 months ago by
- Reviewers set to Vincent Delecroix
comment:4 Changed 14 months ago by
- Commit changed from bbf59a23ae600c8b1e834320d606b5d8c3ded66e to bd67035c3230c6ae7e6ee9bc641f938075c82199
Branch pushed to git repo; I updated commit sha1. New commits:
bd67035 | 31760: Don't modify self
|
comment:5 in reply to: ↑ 2 Changed 14 months ago by
Replying to vdelecroix:
You are not allowed to mutate self in
+ self._morph, new_morph = new_morph, self._morph + result = self.immortal_letters() + self._morph = new_morphThe code might be interrupted during the second line (an alaram, a keyboard interrupt, etc). In such situation you would end up with corrupted data.
Fixed.
comment:6 Changed 14 months ago by
- Branch changed from u/gh-mrejmon/speedup_growing to public/combinat/growing_letters_speedup-31760
- Commit changed from bd67035c3230c6ae7e6ee9bc641f938075c82199 to 24f3bf96117bbeed8864977081094a9bfd641d53
- Reviewers changed from Vincent Delecroix to Vincent Delecroix, Travis Scrimshaw
I made a few trivial tweaks to the code to be a bit easier to read and follow our coding conventions. If my changes are good, then positive review unless Vincent has any more comments.
New commits:
61c4f42 | Merge branch 'u/gh-mrejmon/speedup_growing' of git://trac.sagemath.org/sage into public/combinat/growing_letters_speedup-31760
|
24f3bf9 | Small Python tweaks to immortal_letters().
|
comment:7 Changed 14 months ago by
- Commit changed from 24f3bf96117bbeed8864977081094a9bfd641d53 to ce00ad67f08e0e6a72957a9e4c578adfe9fa1390
Branch pushed to git repo; I updated commit sha1. New commits:
ce00ad6 | improve growing letters, immortal letters
|
comment:8 Changed 14 months ago by
I hope my last commit simplifies a bit the code. The ticket can be set to positive review if nobody thinks otherwise.
comment:9 Changed 14 months ago by
- Commit changed from ce00ad67f08e0e6a72957a9e4c578adfe9fa1390 to a6fa7c6325413de0e1f8a7dbeb90799378c9e9e4
Branch pushed to git repo; I updated commit sha1. New commits:
a6fa7c6 | do not build custom domain/codomain
|
comment:10 Changed 14 months ago by
It is a pity that a6fa7c6 makes it faster in practice...
comment:11 Changed 14 months ago by
- Commit changed from a6fa7c6325413de0e1f8a7dbeb90799378c9e9e4 to 263890855e11b5bf43435e8bc24b3f0666f2b5fc
Branch pushed to git repo; I updated commit sha1. New commits:
2638908 | 31760: Add one set() call
|
comment:12 Changed 14 months ago by
Thanks for improving the code, I did a last small change.
comment:13 Changed 14 months ago by
- Commit changed from 263890855e11b5bf43435e8bc24b3f0666f2b5fc to 5aab984e33a165b1a0e6b5a259c8b0c388bf2372
Branch pushed to git repo; I updated commit sha1. New commits:
5aab984 | 31760: Replace set with list in docs
|
comment:14 follow-up: ↓ 15 Changed 14 months ago by
On second thought, what do you think about replacing:
if not self.is_endomorphism():
with something like:
if not set(self.codomain().alphabet()).issubset(self.domain().alphabet()):
so that one doesn't have to manually specify the codomain in cases like this:
WordMorphism('a->b,b->c,c->d,d->c', codomain=Words('abcd')).growing_letters()
comment:15 in reply to: ↑ 14 ; follow-ups: ↓ 16 ↓ 19 Changed 14 months ago by
Replying to gh-mrejmon:
On second thought, what do you think about replacing:
if not self.is_endomorphism():with something like:
if not set(self.codomain().alphabet()).issubset(self.domain().alphabet()):so that one doesn't have to manually specify the codomain in cases like this:
WordMorphism('a->b,b->c,c->d,d->c', codomain=Words('abcd')).growing_letters()
Indeed. Is there a name for functions f: A -> B
with B
included in A
?
comment:16 in reply to: ↑ 15 ; follow-up: ↓ 17 Changed 14 months ago by
Replying to vdelecroix:
Indeed. Is there a name for functions
f: A -> B
withB
included inA
?
I would just call it an endomorphism as the only difference is a promise that the image is smaller than all of A
.
Also, doctest failures from the patchbot:
********************************************************************** File "src/sage/combinat/words/morphism.py", line 1981, in sage.combinat.words.morphism.WordMorphism.periodic_points Failed example: for p in f.periodic_points(): print("{} , {}".format(len(p), p[0])) Expected: 1 , ababaaababaaabaabaababaaababaaabaabaabab... 1 , baaabaabaababaaabaababaaabaababaaababaaa... Got: 1 , baaabaabaababaaabaababaaabaababaaababaaa... 1 , ababaaababaaabaabaababaaababaaabaabaabab... ********************************************************************** File "src/sage/combinat/words/morphism.py", line 1987, in sage.combinat.words.morphism.WordMorphism.periodic_points Failed example: for p in f.periodic_points(): print("{} , {}".format(len(p), p[0])) Expected: 2 , aababaaaababaababbabaababaababbabaababaa... Got: 2 , babbabaababaababbabbabbabaababaababbabaa... **********************************************************************
comment:17 in reply to: ↑ 16 ; follow-up: ↓ 21 Changed 14 months ago by
Replying to tscrim:
Replying to vdelecroix:
Indeed. Is there a name for functions
f: A -> B
withB
included inA
?I would just call it an endomorphism as the only difference is a promise that the image is smaller than all of
A
.
It is a bit like calling a matrix with more rows than column a square matrix :)
sage: WordMorphism('a->a,b->b,c->a').incidence_matrix() [1 0 1] [0 1 0]
Also, doctest failures from the patchbot:
I will see why this is not consistent.
comment:18 Changed 14 months ago by
- Commit changed from 5aab984e33a165b1a0e6b5a259c8b0c388bf2372 to 14001685024278a7917290111cf1ca2a3629ef57
Branch pushed to git repo; I updated commit sha1. New commits:
1400168 | fix ordering in get_cycles
|
comment:19 in reply to: ↑ 15 ; follow-up: ↓ 22 Changed 14 months ago by
Replying to vdelecroix:
Replying to gh-mrejmon:
On second thought, what do you think about replacing:
if not self.is_endomorphism():with something like:
if not set(self.codomain().alphabet()).issubset(self.domain().alphabet()):so that one doesn't have to manually specify the codomain in cases like this:
WordMorphism('a->b,b->c,c->d,d->c', codomain=Words('abcd')).growing_letters()Indeed. Is there a name for functions
f: A -> B
withB
included inA
?
Alternatively, we can make WordMorphism
construct so that if the codomain is included in the domain then the default is to set codomain=domain
.
comment:20 Changed 14 months ago by
- Commit changed from 14001685024278a7917290111cf1ca2a3629ef57 to 35f617b7204db754e702356c7f5901df8797fa69
Branch pushed to git repo; I updated commit sha1. New commits:
35f617b | clarify get_cycles
|
comment:21 in reply to: ↑ 17 Changed 14 months ago by
Replying to vdelecroix:
Replying to tscrim:
Replying to vdelecroix:
Indeed. Is there a name for functions
f: A -> B
withB
included inA
?I would just call it an endomorphism as the only difference is a promise that the image is smaller than all of
A
.It is a bit like calling a matrix with more rows than column a square matrix :)
sage: WordMorphism('a->a,b->b,c->a').incidence_matrix() [1 0 1] [0 1 0]
I agree fundamentally with your point, but not a generic example. I would argue that if there is a canonical way of embedding the image, I think we can allow the abuse. ;)
comment:22 in reply to: ↑ 19 ; follow-ups: ↓ 23 ↓ 24 Changed 14 months ago by
Replying to vdelecroix:
Alternatively, we can make
WordMorphism
construct so that if the codomain is included in the domain then the default is to setcodomain=domain
.
-1 on this. This would break the composition f: A -> B
and g: B -> C
if A < B
but C
is some other unrelated set of words that is not defined on all of A
.
comment:23 in reply to: ↑ 22 Changed 14 months ago by
Replying to tscrim:
Replying to vdelecroix:
Alternatively, we can make
WordMorphism
construct so that if the codomain is included in the domain then the default is to setcodomain=domain
.-1 on this. This would break the composition
f: A -> B
andg: B -> C
ifA < B
butC
is some other unrelated set of words that is not defined on all ofA
.
I don't think it would in practice due to the way __mul__
is implemented:
def __mul__(self, other): r""" Returns the morphism ``self``\*``other``. ... """ return WordMorphism(dict((key, self(w)) for key, w in other._morph.items()), codomain=self.codomain())
and for example:
sage: f = WordMorphism('a->a,b->a', codomain=FiniteWords('ab')) sage: g = WordMorphism('a->0') sage: g * f WordMorphism: a->0, b->0
All in all, I don't mind either of these approaches. I also noticed that in the docs of is_endomorphism
there already is:
Returns ``True`` if the codomain is a subset of the domain.
And hence that is_endormophism
did work with subsets in some point, but this was changed in #8920 due to not being mathematically accurate.
comment:24 in reply to: ↑ 22 Changed 14 months ago by
Replying to tscrim:
Replying to vdelecroix:
Alternatively, we can make
WordMorphism
construct so that if the codomain is included in the domain then the default is to setcodomain=domain
.-1 on this. This would break the composition
f: A -> B
andg: B -> C
ifA < B
butC
is some other unrelated set of words that is not defined on all ofA
.
Maybe I was unclear. What I suggested is to change
sage: WordMorphism('a->a,b->a').codomain() Finite words over {'a'}
to
sage: WordMorphism('a->a,b->a').codomain() Finite words over {'a', 'b'}
The keyword codomain
will not disappear and still allows you to build any kind of morphism you want
sage: WordMorphism('a->a,b->a', codomain=FiniteWords('a')).codomain() Finite words over {'a'}
comment:25 Changed 14 months ago by
This discussion is not exactly in the scope of the ticket and I suggest that we conclude here and talk about what is an endomorphism elsewhere. What do you think?
comment:26 follow-up: ↓ 27 Changed 14 months ago by
I think it is still somewhat relevant to the ticket at hand, so we should see if we can come to a consensus.
That is what I understood your proposal to be, but I am thinking of the following situation:
sage: I = WordMorphism('a->a,b->b') # The identity map sage: f = WordMorphism('a->x') sage: I.domain() Finite words over {'a', 'b'} sage: I.codomain() Finite words over {'a', 'b'} sage: f.domain() Finite words over {'a'} sage: f.codomain() Finite words over {'x'} sage: f * I # This should not be a valid composition --------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-7-dc18bb6a87ea> in <module> ----> 1 f * I ~/sage-build/local/lib/python3.9/site-packages/sage/combinat/words/morphism.py in __mul__(self, other) 976 WordMorphism: 977 """ --> 978 return WordMorphism(dict((key, self(w)) for key, w in other._morph.items()), codomain=self.codomain()) 979 980 def __pow__(self, exp): ~/sage-build/local/lib/python3.9/site-packages/sage/combinat/words/morphism.py in <genexpr>(.0) 976 WordMorphism: 977 """ --> 978 return WordMorphism(dict((key, self(w)) for key, w in other._morph.items()), codomain=self.codomain()) 979 980 def __pow__(self, exp): ~/sage-build/local/lib/python3.9/site-packages/sage/combinat/words/morphism.py in __call__(self, w, order, datatype) 794 im = C() 795 for a in w: --> 796 im += self._morph[a] 797 if datatype is not None: 798 return C(im, datatype=datatype) KeyError: 'b'
Now this is failing, as it should, but not in a meaningful way IMO. The error message is somewhat cryptic, It all works without explicit consideration of the domain/codomain matching because of how it is coded, but it feels a bit fragile to me.
comment:27 in reply to: ↑ 26 Changed 14 months ago by
Replying to tscrim:
I think it is still somewhat relevant to the ticket at hand, so we should see if we can come to a consensus.
That is what I understood your proposal to be, but I am thinking of the following situation:
SNIP
Now this is failing, as it should, but not in a meaningful way IMO. The error message is somewhat cryptic, It all works without explicit consideration of the domain/codomain matching because of how it is coded, but it feels a bit fragile to me.
It looks fragile to me as well but also unrelated to this ticket which is about improving growing_letters
. comment:14 which is the starting point of the discussion about is_endomorphism
was complaining about the fact that growing_letters
(and many other methods) remains valid under the more general assumption that you can iterate the morphism. I opened #31797 for the latter.
comment:28 Changed 14 months ago by
- Status changed from needs_review to positive_review
Alright, then let's set this to a positive review. Thanks.
comment:29 Changed 13 months ago by
- Branch changed from public/combinat/growing_letters_speedup-31760 to 35f617b7204db754e702356c7f5901df8797fa69
- Resolution set to fixed
- Status changed from positive_review to closed
You are not allowed to mutate self in
The code might be interrupted during the second line (an alaram, a keyboard interrupt, etc). In such situation you would end up with corrupted data.