#31760 closed enhancement (fixed)

Speedup WordMorphism.growing_letters

Reported by: gh-mrejmon Owned by:
Priority: major Milestone: sage-9.4
Component: combinatorics Keywords: words
Cc: Merged in:
Authors: Martin Rejmon Reviewers: Vincent Delecroix, Travis Scrimshaw
Report Upstream: N/A Work issues:
Branch: 35f617b (Commits, GitHub, GitLab) Commit: 35f617b7204db754e702356c7f5901df8797fa69
Dependencies: Stopgaps:

Status badges

Description

This patch changes the implementation of WordMorphism.growing_letters to be less sophisticated but faster.

sage: m = WordMorphism('a->aa,b->bc,c->def,d->fe,e->df,f->gh,g->hh,h->')
sage: %timeit m.growing_letters()
24.7 µs ± 74.3 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

vs.

sage: %timeit m.growing_letters()
7.89 ms ± 84 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

It also adds a method for returning "immortal" letters (letter a is mortal for a morphism s if there exists an integer k>0 such that sk(a) = ε).

Change History (29)

comment:1 Changed 14 months ago by gh-mrejmon

  • Status changed from new to needs_review

comment:2 follow-up: Changed 14 months ago by vdelecroix

You are not allowed to mutate self in

+        self._morph, new_morph = new_morph, self._morph
+        result = self.immortal_letters()
+        self._morph = new_morph

The code might be interrupted during the second line (an alaram, a keyboard interrupt, etc). In such situation you would end up with corrupted data.

comment:3 Changed 14 months ago by vdelecroix

  • Reviewers set to Vincent Delecroix

comment:4 Changed 14 months ago by git

  • Commit changed from bbf59a23ae600c8b1e834320d606b5d8c3ded66e to bd67035c3230c6ae7e6ee9bc641f938075c82199

Branch pushed to git repo; I updated commit sha1. New commits:

bd6703531760: Don't modify self

comment:5 in reply to: ↑ 2 Changed 14 months ago by gh-mrejmon

Replying to vdelecroix:

You are not allowed to mutate self in

+        self._morph, new_morph = new_morph, self._morph
+        result = self.immortal_letters()
+        self._morph = new_morph

The code might be interrupted during the second line (an alaram, a keyboard interrupt, etc). In such situation you would end up with corrupted data.

Fixed.

comment:6 Changed 14 months ago by tscrim

  • Branch changed from u/gh-mrejmon/speedup_growing to public/combinat/growing_letters_speedup-31760
  • Commit changed from bd67035c3230c6ae7e6ee9bc641f938075c82199 to 24f3bf96117bbeed8864977081094a9bfd641d53
  • Reviewers changed from Vincent Delecroix to Vincent Delecroix, Travis Scrimshaw

I made a few trivial tweaks to the code to be a bit easier to read and follow our coding conventions. If my changes are good, then positive review unless Vincent has any more comments.


New commits:

61c4f42Merge branch 'u/gh-mrejmon/speedup_growing' of git://trac.sagemath.org/sage into public/combinat/growing_letters_speedup-31760
24f3bf9Small Python tweaks to immortal_letters().

comment:7 Changed 14 months ago by git

  • Commit changed from 24f3bf96117bbeed8864977081094a9bfd641d53 to ce00ad67f08e0e6a72957a9e4c578adfe9fa1390

Branch pushed to git repo; I updated commit sha1. New commits:

ce00ad6improve growing letters, immortal letters

comment:8 Changed 14 months ago by vdelecroix

I hope my last commit simplifies a bit the code. The ticket can be set to positive review if nobody thinks otherwise.

comment:9 Changed 14 months ago by git

  • Commit changed from ce00ad67f08e0e6a72957a9e4c578adfe9fa1390 to a6fa7c6325413de0e1f8a7dbeb90799378c9e9e4

Branch pushed to git repo; I updated commit sha1. New commits:

a6fa7c6do not build custom domain/codomain

comment:10 Changed 14 months ago by vdelecroix

It is a pity that a6fa7c6 makes it faster in practice...

comment:11 Changed 14 months ago by git

  • Commit changed from a6fa7c6325413de0e1f8a7dbeb90799378c9e9e4 to 263890855e11b5bf43435e8bc24b3f0666f2b5fc

Branch pushed to git repo; I updated commit sha1. New commits:

263890831760: Add one set() call

comment:12 Changed 14 months ago by gh-mrejmon

Thanks for improving the code, I did a last small change.

comment:13 Changed 14 months ago by git

  • Commit changed from 263890855e11b5bf43435e8bc24b3f0666f2b5fc to 5aab984e33a165b1a0e6b5a259c8b0c388bf2372

Branch pushed to git repo; I updated commit sha1. New commits:

5aab98431760: Replace set with list in docs

comment:14 follow-up: Changed 14 months ago by gh-mrejmon

On second thought, what do you think about replacing:

if not self.is_endomorphism():

with something like:

if not set(self.codomain().alphabet()).issubset(self.domain().alphabet()):

so that one doesn't have to manually specify the codomain in cases like this:

WordMorphism('a->b,b->c,c->d,d->c', codomain=Words('abcd')).growing_letters()

comment:15 in reply to: ↑ 14 ; follow-ups: Changed 14 months ago by vdelecroix

Replying to gh-mrejmon:

On second thought, what do you think about replacing:

if not self.is_endomorphism():

with something like:

if not set(self.codomain().alphabet()).issubset(self.domain().alphabet()):

so that one doesn't have to manually specify the codomain in cases like this:

WordMorphism('a->b,b->c,c->d,d->c', codomain=Words('abcd')).growing_letters()

Indeed. Is there a name for functions f: A -> B with B included in A?

comment:16 in reply to: ↑ 15 ; follow-up: Changed 14 months ago by tscrim

Replying to vdelecroix:

Indeed. Is there a name for functions f: A -> B with B included in A?

I would just call it an endomorphism as the only difference is a promise that the image is smaller than all of A.

Also, doctest failures from the patchbot:

**********************************************************************
File "src/sage/combinat/words/morphism.py", line 1981, in sage.combinat.words.morphism.WordMorphism.periodic_points
Failed example:
    for p in f.periodic_points():
        print("{} , {}".format(len(p), p[0]))
Expected:
    1 , ababaaababaaabaabaababaaababaaabaabaabab...
    1 , baaabaabaababaaabaababaaabaababaaababaaa...
Got:
    1 , baaabaabaababaaabaababaaabaababaaababaaa...
    1 , ababaaababaaabaabaababaaababaaabaabaabab...
**********************************************************************
File "src/sage/combinat/words/morphism.py", line 1987, in sage.combinat.words.morphism.WordMorphism.periodic_points
Failed example:
    for p in f.periodic_points():
        print("{} , {}".format(len(p), p[0]))
Expected:
    2 , aababaaaababaababbabaababaababbabaababaa...
Got:
    2 , babbabaababaababbabbabbabaababaababbabaa...
**********************************************************************

comment:17 in reply to: ↑ 16 ; follow-up: Changed 14 months ago by vdelecroix

Replying to tscrim:

Replying to vdelecroix:

Indeed. Is there a name for functions f: A -> B with B included in A?

I would just call it an endomorphism as the only difference is a promise that the image is smaller than all of A.

It is a bit like calling a matrix with more rows than column a square matrix :)

sage: WordMorphism('a->a,b->b,c->a').incidence_matrix()
[1 0 1]
[0 1 0]

Also, doctest failures from the patchbot:

I will see why this is not consistent.

comment:18 Changed 14 months ago by git

  • Commit changed from 5aab984e33a165b1a0e6b5a259c8b0c388bf2372 to 14001685024278a7917290111cf1ca2a3629ef57

Branch pushed to git repo; I updated commit sha1. New commits:

1400168fix ordering in get_cycles

comment:19 in reply to: ↑ 15 ; follow-up: Changed 14 months ago by vdelecroix

Replying to vdelecroix:

Replying to gh-mrejmon:

On second thought, what do you think about replacing:

if not self.is_endomorphism():

with something like:

if not set(self.codomain().alphabet()).issubset(self.domain().alphabet()):

so that one doesn't have to manually specify the codomain in cases like this:

WordMorphism('a->b,b->c,c->d,d->c', codomain=Words('abcd')).growing_letters()

Indeed. Is there a name for functions f: A -> B with B included in A?

Alternatively, we can make WordMorphism construct so that if the codomain is included in the domain then the default is to set codomain=domain.

comment:20 Changed 14 months ago by git

  • Commit changed from 14001685024278a7917290111cf1ca2a3629ef57 to 35f617b7204db754e702356c7f5901df8797fa69

Branch pushed to git repo; I updated commit sha1. New commits:

35f617bclarify get_cycles

comment:21 in reply to: ↑ 17 Changed 14 months ago by tscrim

Replying to vdelecroix:

Replying to tscrim:

Replying to vdelecroix:

Indeed. Is there a name for functions f: A -> B with B included in A?

I would just call it an endomorphism as the only difference is a promise that the image is smaller than all of A.

It is a bit like calling a matrix with more rows than column a square matrix :)

sage: WordMorphism('a->a,b->b,c->a').incidence_matrix()
[1 0 1]
[0 1 0]

I agree fundamentally with your point, but not a generic example. I would argue that if there is a canonical way of embedding the image, I think we can allow the abuse. ;)

comment:22 in reply to: ↑ 19 ; follow-ups: Changed 14 months ago by tscrim

Replying to vdelecroix:

Alternatively, we can make WordMorphism construct so that if the codomain is included in the domain then the default is to set codomain=domain.

-1 on this. This would break the composition f: A -> B and g: B -> C if A < B but C is some other unrelated set of words that is not defined on all of A.

comment:23 in reply to: ↑ 22 Changed 14 months ago by gh-mrejmon

Replying to tscrim:

Replying to vdelecroix:

Alternatively, we can make WordMorphism construct so that if the codomain is included in the domain then the default is to set codomain=domain.

-1 on this. This would break the composition f: A -> B and g: B -> C if A < B but C is some other unrelated set of words that is not defined on all of A.

I don't think it would in practice due to the way __mul__ is implemented:

def __mul__(self, other):
    r"""
    Returns the morphism ``self``\*``other``.
    ...
    """
return WordMorphism(dict((key, self(w)) for key, w in other._morph.items()), codomain=self.codomain())

and for example:

sage: f = WordMorphism('a->a,b->a', codomain=FiniteWords('ab'))
sage: g = WordMorphism('a->0')
sage: g * f
WordMorphism: a->0, b->0

All in all, I don't mind either of these approaches. I also noticed that in the docs of is_endomorphism there already is:

Returns ``True`` if the codomain is a subset of the domain.

And hence that is_endormophism did work with subsets in some point, but this was changed in #8920 due to not being mathematically accurate.

comment:24 in reply to: ↑ 22 Changed 14 months ago by vdelecroix

Replying to tscrim:

Replying to vdelecroix:

Alternatively, we can make WordMorphism construct so that if the codomain is included in the domain then the default is to set codomain=domain.

-1 on this. This would break the composition f: A -> B and g: B -> C if A < B but C is some other unrelated set of words that is not defined on all of A.

Maybe I was unclear. What I suggested is to change

sage: WordMorphism('a->a,b->a').codomain()
Finite words over {'a'}

to

sage: WordMorphism('a->a,b->a').codomain()
Finite words over {'a', 'b'}

The keyword codomain will not disappear and still allows you to build any kind of morphism you want

sage: WordMorphism('a->a,b->a', codomain=FiniteWords('a')).codomain()
Finite words over {'a'}

comment:25 Changed 14 months ago by vdelecroix

This discussion is not exactly in the scope of the ticket and I suggest that we conclude here and talk about what is an endomorphism elsewhere. What do you think?

comment:26 follow-up: Changed 14 months ago by tscrim

I think it is still somewhat relevant to the ticket at hand, so we should see if we can come to a consensus.

That is what I understood your proposal to be, but I am thinking of the following situation:

sage: I = WordMorphism('a->a,b->b')  # The identity map
sage: f = WordMorphism('a->x')                                                            
sage: I.domain()
Finite words over {'a', 'b'}
sage: I.codomain()
Finite words over {'a', 'b'}
sage: f.domain()
Finite words over {'a'}
sage: f.codomain()
Finite words over {'x'}
sage: f * I  # This should not be a valid composition
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-7-dc18bb6a87ea> in <module>
----> 1 f * I

~/sage-build/local/lib/python3.9/site-packages/sage/combinat/words/morphism.py in __mul__(self, other)
    976             WordMorphism:
    977         """
--> 978         return WordMorphism(dict((key, self(w)) for key, w in other._morph.items()), codomain=self.codomain())
    979 
    980     def __pow__(self, exp):

~/sage-build/local/lib/python3.9/site-packages/sage/combinat/words/morphism.py in <genexpr>(.0)
    976             WordMorphism:
    977         """
--> 978         return WordMorphism(dict((key, self(w)) for key, w in other._morph.items()), codomain=self.codomain())
    979 
    980     def __pow__(self, exp):

~/sage-build/local/lib/python3.9/site-packages/sage/combinat/words/morphism.py in __call__(self, w, order, datatype)
    794                 im = C()
    795                 for a in w:
--> 796                     im += self._morph[a]
    797                 if datatype is not None:
    798                     return C(im, datatype=datatype)

KeyError: 'b'

Now this is failing, as it should, but not in a meaningful way IMO. The error message is somewhat cryptic, It all works without explicit consideration of the domain/codomain matching because of how it is coded, but it feels a bit fragile to me.

comment:27 in reply to: ↑ 26 Changed 14 months ago by vdelecroix

Replying to tscrim:

I think it is still somewhat relevant to the ticket at hand, so we should see if we can come to a consensus.

That is what I understood your proposal to be, but I am thinking of the following situation:

SNIP

Now this is failing, as it should, but not in a meaningful way IMO. The error message is somewhat cryptic, It all works without explicit consideration of the domain/codomain matching because of how it is coded, but it feels a bit fragile to me.

It looks fragile to me as well but also unrelated to this ticket which is about improving growing_letters. comment:14 which is the starting point of the discussion about is_endomorphism was complaining about the fact that growing_letters (and many other methods) remains valid under the more general assumption that you can iterate the morphism. I opened #31797 for the latter.

comment:28 Changed 14 months ago by tscrim

  • Status changed from needs_review to positive_review

Alright, then let's set this to a positive review. Thanks.

comment:29 Changed 13 months ago by vbraun

  • Branch changed from public/combinat/growing_letters_speedup-31760 to 35f617b7204db754e702356c7f5901df8797fa69
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.