#25526 closed enhancement (fixed)

Factor iterator in suffix tree of word

Reported by: evandomme Owned by:
Priority: major Milestone: sage-8.3
Component: combinatorics Keywords: thursdaysbdx
Cc: vdelecroix, slabbe Merged in:
Authors: Vincent Delecroix, Élise Vandomme Reviewers: Sébastien Labbé
Report Upstream: N/A Work issues:
Branch: d7bcfdf (Commits) Commit: d7bcfdf912a443172e2a3a94adeaf0cd9c978b69
Dependencies: Stopgaps:

Description (last modified by slabbe)

We improve the algorithm computing the factor iterator in the implicit suffix tree of a word.

BEFORE:

sage: w = words.FibonacciWord([0,1])
sage: it = w[:10000].suffix_tree().factor_iterator()
sage: %time L = list(it)
Processus arrêté

AFTER:

sage: w = words.FibonacciWord([0,1])
sage: it = w[:10000].suffix_tree().factor_iterator()
sage: %time L = list(it)
CPU times: user 14 s, sys: 504 ms, total: 14.5 s
Wall time: 14.5 s
sage: len(L)
24337601

Change History (10)

comment:1 Changed 18 months ago by evandomme

  • Branch set to u/evandomme/factor_iterator_in_suffix_tree_of_word

comment:2 Changed 18 months ago by slabbe

  • Commit set to d7bcfdf912a443172e2a3a94adeaf0cd9c978b69

Does the branch needs review?


New commits:

d7bcfdfBetter implmentation of the factor iterator in suffix tree of words

comment:3 follow-up: Changed 18 months ago by slabbe

  • Status changed from new to needs_review

Can you provide an example in the description of the ticket that shows how better the code has become with the improvement you propose?

comment:4 Changed 18 months ago by slabbe

  • Status changed from needs_review to needs_work

comment:5 in reply to: ↑ 3 ; follow-up: Changed 18 months ago by vdelecroix

Replying to slabbe:

Can you provide an example in the description of the ticket that shows how better the code has become with the improvement you propose?

No. The reason is that there is no guarantee that iterating through a dictionary via iteritems you always get the same order (you can check that in the doctests there are sorted everywhere). But before this ticket, it was clearly not a DFS

sage: for w in Word("cacao").suffix_tree().factor_iterator(): print w
c
a
ac
aca
ao
acao
o
ca
cac
caca
cao
cacao

With this ticket it is

sage: for w in Word("cacao").suffix_tree().factor_iterator(): print w
a
ao
ac
aca
acao
o
c
ca
cao
cac
caca
cacao

comment:6 in reply to: ↑ 5 Changed 18 months ago by slabbe

Replying to vdelecroix:

Replying to slabbe:

Can you provide an example in the description of the ticket that shows how better the code has become with the improvement you propose?

No.

I was asking in the description of the ticket not in a doctest.

comment:7 Changed 18 months ago by slabbe

Before this ticket, this just eats all the memory and CPU:

sage: w = words.FibonacciWord([0,1])
sage: it = w[:10000].suffix_tree().factor_iterator()
sage: %time L = list(it)
Processus arrêté

With the branch of this ticket, I get:

sage: w = words.FibonacciWord([0,1])
sage: it = w[:10000].suffix_tree().factor_iterator()
sage: %time L = list(it)
CPU times: user 14 s, sys: 504 ms, total: 14.5 s
Wall time: 14.5 s
sage: len(L)
24337601

So, it is definitely an improvement.

comment:8 Changed 18 months ago by slabbe

  • Reviewers set to Sébastien Labbé
  • Status changed from needs_work to positive_review

comment:9 Changed 18 months ago by slabbe

  • Description modified (diff)

comment:10 Changed 18 months ago by vbraun

  • Branch changed from u/evandomme/factor_iterator_in_suffix_tree_of_word to d7bcfdf912a443172e2a3a94adeaf0cd9c978b69
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.