Opened 12 years ago

Closed 12 years ago

Last modified 5 years ago

#7227 closed enhancement (fixed)

[with patch] Improving factor complexity of words functions

Reported by: slabbe Owned by: slabbe
Priority: major Milestone: sage-4.2.1
Component: combinatorics Keywords: factor complexity
Cc: vdelecroix Merged in: sage-4.2.1.alpha0
Authors: Sébastien Labbé Reviewers: Vincent Delecroix
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description

Improving the word complexity functions by

  • caching palindromic_lacunas_study
  • caching suffix_tree
  • caching suffix_trie
  • allowing factor_set to take an integer as input
  • adding rauzy_graph function for finite word

Attachments (3)

trac_7227_word_factor_complexity-sl.patch (7.5 KB) - added by slabbe 12 years ago.
trac_7227_edge_labels-sl.patch (1.7 KB) - added by slabbe 12 years ago.
trac_7227_add_edge-sl.patch (2.0 KB) - added by slabbe 12 years ago.
Applies over the precedent 2 patches.

Download all attachments as: .zip

Change History (14)

Changed 12 years ago by slabbe

comment:1 Changed 12 years ago by slabbe

  • Status changed from new to needs_review
  • Summary changed from Improving factor complexity of words functions to [with patch, needs review] Improving factor complexity of words functions

Here are some examples of the improvements made by the patch :

BEFORE:

sage: t = words.ThueMorseWord()
sage: w = t[:10000]
sage: time _ = [w.number_of_factors(i) for i in range(20)]
CPU times: user 4.19 s, sys: 0.00 s, total: 4.19 s
Wall time: 4.19 s
sage: time _ = [w.number_of_factors(i) for i in range(50)]
CPU times: user 10.28 s, sys: 0.00 s, total: 10.28 s
Wall time: 10.28 s

AFTER:

sage: t = words.ThueMorseWord()
sage: w = t[:10000]
sage: time _ = [w.number_of_factors(i) for i in range(20)]
CPU times: user 0.30 s, sys: 0.00 s, total: 0.30 s
Wall time: 0.30 s
sage: time _ = [w.number_of_factors(i) for i in range(50)]
CPU times: user 0.02 s, sys: 0.00 s, total: 0.02 s
Wall time: 0.02 s
sage: time _ = [w.number_of_factors(i) for i in range(100)]
CPU times: user 0.06 s, sys: 0.00 s, total: 0.06 s
Wall time: 0.06 s
sage: time _ = [w.number_of_factors(i) for i in range(1000)]
CPU times: user 4.90 s, sys: 0.00 s, total: 4.90 s
Wall time: 4.90 s
sage: time _ = [w.number_of_factors(i) for i in range(1001)]
CPU times: user 4.85 s, sys: 0.00 s, total: 4.85 s
Wall time: 4.85 s
sage: time _ = [w.number_of_factors(i) for i in range(2000)]
CPU times: user 27.64 s, sys: 0.00 s, total: 27.64 s
Wall time: 27.64 s

I should also add some Rauzy graphs examples and some timing improvements on palindrome complexity as well.

Changed 12 years ago by slabbe

comment:2 Changed 12 years ago by slabbe

  • Cc vdelecroix added

I added a patch that improves the rauzy graph function (adds the labels to the edges).

Still needs review!

comment:3 follow-up: Changed 12 years ago by vdelecroix

  • Reviewers set to vdelecroix
  • Summary changed from [with patch, needs review] Improving factor complexity of words functions to [with patch] Improving factor complexity of words functions

rauzy_graph : why don't you use DiGraph? method for the creation of edges ?

The rest is OK.

comment:4 Changed 12 years ago by vdelecroix

  • Summary changed from [with patch] Improving factor complexity of words functions to [with patch, needs review] Improving factor complexity of words functions

comment:5 in reply to: ↑ 3 ; follow-up: Changed 12 years ago by slabbe

Replying to vdelecroix:

rauzy_graph : why don't you use DiGraph? method for the creation of edges ?

You mean the add_edge method? I don't know. Is it faster?

comment:6 in reply to: ↑ 5 Changed 12 years ago by vdelecroix

Replying to slabbe:

Replying to vdelecroix:

rauzy_graph : why don't you use DiGraph? method for the creation of edges ?

You mean the add_edge method? I don't know. Is it faster?

At least it is not slower and I find it clearer:

sage: timeit('G = DiGraph(loops=True)\nfor i in range(200):\n  for j in range(200):\n    d.add_edge(i,j)')
5 loops, best of 3: 248 ms per loop
sage: timeit('d = {}\nfor i in range(200):\n  d[i]=[]\n  for j in range(200):\n    d[i].append(j)\nG=DiGraph(d,loops=True)')
5 loops, best of 3: 266 ms per loop

Changed 12 years ago by slabbe

Applies over the precedent 2 patches.

comment:7 Changed 12 years ago by slabbe

Thanks for the suggestion Vincent. I did the changes. I also changed the behavior for n=0 to agree with what is currently done in digraphs.DeBruijn() (see #7246).

Needs review!

comment:8 follow-up: Changed 12 years ago by vdelecroix

  • Status changed from needs_review to positive_review
  • Summary changed from [with patch, needs review] Improving factor complexity of words functions to [with patch] Improving factor complexity of words functions

Positive review.

Remark for future: the graph rendering is quite bad because word renders "word: xxxx"

comment:9 in reply to: ↑ 8 Changed 12 years ago by slabbe

Remark for future: the graph rendering is quite bad because word renders "word: xxxx"

One way to avoid the word: identifier is to set it empty using

sage: WordOptions(identifier='')
sage: Word(range(10))
0123456789

but it affects not only the vertices of the Rauzy graph but every single print of a word which might not be exactly what you want...

comment:10 Changed 12 years ago by mhansen

  • Authors set to Sebastien Labbe
  • Merged in set to sage-4.2.1.alpha0
  • Resolution set to fixed
  • Reviewers changed from vdelecroix to Vincent Delecroix
  • Status changed from positive_review to closed

comment:11 Changed 5 years ago by chapoton

  • Authors changed from Sebastien Labbe to Sébastien Labbé
  • Report Upstream set to N/A
Note: See TracTickets for help on using tickets.