Opened 12 years ago

Closed 12 years ago

Last modified 12 years ago

#7414 closed defect (fixed)

improve {from,to}_inversion_vector and to_lehmer_code

Reported by: ylchapuy Owned by: mhansen
Priority: major Milestone: sage-4.2.1
Component: combinatorics Keywords: permutations, inversion vector, lehmer code
Cc: sage-combinat Merged in: sage-4.2.1.rc0
Authors: Yann Laigle-Chapuy Reviewers: Florent Hivert
Report Upstream: Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description

The method to_inversion_vector can be greatly improved by using a divide-and-conquer strategy.

Attachments (2)

trac_7414-inversion_vector.patch (8.6 KB) - added by ylchapuy 12 years ago.
trac_7414-from_inversion_vector.patch (1.6 KB) - added by ylchapuy 12 years ago.

Download all attachments as: .zip

Change History (11)

comment:1 Changed 12 years ago by ylchapuy

  • Status changed from new to needs_review

for the record,

before:

sage: p= Permutations(1000).random_element()
sage: timeit('p.to_inversion_vector()')
5 loops, best of 3: 2.08 s per loop
sage: iv = p.to_inversion_vector()
sage: timeit('sage.combinat.permutation.from_inversion_vector(iv)')
25 loops, best of 3: 9.57 ms per loop

after:

sage: p= Permutations(1000).random_element()
sage: timeit('p.to_inversion_vector()')
25 loops, best of 3: 14.7 ms per loop
sage: iv = p.to_inversion_vector()
sage: timeit('sage.combinat.permutation.from_inversion_vector(iv)')
625 loops, best of 3: 1.47 ms per loop

comment:2 Changed 12 years ago by hivert

  • Keywords permutations inversion vector lehmer code added
  • Reviewers set to Florent Hivert
  • Status changed from needs_review to needs_work
  • Work issues set to Large speed regression for small entries

Yes, this is very good for large permutaions ! but is is much slower on small permutations, where I will use it :-) Sorry for this...

Before:

625 loops, best of 3: 16.4 µs per loop
625 loops, best of 3: 19.2 µs per loop
625 loops, best of 3: 33.3 µs per loop
625 loops, best of 3: 87.4 µs per loop
625 loops, best of 3: 356 µs per loop
125 loops, best of 3: 2.04 ms per loop
25 loops, best of 3: 14.2 ms per loop
5 loops, best of 3: 117 ms per loop

after:

625 loops, best of 3: 18.1 µs per loop
625 loops, best of 3: 19.9 µs per loop
625 loops, best of 3: 51.2 µs per loop
625 loops, best of 3: 166 µs per loop
625 loops, best of 3: 794 µs per loop
125 loops, best of 3: 4.86 ms per loop
25 loops, best of 3: 33.2 ms per loop
5 loops, best of 3: 271 ms per loop

I suggest you to reinstate the former implementation and to change from one to the other depending on the size of the permutations. I wrote the same in MuPAD, the cut-of point where around 18.

Moreover, since the Lehmer code is the inversion vector of the inverse, you can speed up it for large n. Also, if you would take the chance to write the definition of the lehmer code (c_i = the number of j > i s.t. s(j) < s(i)) and to put a link beetween those two methods, then I would be extremely happy to put a positive review.

Sorry to give you more work.

Cheers,

Florent

comment:3 Changed 12 years ago by hivert

And finally, it would be perfect if you add a note on the complexity of the algorithm.

Cheers,

Florent

Changed 12 years ago by ylchapuy

comment:4 Changed 12 years ago by ylchapuy

  • Status changed from needs_work to needs_review
  • Summary changed from improve {from,to}_inversion_vector to improve {from,to}_inversion_vector and to_lehmer_code
  • Work issues Large speed regression for small entries deleted

I did my best to keep small permutations fast. Here are the new timings.

sage: for k in [0,1,2,3,4,5,6,7]:
    L=Permutations(k).list()
    print k
    timeit('[len(p._to_inversion_vector_orig()) for p in L]')
    timeit('[len(p._to_inversion_vector_small()) for p in L]')
    timeit('[len(p.to_inversion_vector()) for p in L]')
....:     
0
625 loops, best of 3: 2.35 µs per loop
625 loops, best of 3: 3.86 µs per loop
625 loops, best of 3: 1.43 µs per loop
1
625 loops, best of 3: 3.23 µs per loop
625 loops, best of 3: 4.98 µs per loop
625 loops, best of 3: 1.54 µs per loop
2
625 loops, best of 3: 7.69 µs per loop
625 loops, best of 3: 12.2 µs per loop
625 loops, best of 3: 3.13 µs per loop
3
625 loops, best of 3: 29.6 µs per loop
625 loops, best of 3: 38 µs per loop
625 loops, best of 3: 11.2 µs per loop
4
625 loops, best of 3: 152 µs per loop
625 loops, best of 3: 171 µs per loop
625 loops, best of 3: 197 µs per loop
5
625 loops, best of 3: 957 µs per loop
625 loops, best of 3: 961 µs per loop
625 loops, best of 3: 1.09 ms per loop
6
125 loops, best of 3: 7.14 ms per loop
125 loops, best of 3: 6.39 ms per loop
125 loops, best of 3: 7.12 ms per loop
7
5 loops, best of 3: 64.4 ms per loop
5 loops, best of 3: 51.1 ms per loop
5 loops, best of 3: 55.5 ms per loop

Timings for big permutations are also quite improved thanks to an improved base case.

sage: p= Permutations(1000).random_element()
sage: timeit('p.to_inversion_vector()')
125 loops, best of 3: 7.03 ms per loop

As you suggested, I also improved the to_lehmer_code method. Here is the comparison, first for small sizes,

before:

sage: for k in [0,1,2,3,4,5,6]:
....:         L=Permutations(k).list()
....:     timeit('[len(p.to_lehmer_code()) for p in L]')
....: 
625 loops, best of 3: 4.06 µs per loop
625 loops, best of 3: 5.86 µs per loop
625 loops, best of 3: 13.8 µs per loop
625 loops, best of 3: 51.2 µs per loop
625 loops, best of 3: 248 µs per loop
625 loops, best of 3: 1.55 ms per loop
25 loops, best of 3: 11.4 ms per loop

after:

sage: for k in [0,1,2,3,4,5,6]:
....:         L=Permutations(k).list()
....:     timeit('[len(p.to_lehmer_code()) for p in L]')
....: 
625 loops, best of 3: 2.5 µs per loop
625 loops, best of 3: 3.81 µs per loop
625 loops, best of 3: 9.44 µs per loop
625 loops, best of 3: 32 µs per loop
625 loops, best of 3: 150 µs per loop
625 loops, best of 3: 880 µs per loop
125 loops, best of 3: 5.89 ms per loop

and for big sizes,

before:

sage: for k in [100,300,600,1000]:
....:         L=[Permutation(sample(xrange(1,k+1), k)) for _ in xrange(10)]
....:     timeit('[len(p.to_lehmer_code()) for p in L]')
....: 
25 loops, best of 3: 20.2 ms per loop
5 loops, best of 3: 174 ms per loop
5 loops, best of 3: 704 ms per loop
5 loops, best of 3: 1.94 s per loop

after

sage: for k in [100,300,600,1000]:
....:         L=[Permutation(sample(xrange(1,k+1), k)) for _ in xrange(10)]
....:     timeit('[len(p.to_lehmer_code()) for p in L]')
....: 
125 loops, best of 3: 1.89 ms per loop
25 loops, best of 3: 11.2 ms per loop
25 loops, best of 3: 37.4 ms per loop
5 loops, best of 3: 69.1 ms per loop

comment:5 Changed 12 years ago by ylchapuy

NB: The tests are long, but they should be much faster of applying #7415 which improves random_element

Changed 12 years ago by ylchapuy

comment:6 Changed 12 years ago by ylchapuy

Sorry, I forgot the from_* methods in the first patch. Please apply both.

Cheers,

Yann

comment:7 Changed 12 years ago by hivert

  • Status changed from needs_review to positive_review

Good work ! Positive review.

Some remarks:

  • as you commented in the code the hardcoded handling of the n=0,1,2,3 case is a little bit overkill :-)
  • we should'nt spent too much time in optimizing very finely those function since at some point, we will probably change the datastructure for permutations to a fastest one (plain python list or tuple or Cython object or even C int[]).

Cheers,

Florent

comment:8 Changed 12 years ago by mhansen

  • Merged in set to sage-4.2.1.rc0
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:9 Changed 12 years ago by mvngu

  • Milestone changed from sage-combinat to sage-4.2.1
Note: See TracTickets for help on using tickets.