Ticket #12029 (closed enhancement: fixed)
Fast conversion of ClonableIntArray to list
| Reported by: | SimonKing | Owned by: | sage-combinat |
|---|---|---|---|
| Priority: | major | Milestone: | sage-4.8 |
| Component: | combinatorics | Keywords: | |
| Cc: | sage-combinat, nborie | Work issues: | |
| Report Upstream: | N/A | Reviewers: | Florent Hivert, Simon King |
| Authors: | Simon King, Florent Hivert | Merged in: | sage-4.8.alpha2 |
| Dependencies: | Stopgaps: |
Description (last modified by SimonKing) (diff)
I think the following is too slow:
sage: from sage.structure.list_clone import IncreasingIntArrays
sage: I = IncreasingIntArrays()(range(1000))
sage: timeit("L = list(I)", number=10000)
10000 loops, best of 3: 41.8 µs per loop
My patch adds a method .list() (I hope this is the fastest way of converting a C-int array into a Python list - Cython experts are welcome to find something better), and it adds an __iter__() method that relies on the .list() method.
Note that I tried to have an __iter__ method that does not call list() but works on the C-array (this is now possible with the new Cython version), but it turned out to be not faster.
Here are the timings with my patch
sage: from sage.structure.list_clone import IncreasingIntArrays
sage: I = IncreasingIntArrays()(range(1000))
sage: timeit("L = I.list()", number=10000)
10000 loops, best of 3: 19.4 µs per loop
sage: timeit("L = list(I)", number=10000)
10000 loops, best of 3: 32.9 µs per loop
Apply
Attachments
Change History
comment:2 in reply to: ↑ 1 ; follow-up: ↓ 3 Changed 19 months ago by SimonKing
- Status changed from new to needs_review
Replying to hivert:
- The .list() method is not a Python convention, it is a Sage-Combinat (and maybe Sage) convention. Isn't it ?
Yes. But there are different places in Sage where .list() is used, and I think if there is a faster way to return a list than via an iterator then it should at least be offered.
- Since the new Cython allows it. Did you try to define an iterator (using yield) without relying on the list ? It could be faster.
As I stated in the ticket description: I did try, and it was not faster. To be concrete: Here is the code that I tested.
def __iter__(self): """ Iterate over the items of self. ... """ cdef int i for i from 0<=i<self._len: yield self._list[i]
And here is the benchmark:
sage: from sage.structure.list_clone import IncreasingIntArrays
sage: I = IncreasingIntArrays()(range(1000))
sage: timeit("L = list(I)", number=10000)
10000 loops, best of 3: 34.3 µs per loop
- Also I'm not sure using append to build the list is a good idea (due to possible reallocation).
Appending to a list - if the list is cdefined - is a very quick operation in Cython.
Si you know the list length from the beginning, why not allocation the list and filling it with a loop ?
Can one allocate a Python list without filling it?
comment:3 in reply to: ↑ 2 ; follow-up: ↓ 4 Changed 19 months ago by hivert
Replying to SimonKing:
- Since the new Cython allows it. Did you try to define an iterator (using yield) without relying on the list ? It could be faster.
As I stated in the ticket description: I did try, and it was not faster. To be concrete: Here is the code that I tested.
Sorry I read it to fast !!!
- Also I'm not sure using append to build the list is a good idea (due to possible reallocation).
Appending to a list - if the list is cdefined - is a very quick operation in Cython.
Si you know the list length from the beginning, why not allocation the list and filling it with a loop ?
Can one allocate a Python list without filling it?
From the C-API you can do it. Of course you must fill it (at least with None(s)) before returning it to python. See http://docs.python.org/c-api/list.html and in particular the note after PyList_New. I don't know if it is doable directly in Cython without calling the C-API.
Florent
comment:4 in reply to: ↑ 3 ; follow-up: ↓ 5 Changed 19 months ago by hivert
Replying to hivert:
Can one allocate a Python list without filling it?
From the C-API you can do it. Of course you must fill it (at least with None(s)) before returning it to python. See http://docs.python.org/c-api/list.html and in particular the note after PyList_New. I don't know if it is doable directly in Cython without calling the C-API.
Hi Simon !
Sorry for replying to myself but I should have said from the very beginning that I'm not a Cython expert ! I'm currently trying the PyList_New version.
Florent
comment:5 in reply to: ↑ 4 ; follow-up: ↓ 6 Changed 19 months ago by SimonKing
Replying to hivert:
I'm currently trying the PyList_New version.
So did I, but currently I am getting segfaults. I guess that is because the references produced by PyList_SetItem are borrowed.
Best regards,
Simon
comment:7 Changed 19 months ago by hivert
Hi Simon,
If correct with respect to memory management, I uploaded a patch which when applied over yours gains some more speed:
This is before my patch:
sage: from sage.structure.list_clone import IncreasingIntArrays
sage: I = IncreasingIntArrays()(range(1000))
sage: timeit("L = I.list()", number=10000)
10000 loops, best of 3: 18.1 µs per loop
This is after my patch:
sage: from sage.structure.list_clone import IncreasingIntArrays
sage: I = IncreasingIntArrays()(range(1000))
sage: timeit("L = I.list()", number=10000)
10000 loops, best of 3: 14 µs per loop
So this is more 20 percents faster... After having a Cython expert check the reference counting stuff, this is probably worth including.
What do you think ?
Cheers,
Florent
comment:8 follow-up: ↓ 9 Changed 19 months ago by SimonKing
While you were posting here, I was asking on sage-devel, posting a code that was very similar to yours. Only difference: I did not use o = PyInt_FromLong(self._list[i]).
Interesting that this is so fast. I need to catch some sleep now, however...
comment:9 in reply to: ↑ 8 ; follow-ups: ↓ 10 ↓ 11 Changed 19 months ago by hivert
Replying to SimonKing:
While you were posting here, I was asking on sage-devel, posting a code that was very similar to yours. Only difference: I did not use o = PyInt_FromLong(self._list[i]).
I just got confirmation (on cython-users) that INCREF is correct where I put it.
Interesting that this is so fast. I need to catch some sleep now, however...
Me too ! Sorry for not being around Paris just when you come.
Cheers,
Florent
comment:10 in reply to: ↑ 9 Changed 19 months ago by SimonKing
Replying to hivert:
I just got confirmation (on cython-users) that INCREF is correct where I put it.
I was putting the INCREF in precisely the same location as you did. Only difference to the code that I tried (getting segfaults): You did PyInt_FromLong before that. And I don't see why that conversion prevents the segfault.
Cheers,
Simon
comment:11 in reply to: ↑ 9 Changed 19 months ago by hivert
Interesting that this is so fast. I need to catch some sleep now, however...
Me too ! Sorry for not being around Paris just when you come.
Just before I go to bed, Mark Florisson noticed on cython-users that
return [self._list[i] for i in range(self._len)]
Should be as fast as the former append version.
Cheers,
Florent
comment:12 Changed 19 months ago by hivert
Hi Simon,
Looking at the C code I figured out that replacing
cdef list L = PyList_New(self._len)
by
cdef list L = <list> PyList_New(self._len)
allows to gain a few more speed:
sage: %timeit I.list() 625 loops, best of 3: 13 µs per loop
So one less microsecond. Can you fold my patch and review it. I'm Ok with your changes.
Cheers,
Florent
Changed 19 months ago by SimonKing
-
attachment
trac12029_clonable_int_array_2_list.patch
added
Iteration over ClonableIntArray?, including Florent's speed-up
comment:13 follow-up: ↓ 14 Changed 19 months ago by SimonKing
- Status changed from needs_review to positive_review
- Reviewers set to Florent Hivert, Simon King
- Description modified (diff)
- Authors changed from Simon King to Simon King, Florent Hivert
I folded Florent's patch and posted the combined patch under the original name of my patch. I hope you don't mind.
I can confirm that using the C-Api gives an additional speed-up. The new doctest from Florent's patch tests against a case that was critical during development (it would result in a segfault without Florent's PyInt_FromLong).
All doctests pass, even when using the new Cython version. Therefore, I can give Florent's patch a positive review. He gave my patch a positive review as well. So, that's done.
Apply trac12029_clonable_int_array_2_list.patch
comment:14 in reply to: ↑ 13 Changed 19 months ago by hivert
Dear Simon,
All doctests pass, even when using the new Cython version. Therefore, I can give Florent's patch a positive review. He gave my patch a positive review as well. So, that's done.
Excellent ! I'm glad to have you as one of my patch co-author, if this wasn't already the case. Thanks for the review.
For the record This thread on cython-users indicate that the choosen solution seems to be the correct one. They are starting discussing optimizing list comprehension in Cython to handle our case.
Cheers,
Florent
comment:15 Changed 18 months ago by jdemeyer
- Status changed from positive_review to closed
- Resolution set to fixed
- Merged in set to sage-4.8.alpha2

A few remarks: