Changes between Version 4 and Version 8 of Ticket #13352


Ignore:
Timestamp:
08/18/13 15:08:25 (8 years ago)
Author:
dcoudert
Comment:

Hello,

Since I have now understood how to use __builtin_popcountl (not obvious when you don't know), I have uploaded a new version of the patch.

before:

sage: B = Bitset('00'*10000+'1')
sage: len(B)
1
sage: %timeit len(B)
1000000 loops, best of 3: 268 ns per loop
sage: B = Bitset('10'*10000)
sage: len(B)
10000
sage: %timeit len(B)
1000 loops, best of 3: 245 us per loop
sage: B = Bitset('10'*1000000)
sage: len(B)
1000000
sage: %timeit len(B)
10 loops, best of 3: 24.1 ms per loop

after:

sage: B = Bitset('00'*10000+'1')
sage: len(B)
1
sage: %timeit len(B)
1000000 loops, best of 3: 275 ns per loop
sage: B = Bitset('10'*10000)
sage: len(B)
10000
sage: %timeit len(B)
100000 loops, best of 3: 2.52 us per loop
sage: B = Bitset('10'*1000000)
sage: len(B)
1000000
sage: %timeit len(B)
1000 loops, best of 3: 236 us per loop

I'm wondering if I should add a wrapper in case the __builtin_popcountl method is not available. This is not difficult, but I don't know which environnement variables should be tested.

Thanks.

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #13352

    • Property Cc ncohen added
    • Property Milestone changed from sage-5.11 to sage-5.12
  • Ticket #13352 – Description

    v4 v8  
    1 This patch improves the running time of the {{{bitset_len}}} method by using fast methods for counting bits in 32 and 64 bits integers (popcount). It
    2 - adds file {{{bitcount.pxi}}} to {{{sage/misc/}}}. This file contains the functions for counting bits in 32 or 64 bits integers
    3 - adds corresponding tests to file {{{misc_c.pyx}}}
    4 - modifies the {{{bitset_len}}} function accordingly
     1This patch improves the running time of the {{{bitset_len}}} method by using {{{__builtin_popcount()}}}.
    52
    63Before:
    74{{{
    8 sage: x = FrozenBitset('10'*1003002); len(x)
    9 1003002
    10 sage: %timeit len(x)
    11 125 loops, best of 3: 5.27 ms per loop
    12 sage: x = FrozenBitset('10'*10705); len(x)
    13 10705
    14 sage: %timeit len(x)
    15 625 loops, best of 3: 56.3 µs per loop
     5sage: B = Bitset('10'*10000)
     6sage: len(B)
     710000
     8sage: %timeit len(B)
     91000 loops, best of 3: 245 us per loop
    1610}}}
    1711
    1812After:
    1913{{{
    20 sage: x = FrozenBitset('10'*1003002); len(x)
    21 1003002
    22 sage: %timeit len(x)
    23 625 loops, best of 3: 865 µs per loop
    24 sage: x = FrozenBitset('10'*10705); len(x)
    25 10705
    26 sage: %timeit len(x)
    27 625 loops, best of 3: 9.39 µs per loop
     14sage: B = Bitset('10'*10000)
     15sage: len(B)
     1610000
     17sage: %timeit len(B)
     18100000 loops, best of 3: 2.52 us per loop
    2819}}}
    29 
    30 The ``popcount_32`` method is the same than the function used in patch #12371.
    31 
    32 The alternative would be to use functions {{{__builtin_popcount()}}} and {{{__builtin_popcountll()}}}. They are extremely fast but they required to add flag ``-mpopcnt`` or ``-msse4.2`` to the gcc compiler. Can we do that? how?
    3320
    3421
    3522APPLY:
    3623
    37 * [attachment:trac_13352_bitset_len.patch]
     24* [attachment:trac_13352_bitset_len_v2.patch]
    3825