Changes between Initial Version and Version 3 of Ticket #13352

08/09/12 15:00:43 (9 years ago)

Replying to leif:

I'd rather "implement" the functions in C (where you can use the C preprocessor and __builtin_popcount() if available) and write a thin Cython wrapper.

GCC e.g. chooses efficient machine implementations (probably a single instruction), depending on the (configured or specified) target and perhaps also further compiler flags.

As reported in!topic/sage-devel/GnudTUwbGG4, I tried to implement the function in C with __builtin_popcount(). I was able to compile successfully, but then sage was not working anymore. See attached patch trac_13352-bitcount-in-c.patch.

Note that GMP has functions like mpz_popcount() for mpz_ts as well.

It could be another solution, but I have not been able yet to find how to access the functions used for int behind?

Not sure what the parallel refers to ... ;-)

In fact, the 4 lines n = __COUNT32__(n, 0) can be executed in parallel and gcc is sometimes able to optimize the sequence of instructions. For instance, all methods have similar running times on my macbook air, but parallel methods are slower on my old PC (system still in 32-bits...).


  • Ticket #13352

    • Property Status changed from new to needs_review
  • Ticket #13352 – Description

    initial v3  
    3232The alternative would be to use functions {{{__builtin_popcount()}}} and {{{__builtin_popcountll()}}}. They are extremely fast but they required to add flag ``-mpopcnt`` or ``-msse4.2`` to the gcc compiler. Can we do that? how?
     37* trac_13352_bitset_len.patch