Opened 8 years ago

Closed 7 years ago

#14704 closed enhancement (wontfix)

Use GMP's mpz_t for bitsets

Reported by: Stefan Owned by: jason
Priority: minor Milestone: sage-duplicate/invalid/wontfix
Component: misc Keywords:
Cc: jason, dcoudert, ncohen, robertwb Merged in:
Authors: Reviewers: Jeroen Demeyer
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description

It was suggested on #14668 that GMP has a type mpz_t that is very much like Sage's bitset_t.

The goal of this ticket: attempt to use mpz_t as base data structure for bitsets and see if it results in a speedup.

Change History (16)

comment:1 Changed 8 years ago by Stefan

  • Type changed from PLEASE CHANGE to enhancement

comment:2 Changed 8 years ago by leif

Note that (AFAIK, haven't looked at it recently) the implementation of the usual operations is pretty dumb (i.e., generic and not using any machine-specific features), so you may get even worse performance (compared to "direct" implementations in C, say).

comment:3 follow-up: Changed 8 years ago by jason

Yes, I was afraid of that; that's why I think we should definitely test it before committing to it.

comment:4 in reply to: ↑ 3 ; follow-up: Changed 8 years ago by leif

Replying to jason:

Yes, I was afraid of that; that's why I think we should definitely test it before committing to it.

Maybe I missed something w.r.t. GMP, but MPIR appears to have asm implementations for a couple of operations.

comment:5 in reply to: ↑ 4 Changed 8 years ago by leif

Replying to leif:

Replying to jason:

Yes, I was afraid of that; that's why I think we should definitely test it before committing to it.

Maybe I missed something w.r.t. GMP, but MPIR appears to have asm implementations for a couple of operations.

P.S.:

(GMP does also have at least some.)

Which doesn't mean that using them is guaranteed to be faster in all cases; there still remains some overhead and a level of indirection.

For small or fixed-size bitsets for example, using a custom C implementation (with macros or inline functions) can still be faster. (And it could further be sped up by using inline assembly or compiler built-ins.)

comment:6 Changed 8 years ago by dcoudert

I agree that for small bitsets, one can possibly produce a very fast implementation in C. But then you should implement a "small-size-bitset" type with all dedicated and optimized methods. I'm not sure it is worth the effort.

Since GMP / MPIR (I don't know which one is the best) are already used by sage and are reasonably fast, It is interesting to use them for bitsets or at least to try. I assume they use calls to low level / build in methods.

comment:7 Changed 8 years ago by jason

  • Cc robertwb added

CCing Robert Bradshaw, since he's an expert in such things.

comment:8 Changed 8 years ago by leif

Well, the main disadvantage of using GMP's / MPIR's functions is that the compiler has no chance to inline them / do further optimizations depending on the context, and that you get another level of boxing (mpz_t points to a structure containing a pointer to the actual data / the limbs).

So whether it's worth heavily depends on the application / use case.

But we could of course (probably dynamically) use different implementations, whichever fits best in the specific case.

comment:9 follow-up: Changed 8 years ago by robertwb

The only way to know for sure is to code something up and compare performance, until then all discussion is rather hypothetical (and I could see things going both ways, but probably would place my bet on mpz_t.)

Note that mpz_t could be a member of the bitset struct (if indeed another struct is needed) which would avoid one level of indirection. It may be worth looking at using the mpn structures directly for this low-level behavior as well.

comment:10 in reply to: ↑ 9 ; follow-up: Changed 8 years ago by leif

Replying to robertwb:

The only way to know for sure is to code something up and compare performance, until then all discussion is rather hypothetical (and I could see things going both ways, but probably would place my bet on mpz_t.)

Well, you do not only need different implementations, but also typical use cases to make comparisons meaningful.


Note that mpz_t could be a member of the bitset struct (if indeed another struct is needed) which would avoid one level of indirection. It may be worth looking at using the mpn structures directly for this low-level behavior as well.

I actually did not think of not making it part of some bitset structure. (But the pointer to the limbs remains.)

Note that GMP operates on limbs, so the (logical) "size" of a plain mpz_t bitset would always be a multiple of BITS_PER_LIMB. (And we certainly don't need the "sign" of a bitset; GMP's treatment may even disturb.)

It would probably make sense to (optionally) keep track of the least and largest element contained in a set, and perhaps its cardinality as well, depending on the application.

comment:11 in reply to: ↑ 10 Changed 8 years ago by leif

Replying to leif:

Note that GMP operates on limbs, so the (logical) "size" of a plain mpz_t bitset would always be a multiple of BITS_PER_LIMB. (And we certainly don't need the "sign" of a bitset; GMP's treatment may even disturb.)

Just to elaborate a bit the subtle issues one has to take care of:

#include <gmp.h>

int main(void)
{
  mpz_t bs;
  int i;

  mpz_init2(bs,(mp_bitcnt_t)(1UL<<16)); // 2^16 *bits*
  gmp_printf("empty bitset:\n"
   "  size=%d alloc=%d value=%Zd popcnt=%lu\n",
    bs[0]._mp_size,
    bs[0]._mp_alloc,
    bs,
    (unsigned long)mpz_popcount(bs));
    
  mpz_com(bs,bs); // complement
  gmp_printf("empty bitset complemented:\n"
     "  size=%d alloc=%d value=... popcnt=%lu\n",
      bs[0]._mp_size,
      bs[0]._mp_alloc,
      /* bs, */
      (unsigned long)mpz_popcount(bs));

  // restore logical size:
  bs[0]._mp_size=1024; // 64-bit *limbs*; 2^16=64*1024
  gmp_printf("with (logical) size \"restored\":\n"
    "  size=%d alloc=%d value=... popcnt=%lu\n",
    bs[0]._mp_size,
    bs[0]._mp_alloc,
    /* bs, */
    (unsigned long)mpz_popcount(bs));

  // (construct) complement (of empty bitset) "manually":
  for(i=0;i<1024;i++)
    bs[0]._mp_d[i]=~0UL;
  gmp_printf("empty bitset manually complemented:\n"
    "  size=%d alloc=%d value=... popcnt=%lu\n",
    bs[0]._mp_size,
    bs[0]._mp_alloc,
    /* bs, */
    (unsigned long)mpz_popcount(bs));

  mpz_clear(bs);
  return 0;
}

gives

empty bitset:
  size=0 alloc=1024 value=0 popcnt=0
empty bitset complemented:
  size=-1 alloc=1024 value=... popcnt=18446744073709551615
with (logical) size "restored":
  size=1024 alloc=1024 value=... popcnt=1
empty bitset manually complemented:
  size=1024 alloc=1024 value=... popcnt=65536

(Note that I've "hardcoded" 64-bit limbs / assumption is that sizeof(long)==8.)

comment:12 Changed 8 years ago by jason

Note that the current bitset code also deals with limbs...

comment:13 Changed 7 years ago by jdemeyer

Alternatively: keep the bitset_t type but use some mpn functions in the implementation, see #13352 for a proof-of-concept with the mpn_popcount() function. I think that's a much better alternative than mpz_t because

  1. mpz_t has a sign which we don't need.
  2. mpz_t does not have a "size" (number of bits) which we do need.
  3. mpz_t is used to represent numbers which have the most significant bit set to 1. It cannot really deal with a long bitstring consisting of only zeros for example.

Proposal: close this ticket as "wontfix".

Last edited 7 years ago by jdemeyer (previous) (diff)

comment:14 Changed 7 years ago by jdemeyer

  • Milestone changed from sage-wishlist to sage-duplicate/invalid/wontfix
  • Reviewers set to Jeroen Demeyer
  • Status changed from new to needs_review

comment:15 Changed 7 years ago by jdemeyer

  • Status changed from needs_review to positive_review

comment:16 Changed 7 years ago by vbraun

  • Resolution set to wontfix
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.