Opened 8 years ago
Closed 7 years ago
#14704 closed enhancement (wontfix)
Use GMP's mpz_t for bitsets
Reported by: | Stefan | Owned by: | jason |
---|---|---|---|
Priority: | minor | Milestone: | sage-duplicate/invalid/wontfix |
Component: | misc | Keywords: | |
Cc: | jason, dcoudert, ncohen, robertwb | Merged in: | |
Authors: | Reviewers: | Jeroen Demeyer | |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description
It was suggested on #14668 that GMP has a type mpz_t that is very much like Sage's bitset_t.
The goal of this ticket: attempt to use mpz_t as base data structure for bitsets and see if it results in a speedup.
Change History (16)
comment:1 Changed 8 years ago by
- Type changed from PLEASE CHANGE to enhancement
comment:2 Changed 8 years ago by
comment:3 follow-up: ↓ 4 Changed 8 years ago by
Yes, I was afraid of that; that's why I think we should definitely test it before committing to it.
comment:4 in reply to: ↑ 3 ; follow-up: ↓ 5 Changed 8 years ago by
Replying to jason:
Yes, I was afraid of that; that's why I think we should definitely test it before committing to it.
Maybe I missed something w.r.t. GMP, but MPIR appears to have asm implementations for a couple of operations.
comment:5 in reply to: ↑ 4 Changed 8 years ago by
Replying to leif:
Replying to jason:
Yes, I was afraid of that; that's why I think we should definitely test it before committing to it.
Maybe I missed something w.r.t. GMP, but MPIR appears to have asm implementations for a couple of operations.
P.S.:
(GMP does also have at least some.)
Which doesn't mean that using them is guaranteed to be faster in all cases; there still remains some overhead and a level of indirection.
For small or fixed-size bitsets for example, using a custom C implementation (with macros or inline functions) can still be faster. (And it could further be sped up by using inline assembly or compiler built-ins.)
comment:6 Changed 8 years ago by
I agree that for small bitsets, one can possibly produce a very fast implementation in C. But then you should implement a "small-size-bitset" type with all dedicated and optimized methods. I'm not sure it is worth the effort.
Since GMP / MPIR (I don't know which one is the best) are already used by sage and are reasonably fast, It is interesting to use them for bitsets or at least to try. I assume they use calls to low level / build in methods.
comment:7 Changed 8 years ago by
- Cc robertwb added
CCing Robert Bradshaw, since he's an expert in such things.
comment:8 Changed 8 years ago by
Well, the main disadvantage of using GMP's / MPIR's functions is that the compiler has no chance to inline them / do further optimizations depending on the context, and that you get another level of boxing (mpz_t
points to a structure containing a pointer to the actual data / the limbs).
So whether it's worth heavily depends on the application / use case.
But we could of course (probably dynamically) use different implementations, whichever fits best in the specific case.
comment:9 follow-up: ↓ 10 Changed 8 years ago by
The only way to know for sure is to code something up and compare performance, until then all discussion is rather hypothetical (and I could see things going both ways, but probably would place my bet on mpz_t.)
Note that mpz_t could be a member of the bitset struct (if indeed another struct is needed) which would avoid one level of indirection. It may be worth looking at using the mpn structures directly for this low-level behavior as well.
comment:10 in reply to: ↑ 9 ; follow-up: ↓ 11 Changed 8 years ago by
Replying to robertwb:
The only way to know for sure is to code something up and compare performance, until then all discussion is rather hypothetical (and I could see things going both ways, but probably would place my bet on mpz_t.)
Well, you do not only need different implementations, but also typical use cases to make comparisons meaningful.
Note that mpz_t could be a member of the bitset struct (if indeed another struct is needed) which would avoid one level of indirection. It may be worth looking at using the mpn structures directly for this low-level behavior as well.
I actually did not think of not making it part of some bitset structure. (But the pointer to the limbs remains.)
Note that GMP operates on limbs, so the (logical) "size" of a plain mpz_t
bitset would always be a multiple of BITS_PER_LIMB
. (And we certainly don't need the "sign" of a bitset; GMP's treatment may even disturb.)
It would probably make sense to (optionally) keep track of the least and largest element contained in a set, and perhaps its cardinality as well, depending on the application.
comment:11 in reply to: ↑ 10 Changed 8 years ago by
Replying to leif:
Note that GMP operates on limbs, so the (logical) "size" of a plain
mpz_t
bitset would always be a multiple ofBITS_PER_LIMB
. (And we certainly don't need the "sign" of a bitset; GMP's treatment may even disturb.)
Just to elaborate a bit the subtle issues one has to take care of:
#include <gmp.h> int main(void) { mpz_t bs; int i; mpz_init2(bs,(mp_bitcnt_t)(1UL<<16)); // 2^16 *bits* gmp_printf("empty bitset:\n" " size=%d alloc=%d value=%Zd popcnt=%lu\n", bs[0]._mp_size, bs[0]._mp_alloc, bs, (unsigned long)mpz_popcount(bs)); mpz_com(bs,bs); // complement gmp_printf("empty bitset complemented:\n" " size=%d alloc=%d value=... popcnt=%lu\n", bs[0]._mp_size, bs[0]._mp_alloc, /* bs, */ (unsigned long)mpz_popcount(bs)); // restore logical size: bs[0]._mp_size=1024; // 64-bit *limbs*; 2^16=64*1024 gmp_printf("with (logical) size \"restored\":\n" " size=%d alloc=%d value=... popcnt=%lu\n", bs[0]._mp_size, bs[0]._mp_alloc, /* bs, */ (unsigned long)mpz_popcount(bs)); // (construct) complement (of empty bitset) "manually": for(i=0;i<1024;i++) bs[0]._mp_d[i]=~0UL; gmp_printf("empty bitset manually complemented:\n" " size=%d alloc=%d value=... popcnt=%lu\n", bs[0]._mp_size, bs[0]._mp_alloc, /* bs, */ (unsigned long)mpz_popcount(bs)); mpz_clear(bs); return 0; }
gives
empty bitset: size=0 alloc=1024 value=0 popcnt=0 empty bitset complemented: size=-1 alloc=1024 value=... popcnt=18446744073709551615 with (logical) size "restored": size=1024 alloc=1024 value=... popcnt=1 empty bitset manually complemented: size=1024 alloc=1024 value=... popcnt=65536
(Note that I've "hardcoded" 64-bit limbs / assumption is that sizeof(long)==8
.)
comment:12 Changed 8 years ago by
Note that the current bitset code also deals with limbs...
comment:13 Changed 7 years ago by
Alternatively: keep the bitset_t
type but use some mpn
functions in the implementation, see #13352 for a proof-of-concept with the mpn_popcount()
function. I think that's a much better alternative than mpz_t
because
mpz_t
has a sign which we don't need.mpz_t
does not have a "size" (number of bits) which we do need.mpz_t
is used to represent numbers which have the most significant bit set to 1. It cannot really deal with a long bitstring consisting of only zeros for example.
Proposal: close this ticket as "wontfix".
comment:14 Changed 7 years ago by
- Milestone changed from sage-wishlist to sage-duplicate/invalid/wontfix
- Reviewers set to Jeroen Demeyer
- Status changed from new to needs_review
comment:15 Changed 7 years ago by
- Status changed from needs_review to positive_review
comment:16 Changed 7 years ago by
- Resolution set to wontfix
- Status changed from positive_review to closed
Note that (AFAIK, haven't looked at it recently) the implementation of the usual operations is pretty dumb (i.e., generic and not using any machine-specific features), so you may get even worse performance (compared to "direct" implementations in C, say).