Opened 7 years ago

Closed 4 years ago

#18304 closed enhancement (fixed)

Integer and Rational comparisons

Reported by: vdelecroix Owned by:
Priority: major Milestone: sage-8.0
Component: performance Keywords:
Cc: jdemeyer Merged in:
Authors: Vincent Delecroix Reviewers: Marc Mezzarobba, Travis Scrimshaw
Report Upstream: N/A Work issues:
Branch: e72c82d (Commits, GitHub, GitLab) Commit: e72c82de36e6838f1064eccf08382b232521ed0d
Dependencies: Stopgaps:

Status badges

Description (last modified by vdelecroix)

MPIR 3.0.0 is shipped with the function mpq_cmp_z(mpq_t, mpz_t). We can use it to make integer/rational comparisons much faster.

Comparing integers is very fast, comparing rationals is very fast but comparing integer with rational is about 5x slower:

sage: a = 1; b = 18
sage: timeit("a < b", number=500000, repeat=100)
500000 loops, best of 100: 64.3 ns per loop
sage: a = QQ(1); b = QQ(18)
sage: timeit("a < b", number=500000, repeat=100)
500000 loops, best of 100: 84 ns per loop
sage: a = 1; b = QQ(18)
sage: timeit("a < b", number=500000, repeat=100)
500000 loops, best of 100: 405 ns per loop

With the branch applied, it gets better

sage: a = 1; b = 18
sage: timeit("a < b", number=500000, repeat=100)
500000 loops, best of 100: 64 ns per loop
sage: a = QQ(1); b = QQ(18)
sage: timeit("a < b", number=500000, repeat=100)
500000 loops, best of 100: 65.6 ns per loop
sage: a = 1; b = QQ(18)
sage: timeit("a < b", number=500000, repeat=100)
500000 loops, best of 100: 87.1 ns per loop

Change History (37)

comment:1 Changed 7 years ago by vdelecroix

  • Authors set to Vincent Delecroix
  • Branch set to u/vdelecroix/18304
  • Description modified (diff)

comment:2 Changed 7 years ago by git

  • Commit set to dbb783170ea1989d32e538af4b025db2c8576cf4

Branch pushed to git repo; I updated commit sha1. New commits:

a7cbb88Remove _cmp_c_impl and _richcmp_c_impl
9b48fecDoctest .pxd files also
1ad339bImplement _rich_to_bool as inline function instead of member function
17bd067Merge tag '6.7.beta2' into t/17890/ticket/17890
dbb7831Trac 18304: faster comparisons Integer <-> Rational

comment:3 Changed 7 years ago by vdelecroix

  • Cc jdemeyer added
  • Status changed from new to needs_review

comment:4 Changed 7 years ago by git

  • Commit changed from dbb783170ea1989d32e538af4b025db2c8576cf4 to 554c4358bc0e470ec0c72c2a24251f4a6978a12e

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

313a400Optimize rich_to_bool_sgn
629f6a5Improve comparisons for permutation groups
0d1e049Improve _richcmp and documentation
39273f1Fix doctest formatting
04570b3Fix bad doctest in etaproducts
3976f2cAdd pointers for special uses of __richcmp__
7e8315fMerge branch 'u/jdemeyer/ticket/17890' into sage-6.7.beta3
554c435Trac 18304: faster comparisons for Rational

comment:5 Changed 7 years ago by vdelecroix

  • Description modified (diff)

comment:6 follow-up: Changed 7 years ago by mmezzarobba

It's not really a big deal, but I'm not much of a fan of these global mpz_tmp/mpq_tmp: this is one more thing that will break if it ever becomes possible to have multi-threaded sage code... Did you check if there is a measurable performance benefit?

Also, couldn't __richcmp__ just call itself instead of having duplicate code to handle the case where right is of the type defined by the class?

By the way, if you make further changes, perhaps you could take the occasion to make _cmp_ use rich_to_bool_sign too.

comment:7 in reply to: ↑ 6 ; follow-ups: Changed 7 years ago by vdelecroix

Hello,

Replying to mmezzarobba:

It's not really a big deal, but I'm not much of a fan of these global mpz_tmp/mpq_tmp: this is one more thing that will break if it ever becomes possible to have multi-threaded sage code...

Me neither (and there is actually one in sage.rings.integer that is not from me). The ideal would be for gmp/mpir to provide a mpq_cmp_mpz. Do you think that it is reasonable to ask them?

Did you check if there is a measurable performance benefit?

It is minor but timings are in the ticket description. In a more concrete situation

sage: l = [ZZ.random_element() for _ in range(100)]
sage: l.extend(QQ.random_element() for _ in range(100))
sage: shuffle(l)
sage: %time l.sort()
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 1.21 ms

compared to

sage: %time l.sort()
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 247 µs

Also, couldn't __richcmp__ just call itself instead of having duplicate code to handle the case where right is of the type defined by the class?

I do not know why it was done this way for integers. I just naively used the implementation of Integer.__richcmp__ as a template.

By the way, if you make further changes, perhaps you could take the occasion to make _cmp_ use rich_to_bool_sign too.

Yes.

comment:8 in reply to: ↑ 7 ; follow-up: Changed 7 years ago by vdelecroix

Replying to vdelecroix:

By the way, if you make further changes, perhaps you could take the occasion to make _cmp_ use rich_to_bool_sign too.

Yes.

Actually, no. Because it is not a rich comparison but just comparison.

comment:9 in reply to: ↑ 7 Changed 7 years ago by mmezzarobba

Replying to vdelecroix:

Me neither (and there is actually one in sage.rings.integer that is not from me).

Yes, I saw that.

The ideal would be for gmp/mpir to provide a mpq_cmp_mpz. Do you think that it is reasonable to ask them?

I have no idea.

Did you check if there is a measurable performance benefit?

It is minor but timings are in the ticket description.

No, I mean specifically for using a global temporary variable instead of creating a new one each time.

comment:10 in reply to: ↑ 8 Changed 7 years ago by mmezzarobba

Replying to vdelecroix:

Replying to vdelecroix:

By the way, if you make further changes, perhaps you could take the occasion to make _cmp_ use rich_to_bool_sign too.

Yes.

Actually, no. Because it is not a rich comparison but just comparison.

Right, sorry.

comment:11 Changed 7 years ago by vdelecroix

  • Status changed from needs_review to needs_work

With

sage: l = [ZZ.random_element() for _ in range(100)]
sage: l.extend(QQ.random_element() for _ in range(100))

declaring global mpz_tmp and mpq_tmp I got

timeit("shuffle(l); l.sort()", number=5000, repeat=10)
5000 loops, best of 10: 131 µs per loop

and with local ones

sage: timeit("shuffle(l); l.sort()", number=5000, repeat=10)
5000 loops, best of 10: 177 µs per loop

Note that this bad scenario only occurs for comparisons of:

  • Python long with Sage Integer
  • Sage Rationals with Sage Integer, Python int, Python long

There is a slightly another annoying thing with the global mpz_tmp/mpq_tmp: mpir/gmp do not free the memory of a mpz/mpq if it is not asked explicitely. So if you compare a very big rational with a very big integer you will occupy some memory for the rest of the Sage session (similarly for a big Python long with a big Sage Integer).

I will try harder to see if there is a way to tweak mpir to avoid the creation of anything when we want to compare mpz to mpq. I think that having slow comparison Python long / Sage integer is not an issue. In other words, we should get rid of these global variables.

Vincent

comment:12 Changed 6 years ago by vdelecroix

  • Description modified (diff)
  • Milestone changed from sage-6.7 to sage-6.9

comment:13 Changed 6 years ago by git

  • Commit changed from 554c4358bc0e470ec0c72c2a24251f4a6978a12e to e03a26e63dd50f79fc7ada885daefd8c9970c482

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

cf21996Trac 18304: mpir patch for mpq_cmp_mpz
e03a26eTrac 18304: faster comparisons for Integer/Rational

comment:14 Changed 6 years ago by vdelecroix

  • Description modified (diff)
  • Status changed from needs_work to needs_review

comment:15 follow-up: Changed 6 years ago by mmezzarobba

Hi Vincent,

Do you still want to patch mpir after the answers you got on mpir-devel and gmp-devel? If you do: would the code from this ticket still work if sage is built with gmp?

comment:16 in reply to: ↑ 15 ; follow-up: Changed 6 years ago by vdelecroix

Hi Marc,

Thanks for having a look.

Replying to mmezzarobba:

Do you still want to patch mpir after the answers you got on mpir-devel and gmp-devel? If you do: would the code from this ticket still work if sage is built with gmp?

Last news from GMP: they kind to agree that a mpq_cmp_z would be good (note that it is mpq_cmp_mpz in my branch). But I am not sure anything concrete already happened.

You are right, I need to also patch gmp in order to make Sage works. Another option is to wait for upstream upgrading from both GMP and mpir. What do you think? I mostly wanted to test it.

comment:17 in reply to: ↑ 16 ; follow-up: Changed 6 years ago by mmezzarobba

Replying to vdelecroix:

You are right, I need to also patch gmp in order to make Sage works. Another option is to wait for upstream upgrading from both GMP and mpir. What do you think? I mostly wanted to test it.

I'd wait.

comment:18 in reply to: ↑ 17 Changed 6 years ago by vdelecroix

  • Description modified (diff)
  • Status changed from needs_review to needs_work

Replying to mmezzarobba:

Replying to vdelecroix:

You are right, I need to also patch gmp in order to make Sage works. Another option is to wait for upstream upgrading from both GMP and mpir. What do you think? I mostly wanted to test it.

I'd wait.

Looks more reasonable to me as well (I do not like to patch upstream). So I will push harder upstream to have these functions in the next releases.

comment:19 Changed 5 years ago by vdelecroix

  • Dependencies set to #22493
  • Description modified (diff)
  • Milestone changed from sage-6.9 to sage-8.0

comment:20 Changed 5 years ago by git

  • Commit changed from e03a26e63dd50f79fc7ada885daefd8c9970c482 to 81b637b30a06e19504eda1898f3e3ad57b0bc803

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

4d723dcAdd yasm as a standard package.
3057c79Add configure check for yasm
6377eaeupgrade mpir to 3-0-0
c39a193Add build time yasm dependency to mpir.
81b637bTrac 18304: faster comparisons for Integer/Rational

comment:21 Changed 5 years ago by vdelecroix

  • Status changed from needs_work to needs_review

comment:22 Changed 5 years ago by chapoton

  • Status changed from needs_review to needs_work

does not apply

comment:23 Changed 5 years ago by git

  • Commit changed from 81b637b30a06e19504eda1898f3e3ad57b0bc803 to e2b72301cda6c9b36a8b128a5cceeac12a11e163

Branch pushed to git repo; I updated commit sha1. This was a forced push. Last 10 new commits:

357103aMerge branch 'public/22493' of git://trac.sagemath.org/sage into mpir300
8ddc2f1MPIR 3.0.0 does not build yasm anymore.
8205366But MPIR always looks for it even on archs where it is not used.
9573e15Check GNU as version when building MPIR.
cb8a3f7Echo comment when skipping yasm check.
aa0b222This should work and be useful on non-linux systems.
c9eaefcRemove useless gcc check when looking for old as.
01c9097Remove broken spaces.
42e738aTrac 18304: faster comparisons for Integer/Rational
e2b7230Merge 8.0.beta8

comment:24 Changed 5 years ago by vdelecroix

  • Status changed from needs_work to needs_review

comment:25 Changed 5 years ago by git

  • Commit changed from e2b72301cda6c9b36a8b128a5cceeac12a11e163 to 575d85593a6965ac5d3b7c008d8dce474eb572e2

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

575d855Trac 18304: faster comparisons for Integer/Rational

comment:26 Changed 5 years ago by vdelecroix

  • Dependencies #22493 deleted

comment:27 Changed 5 years ago by vdelecroix

  • Description modified (diff)

comment:28 Changed 5 years ago by tscrim

Overall, I think this is ready to go into Sage. However, close to the realm of bikeshedding, I would change:

-if not isinstance(left, Rational):
-   raise RuntimeError('this should not happen')
+assert isinstance(left, Rational)

comment:29 Changed 4 years ago by git

  • Commit changed from 575d85593a6965ac5d3b7c008d8dce474eb572e2 to 8c60c6ad58e14315213ed055a05383550408ab8f

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

8c60c6aTrac 18304: faster comparisons for Integer/Rational

comment:30 Changed 4 years ago by git

  • Commit changed from 8c60c6ad58e14315213ed055a05383550408ab8f to 6294a5fdc0fdc9d4c977afe1b8c100af06ca9ffc

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

6294a5fTrac 18304: faster comparisons for Integer/Rational

comment:31 Changed 4 years ago by vdelecroix

rebased on 8.0.beta10 (and incorported your comment:28)

comment:32 Changed 4 years ago by tscrim

Thanks. However, I just had a strong sense of dread come over me when I thought about why the old code was checking left/right. In code like __eq__, Python will sometimes swap the arguments, see:

https://eev.ee/blog/2012/03/24/python-faq-equality/

I do not know if this is done for __richcmp__ with extension classes, but I probably won't be able to look into this until tomorrow.

comment:33 Changed 4 years ago by chapoton

  • Status changed from needs_review to needs_work

does not build, due to a changed import (richcmp stuff is now in sage.structure.richcmp)

comment:34 Changed 4 years ago by git

  • Commit changed from 6294a5fdc0fdc9d4c977afe1b8c100af06ca9ffc to e72c82de36e6838f1064eccf08382b232521ed0d

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

e72c82dTrac 18304: faster comparisons for Integer/Rational

comment:35 Changed 4 years ago by vdelecroix

  • Status changed from needs_work to needs_review

Indeed. I rebased on 8.0.beta11.

comment:36 Changed 4 years ago by tscrim

  • Reviewers set to Marc Mezzarobba, Travis Scrimshaw
  • Status changed from needs_review to positive_review

Hmm...I could've sworn Python sometimes flips inputs for something like __eq__, but maybe I am thinking of some other operation. Anyways, looks good and ready to be merged.

comment:37 Changed 4 years ago by vbraun

  • Branch changed from u/vdelecroix/18304 to e72c82de36e6838f1064eccf08382b232521ed0d
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.