Opened 7 years ago

Closed 5 years ago

Integer and Rational comparisons

Reported by: Owned by: vdelecroix major sage-8.0 performance jdemeyer Vincent Delecroix Marc Mezzarobba, Travis Scrimshaw N/A e72c82d e72c82de36e6838f1064eccf08382b232521ed0d

MPIR 3.0.0 is shipped with the function `mpq_cmp_z(mpq_t, mpz_t)`. We can use it to make integer/rational comparisons much faster.

Comparing integers is very fast, comparing rationals is very fast but comparing integer with rational is about 5x slower:

```sage: a = 1; b = 18
sage: timeit("a < b", number=500000, repeat=100)
500000 loops, best of 100: 64.3 ns per loop
sage: a = QQ(1); b = QQ(18)
sage: timeit("a < b", number=500000, repeat=100)
500000 loops, best of 100: 84 ns per loop
sage: a = 1; b = QQ(18)
sage: timeit("a < b", number=500000, repeat=100)
500000 loops, best of 100: 405 ns per loop
```

With the branch applied, it gets better

```sage: a = 1; b = 18
sage: timeit("a < b", number=500000, repeat=100)
500000 loops, best of 100: 64 ns per loop
sage: a = QQ(1); b = QQ(18)
sage: timeit("a < b", number=500000, repeat=100)
500000 loops, best of 100: 65.6 ns per loop
sage: a = 1; b = QQ(18)
sage: timeit("a < b", number=500000, repeat=100)
500000 loops, best of 100: 87.1 ns per loop
```

comment:1 Changed 7 years ago by vdelecroix

• Authors set to Vincent Delecroix
• Branch set to u/vdelecroix/18304
• Description modified (diff)

comment:2 Changed 7 years ago by git

• Commit set to dbb783170ea1989d32e538af4b025db2c8576cf4

Branch pushed to git repo; I updated commit sha1. New commits:

 ​a7cbb88 `Remove _cmp_c_impl and _richcmp_c_impl` ​9b48fec `Doctest .pxd files also` ​1ad339b `Implement _rich_to_bool as inline function instead of member function` ​17bd067 `Merge tag '6.7.beta2' into t/17890/ticket/17890` ​dbb7831 `Trac 18304: faster comparisons Integer <-> Rational`

comment:3 Changed 7 years ago by vdelecroix

• Status changed from new to needs_review

comment:4 Changed 7 years ago by git

• Commit changed from dbb783170ea1989d32e538af4b025db2c8576cf4 to 554c4358bc0e470ec0c72c2a24251f4a6978a12e

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

 ​313a400 `Optimize rich_to_bool_sgn` ​629f6a5 `Improve comparisons for permutation groups` ​0d1e049 `Improve _richcmp and documentation` ​39273f1 `Fix doctest formatting` ​04570b3 `Fix bad doctest in etaproducts` ​3976f2c `Add pointers for special uses of __richcmp__` ​7e8315f `Merge branch 'u/jdemeyer/ticket/17890' into sage-6.7.beta3` ​554c435 `Trac 18304: faster comparisons for Rational`

comment:5 Changed 7 years ago by vdelecroix

• Description modified (diff)

comment:6 follow-up: ↓ 7 Changed 7 years ago by mmezzarobba

It's not really a big deal, but I'm not much of a fan of these global `mpz_tmp`/`mpq_tmp`: this is one more thing that will break if it ever becomes possible to have multi-threaded sage code... Did you check if there is a measurable performance benefit?

Also, couldn't `__richcmp__` just call itself instead of having duplicate code to handle the case where `right` is of the type defined by the class?

By the way, if you make further changes, perhaps you could take the occasion to make `_cmp_` use `rich_to_bool_sign` too.

comment:7 in reply to: ↑ 6 ; follow-ups: ↓ 8 ↓ 9 Changed 7 years ago by vdelecroix

Hello,

It's not really a big deal, but I'm not much of a fan of these global `mpz_tmp`/`mpq_tmp`: this is one more thing that will break if it ever becomes possible to have multi-threaded sage code...

Me neither (and there is actually one in `sage.rings.integer` that is not from me). The ideal would be for gmp/mpir to provide a `mpq_cmp_mpz`. Do you think that it is reasonable to ask them?

Did you check if there is a measurable performance benefit?

It is minor but timings are in the ticket description. In a more concrete situation

```sage: l = [ZZ.random_element() for _ in range(100)]
sage: l.extend(QQ.random_element() for _ in range(100))
sage: shuffle(l)
sage: %time l.sort()
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 1.21 ms
```

compared to

```sage: %time l.sort()
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 247 µs
```

Also, couldn't `__richcmp__` just call itself instead of having duplicate code to handle the case where `right` is of the type defined by the class?

I do not know why it was done this way for integers. I just naively used the implementation of `Integer.__richcmp__` as a template.

By the way, if you make further changes, perhaps you could take the occasion to make `_cmp_` use `rich_to_bool_sign` too.

Yes.

comment:8 in reply to: ↑ 7 ; follow-up: ↓ 10 Changed 7 years ago by vdelecroix

By the way, if you make further changes, perhaps you could take the occasion to make `_cmp_` use `rich_to_bool_sign` too.

Yes.

Actually, no. Because it is not a rich comparison but just comparison.

comment:9 in reply to: ↑ 7 Changed 7 years ago by mmezzarobba

Me neither (and there is actually one in `sage.rings.integer` that is not from me).

Yes, I saw that.

The ideal would be for gmp/mpir to provide a `mpq_cmp_mpz`. Do you think that it is reasonable to ask them?

I have no idea.

Did you check if there is a measurable performance benefit?

It is minor but timings are in the ticket description.

No, I mean specifically for using a global temporary variable instead of creating a new one each time.

comment:10 in reply to: ↑ 8 Changed 7 years ago by mmezzarobba

By the way, if you make further changes, perhaps you could take the occasion to make `_cmp_` use `rich_to_bool_sign` too.

Yes.

Actually, no. Because it is not a rich comparison but just comparison.

Right, sorry.

comment:11 Changed 7 years ago by vdelecroix

• Status changed from needs_review to needs_work

With

```sage: l = [ZZ.random_element() for _ in range(100)]
sage: l.extend(QQ.random_element() for _ in range(100))
```

declaring global `mpz_tmp` and `mpq_tmp` I got

```timeit("shuffle(l); l.sort()", number=5000, repeat=10)
5000 loops, best of 10: 131 µs per loop
```

and with local ones

```sage: timeit("shuffle(l); l.sort()", number=5000, repeat=10)
5000 loops, best of 10: 177 µs per loop
```

Note that this bad scenario only occurs for comparisons of:

• Python long with Sage Integer
• Sage Rationals with Sage Integer, Python int, Python long

There is a slightly another annoying thing with the global `mpz_tmp/mpq_tmp`: `mpir/gmp` do not free the memory of a `mpz/mpq` if it is not asked explicitely. So if you compare a very big rational with a very big integer you will occupy some memory for the rest of the Sage session (similarly for a big Python long with a big Sage Integer).

I will try harder to see if there is a way to tweak `mpir` to avoid the creation of anything when we want to compare `mpz` to `mpq`. I think that having slow comparison Python long / Sage integer is not an issue. In other words, we should get rid of these global variables.

Vincent

comment:12 Changed 7 years ago by vdelecroix

• Description modified (diff)
• Milestone changed from sage-6.7 to sage-6.9

comment:13 Changed 7 years ago by git

• Commit changed from 554c4358bc0e470ec0c72c2a24251f4a6978a12e to e03a26e63dd50f79fc7ada885daefd8c9970c482

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

 ​cf21996 `Trac 18304: mpir patch for mpq_cmp_mpz` ​e03a26e `Trac 18304: faster comparisons for Integer/Rational`

comment:14 Changed 7 years ago by vdelecroix

• Description modified (diff)
• Status changed from needs_work to needs_review

comment:15 follow-up: ↓ 16 Changed 7 years ago by mmezzarobba

Hi Vincent,

Do you still want to patch mpir after the answers you got on mpir-devel and gmp-devel? If you do: would the code from this ticket still work if sage is built with gmp?

comment:16 in reply to: ↑ 15 ; follow-up: ↓ 17 Changed 7 years ago by vdelecroix

Hi Marc,

Thanks for having a look.

Do you still want to patch mpir after the answers you got on mpir-devel and gmp-devel? If you do: would the code from this ticket still work if sage is built with gmp?

Last news from GMP: they kind to agree that a `mpq_cmp_z` would be good (note that it is `mpq_cmp_mpz` in my branch). But I am not sure anything concrete already happened.

You are right, I need to also patch gmp in order to make Sage works. Another option is to wait for upstream upgrading from both GMP and mpir. What do you think? I mostly wanted to test it.

comment:17 in reply to: ↑ 16 ; follow-up: ↓ 18 Changed 7 years ago by mmezzarobba

You are right, I need to also patch gmp in order to make Sage works. Another option is to wait for upstream upgrading from both GMP and mpir. What do you think? I mostly wanted to test it.

I'd wait.

comment:18 in reply to: ↑ 17 Changed 7 years ago by vdelecroix

• Description modified (diff)
• Status changed from needs_review to needs_work

You are right, I need to also patch gmp in order to make Sage works. Another option is to wait for upstream upgrading from both GMP and mpir. What do you think? I mostly wanted to test it.

I'd wait.

Looks more reasonable to me as well (I do not like to patch upstream). So I will push harder upstream to have these functions in the next releases.

comment:19 Changed 5 years ago by vdelecroix

• Dependencies set to #22493
• Description modified (diff)
• Milestone changed from sage-6.9 to sage-8.0

comment:20 Changed 5 years ago by git

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

 ​4d723dc `Add yasm as a standard package.` ​3057c79 `Add configure check for yasm` ​6377eae `upgrade mpir to 3-0-0` ​c39a193 `Add build time yasm dependency to mpir.` ​81b637b `Trac 18304: faster comparisons for Integer/Rational`

comment:21 Changed 5 years ago by vdelecroix

• Status changed from needs_work to needs_review

comment:22 Changed 5 years ago by chapoton

• Status changed from needs_review to needs_work

does not apply

comment:23 Changed 5 years ago by git

• Commit changed from 81b637b30a06e19504eda1898f3e3ad57b0bc803 to e2b72301cda6c9b36a8b128a5cceeac12a11e163

Branch pushed to git repo; I updated commit sha1. This was a forced push. Last 10 new commits:

 ​357103a `Merge branch 'public/22493' of git://trac.sagemath.org/sage into mpir300` ​8ddc2f1 `MPIR 3.0.0 does not build yasm anymore.` ​8205366 `But MPIR always looks for it even on archs where it is not used.` ​9573e15 `Check GNU as version when building MPIR.` ​cb8a3f7 `Echo comment when skipping yasm check.` ​aa0b222 `This should work and be useful on non-linux systems.` ​c9eaefc `Remove useless gcc check when looking for old as.` ​01c9097 `Remove broken spaces.` ​42e738a `Trac 18304: faster comparisons for Integer/Rational` ​e2b7230 `Merge 8.0.beta8`

comment:24 Changed 5 years ago by vdelecroix

• Status changed from needs_work to needs_review

comment:25 Changed 5 years ago by git

• Commit changed from e2b72301cda6c9b36a8b128a5cceeac12a11e163 to 575d85593a6965ac5d3b7c008d8dce474eb572e2

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

 ​575d855 `Trac 18304: faster comparisons for Integer/Rational`

comment:26 Changed 5 years ago by vdelecroix

• Dependencies #22493 deleted

comment:27 Changed 5 years ago by vdelecroix

• Description modified (diff)

comment:28 Changed 5 years ago by tscrim

Overall, I think this is ready to go into Sage. However, close to the realm of bikeshedding, I would change:

```-if not isinstance(left, Rational):
-   raise RuntimeError('this should not happen')
+assert isinstance(left, Rational)
```

comment:29 Changed 5 years ago by git

• Commit changed from 575d85593a6965ac5d3b7c008d8dce474eb572e2 to 8c60c6ad58e14315213ed055a05383550408ab8f

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

 ​8c60c6a `Trac 18304: faster comparisons for Integer/Rational`

comment:30 Changed 5 years ago by git

• Commit changed from 8c60c6ad58e14315213ed055a05383550408ab8f to 6294a5fdc0fdc9d4c977afe1b8c100af06ca9ffc

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

 ​6294a5f `Trac 18304: faster comparisons for Integer/Rational`

comment:31 Changed 5 years ago by vdelecroix

rebased on 8.0.beta10 (and incorported your comment:28)

comment:32 Changed 5 years ago by tscrim

Thanks. However, I just had a strong sense of dread come over me when I thought about why the old code was checking left/right. In code like `__eq__`, Python will sometimes swap the arguments, see:

I do not know if this is done for `__richcmp__` with extension classes, but I probably won't be able to look into this until tomorrow.

comment:33 Changed 5 years ago by chapoton

• Status changed from needs_review to needs_work

does not build, due to a changed import (richcmp stuff is now in sage.structure.richcmp)

comment:34 Changed 5 years ago by git

• Commit changed from 6294a5fdc0fdc9d4c977afe1b8c100af06ca9ffc to e72c82de36e6838f1064eccf08382b232521ed0d

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

 ​e72c82d `Trac 18304: faster comparisons for Integer/Rational`

comment:35 Changed 5 years ago by vdelecroix

• Status changed from needs_work to needs_review

Indeed. I rebased on 8.0.beta11.

comment:36 Changed 5 years ago by tscrim

• Reviewers set to Marc Mezzarobba, Travis Scrimshaw
• Status changed from needs_review to positive_review

Hmm...I could've sworn Python sometimes flips inputs for something like `__eq__`, but maybe I am thinking of some other operation. Anyways, looks good and ready to be merged.

comment:37 Changed 5 years ago by vbraun

• Branch changed from u/vdelecroix/18304 to e72c82de36e6838f1064eccf08382b232521ed0d
• Resolution set to fixed
• Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.