Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#22152 closed enhancement (fixed)

speed up generic polynomials

Reported by: mmezzarobba Owned by:
Priority: major Milestone: sage-7.6
Component: performance Keywords:
Cc: cremona, nbruin, SimonKing Merged in:
Authors: Marc Mezzarobba, Travis Scrimshaw Reviewers: Travis Scrimshaw, Marc Mezzarobba
Report Upstream: N/A Work issues:
Branch: b846e12 (Commits, GitHub, GitLab) Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by mmezzarobba)

Some hopefully representative examples:

QQi.<i> = QuadraticField(-1)
Pol.<x> = QQi[]
p = x + i + 1
q = ( (8*i - 12)*x^10 + (-i + 2)*x^9 + (-i + 1)*x^8 + (-i + 7)*x^7 + (-i + 1)*x^6 + (-12*i + 2)*x^5 + (i + 1)*x^4 + (-7*i + 1)*x^3 + (2*i + 1)*x^2 + x - i - 2)
a = 42
b = QQi(42)
mat = matrix(QQi, [[1, i], [0,2]])
y = polygen(Pol, 'y')
foo = y^2 + x + 1
fl1 = 1.r
RR1 = 1.
for cmd in [
    'p*p',
    'p**5',
    'p._mul_trunc_(q, 3)',
    'p(0)',
    'q(0)',
    'p(a)',
    'p(b)',
    'q(b)',
    'q(fl1)',
    'q(RR1)',
    'q(a)',
    'q(mat)',
    'foo(x=1, y=2)',
]:
    timeit(cmd)
sage: pol = MatrixSpace(ZZ, 1000, 1000)['x'].random_element(degree=10)
sage: %timeit pol(42)

Results:

test                  old      new
--------------------------------------
p*p                   13.1 µs  8.14 µs
p**5                  53.7 µs  35.9 µs
p._mul_trunc_(q, 3)   17 µs    8.26 µs
p(0)                  3.49 µs  920 ns
q(0)                  3.5 µs   877 ns
p(a)                  6.72 µs  3.78 µs
p(b)                  6.68 µs  2.48 µs
q(b)                  13.8 µs  11.9 µs
q(fl1)                7.47 ms  8.24 ms
q(RR1)                7.48 ms  8.27 ms
q(a)                  16.8 µs  15.8 µs
q(mat)                290 µs   284 µs
foo(x=1, y=2)         64.1 µs  54 µs
---------------------------------------
pol(42)               208 ms   202 ms

(Evaluations at floating-point numbers are very slow, and suffer from a small regression, due to the slowness of the generic coercion map through CLF.)

Edit: add up to ~0.5 µs to some of the above timings due to subsequent fixes.

Change History (34)

comment:1 Changed 5 years ago by mmezzarobba

  • Cc cremona nbruin SimonKing added
  • Status changed from new to needs_review

comment:2 Changed 5 years ago by git

  • Commit changed from 9b7eec254e31c45e80cf0e81b9192e9152ccdf18 to 1fb9d2737f5a9e454f25bf80a1eeef7b4825ccfd

Branch pushed to git repo; I updated commit sha1. New commits:

0678134generic Polynomials: [] -> get_unsafe() when possible
1fb9d27speed up generic Polynomial.__call__()

comment:3 Changed 5 years ago by mmezzarobba

  • Description modified (diff)
  • Summary changed from speed up generic polynomials a little to speed up generic polynomials

The tests in rings/ pass; I haven't yet run the whole test suite.

comment:4 Changed 5 years ago by git

  • Commit changed from 1fb9d2737f5a9e454f25bf80a1eeef7b4825ccfd to 7488953070e2ee0f503c4b5a3258bc12492c9ee8

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

7488953speed up generic Polynomial.__call__()

comment:5 follow-up: Changed 5 years ago by tscrim

Looks good overall. From a quick look through the code, it seems like the inputs to the do_* methods always take lists as inputs. Likewise, should we make list a cpdef method that guarantees a list being returned? This might get us some extra speed as well.

comment:6 in reply to: ↑ 5 Changed 5 years ago by mmezzarobba

Replying to tscrim:

Looks good overall. From a quick look through the code, it seems like the inputs to the do_* methods always take lists as inputs. Likewise, should we make list a cpdef method that guarantees a list being returned? This might get us some extra speed as well.

Thanks for the suggestion. Adding type annotations to do_* and brutally replacing def list(...) by cdef list list(...) everywhere actually seems to make things slower, though the difference is small. I didn't try to investigate any further. And I must confess that it is not at all clear to me what implications those cdef have with the relatively complex hierarchy of subclasses that override list() (sometimes from Cython, sometimes from Python, sometimes with additional arguments).

comment:7 follow-up: Changed 5 years ago by tscrim

You might be loosing some speed by variables not being cdef'ed as a list. Did you happen to save your current branch? I can play around with it later tonight to see if I can get some extra speed out of it.

comment:8 in reply to: ↑ 7 Changed 5 years ago by mmezzarobba

I noticed a small problem with the last commit: it would cause terrible regression for cases like polynomials over large matrix spaces evaluated at scalars. I'll push a fix in a minute.

Replying to tscrim:

You might be loosing some speed by variables not being cdef'ed as a list. Did you happen to save your current branch?

Yes, see u/mmezzarobba/speed_up_generic_polynomials-alt.

comment:9 Changed 5 years ago by git

  • Commit changed from 7488953070e2ee0f503c4b5a3258bc12492c9ee8 to 25f884d48ae7392c908231dfcaa1931b66cbe900

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

25f884dspeed up generic Polynomial.__call__()

comment:10 Changed 5 years ago by mmezzarobba

  • Description modified (diff)

comment:11 Changed 5 years ago by tscrim

  • Branch changed from u/mmezzarobba/speed_up_generic_polynomials to u/tscrim/speed_up_generic_polynomials-22152
  • Commit changed from 25f884d48ae7392c908231dfcaa1931b66cbe900 to a75321719bd7de6637b740bf00c6f266e7acfda7
  • Milestone changed from sage-7.5 to sage-7.6
  • Reviewers set to Travis Scrimshaw

I went through and made all of the def list(self) methods into cpdef list list(self, bint copy=True) to take advantage of the times when we don't have to copy the list and to specify the output type. This didn't give as much of a speed bump as I was expecting. I also made some other improvements by using get_unsafe and other things. Here are my timings:

Current branch:

625 loops, best of 3: 6.68 µs per loop
625 loops, best of 3: 29.2 µs per loop
625 loops, best of 3: 6.77 µs per loop
625 loops, best of 3: 739 ns per loop
625 loops, best of 3: 699 ns per loop
625 loops, best of 3: 3.3 µs per loop
625 loops, best of 3: 2.23 µs per loop
625 loops, best of 3: 10.4 µs per loop
125 loops, best of 3: 7.11 ms per loop
125 loops, best of 3: 7.1 ms per loop
625 loops, best of 3: 13.2 µs per loop
625 loops, best of 3: 247 µs per loop
625 loops, best of 3: 46.7 µs per loop

vs my new branch

625 loops, best of 3: 6.5 µs per loop
625 loops, best of 3: 28.6 µs per loop
625 loops, best of 3: 6.64 µs per loop
625 loops, best of 3: 739 ns per loop
625 loops, best of 3: 737 ns per loop
625 loops, best of 3: 3.13 µs per loop
625 loops, best of 3: 2.19 µs per loop
625 loops, best of 3: 10.6 µs per loop
125 loops, best of 3: 6.88 ms per loop
125 loops, best of 3: 6.84 ms per loop
625 loops, best of 3: 13.4 µs per loop
625 loops, best of 3: 240 µs per loop
625 loops, best of 3: 44.4 µs per loop

The small timings have too much variation to say there is a definitive regression. However, the arithmetic operations have improved.

Something else I noticed that probably could get us some speed is requiring all polynomials to have a _new_constant_poly. Currently only the dense ones have such a method. Shall we do that here?


New commits:

f355638wip
b3878a2Merge branch 'u/mmezzarobba/speed_up_generic_polynomials-alt' of git://trac.sagemath.org/sage into u/mmezzarobba/speed_up_generic_polynomials-alt
526fdadUsing cpdef list list() and some other improvements.
6b53da5Making list always take a second input for the dense cases and some other speedups.
a753217A few more _new_generic instead of via the coercion framework.

comment:12 Changed 5 years ago by mmezzarobba

  • Authors changed from Marc Mezzarobba to Marc Mezzarobba, Travis Scrimshaw
  • Branch changed from u/tscrim/speed_up_generic_polynomials-22152 to u/mmezzarobba/speed_up_generic_polynomials
  • Commit changed from a75321719bd7de6637b740bf00c6f266e7acfda7 to 82554f906ead4a2693223115d25b20698c82e1ee
  • Reviewers changed from Travis Scrimshaw to Travis Scrimshaw, Travis Scrimshaw

Thanks for the improvements! I cleaned up the branch a little (rebased, squashed my "wip" commit into the following one) without changing the code.

Regarding _new_constant_poly() for sparse polynomials: I really don't think I'll have time to make further changes until Feb. 8 at best, though I'll do my best to review yours if you make some. In any case I'm happy with the current branch; you can set the ticket to positive review if you agree, assuming the patchbot doesn't find anything to complain about.


New commits:

40cabeddon't call parent() on result from methods of base Polynomial class
425aa01speed up truncated multiplication of generic polynomials
edb148dcache the result of PolynomialRing.is_exact()
4628080polynomial_element: minor code simplifications/speedups
901e40edisable cython checks in generic polynomial multiplication
67aadecgeneric Polynomials: [] -> get_unsafe() when possible
856a74aspeed up generic Polynomial.__call__()
6bca846Using cpdef list list() and some other improvements.
04e89aeMaking list always take a second input for the dense cases and some other speedups.
82554f9A few more _new_generic instead of via the coercion framework.

comment:13 Changed 5 years ago by mmezzarobba

  • Reviewers changed from Travis Scrimshaw, Travis Scrimshaw to Travis Scrimshaw, Marc Mezzarobba

comment:14 Changed 5 years ago by tscrim

  • Status changed from needs_review to positive_review

Let's get this into Sage now and do further optimizations on a followup ticket.

comment:15 Changed 5 years ago by vbraun

  • Status changed from positive_review to needs_work

Merge conflict...

comment:16 Changed 5 years ago by git

  • Commit changed from 82554f906ead4a2693223115d25b20698c82e1ee to 2e6ce3627e8cc20b0eaf9813d725969e61089871

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

232d8a2don't call parent() on result from methods of base Polynomial class
623fac3speed up truncated multiplication of generic polynomials
66be984cache the result of PolynomialRing.is_exact()
e56be91polynomial_element: minor code simplifications/speedups
2e71c9adisable cython checks in generic polynomial multiplication
7023518generic Polynomials: [] -> get_unsafe() when possible
421e9c5speed up generic Polynomial.__call__()
59ab09bUsing cpdef list list() and some other improvements.
bff9081Making list always take a second input for the dense cases and some other speedups.
2e6ce36A few more _new_generic instead of via the coercion framework.

comment:17 Changed 5 years ago by git

  • Commit changed from 2e6ce3627e8cc20b0eaf9813d725969e61089871 to 582bc1451110361c164f5af0f7c9e582bca7eba3

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

9076574don't call parent() on result from methods of base Polynomial class
3bcc65dspeed up truncated multiplication of generic polynomials
00c32eecache the result of PolynomialRing.is_exact()
5a50716polynomial_element: minor code simplifications/speedups
f7ac5d3disable cython checks in generic polynomial multiplication
53aa45cgeneric Polynomials: [] -> get_unsafe() when possible
9c386fbspeed up generic Polynomial.__call__()
9d91c6cUsing cpdef list list() and some other improvements.
ff878d6Making list always take a second input for the dense cases and some other speedups.
582bc14A few more _new_generic instead of via the coercion framework.

comment:18 Changed 5 years ago by git

  • Commit changed from 582bc1451110361c164f5af0f7c9e582bca7eba3 to c4b9000794094b9b116a15d9ab66f831f7f9d126

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

ce745fcdon't call parent() on result from methods of base Polynomial class
e86ab58speed up truncated multiplication of generic polynomials
25c536ecache the result of PolynomialRing.is_exact()
a87df2bpolynomial_element: minor code simplifications/speedups
9439459disable cython checks in generic polynomial multiplication
2de8e6bgeneric Polynomials: [] -> get_unsafe() when possible
9907674speed up generic Polynomial.__call__()
74b1d6bUsing cpdef list list() and some other improvements.
115876dMaking list always take a second input for the dense cases and some other speedups.
c4b9000A few more _new_generic instead of via the coercion framework.

comment:19 Changed 5 years ago by mmezzarobba

  • Status changed from needs_work to positive_review

Rebased.

comment:20 follow-up: Changed 5 years ago by vbraun

  • Status changed from positive_review to needs_work

merge conflict...

comment:21 in reply to: ↑ 20 Changed 5 years ago by mmezzarobba

Replying to vbraun:

merge conflict...

Again? With which branch? This ticket's branch (as of c4b9000) is based on the current trac:develop.

comment:22 Changed 5 years ago by git

  • Commit changed from c4b9000794094b9b116a15d9ab66f831f7f9d126 to 29bfdb557e99262cb4c01a6915a55033a4862535

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

660a603don't call parent() on result from methods of base Polynomial class
773036bspeed up truncated multiplication of generic polynomials
3429977cache the result of PolynomialRing.is_exact()
1925f00polynomial_element: minor code simplifications/speedups
095bdc5disable cython checks in generic polynomial multiplication
74b56bageneric Polynomials: [] -> get_unsafe() when possible
03e8691speed up generic Polynomial.__call__()
f0f1923Using cpdef list list() and some other improvements.
3532b34Making list always take a second input for the dense cases and some other speedups.
29bfdb5A few more _new_generic instead of via the coercion framework.

comment:23 Changed 5 years ago by mmezzarobba

  • Status changed from needs_work to positive_review

Rebased again, on top of 7.6.beta1.

comment:24 Changed 5 years ago by tscrim

  • Status changed from positive_review to needs_work

I am getting the same doctest failure as the patchbot:

sage -t --long src/sage/schemes/elliptic_curves/padics.py
**********************************************************************
File "src/sage/schemes/elliptic_curves/padics.py", line 1058, in sage.schemes.elliptic_curves.padics.padic_sigma
Failed example:
    for N in range(3, max_N):                     # long time
       sigma = E.padic_sigma(p, N, E2=E2)         # long time
       assert sigma == max_sigma
Exception raised:
    Traceback (most recent call last):
      File "/home/travis/sage-build/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 498, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/home/travis/sage-build/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 861, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.schemes.elliptic_curves.padics.padic_sigma[19]>", line 3, in <module>
        assert sigma == max_sigma
    AssertionError
**********************************************************************
1 item had failures:
   1 of  21 in sage.schemes.elliptic_curves.padics.padic_sigma
    [197 tests, 1 failure, 11.79 s]
----------------------------------------------------------------------
sage -t --long src/sage/schemes/elliptic_curves/padics.py  # 1 doctest failed
----------------------------------------------------------------------

I almost certainly did something wrong with my round of improvements. This will require some digging.

comment:25 Changed 5 years ago by tscrim

This was introduced by me in 3532b34. Now to figure out exactly what the problem is...

comment:26 follow-up: Changed 5 years ago by tscrim

  • Branch changed from u/mmezzarobba/speed_up_generic_polynomials to u/tscrim/speed_up_generic_polynomials-22152
  • Commit changed from 29bfdb557e99262cb4c01a6915a55033a4862535 to b846e12f17880b85c8929471308918ac572f30ea
  • Status changed from needs_work to needs_review

Okay, so the problem was in the change I made to the __iter__ in that iterating over the list is not the same as iterating up to the degree. I don't understand why this results in different behavior. However, for now, I think we can just let it be.


New commits:

b846e12Reverting change to __iter__.

comment:27 in reply to: ↑ 26 Changed 5 years ago by mmezzarobba

Replying to tscrim:

Okay, so the problem was in the change I made to the __iter__ in that iterating over the list is not the same as iterating up to the degree. I don't understand why this results in different behavior.

Since the failure has to do with p-adics, I suspect the reason may be that:

sage: Pol.<x> = Qp(5)[]
sage: pol = 1 + O(5)*x
sage: pol.degree()
0
sage: list(pol)
[1 + O(5^20), O(5)]

I don't know if this is intentional, nor why the implementation of p-adics take this convention. (I'd have expected degree() to check that the coefficients are nonzero using __nonzero__(), and __nonzero__() to return true except when its argument is exactly zero, like it does for some other inexact parents.) But perhaps we need to check that our other changes don't break things like this.

comment:28 Changed 5 years ago by tscrim

I don't think we made any other functional changes as get_unsafe has to be consistent with __getitem__ (for non-slices). Everything else is extending current behavior or passing type info to Cython.

I'm not sure if that is intentional either. We might need to ask someone more familiar with the p-adics. Although that could be something we push to another ticket too.

comment:29 Changed 5 years ago by mmezzarobba

  • Status changed from needs_review to positive_review

Agreed.

comment:30 Changed 5 years ago by vbraun

  • Branch changed from u/tscrim/speed_up_generic_polynomials-22152 to b846e12f17880b85c8929471308918ac572f30ea
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:31 Changed 5 years ago by mmezzarobba

  • Commit b846e12f17880b85c8929471308918ac572f30ea deleted

Looks like we have another problem:

sage: Pol = ZZ['x']['y']['z']
....: d = Pol.gens_dict_recursive()
....: x, y, z = d['x'], d['y'], d['z']
....: v = QQ(0)
....: (x*z)(x=v)
<BOOM>

comment:32 Changed 5 years ago by tscrim

This is something very special about it being 0 with the coercion.

comment:33 Changed 5 years ago by tscrim

This also works for p = x*z:

sage: p(1)(x=v)
0

comment:34 Changed 5 years ago by tscrim

Followup on #22317.

Note: See TracTickets for help on using tickets.