Good news! Things are now faster than without the patches!
I found that one can considerably improve the conversion of an element of the base ring into a polynomial ring. Some polynomial rings used a generic conversion map, some used a polynomial base injection map -- and both were slow.
My inspiration came from Algebras.ParentMethods.__init_extra__
: If R is a polynomial ring, then multiplication of a scalar with R.one()
often is a very fast method to convert the scalar into R.
Problems:
- We should not assume that any ring has a unit (ok, polynomial rings over a unital ring have...).
- Calling
R.one()
will usually trigger the creation of a generic conversion - hence, it would be difficult to register it as conversion.
- Not all flavours of polynomial elements have a
_rmul_
(polynomial_element_generic has not).
- Sometimes, other conversion maps are registered when one wants to register the polynomial base injection map.
So, I implemented _rmul_
and _lmul_
for polynomial_element_generic, try various ways (old and new coercion model) of creating a One bypassing conversion maps, and in one init method of polynomial rings I decided to re-initialise the conversion maps.
Timings
I tried to test as many cases (multi- versus univariate, libSingular
and others, different base rings,...). Without all the patches, we have the following:
sage: R.<x> = ZZ[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 23.4 µs per loop
sage: R.<x> = QQ[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 24.6 µs per loop
sage: R.<x> = GF(3)[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 87.9 µs per loop
sage: R.<x> = QQ['t'][]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 113 µs per loop
sage: R.<x,y> = ZZ[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 13 µs per loop
sage: R.<x,y> = QQ[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 16.6 µs per loop
sage: R.<x,y> = GF(3)[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 10.8 µs per loop
sage: R.<x,y> = QQ['t'][]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 238 µs per loop
sage: R.<x,y> = Qp(3)[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 511 µs per loop
sage: R.<x> = Qp(3)[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 1.06 ms per loop
With the patches, I get
sage: R.<x> = ZZ[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 8.97 µs per loop
sage: R.<x> = QQ[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 8.3 µs per loop
sage: R.<x> = GF(3)[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 70.3 µs per loop
sage: R.<x> = QQ['t'][]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 82.6 µs per loop
sage: R.<x,y> = ZZ[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 12.6 µs per loop
sage: R.<x,y> = QQ[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 16.4 µs per loop
sage: R.<x,y> = GF(3)[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 10.5 µs per loop
sage: R.<x,y> = QQ['t'][]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 187 µs per loop
sage: R.<x,y> = Qp(3)[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 503 µs per loop
sage: R.<x> = Qp(3)[]
sage: timeit('(2*x-1)^2+5', number=10^4)
10000 loops, best of 3: 1.08 ms per loop
So, there is no significant slow down at all, but a considerable speed up in most cases.
I suppose it can now be reviewed. I understood that the Robert's patches essentially have a positive review, except for the slow-down. So, would it suffice if some of you test my patch?
Apply 9944-poly-cat.patch 9944-poly-cat-doctests.patch trac-9944-poly-cat-review.patch trac9944_polynomial_speedup.patch