Opened 3 years ago

Closed 3 years ago

#19822 closed enhancement (fixed)

Fast polynomial evaluation fmpz_poly/ZZX with mpfr/mpfi input

Reported by: vdelecroix Owned by:
Priority: major Milestone: sage-7.1
Component: algebra Keywords:
Cc: Merged in:
Authors: Vincent Delecroix Reviewers: Jeroen Demeyer
Report Upstream: N/A Work issues:
Branch: 232a575 (Commits) Commit: 232a575c4933fc2e02d61fda511dca65b7a101e2
Dependencies: Stopgaps:

Description (last modified by vdelecroix)

We implement functions that allow fast evaluations of integer polynomials with real input:

  • fmpz_poly_evaluation_mpfr(mpfr_t res, const fmpz_poly_t poly, const mpfr_t a)
  • fmpz_poly_evaluation_mpfi(mpfi_t res, const fmpz_poly_t poly, const mpfi_t a)
  • ZZX_evaluation_mpfr(mpfr_t res, ZZX_c poly, const mpfr_t a)
  • ZZX_evaluation_mpfi(mpfi_t res, ZZX_c poly, const mpfi_t a)

These functions are integrated in the polynomial code within the methods _eval_mpfr_ and _eval_mpfi_.

Follow up: #19841, #19824

See also the following discussion for upstream integrations:

And an algorithm for accuracte computations:

Attachments (10)

degree_20_64.png (45.2 KB) - added by vdelecroix 3 years ago.
Horner evaluations degree=20 precision=64
degree_20_128.png (46.1 KB) - added by vdelecroix 3 years ago.
Horner evaluations degree=20 precision=128
degree_10_64.png (45.1 KB) - added by vdelecroix 3 years ago.
Horner evaluations degree=10 precision=64
degree_10_128.png (46.5 KB) - added by vdelecroix 3 years ago.
Horner evaluations degree=10 precision=128
poly_20_1.png (54.7 KB) - added by vdelecroix 3 years ago.
Horner evaluations degree=20 nb_terms=1
poly_20_2.png (58.9 KB) - added by vdelecroix 3 years ago.
Horner evaluations degree=20 nb_terms=2
poly_20_5.png (57.9 KB) - added by vdelecroix 3 years ago.
Horner evaluations degree=20 nb_terms=5
poly_20_13.png (57.9 KB) - added by vdelecroix 3 years ago.
Horner evaluations degree=20 nb_terms=13
poly_20_21.png (63.2 KB) - added by vdelecroix 3 years ago.
Horner evaluations degree=20 nb_terms=21
degree_20_256.png (45.2 KB) - added by vdelecroix 3 years ago.
Horner evaluations degree=10 precision=256

Download all attachments as: .zip

Change History (61)

comment:1 Changed 3 years ago by vdelecroix

  • Branch set to u/vdelecroix/19822
  • Commit set to 683cd0de16503dc913750fd7c417129409b010c6
  • Status changed from new to needs_review

New commits:

e2287a6Trac 19822: polynomial evaluation
ea54d91Trac 19822: call from python
683cd0dTrac 19822: faster number field comparisons

comment:2 Changed 3 years ago by vdelecroix

  • Description modified (diff)

comment:3 Changed 3 years ago by jdemeyer

  • Status changed from needs_review to needs_work

Some quick comments:

  1. only do -> only does
  1. What's the point of the mpfr_rnd_t rnd parameter, given that you don't make any guarantees about exact rounding or rounding direction? Given the code, the only sensible rounding mode is MPFR_RNDN.
  1. Did you investigate special-casing small degrees (say, degree <= 1)?
  1. Both fmpz_poly_degree(poly) and ZZX_deg return long, so the loop variable must be long too.

comment:4 Changed 3 years ago by jdemeyer

  1. Please move the changes to src/sage/rings/number_field/number_field_element.pyx to a new ticket such that we can focus on the actual issue from the ticket here.

comment:5 Changed 3 years ago by git

  • Commit changed from 683cd0de16503dc913750fd7c417129409b010c6 to 131cd594f7a5b4977040b9f681dc8b02197e4247

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

131cd59Trac 19822: review 1

comment:6 follow-up: Changed 3 years ago by vdelecroix

  • Description modified (diff)
  • Status changed from needs_work to needs_review

For 2. I just forward the rounding mode to the mpfr functions. Having exact rounding for polynomials might actually be tricky... and also useful. The main reason for having this option is to handle simply polynomial evaluations with different RealField input.

For 3. I did not consider it on purpose. Having special case for degree 0,1 will slow down the code of higher degree cases which are indeed the interesting ones.

comment:7 Changed 3 years ago by vdelecroix

To be more complient with flint calling convention I will modify the names of the function eval -> evaluation.

comment:8 Changed 3 years ago by git

  • Commit changed from 131cd594f7a5b4977040b9f681dc8b02197e4247 to ad0721694614fe3546cdace8b3ec27e415539b74

Branch pushed to git repo; I updated commit sha1. New commits:

ad07216Trac 19822: change method names + warning in doc

comment:9 Changed 3 years ago by vdelecroix

  • Description modified (diff)

comment:10 in reply to: ↑ 6 Changed 3 years ago by jdemeyer

Replying to vdelecroix:

For 2. I just forward the rounding mode to the mpfr functions.

I know that but it doesn't make sense to do that!

What would be the point of doing every single operation with MPFR_RNDU? That doesn't guarantee anything about the rounding of the result (if I take a number obtained with MPFR_RNDU and multiply it with -1, it effectively becomes rounded with MPFR_RNDD). Since polynomials use a mixture of additions and multiplications, it becomes difficult to avoid this problem.

comment:11 Changed 3 years ago by jdemeyer

  • Status changed from needs_review to needs_work

Tiny detail: please keep the alphabetic order in src/module_list.py.

comment:12 Changed 3 years ago by jdemeyer

Typo: Hornder -> Horner.

comment:13 Changed 3 years ago by jdemeyer

Avoid sage/libs/ntl/decl.pxi: just cimport what you need from the .pxd files in NTL.

comment:14 follow-up: Changed 3 years ago by vdelecroix

Replying to jdemeyer:

Replying to vdelecroix:

For 2. I just forward the rounding mode to the mpfr functions.

I know that but it doesn't make sense to do that!

I see and you are definitely right. Then what should with p(x) when p is an integer polynomial and x belongs to RealField(prec, rnd=MPFR_RNDU)?

comment:15 in reply to: ↑ 14 Changed 3 years ago by jdemeyer

Just use MPFR_RNDN for polynomial evaluation in all cases, it's the only sensible thing if you cannot guarantee anything about the rounding.

comment:16 Changed 3 years ago by git

  • Commit changed from ad0721694614fe3546cdace8b3ec27e415539b74 to 2453496c0c37e647e8a052c576dc975ff91e4ac0

Branch pushed to git repo; I updated commit sha1. New commits:

6693fa9Trac 19822: only use MPFR_RNDN
2453496Trac 19822: review 2

comment:17 Changed 3 years ago by vdelecroix

  • Description modified (diff)
  • Status changed from needs_work to needs_review

comment:18 Changed 3 years ago by vdelecroix

  • Description modified (diff)

comment:19 Changed 3 years ago by vdelecroix

remark: in flint they have two polynomial evaluation algorithms: Horner and divide_and_conquer and they hardcoded the threshold to 50.

comment:20 Changed 3 years ago by git

  • Commit changed from 2453496c0c37e647e8a052c576dc975ff91e4ac0 to 82f121bfac062934185f741610c7f97ddd504882

Branch pushed to git repo; I updated commit sha1. New commits:

82f121bTrac 19822: add doctest for corner cases

comment:21 Changed 3 years ago by vdelecroix

  • Description modified (diff)

comment:22 Changed 3 years ago by git

  • Commit changed from 82f121bfac062934185f741610c7f97ddd504882 to 5562a78b592125e698021743d545f176cc0604bb

Branch pushed to git repo; I updated commit sha1. New commits:

5562a78Trac 19822: remove two Cython directives that belong to #19841

comment:23 Changed 3 years ago by vdelecroix

ready for review again.

comment:24 Changed 3 years ago by vdelecroix

ping?

comment:25 Changed 3 years ago by jdemeyer

The code looks good but I need to benchmark it.

comment:26 Changed 3 years ago by vdelecroix

You can also command it if you have precise requests ;-)

BTW, I added special cases for degree 0 and 1 and I have seen no difference.

comment:27 follow-up: Changed 3 years ago by jdemeyer

Benchmarks indicate that the bound degree < 10 makes no sense at all. The ratio in speed between the old and the new code does not depend on the degree at all.

Something which is strange: it looks like the precision of the real number/interval matters: for small precisions, the new code is faster but for large precisions (> 50000 bits) the old code is faster. This makes no sense to me! I would guess that both implementations should scale the same way if the precision is increased. Any clues?

Last edited 3 years ago by jdemeyer (previous) (diff)

comment:28 in reply to: ↑ 27 ; follow-up: Changed 3 years ago by vdelecroix

Replying to jdemeyer:

Benchmarks indicate that the bound degree < 10 makes no sense at all. The ratio in speed between the old and the new code does not depend on the degree at all.

Something which is strange: it looks like the precision of the real number/interval matters: for small precisions, the new code is faster but for large precisions (> 50000 bits) the old code is faster. This makes no sense to me! I would guess that both implementations should scale the same way if the precision is increased. Any clues?

What kind of polynomials did you use? The old code tries to minimize the number of operation in evaluation. For example (x+1)^3 is infinitely smarter than 1 + x(3 + x(3 + x)).

comment:29 in reply to: ↑ 28 Changed 3 years ago by jdemeyer

Replying to vdelecroix:

What kind of polynomials did you use?

R.<x> = ZZ[]
pol = R.random_element(deg)

The old code tries to minimize the number of operation in evaluation.

I see. I guess that those "random" polynomials are somewhat sparse, so less operations are needed. I will try again with very dense polynomials.

comment:30 Changed 3 years ago by jdemeyer

  • Reviewers set to Jeroen Demeyer
  • Status changed from needs_review to needs_work

With dense polynomials and large precision, the timings of the old and new code are essentially the same. That's expected since the MPFR/MPFI operations dominate the runtime.

Can you remove the "degree < 10" check?

comment:31 follow-up: Changed 3 years ago by jdemeyer

On second thought, what if the polynomial is x^1000? Then we will lose a lot with your code.

comment:32 in reply to: ↑ 31 Changed 3 years ago by vdelecroix

Replying to jdemeyer:

On second thought, what if the polynomial is x^1000? Then we will lose a lot with your code.

Indeed. It is possible to implement a less stupid Horner with grouping. In other words x^6 + 3x^2 + 1 would be 1 + x^2(3 + x^4). There would be a small overhead for dense polynomials.

comment:33 Changed 3 years ago by jdemeyer

Or just skip the addition if the term is 0. It would be nice to benchmark that.

comment:34 Changed 3 years ago by vdelecroix

And there is no mpfi_pow nor mpfi_pow_ui!!

comment:35 Changed 3 years ago by vdelecroix

mpfr_pow is *more* reliable than trial multiplication! Hence we should use it

sage: a = RR(1.1)
sage: a*a*a*a == a**4    # understandable
False
sage: b = RealField(256)(1.1)
sage: ans=b*b*b*b
sage: (ans-a**4)
-4.44089209850063e-16
sage: (ans-a*a*a*a)
-6.66133814775094e-16

(EDIT: I was completely wrong in my first version)

Last edited 3 years ago by vdelecroix (previous) (diff)

Changed 3 years ago by vdelecroix

Horner evaluations degree=20 precision=64

Changed 3 years ago by vdelecroix

Horner evaluations degree=20 precision=128

Changed 3 years ago by vdelecroix

Horner evaluations degree=10 precision=64

Changed 3 years ago by vdelecroix

Horner evaluations degree=10 precision=128

Changed 3 years ago by vdelecroix

Horner evaluations degree=20 nb_terms=1

Changed 3 years ago by vdelecroix

Horner evaluations degree=20 nb_terms=2

Changed 3 years ago by vdelecroix

Horner evaluations degree=20 nb_terms=5

Changed 3 years ago by vdelecroix

Horner evaluations degree=20 nb_terms=13

Changed 3 years ago by vdelecroix

Horner evaluations degree=20 nb_terms=21

Changed 3 years ago by vdelecroix

Horner evaluations degree=10 precision=256

comment:36 follow-up: Changed 3 years ago by vdelecroix

I attached files which compare the various possibilities for Horner scheme:

  • eval1: the current version which is straightforward Horner expansion (red)
  • eval2: the same but skip addition if the term is 0 (green)
  • eval3: use mpfr_pow (blue)

There are files degree_{deg}_{prec}.png for fixed degrees where the number of terms vary and poly_{deg}_{nb_terms}.png where the percision vary.

Horner evaluations degree=20 precision=64 Horner evaluations degree=10 precision=256

Horner evaluations degree=20 nb_terms=2 Horner evaluations degree=20 nb_terms=13

eval2 seems superior over eval1. But eval3 seems only interesting for very sparse polynomials or very large precision. Indeed, for half dense polynomials (degree=20, nb_terms=13), eval3 gets better than the other two around prec=28

comment:37 Changed 3 years ago by vdelecroix

  • Description modified (diff)

comment:38 Changed 3 years ago by vdelecroix

  • Description modified (diff)

comment:39 in reply to: ↑ 36 Changed 3 years ago by jdemeyer

Do you have code for eval2? I think that might be the best in the end, it is sort of a compromise.

comment:40 follow-up: Changed 3 years ago by git

  • Commit changed from 5562a78b592125e698021743d545f176cc0604bb to 6970134df31c259484fd3d5fda0a2e6516a7db70

Branch pushed to git repo; I updated commit sha1. New commits:

97c15bfmerge u/vdelecroix/19822 in Sage-7.0.rc1
6970134Trac 19822: ignore zero coefficients + copyright

comment:41 Changed 3 years ago by git

  • Commit changed from 6970134df31c259484fd3d5fda0a2e6516a7db70 to 568f0d4c39d64f00a89b1a4293346b93317fcd9b

Branch pushed to git repo; I updated commit sha1. New commits:

568f0d4Trac 19822: remove condition on degree for evaluation

comment:42 Changed 3 years ago by vdelecroix

  • Status changed from needs_work to needs_review

comment:43 in reply to: ↑ 40 Changed 3 years ago by jdemeyer

Replying to git:

6970134Trac 19822: ignore zero coefficients + copyright

Could you at least use the standard copyright template instead?

comment:44 Changed 3 years ago by jdemeyer

In doctests, use the x^n notation instead of x**n.

comment:45 Changed 3 years ago by jdemeyer

  • Milestone changed from sage-7.0 to sage-7.1
  • Status changed from needs_review to needs_work

comment:46 Changed 3 years ago by jdemeyer

Remove the mention of "complex" values:

This file provides fast evaluation of integer polynomials with a real
or complex value

comment:47 Changed 3 years ago by jdemeyer

The doctests with the various rounding modes are obsolete, since the rounding mode isn't used. I suggest to remove those tests.

comment:48 Changed 3 years ago by git

  • Commit changed from 568f0d4c39d64f00a89b1a4293346b93317fcd9b to 232a575c4933fc2e02d61fda511dca65b7a101e2

Branch pushed to git repo; I updated commit sha1. New commits:

1be1b6eTrac 19822: use the standard copyright template
11ae972Trac 19822: ** -> ^ in doctests
8d4c985Trac 19822: remove "complex value"
232a575Trac 19822: remove rounding mode doctests

comment:49 Changed 3 years ago by vdelecroix

  • Status changed from needs_work to needs_review

comment:50 Changed 3 years ago by jdemeyer

  • Status changed from needs_review to positive_review

comment:51 Changed 3 years ago by vbraun

  • Branch changed from u/vdelecroix/19822 to 232a575c4933fc2e02d61fda511dca65b7a101e2
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.