Opened 9 years ago
Last modified 4 weeks ago
#8558 needs_work enhancement
add a fast gcd algorithm for univariate polynomials over absolute number fields
Reported by:  lftabera  Owned by:  AlexGhitza 

Priority:  major  Milestone:  sage8.8 
Component:  algebra  Keywords:  gcd, pari, ntl, number field, days94 
Cc:  slelievre  Merged in:  
Authors:  Luis Felipe Tabera Alonso  Reviewers:  Jeroen Demeyer 
Report Upstream:  N/A  Work issues:  
Branch:  u/lftabera/ticket/8558 (Commits)  Commit:  b894a30815175395f814e9493636d8ecada2ffd1 
Dependencies:  #14186, #15803, #15804  Stopgaps: 
Description (last modified by )
Question arised here,
univariate gcd is performed using euclidean algorithm, which causes explosion of coefficients and is slow but for trivial examples.
For relative number fields, pass to an absolute representation. This may be slow. But for the cases where this is slow the current implementation may be unfeasible.
Attachments (6)
Change History (72)
comment:1 Changed 9 years ago by
 Status changed from new to needs_work
Changed 9 years ago by
comment:2 Changed 9 years ago by
 Keywords ntl added
 Summary changed from Make gcd use pari for univariate polynomial rings over number fields to [with patch, needs work] add a fast gcd algorithm for univariate polynomials over absolute number fields
comment:3 Changed 9 years ago by
 Status changed from needs_work to needs_review
 Summary changed from [with patch, needs work] add a fast gcd algorithm for univariate polynomials over absolute number fields to add a fast gcd algorithm for univariate polynomials over absolute number fields
Here there is a cleaner patch.
The patch adds a new class of univariate polynomials whose ground field is an absolute number field. There is a new _gcd method for this class. It actually implements several approaches to the modular gcd algorithm:
LangemyrMcCallum? algorithm Encarnacion a mixture of the two previous (default)
moreover, a call to pari method and the old Euclidean method.
I think that there should be only one modular algorithm, but, as usual, there are cases in which any of them beats the others. So suggestions are welcome. It should probably be Encarnacion or the mixture of boths
some timmings for harder problems:
sage: N = NumberField(ZZ[x].random_element(15).monic(),'a') sage: K = N[x] sage: f = K.random_element(100) sage: g1 = K.random_element(100) sage: g2 = K.random_element(100) sage: g1 *= f sage: g2 *= f sage: %time _=gcd(g1,g2) CPU times: user 7.32 s, sys: 0.02 s, total: 7.34 s Wall time: 7.42 s
This example with the old implementation or even pari is unfeasible.
comment:4 Changed 9 years ago by
 Improved documentation
 Eliminated an improvement in a function that hos nothing to do with the ticket
comment:5 Changed 9 years ago by
Added the method also for number fields of degree 1 using Flint in this special case
Changed 9 years ago by
comment:6 Changed 9 years ago by
 Priority changed from minor to major
Now that #4000 has possitive review, I raise the priority of this one.
Updated patch to work in 4.5.3
Now, it works also for relative number fields, passing to an absolute representation and performing there the computations. This is not optimal but it is usable.
If the extension of the field is one, then the gcd of QQ[x] is used wich is faster.
Changed 9 years ago by
comment:7 Changed 8 years ago by
I can confirm that the patch applies fine to 4.6.rc0 and all tests (including long) pass.
There are some minor problems with the documentation (just punctuation)  now I am studying the code, since this looks like very useful functionality!
comment:8 Changed 8 years ago by
Some clean in the documentation, reorder of the methods (methods after the classes of the file) and a couple of small bugs (honor the method option for relative fields and duplicated code for degree 1 extensions). Rename correctly the patch.
Apply: trac8558.patch
Try new buildbot :)
comment:9 Changed 8 years ago by
Apply trac8558.patch
(I think such messages have to come *after* the patch is uploaded)
comment:10 Changed 8 years ago by
 Description modified (diff)
comment:11 Changed 8 years ago by
 Status changed from needs_review to needs_info
Hi, I don't know much about fast computation in polynomial rings over number fields. However, I'm wondering why you have this bias towards functions as opposed to methods? For example, lift_pm
looks like it should be a method on p
instead of a function? Perhaps the same applies to the gcd functions? Or am I missing something?
comment:12 Changed 8 years ago by
No bias intended. lift_pm can be a method and probably in more classes than polynomials in NTL.
I am reading again the code and it looks awful now. I think there are too many gcd alternatives with little gainamong them.
I will update the code.
comment:13 Changed 8 years ago by
 Status changed from needs_info to needs_review
Now the functions are methods.
lift_to_poly_ZZ and lift_to_poly_QQ are methods of ntl_ZZ_pEX
lift_pm is a method of ntl_ZZ_p
Still, I have left a lift_pm function in sage.rings.arith I think it is good for educational purposes in modular algorithms.
I have eliminated pure Encarnacion and pure Langemyr_McCallum methods. There is no real benefit compared to the extra code they add.
comment:15 Changed 8 years ago by
rebase against 4.7.1
comment:16 Changed 8 years ago by
Apply: trac8558.2.patch
comment:17 Changed 8 years ago by
 Description modified (diff)
comment:18 Changed 8 years ago by
Several comments:
 Why the name
lift_pm()
? In PARI, this function is calledcenterlift()
which is more understandable in my opinion. On the other hand, having the name begin withlift
is more friendly for TABcompletion, so why notlift_centered
or something?  A special class for polynomials over number fields looks like a good idea, but please put it in a separate file
polynomial_number_field.pyx
instead of adding topolynomial_element_generic.pyx
.  I have a hard time understand the text below. What is
c
? What isn
? Probably "reduction" is a better word than "project". And I also think the nameK
for a polynomial ring is badly chosen:def lift_to_poly_QQ(self,K): """ Compute a lift of poly to K. INPUT:  K: an univariate polynomial ring over an absolute number field QQ[a]. In order to make sense of this algorithm, the minimum polynomial of 'a' should project onto `c` modulo `m`. OUTPUT: A polynomial in QQ[a] such that its projection coefficientwise is poly. This algorithm uses rational reconstruction, so it may fail.
 Why
_gcd()
instead ofgcd()
?  Add some tests using the global
gcd()
function instead of always calling your_gcd()
method.  Thanks to #11904, the line
h1 = pari([coeff._pari_('y') for coeff in self.list()]).Polrev()
can be changed toh1 = pari(self)
 This looks very fishy:
#Experimental bound IMPROVE
 With
p = p.next_prime(False)
do you meanp = p.next_prime(proof=False)
which is much clearer?  I believe the keyword argument
algoritm=
is usually used instead ofmethod =
(note also the spacing!).
comment:19 Changed 8 years ago by
 Reviewers set to Jeroen Demeyer
 Status changed from needs_review to needs_work
comment:20 Changed 7 years ago by
rebase 4.7.2 still needs work
comment:21 Changed 6 years ago by
Back to business.
I have updated the patch with Jeroen's suggestions. Still I have to do something with
#Experimental bound IMPROVE p = ZZ(3+min(2**255, (max(map(abs,Bound.list())).n()**(0.4)).floor())).next_prime(proof=False)
p is the first prime to perform a modular gcd. If p=2, we will have to use a lot of primes and will be inefficient. On the other hand, if p is too big, we lose the advantages of modular gcd. So we have to give a sane, intermediate default.
The prime above is based on some experiments I made two years ago, the idea is to use a prime such that we will have to use few modular gcd, but limiting our stating prime to the interval [3, 2^255]
.
Ideas welcomed.
comment:22 Changed 6 years ago by
Apply trac8558.2.patch
comment:23 Changed 6 years ago by
 Dependencies set to #14186
comment:24 Changed 6 years ago by
 Status changed from needs_work to needs_review
Still, we have to look for a sensible default starting prime p, but the rest of the ticket can be reviewed.
Further improvements for other tickets and for big degree polynomials (different tickets) could be
 add a halfgcd algorithm for ntl_ZZ_p_EX class, ntl libraries do not provide one.
 Improve a lot the test of having a gcd, remainder algorithm is slow.
Apply: trac8558.2.patch
comment:25 Changed 6 years ago by
The patchbot wants to apply too many patches
Apply: trac8558.2.patch
comment:26 Changed 6 years ago by
Apply: trac8558.2.patch
comment:28 Changed 6 years ago by
Some quick comments:
_sig_on
and_sig_off
are very deprecated and should be changed tosig_on()
andsig_off()
(#10115). The change "somewhat" > "somewha" clearly is a mistake.
 Documentation and library formatting must be fixed.
 Could you replace
centerlift
bylift_centered
insage/rings/finite_rings/integer_mod.pyx
for consistency?  As I already mentioned, replace
method =
byalgorithm=
 Since
QQ
is a unique parent, replacebase_ring != QQ
bybase_ring is not QQ
.  Remove
# There has to be a better way to do this, self.change_ring()
does not work. I don't see any problem with that code.  It is better to use a
**kwds
argument forsparse
andimplementation
indef __init__(self, base_ring, name="x", sparse=False, element_class=None, implementation=None)
comment:29 Changed 6 years ago by
 Status changed from needs_review to needs_work
comment:30 Changed 6 years ago by
 Status changed from needs_work to needs_review
I have taken into acoount the suggestions. I have deprecated integer_mod.centerlift and have eliminated the PolynomialRing? new classes, since they only differed from PolynomialRing_field in the element_class and ignored the sparse option.
comment:31 Changed 6 years ago by
 Milestone changed from sage5.11 to sage5.12
comment:32 Changed 5 years ago by
Apply: trac8558.2.2.patch
comment:33 Changed 5 years ago by
 Branch set to u/lftabera/ticket/8558
 Commit set to 85ff4b9e6b8db9019eb7f4720851878966eecf77
comment:34 Changed 5 years ago by
Certainly looks in much better shape than it used to be, but I'm afraid I don't know enough NTL to completely review this. Why not split up this ticket in two? The first which adds the new classes and implements gcd
via PARI and the second which implements the modular algorithm.
comment:35 Changed 5 years ago by
 Description modified (diff)
comment:36 Changed 5 years ago by
 Milestone changed from sage6.1 to sage6.2
comment:37 Changed 5 years ago by
 Commit changed from 85ff4b9e6b8db9019eb7f4720851878966eecf77 to e29c687b79482a5fadca00b625dc1ffbeeee7f9e
Branch pushed to git repo; I updated commit sha1. New commits:
ddb7291  Add lift_centered method to more classes, deprecate centerlift

1adefd5  Merge branch 'u/lftabera/lift_centered' into u/lftabera/ticket/8558

4cc4359  Create the classes of univariate polynomials over number fields

bb677fc  Add gcd method to polynomial_number_field using pari

e29c687  Merge branch 'u/lftabera/gcd_number_field_pari' into u/lftabera/ticket/8558

comment:38 Changed 5 years ago by
 Dependencies changed from #14186 to #14186, #15803, #15804
I have split the ticket into three tickets for easier reviewer and to allow partial merging.
comment:39 Changed 5 years ago by
 Milestone changed from sage6.2 to sage6.3
comment:40 Changed 5 years ago by
 Status changed from needs_review to needs_work
 Work issues set to does not merge with 6.2
comment:41 Changed 5 years ago by
 Commit changed from e29c687b79482a5fadca00b625dc1ffbeeee7f9e to f57b6514c42c431a3849d28703b6fce105f25a5d
Branch pushed to git repo; I updated commit sha1. New commits:
e6bd63b  Deprecate centerlift method

169cb08  Make global lift_centered function fallback to lift if the object

868e571  Catch exceptions instead of using hassattr.

1266aaa  Merge branch 'u/lftabera/lift_centered' into u/lftabera/ticket/8558

341a8f7  Use proper constructions for the library. Minor fixes in other parts of the code.

3ff4ae8  Merge branch 'u/lftabera/gcd_number_field_pari' into u/lftabera/ticket/8558

f39d0ab  Merge branch 'master' into u/lftabera/ticket/8558

f57b651  fix import of lift_centered

comment:42 Changed 5 years ago by
 Work issues does not merge with 6.2 deleted
(I'm not sure if the merge conflict was caused by this ticket or one of its dependencies; I just reported what the patchbot told me.)
I don't really have time to review any of this at the moment, but I agree with earlier reviewers that this looks like a very nice addition. I was just wondering whether it would make sense to do this via a NumberField._gcd_univariate_polynomial()
method as introduced by #13442. On the one hand, this would mean you didn't have to introduce new classes (potentially many of them: we have absolute and relative number fields, quadratic fields, cyclotomic fields, sparse and dense polynomials, who knows what other distinctions we'll have in the future). On the other hand, the code of this ticket has already been written and #13442 is still at needs_work
, so it probably isn't worth the effort for now.
comment:43 Changed 5 years ago by
Thanks anyway, this made me update my local branches :)
The code is ok, but the branch of patches is a little mess due to my poor "gitfu". As of now I think that it is more important to get the dependencies merged, they are much simpler and would allow to prepare cleaner patches for this ticket.
Concerning #13442, I had in mind an implementation of these rings as in ticket #10591, so it would be better to have different classes for polynomials over number fields. The long term goal would be to have nice multiple extensions without the need of computing a primitive element for basic arithmetic, IMO it would be faster than using singular (cf. #9541).
comment:44 Changed 5 years ago by
 Milestone changed from sage6.3 to sage6.4
comment:45 Changed 4 years ago by
 Commit changed from f57b6514c42c431a3849d28703b6fce105f25a5d to add0e6314fc91175996bf4390a858dc5476b0539
Branch pushed to git repo; I updated commit sha1. New commits:
775f2c7  Merge tag '6.3' into u/lftabera/ticket/8558

a2622a9  Merge tag '6.4' into u/lftabera/ticket/8558

9da4eab  Merge branch 'master' into u/lftabera/lift_centered

406dca9  Fix import error

e81e7a0  Merge branch 'master' into u/lftabera/lift_centered

07f725e  Merge branch 'u/lftabera/lift_centered' into u/lftabera/ticket/8558

add0e63  Fix an exception that has changed.

comment:46 Changed 4 years ago by
 Commit changed from add0e6314fc91175996bf4390a858dc5476b0539 to c89c73444d62d8f88b8cd0f6c0913b609fbac00e
Branch pushed to git repo; I updated commit sha1. New commits:
2f30718  Merge tag '6.5' into u/lftabera/lift_centered

fff5565  Merge tag '6.6' into u/lftabera/lift_centered

7a5e891  Merge tag '6.7' into u/lftabera/lift_centered

3dd232b  Ticket 15804

c89c734  Merge branch 'u/lftabera/lift_centered' into u/lftabera/ticket/8558

comment:47 Changed 4 years ago by
 Status changed from needs_work to needs_review
comment:48 Changed 3 years ago by
 Status changed from needs_review to needs_work
After so much time, I want to check if the code needs refactoring.
comment:49 Changed 3 years ago by
 Commit changed from c89c73444d62d8f88b8cd0f6c0913b609fbac00e to 43d1c1237ca102a53e319045ac8e116c0e6cf268
comment:50 Changed 3 years ago by
 Status changed from needs_work to needs_review
It works again with latest beta
comment:51 Changed 3 years ago by
.. NOTE:
>.. NOTE::
 In the docstring of "lift_to_poly_QQ" it is written that "it may fail with a
ValueError?
exception.". But there is only an example which gives an "ArithmeticError?". So either fix the note or add a doctest.
 you should do more typing in Cython file (with
cdef
). For example,cdef list lifted = []
. It will help cython to speed up loops including the loops. And if I am not mistaken, this syntaxfor 0 <= i < r
is equivalent tofor i in range(r)
which is more Python friendly.
 the name
lift_to_poly_QQ
looks strange to me. What does theQQ
stands for?
 Could you provide more examples why the situation is it better with this new algorithm?
comment:52 Changed 3 years ago by
 Commit changed from 43d1c1237ca102a53e319045ac8e116c0e6cf268 to eef9fe12ac4578b08fbff4b8ca5264ad5e20a8b9
comment:53 Changed 3 years ago by
 I have added mor cdef, however, I cannot do that in polynomial_number_field since the class inherist from a python class. I was not able to cdef, rings or number field elements. The latter has at least two classes and I was not able to cdef the common parent.
 I have fixed types, more verbose method names, clean unusued variables.
 modular method can be further improved for high degree using halfgcd algorithms in ntl.
 Some more examples:
Pari seems to use either Euclid or some subresultant variation. I would expect pari to perform better when the gcd is big with respect to the degree of the inputs. Or, when the input has small degree.
The polynomials are generated with K.random_element() so small coefficients. This benefits pari.
Small extension, not impressing except for huge polynomials
sage: K=QQ[I]['x']
deg f, deg g, deg gcd, timeit pari, timeit modular 2, 2, 0, 111 µs, 176 µs 2, 2, 1, 173 µs, 360 µs 2, 2, 2, 185 µs, 466 µs 100, 100, 0, 30.6 ms, 7.06 ms 100, 100, 50, 22.5 ms, 34.7 ms 100, 100, 100, 4.82 ms, 6.58 ms 300, 300, 0, 1.8 s, 45.1 ms 300, 300, 150, 534 ms, 201 ms 300, 300, 300, 13.3 ms, 11.3 ms 1000, 1000, 100, 2min 52s, 2.07 s
A degree 3, easy extension
sage: R=NumberField(x^32,'a')['x']
deg f, deg g, deg gcd, timeit pari, timeit modular 2, 2, 0, 163 µs, 243 µs 2, 2, 1, 239 µs, 597 µs 2, 2, 2, 257 µs, 587 µs 100, 100, 0, 19 s, 11.2 ms (1000x faster) 100, 100, 50, 8.6 s, 47.3 ms (100x faster) 100, 100, 100, 6.57 ms, 11.9 ms (2x slower) 300, 300, 150, > 600 s, 403 ms
Tu support my claim that big coefficients benefits modular
sage: K=QQ[I]['x'] sage: f=K.random_element(2,10**10)
100, 100, 0, 282 ms, 6.92 ms 100, 100, 50, 117 ms, 15.4 ms 100, 100, 100, 4.69 ms, 5.01 ms
comment:54 Changed 3 years ago by
 Status changed from needs_review to needs_work
lift methods still need better documentation according to #18
comment:55 Changed 3 years ago by
If I were a pari developer reasing this, I would ask: why not implement the algorithm in pari itself, rather than in Sage?
comment:56 Changed 3 years ago by
I am not a pari developer, it would certainly benefit more people than an implementation in Sage. I thought about an implementation in ntl, but I do not touch C++ since 15 years ago. Flint would also be a good place to put a similar code.
According to the documentation:
gcd uses:
 integers: use modified rightshift binary ("plusminus" variant).
 univariate polynomials with coefficients in the same number field (in particular rational): use modular gcd algorithm.
 general polynomials: use the subresultant algorithm if coefficient explosion is likely (non modular coefficients).
So, it may be the case that we are not translating polynomials to the correct pari setting and so I get bad timings? It may be the case, in order not to compute the discriminant of the number field we are dealing with...
comment:57 Changed 3 years ago by
One thing to watch when comparing timings with pari is that when Sage calls pari to compute class number s and units in a number field, unless Proof=False it assumes no hypothesis (like GRH) so is often a lot slower than the same thing done in pari, where the default is to assume everything (and you can ask for certification later). This should be as issue if you only do basic number field arithmetic, but if you do anything which requires knowing the units or class group it can be quite significant. You might want to try setting Proof=False in Sage and seeing if your timings change.
comment:58 Changed 3 years ago by
It it is not that, I only need basic arithmetic, at most, pari could be interested in the discriminant, but I am working in QQ[I] in my examples. Which should be easy.
I have tried to write polynomials in pure gp as
f= Mod(3*y,y^21)*x^2+Mod(1,y^21)*x+Mod(y,y^21)
or as
w=quadgen(4) f= 3*w*x^2+1*x+w
in both cases I get bad timing. Reading the documentation now to see how to deal with polynomials with number field coefficients in pari...
comment:59 Changed 3 years ago by
 Commit changed from eef9fe12ac4578b08fbff4b8ca5264ad5e20a8b9 to 5cfbce8851be4af6fceba9a37599b1150135540e
comment:60 Changed 3 years ago by
 Status changed from needs_work to needs_review
It looks that Pari is not really using a modular algorithm...
In any case, I have fixed the documentation, I have also fixed some heuristic assumptions about the primes. Do not use proof=False and be more conservative about the size of the primes. In principle, with proof=False we could try a composite number such that the gcd in the residue ring success but some prime factors are good and some are bad. Not sure if this is possible. Also, chinese remainder could potentially fail, in this case I think that can only happens if the gcd is already unfeasible by any method. But let's be conservative and use only failproof methods.
comment:61 Changed 14 months ago by
 Commit changed from 5cfbce8851be4af6fceba9a37599b1150135540e to 02639b776efa6fc3511657a7ec0a5655abfd63d9
comment:62 Changed 10 months ago by
 Keywords days94 added
comment:63 Changed 10 months ago by
 Commit changed from 02639b776efa6fc3511657a7ec0a5655abfd63d9 to b894a30815175395f814e9493636d8ecada2ffd1
comment:64 Changed 10 months ago by
 Status changed from needs_review to needs_work
This code uses ntl ZZ_pEX while it should use the faster lzz_pEX for wordsized primes.
comment:65 Changed 3 months ago by
 Cc slelievre added
 Milestone changed from sage6.4 to sage8.7
comment:66 Changed 4 weeks ago by
 Milestone changed from sage8.7 to sage8.8
Ticket retargeted after milestone closed (if you don't believe this ticket is appropriate for the Sage 8.8 release please retarget manually)
The pari algorithm is not fast enough. I have implemented a modular algorithm using ntl.
The patch needs doctest and integration in sage (it is, right now, a separate function).
It adds a new function modular_gcd
To test it, apply the patch and
from sage.rings.polynomial.polynomial_absolute_number_field import modular_gcd
Some timmings: (800 Mhz i386 linux, 1Gb ram)