Opened 8 years ago
Closed 8 years ago
#16964 closed defect (fixed)
Speed up comparisons in QQbar
Reported by:  Martin von Gagern  Owned by:  

Priority:  critical  Milestone:  sage6.5 
Component:  number fields  Keywords:  variety qqbar cmp singular 
Cc:  Merged in:  
Authors:  Martin von Gagern  Reviewers:  Vincent Delecroix 
Report Upstream:  N/A  Work issues:  
Branch:  3f4afef (Commits, GitHub, GitLab)  Commit:  3f4afef46ab6a042cb2678394031cb5c26d89b1a 
Dependencies:  Stopgaps: 
Description (last modified by )
When computing the variety of some ideal, then excessive amounts of time are apparently spent sorting the solutions. (In one example, the variety was essentially computed in 7½ hours but the sorting wasn't finished even 1½ hours after that.) This is due to the fact that comparison between complex algebraic numbers is done lexicographically, which means that quite often the real parts of complex conjugate numbers will have to be compared for equality, which can be a very costly operation.
Originally the computation even resulted in a Pari exception (“not enough precomputed primes”). This no longer occurs (since the pari upgrade from #15767). So the focus of this ticket is now the excessive amount of time required for comparisons, even without an exception.
Change History (33)
comment:2 Changed 8 years ago by
Replying to jdemeyer:
Please mention the version of Sage, in particular whether or not #15767 was applied.
This was sage 6.3 on Gentoo. With current develop (6.4.beta2) at least it doesn't fail as “quickly”: where the previous computation was 8 hours all in all up to the exception, my patched version without sorting took 7½ hours to compute the list, and after that sage has now been busy sorting that list for 1½ hours and still isn't done. I'll report back when it is, or when it gave up, but I'd say some need for action remains, even if the exception no longer occurs in this form.
comment:3 Changed 8 years ago by
Here is a reproducing example which at least demonstrates that comparisons take way longer than they should:
sage: r = QQ[x](69721504*x^8 + 251777664*x^6 + 329532012*x^4 + 184429548*x^2 + 37344321).roots(QQbar, False) sage: r [0.0221204634374360?  1.090991904211621?*I, 0.0221204634374360? + 1.090991904211621?*I, 0.8088604911480535?*I, 0.7598602580415435?*I, 0.7598602580415435?*I, 0.8088604911480535?*I, 0.0221204634374360?  1.090991904211621?*I, 0.0221204634374360? + 1.090991904211621?*I] sage: [r[0], r[1]].sort()
This is because the comparison of the real parts takes like forever. Which in turn is because the computation of its minimal polynomial takes forever.
Looking at the set of all zeros, I can see that there are 4 clearly distinct real parts, and each comes with a pair of conjugate solutions since the polynomial has real coefficients. This is enough to conclude that if the intervals for two real parts overlap, then they must be equal and I don't have to do an exact computation for this. Should we try to implement some of this reasoning as a special case for AlgebraicNumber.__cmp__
, for the case where the descriptor is exact and the minpoly is the same?
comment:4 Changed 8 years ago by
Summary:  MPolynomialIdeal_singular_repr.variety: sage.libs.pari.gen.PariError: not enough precomputed primes → Fix comparisons in QQbar 

comment:5 Changed 8 years ago by
Looking for ways to address this, I noticed that the a lot of time apparently is spent inside
class ANBinaryExpr(ANDescr): ⋮ def exactify(self): ⋮ gen = left._exact_field().union(right._exact_field()) ⋮
I wonder whether we can avoid that union for the case where both fields have the same defining polynomial. I wonder whether we can assume that the root of the field will form a power basis, and if so, whether there is any reasonably cheap way to find the conversion between different power bases, so we could express one root in terms of another.
Since I don't have any good ideas how to achieve this, my best idea still is tacking this at the __cmp__
level, but if anyone has an idea for solving this more generic issue, that would be really great since it would help other computations as well. So I'm sharing my thoughts.
Here is what I've tried and discarded, so you can avoid that same cul de sac. I started by writing down a generic linear combination w = a₀ + a₁z + a₂z² + … and then computed p(w) reduced by p(z), where p is the polynomial of the field. This gives a polynomial in z, and if enough powers of z are irrational then all the coefficients have to be zero if w is a root and the a_i are to be rational. So this gave me d conditions on these a₀ through a_{d1}, which I could combine into an ideal and try to compute a variety. But that variety computation takes like forever in the above example, so this generic approach of finding other roots is not feasible in this fashion. Been there, tried that and discarded it.
comment:6 Changed 8 years ago by
Component:  algebraic geometry → number fields 

Priority:  major → critical 
Summary:  Fix comparisons in QQbar → Speed up comparisons in QQbar 
comment:7 Changed 8 years ago by
Description:  modified (diff) 

Milestone:  sage6.4 → sage6.5 
comment:8 Changed 8 years ago by
Branch:  → u/gagern/ticket/16964 

Created:  Sep 11, 2014, 6:23:53 AM → Sep 11, 2014, 6:23:53 AM 
Modified:  Dec 13, 2014, 10:20:12 PM → Dec 13, 2014, 10:20:12 PM 
comment:9 Changed 8 years ago by
Authors:  → Martin von Gagern 

Commit:  → 9f6fdc209d62e055df4bdd38f9f946332fbf3842 
Status:  new → needs_review 
Here is my first stab at the approach outlined in comment:3. If you have a nicer way to handle things, feel free to suggest an alternative. If you think that my special case should not call minpoly()
but instead examine whether the descriptor is ANRoot
with matching polynomial, please state your reason for this argument. Likewise, if you think I should also handle more than one real value or conjugate pair for a given real component, then I'd love to hear how you'd implement this without making the code too hard to read and maintain.
As it is, I consider my code not particularly beautiful, and perhaps not the best solution either, but it is way better than the current state of affairs, and since we have other things depending on this, e.g. #14239, I'd like to see this merged pretty soon and possible improvements dealt with in a later ticket.
New commits:
9f6fdc2  Faster qqbar comparison for case with equal minpoly.

comment:10 Changed 8 years ago by
Commit:  9f6fdc209d62e055df4bdd38f9f946332fbf3842 → 2c4ac1be555a9c81dee84acfc7d7c2cb2602db22 

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
2c4ac1b  Faster qqbar comparison for case with equal minpoly.

comment:11 followup: 13 Changed 8 years ago by
Just a quick comment: I like your general approach, but I have to check the details...
comment:12 Changed 8 years ago by
Commit:  2c4ac1be555a9c81dee84acfc7d7c2cb2602db22 → e7acd96040fb43eebed0e3df0f1b1c7613d49bc1 

Branch pushed to git repo; I updated commit sha1. New commits:
e7acd96  Fix term in documentation.

comment:13 Changed 8 years ago by
Description:  modified (diff) 

Changing description to turn focus from pari exception to performance.
Replying to jdemeyer:
I like your general approach, but I have to check the details...
Have you found time for a closer look? Since this is currently the only ticket in the review queue with priority critical, perhaps we should try to get this into Sage 6.5?
comment:14 Changed 8 years ago by
Status:  needs_review → needs_info 

Hello,
This improvement is really cool!
EDIT: the thing below is basically one part of comment:9...
Couldn't we do a little bit better? In some cases we might identify numerically pairs of conjugated roots (and possibly one real root) and still be able to decide which one is which. There exists polynomials whose roots have the same real part:
sage: x = polygen(ZZ) sage: P = x^4  4*x^3 + 9*x^2  10*x + 5 sage: P.roots(CC,False) [1.00000000000000  1.61803398874989*I, 1.00000000000000  0.618033988749895*I, 1.00000000000000 + 0.618033988749895*I, 1.00000000000000 + 1.61803398874989*I]
With the implementation proposed in this ticket, the comparison of the roots of the above polynomial falls back to the generic implementation. We could at least compare the conjugated ones without exactification of the real part. But we can leave that for a further ticket.
Vincent
comment:15 Changed 8 years ago by
And you should also avoid this special case if self._value.complex()
and other._value.complex()
do not overlap. That would potentially save a call to minpoly
.
comment:16 Changed 8 years ago by
Branch:  u/gagern/ticket/16964 → u/vdelecroix/16964 

Commit:  e7acd96040fb43eebed0e3df0f1b1c7613d49bc1 → 1294bf20bb6fe915a5371318cb8f9feacfa3f86a 
Status:  needs_info → needs_review 
comment:17 Changed 8 years ago by
Commit:  1294bf20bb6fe915a5371318cb8f9feacfa3f86a → e6a5a00acfb25935788ee67fc2defe28060343be 

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
e6a5a00  trac #16964: review

comment:18 Changed 8 years ago by
Commit:  e6a5a00acfb25935788ee67fc2defe28060343be → 0bf7dba36d3935c23ac385de7c28cf6d85055cd5 

comment:19 Changed 8 years ago by
Commit:  0bf7dba36d3935c23ac385de7c28cf6d85055cd5 → 731ec907ee9cdc81bb06d647ebf0be26bd6ba117 

comment:20 Changed 8 years ago by
Sorry I messed up the commits! Now everything looks good and doctest are fine...
comment:21 Changed 8 years ago by
Commit:  731ec907ee9cdc81bb06d647ebf0be26bd6ba117 → 297c68ad67657dc0e4c04a362827934a2e8ed17c 

comment:22 Changed 8 years ago by
Branch:  u/vdelecroix/16964 → u/gagern/16964 

comment:23 followup: 24 Changed 8 years ago by
Commit:  297c68ad67657dc0e4c04a362827934a2e8ed17c → 16f62a2ea8e36bf3ac9f781677e404e01366a594 

I like how you avoid unneccessary minpoly computation. But I think we can do better than you did, by not having the intervals span negative and positive imaginary parts, but instead considering the absolute value of the imaginary part. I did something along these lines, but I now see that I'll have to rebase that on your latest forced push…
New commits:
731ec90  trac #16964: review

16f62a2  trac #16964: Use absolute value of imaginary part

comment:24 followup: 30 Changed 8 years ago by
Replying to gagern:
I like how you avoid unneccessary minpoly computation. But I think we can do better than you did, by not having the intervals span negative and positive imaginary parts, but instead considering the absolute value of the imaginary part. I did something along these lines, but I now see that I'll have to rebase that on your latest forced push…
sorry for that... Your version is much simpler by the way!
But I am not completely convinced that this is optimal. Do you know if there is a cheap way to assert if two roots of a given (irreducible) polynomial have the same real value? In the example I added in the commit 297c68a sorted(p2.roots(QQbar,False)
still takes hours.
Vincent
comment:26 Changed 8 years ago by
Would I be correct to assume that all this ii_minus
and ii_plus
handling in a96f3f1 is essentially equivalent to my use of absolute values in 16f62a2? If that is the case, then I'd rather drop a96f3f1 from the history, and instead rebase 297c68a onto 16f62a2. How does that sound to you?
comment:27 Changed 8 years ago by
Commit:  16f62a2ea8e36bf3ac9f781677e404e01366a594 → aaf3eb648e1feca6e27c6dd3ffc5fdd9cee98191 

Branch pushed to git repo; I updated commit sha1. New commits:
aaf3eb6  trac #16964: (review) add an example

comment:28 Changed 8 years ago by
Branch:  u/gagern/16964 → u/vdelecroix/16964 

Commit:  aaf3eb648e1feca6e27c6dd3ffc5fdd9cee98191 → 3f4afef46ab6a042cb2678394031cb5c26d89b1a 
Reviewers:  → Vincent Delecroix 
I choose a more meaningful test. Could you have a look?
As a consequence of your changes, we get great improvements on comparing rationals!!
new timings
sage: a = QQbar(1/3) sage: b = QQbar(1/2) sage: %timeit cmp(a,b) 1000000 loops, best of 3: 881 ns per loop
old timings
sage: a = QQbar(1/3) sage: b = QQbar(1/2) sage: %timeit cmp(a,b) 10000 loops, best of 3: 34.4 µs per loop
This is completely crazy.
If you are happy with my changes you can set to positive review.
Vincent
New commits:
3f4afef  trac #16964: better doctest

comment:29 Changed 8 years ago by
Status:  needs_review → positive_review 

I like it. Thanks for the review!
The speed gain for the rational numbers appear to be due to the fact that we no longer have to construct an algebraic number representation for the real part. a._value.real()
is a lot faster than a.real()
:
sage: %timeit a._value.real() 1000000 loops, best of 3: 201 ns per loop sage: %timeit a.real() 100000 loops, best of 3: 10.7 µs per loop
(I still hope that someone, some day, will make all this work here obsolete by coming up with a better way to compare QQbar elements even if they don't share a minpoly. No idea how, though.)
comment:30 followup: 31 Changed 8 years ago by
Replying to vdelecroix:
But I am not completely convinced that this is optimal. Do you know if there is a cheap way to assert if two roots of a given (irreducible) polynomial have the same real value?
Using resultants, you can find a polynomial which has (a  b) as a root. And then you use interval arithmetic to find which root. Determining whether the imaginary parts are equal then is equivalent to checking whether a real polynomial has a pure imaginary root. The latter can probably be done using some Sturmlike algorithm.
comment:31 followup: 32 Changed 8 years ago by
Replying to jdemeyer:
Using resultants, you can find a polynomial which has (a  b) as a root.
Up to now I had assumed that most arithmetic in QQbar would eventually be performed using resultants. But it seems I was mistaken.
sage: x = polygen(ZZ) sage: p1 = x^5 + 6*x^4  42*x^3  142*x^2 + 467*x + 422 sage: p2 = p1((x1)^2) sage: r1 = QQbar.polynomial_root(p2, CIF(1, (2.1, 2.2))) sage: r2 = QQbar.polynomial_root(p2, CIF(1, (2.8, 2.9))) sage: a,b = polygens(QQ, 'a,b') sage: %time p3 = r1.minpoly()(a + b).resultant(r2.minpoly()(b), b) CPU times: user 62 ms, sys: 0 ns, total: 62 ms Wall time: 68 ms sage: [r for f in p3.factor() ....: for r in f[0].univariate_polynomial().roots(QQbar, False) ....: if r._value.overlaps(r1._value  r2._value)] [0.7266314748516305?*I] sage: %time p4 = (r1  r2).minpoly()
One possible root of p3
is b=r2
and a+b=r1
which means a=r1r2
. So eliminating b
we get a (reducible, not minimal) polynomial in a
which has that difference as one of its roots, just as you indicated. I try to identify that by looking at the roots r
of the factors f
, checking whether they overlap the numeric interval. The single result I obtain has zero real part, thus indicating that we should sort by imaginary part.
I lost patience waiting for that final result of p4
, which should do pretty much the same thing in my opinion. My question is, why is it computed the way it is? Why do arithmetic operators for algebraic numbers compute some costly unions of number fields (which I believe is what they are doing), instead of using resultants to describe their results? And should we start some major rewrite effort to change that, i.e. to base most if not all arithmetic operations on resultants?
I can think of two possible problems. One is that we might be dealing with a special case here, and that perhaps number field unions are in general cheaper than resultants. If that were the case, what does that mean for speeding up comparisons? Another possible problem I can imagine is that the resultant could factor into several distinct polynomials, some of which might share a root. If that were the case, numeric refinement wouldn't be able to help choosing the right factor. Should we perhaps not factor the resultant polynomial, but instead compute roots for the fully expanded form?
I get the impression that this might trigger some major work. I hope that the reviewed changes can land as they are, while we investigate (in some other branch) how to tackle this more generic approach.
comment:32 Changed 8 years ago by
Replying to gagern:
Why do arithmetic operators for algebraic numbers compute some costly unions of number fields (which I believe is what they are doing), instead of using resultants to describe their results? And should we start some major rewrite effort to change that, i.e. to base most if not all arithmetic operations on resultants?
I get the impression that this might trigger some major work. I hope that the reviewed changes can land as they are, while we investigate (in some other branch) how to tackle this more generic approach.
Just created #17886 about using resultants to speed up most qqbar operations.
comment:33 Changed 8 years ago by
Branch:  u/vdelecroix/16964 → 3f4afef46ab6a042cb2678394031cb5c26d89b1a 

Resolution:  → fixed 
Status:  positive_review → closed 
Please mention the version of Sage, in particular whether or not #15767 was applied.