Opened 7 years ago

Closed 6 years ago

#16964 closed defect (fixed)

Speed up comparisons in QQbar

Reported by: gagern Owned by:
Priority: critical Milestone: sage-6.5
Component: number fields Keywords: variety qqbar cmp singular
Cc: Merged in:
Authors: Martin von Gagern Reviewers: Vincent Delecroix
Report Upstream: N/A Work issues:
Branch: 3f4afef (Commits, GitHub, GitLab) Commit: 3f4afef46ab6a042cb2678394031cb5c26d89b1a
Dependencies: Stopgaps:

Status badges

Description (last modified by gagern)

When computing the variety of some ideal, then excessive amounts of time are apparently spent sorting the solutions. (In one example, the variety was essentially computed in 7½ hours but the sorting wasn't finished even 1½ hours after that.) This is due to the fact that comparison between complex algebraic numbers is done lexicographically, which means that quite often the real parts of complex conjugate numbers will have to be compared for equality, which can be a very costly operation.

Originally the computation even resulted in a Pari exception (“not enough precomputed primes”). This no longer occurs (since the pari upgrade from #15767). So the focus of this ticket is now the excessive amount of time required for comparisons, even without an exception.

Change History (33)

comment:1 follow-up: Changed 7 years ago by jdemeyer

Please mention the version of Sage, in particular whether or not #15767 was applied.

Last edited 7 years ago by jdemeyer (previous) (diff)

comment:2 in reply to: ↑ 1 Changed 7 years ago by gagern

Replying to jdemeyer:

Please mention the version of Sage, in particular whether or not #15767 was applied.

This was sage 6.3 on Gentoo. With current develop (6.4.beta2) at least it doesn't fail as “quickly”: where the previous computation was 8 hours all in all up to the exception, my patched version without sorting took 7½ hours to compute the list, and after that sage has now been busy sorting that list for 1½ hours and still isn't done. I'll report back when it is, or when it gave up, but I'd say some need for action remains, even if the exception no longer occurs in this form.

comment:3 Changed 7 years ago by gagern

Here is a reproducing example which at least demonstrates that comparisons take way longer than they should:

sage: r = QQ[x](69721504*x^8 + 251777664*x^6 + 329532012*x^4 + 184429548*x^2 + 37344321).roots(QQbar, False)
sage: r
[-0.0221204634374360? - 1.090991904211621?*I,
 -0.0221204634374360? + 1.090991904211621?*I,
 -0.8088604911480535?*I,
 -0.7598602580415435?*I,
 0.7598602580415435?*I,
 0.8088604911480535?*I,
 0.0221204634374360? - 1.090991904211621?*I,
 0.0221204634374360? + 1.090991904211621?*I]
sage: [r[0], r[1]].sort()

This is because the comparison of the real parts takes like forever. Which in turn is because the computation of its minimal polynomial takes forever.

Looking at the set of all zeros, I can see that there are 4 clearly distinct real parts, and each comes with a pair of conjugate solutions since the polynomial has real coefficients. This is enough to conclude that if the intervals for two real parts overlap, then they must be equal and I don't have to do an exact computation for this. Should we try to implement some of this reasoning as a special case for AlgebraicNumber.__cmp__, for the case where the descriptor is exact and the minpoly is the same?

comment:4 Changed 7 years ago by jdemeyer

  • Summary changed from MPolynomialIdeal_singular_repr.variety: sage.libs.pari.gen.PariError: not enough precomputed primes to Fix comparisons in QQbar

comment:5 Changed 7 years ago by gagern

Looking for ways to address this, I noticed that the a lot of time apparently is spent inside

class ANBinaryExpr(ANDescr):
    ⋮
    def exactify(self):
        ⋮
            gen = left._exact_field().union(right._exact_field())
        ⋮

I wonder whether we can avoid that union for the case where both fields have the same defining polynomial. I wonder whether we can assume that the root of the field will form a power basis, and if so, whether there is any reasonably cheap way to find the conversion between different power bases, so we could express one root in terms of another.

Since I don't have any good ideas how to achieve this, my best idea still is tacking this at the __cmp__ level, but if anyone has an idea for solving this more generic issue, that would be really great since it would help other computations as well. So I'm sharing my thoughts.

Here is what I've tried and discarded, so you can avoid that same cul de sac. I started by writing down a generic linear combination w = a₀ + a₁z + a₂z² + … and then computed p(w) reduced by p(z), where p is the polynomial of the field. This gives a polynomial in z, and if enough powers of z are irrational then all the coefficients have to be zero if w is a root and the a_i are to be rational. So this gave me d conditions on these a₀ through a_{d-1}, which I could combine into an ideal and try to compute a variety. But that variety computation takes like forever in the above example, so this generic approach of finding other roots is not feasible in this fashion. Been there, tried that and discarded it.

comment:6 Changed 6 years ago by jdemeyer

  • Component changed from algebraic geometry to number fields
  • Priority changed from major to critical
  • Summary changed from Fix comparisons in QQbar to Speed up comparisons in QQbar

comment:7 Changed 6 years ago by jdemeyer

  • Description modified (diff)
  • Milestone changed from sage-6.4 to sage-6.5

comment:8 Changed 6 years ago by gagern

  • Branch set to u/gagern/ticket/16964
  • Created changed from 09/11/14 06:23:53 to 09/11/14 06:23:53
  • Modified changed from 12/13/14 22:20:12 to 12/13/14 22:20:12

comment:9 Changed 6 years ago by gagern

  • Authors set to Martin von Gagern
  • Commit set to 9f6fdc209d62e055df4bdd38f9f946332fbf3842
  • Status changed from new to needs_review

Here is my first stab at the approach outlined in comment:3. If you have a nicer way to handle things, feel free to suggest an alternative. If you think that my special case should not call minpoly() but instead examine whether the descriptor is ANRoot with matching polynomial, please state your reason for this argument. Likewise, if you think I should also handle more than one real value or conjugate pair for a given real component, then I'd love to hear how you'd implement this without making the code too hard to read and maintain.

As it is, I consider my code not particularly beautiful, and perhaps not the best solution either, but it is way better than the current state of affairs, and since we have other things depending on this, e.g. #14239, I'd like to see this merged pretty soon and possible improvements dealt with in a later ticket.


New commits:

9f6fdc2Faster qqbar comparison for case with equal minpoly.

comment:10 Changed 6 years ago by git

  • Commit changed from 9f6fdc209d62e055df4bdd38f9f946332fbf3842 to 2c4ac1be555a9c81dee84acfc7d7c2cb2602db22

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

2c4ac1bFaster qqbar comparison for case with equal minpoly.

comment:11 follow-up: Changed 6 years ago by jdemeyer

Just a quick comment: I like your general approach, but I have to check the details...

comment:12 Changed 6 years ago by git

  • Commit changed from 2c4ac1be555a9c81dee84acfc7d7c2cb2602db22 to e7acd96040fb43eebed0e3df0f1b1c7613d49bc1

Branch pushed to git repo; I updated commit sha1. New commits:

e7acd96Fix term in documentation.

comment:13 in reply to: ↑ 11 Changed 6 years ago by gagern

  • Description modified (diff)

Changing description to turn focus from pari exception to performance.

Replying to jdemeyer:

I like your general approach, but I have to check the details...

Have you found time for a closer look? Since this is currently the only ticket in the review queue with priority critical, perhaps we should try to get this into Sage 6.5?

comment:14 Changed 6 years ago by vdelecroix

  • Status changed from needs_review to needs_info

Hello,

This improvement is really cool!

EDIT: the thing below is basically one part of comment:9...

Couldn't we do a little bit better? In some cases we might identify numerically pairs of conjugated roots (and possibly one real root) and still be able to decide which one is which. There exists polynomials whose roots have the same real part:

sage: x = polygen(ZZ)
sage: P = x^4 - 4*x^3 + 9*x^2 - 10*x + 5
sage: P.roots(CC,False)
[1.00000000000000 - 1.61803398874989*I,
 1.00000000000000 - 0.618033988749895*I,
 1.00000000000000 + 0.618033988749895*I,
 1.00000000000000 + 1.61803398874989*I]

With the implementation proposed in this ticket, the comparison of the roots of the above polynomial falls back to the generic implementation. We could at least compare the conjugated ones without exactification of the real part. But we can leave that for a further ticket.

Vincent

Last edited 6 years ago by vdelecroix (previous) (diff)

comment:15 Changed 6 years ago by vdelecroix

And you should also avoid this special case if -self._value.imag() and other._value.imag() do not overlap. That would potentially save a call to minpoly.

Last edited 6 years ago by vdelecroix (previous) (diff)

comment:16 Changed 6 years ago by vdelecroix

  • Branch changed from u/gagern/ticket/16964 to u/vdelecroix/16964
  • Commit changed from e7acd96040fb43eebed0e3df0f1b1c7613d49bc1 to 1294bf20bb6fe915a5371318cb8f9feacfa3f86a
  • Status changed from needs_info to needs_review

I pushed a review commit. Tell me what you think. It still lacks a nice complicated example... (I am looking for it)


New commits:

8c0cf63trac #17863: remove stuff from src/ext
2777e6bMerge branch 'u/gagern/ticket/16964' of trac.sagemath.org:sage into tmp
1294bf2trac #16964: review

comment:17 Changed 6 years ago by git

  • Commit changed from 1294bf20bb6fe915a5371318cb8f9feacfa3f86a to e6a5a00acfb25935788ee67fc2defe28060343be

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

e6a5a00trac #16964: review

comment:18 Changed 6 years ago by git

  • Commit changed from e6a5a00acfb25935788ee67fc2defe28060343be to 0bf7dba36d3935c23ac385de7c28cf6d85055cd5

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

8c0cf63trac #17863: remove stuff from src/ext
dfc820dtrac #16964: merge sage-6.6.beta1
0bf7dbatrac #16964: review

comment:19 Changed 6 years ago by git

  • Commit changed from 0bf7dba36d3935c23ac385de7c28cf6d85055cd5 to 731ec907ee9cdc81bb06d647ebf0be26bd6ba117

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

4f082b4trac #16964: merge sage-6.6.beta1
731ec90trac #16964: review

comment:20 Changed 6 years ago by vdelecroix

Sorry I messed up the commits! Now everything looks good and doctest are fine...

comment:21 Changed 6 years ago by git

  • Commit changed from 731ec907ee9cdc81bb06d647ebf0be26bd6ba117 to 297c68ad67657dc0e4c04a362827934a2e8ed17c

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

a96f3f1trac #16964: review
297c68atrac #16964: (review) add an example

comment:22 Changed 6 years ago by gagern

  • Branch changed from u/vdelecroix/16964 to u/gagern/16964

comment:23 follow-up: Changed 6 years ago by gagern

  • Commit changed from 297c68ad67657dc0e4c04a362827934a2e8ed17c to 16f62a2ea8e36bf3ac9f781677e404e01366a594

I like how you avoid unneccessary minpoly computation. But I think we can do better than you did, by not having the intervals span negative and positive imaginary parts, but instead considering the absolute value of the imaginary part. I did something along these lines, but I now see that I'll have to rebase that on your latest forced push…


New commits:

731ec90trac #16964: review
16f62a2trac #16964: Use absolute value of imaginary part

comment:24 in reply to: ↑ 23 ; follow-up: Changed 6 years ago by vdelecroix

Replying to gagern:

I like how you avoid unneccessary minpoly computation. But I think we can do better than you did, by not having the intervals span negative and positive imaginary parts, but instead considering the absolute value of the imaginary part. I did something along these lines, but I now see that I'll have to rebase that on your latest forced push…

sorry for that... Your version is much simpler by the way!

But I am not completely convinced that this is optimal. Do you know if there is a cheap way to assert if two roots of a given (irreducible) polynomial have the same real value? In the example I added in the commit 297c68a sorted(p2.roots(QQbar,False) still takes hours.

Vincent

Last edited 6 years ago by vdelecroix (previous) (diff)

comment:25 Changed 6 years ago by vdelecroix

I can also rebase my changes on yours BTW.

comment:26 Changed 6 years ago by gagern

Would I be correct to assume that all this ii_minus and ii_plus handling in a96f3f1 is essentially equivalent to my use of absolute values in 16f62a2? If that is the case, then I'd rather drop a96f3f1 from the history, and instead rebase 297c68a onto 16f62a2. How does that sound to you?

comment:27 Changed 6 years ago by git

  • Commit changed from 16f62a2ea8e36bf3ac9f781677e404e01366a594 to aaf3eb648e1feca6e27c6dd3ffc5fdd9cee98191

Branch pushed to git repo; I updated commit sha1. New commits:

aaf3eb6trac #16964: (review) add an example

comment:28 Changed 6 years ago by vdelecroix

  • Branch changed from u/gagern/16964 to u/vdelecroix/16964
  • Commit changed from aaf3eb648e1feca6e27c6dd3ffc5fdd9cee98191 to 3f4afef46ab6a042cb2678394031cb5c26d89b1a
  • Reviewers set to Vincent Delecroix

I choose a more meaningful test. Could you have a look?

As a consequence of your changes, we get great improvements on comparing rationals!!

new timings

sage: a = QQbar(1/3)
sage: b = QQbar(1/2)
sage: %timeit cmp(a,b)
1000000 loops, best of 3: 881 ns per loop

old timings

sage: a = QQbar(1/3)
sage: b = QQbar(1/2)
sage: %timeit cmp(a,b)
10000 loops, best of 3: 34.4 µs per loop

This is completely crazy.

If you are happy with my changes you can set to positive review.

Vincent


New commits:

3f4afeftrac #16964: better doctest

comment:29 Changed 6 years ago by gagern

  • Status changed from needs_review to positive_review

I like it. Thanks for the review!

The speed gain for the rational numbers appear to be due to the fact that we no longer have to construct an algebraic number representation for the real part. a._value.real() is a lot faster than a.real():

sage: %timeit a._value.real()
1000000 loops, best of 3: 201 ns per loop
sage: %timeit a.real()
100000 loops, best of 3: 10.7 µs per loop

(I still hope that someone, some day, will make all this work here obsolete by coming up with a better way to compare QQbar elements even if they don't share a minpoly. No idea how, though.)

comment:30 in reply to: ↑ 24 ; follow-up: Changed 6 years ago by jdemeyer

Replying to vdelecroix:

But I am not completely convinced that this is optimal. Do you know if there is a cheap way to assert if two roots of a given (irreducible) polynomial have the same real value?

Using resultants, you can find a polynomial which has (a - b) as a root. And then you use interval arithmetic to find which root. Determining whether the imaginary parts are equal then is equivalent to checking whether a real polynomial has a pure imaginary root. The latter can probably be done using some Sturm-like algorithm.

comment:31 in reply to: ↑ 30 ; follow-up: Changed 6 years ago by gagern

Replying to jdemeyer:

Using resultants, you can find a polynomial which has (a - b) as a root.

Up to now I had assumed that most arithmetic in QQbar would eventually be performed using resultants. But it seems I was mistaken.

sage: x = polygen(ZZ)
sage: p1 = x^5 + 6*x^4 - 42*x^3 - 142*x^2 + 467*x + 422
sage: p2 = p1((x-1)^2)
sage: r1 = QQbar.polynomial_root(p2, CIF(1, (2.1, 2.2)))
sage: r2 = QQbar.polynomial_root(p2, CIF(1, (2.8, 2.9)))
sage: a,b = polygens(QQ, 'a,b')
sage: %time p3 = r1.minpoly()(a + b).resultant(r2.minpoly()(b), b)
CPU times: user 62 ms, sys: 0 ns, total: 62 ms
Wall time: 68 ms
sage: [r for f in p3.factor()
....:  for r in f[0].univariate_polynomial().roots(QQbar, False)
....:  if r._value.overlaps(r1._value - r2._value)]
[-0.7266314748516305?*I]
sage: %time p4 = (r1 - r2).minpoly()

One possible root of p3 is b=r2 and a+b=r1 which means a=r1-r2. So eliminating b we get a (reducible, not minimal) polynomial in a which has that difference as one of its roots, just as you indicated. I try to identify that by looking at the roots r of the factors f, checking whether they overlap the numeric interval. The single result I obtain has zero real part, thus indicating that we should sort by imaginary part.

I lost patience waiting for that final result of p4, which should do pretty much the same thing in my opinion. My question is, why is it computed the way it is? Why do arithmetic operators for algebraic numbers compute some costly unions of number fields (which I believe is what they are doing), instead of using resultants to describe their results? And should we start some major rewrite effort to change that, i.e. to base most if not all arithmetic operations on resultants?

I can think of two possible problems. One is that we might be dealing with a special case here, and that perhaps number field unions are in general cheaper than resultants. If that were the case, what does that mean for speeding up comparisons? Another possible problem I can imagine is that the resultant could factor into several distinct polynomials, some of which might share a root. If that were the case, numeric refinement wouldn't be able to help choosing the right factor. Should we perhaps not factor the resultant polynomial, but instead compute roots for the fully expanded form?

I get the impression that this might trigger some major work. I hope that the reviewed changes can land as they are, while we investigate (in some other branch) how to tackle this more generic approach.

comment:32 in reply to: ↑ 31 Changed 6 years ago by gagern

Replying to gagern:

Why do arithmetic operators for algebraic numbers compute some costly unions of number fields (which I believe is what they are doing), instead of using resultants to describe their results? And should we start some major rewrite effort to change that, i.e. to base most if not all arithmetic operations on resultants?

I get the impression that this might trigger some major work. I hope that the reviewed changes can land as they are, while we investigate (in some other branch) how to tackle this more generic approach.

Just created #17886 about using resultants to speed up most qqbar operations.

comment:33 Changed 6 years ago by vbraun

  • Branch changed from u/vdelecroix/16964 to 3f4afef46ab6a042cb2678394031cb5c26d89b1a
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.