Opened 3 years ago

Closed 3 years ago

#28296 closed defect (fixed)

Random failure in src/sage/rings/qqbar.py

Reported by: vbraun Owned by:
Priority: major Milestone: sage-8.9
Component: number theory Keywords: random_fail
Cc: Merged in:
Authors: Vincent Delecroix Reviewers: Volker Braun
Report Upstream: N/A Work issues:
Branch: 12c0b20 (Commits, GitHub, GitLab) Commit: 12c0b209f9215c8c303253900b75d9f7613e9682
Dependencies: Stopgaps:

Status badges

Description

I'm seeing this with a rather high frequency:

**********************************************************************
File "src/sage/rings/qqbar.py", line 520, in sage.rings.qqbar
Failed example:
    z2 = QQbar.polynomial_root(p4, ival)
Exception raised:
    Traceback (most recent call last):
      File "/Users/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 681, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/Users/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 1105, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.rings.qqbar[183]>", line 1, in <module>
        z2 = QQbar.polynomial_root(p4, ival)
      File "/Users/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/rings/qqbar.py", line 1411, in polynomial_root
        return AlgebraicNumber(ANRoot(poly, interval, multiplicity))
      File "/Users/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/rings/qqbar.py", line 6209, in __init__
        self._interval = self.refine_interval(interval, 64)
      File "/Users/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/rings/qqbar.py", line 6367, in refine_interval
        v = self._complex_refine_interval(interval, prec)
      File "/Users/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/rings/qqbar.py", line 6594, in _complex_refine_interval
        return self._complex_isolate_interval(interval, prec)
      File "/Users/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/rings/qqbar.py", line 6688, in _complex_isolate_interval
        rts = self._poly.complex_roots(prec, self._multiplicity)
      File "/Users/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/rings/qqbar.py", line 6107, in complex_roots
        roots_mult = complex_roots(p, min_prec=prec)
      File "/Users/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/rings/polynomial/complex_roots.py", line 258, in complex_roots
        rts = cfac.roots(multiplicities=False)
      File "sage/rings/polynomial/polynomial_element.pyx", line 7629, in sage.rings.polynomial.polynomial_element.Polynomial.roots (build/cythonized/sage/rings/polynomial/polynomial_element.c:59323)
        ext_rts = self.__pari__().polroots(precision=L.prec())
      File "cypari2/gen.pyx", line 4122, in cypari2.gen.Gen.polroots
    AlarmInterrupt
**********************************************************************
1 item had failures:
   1 of 186 in sage.rings.qqbar
    [1483 tests, 1 failure, 96.81 s]
----------------------------------------------------------------------
sage -t --long src/sage/rings/qqbar.py  # 1 doctest failed
----------------------------------------------------------------------

I don't understand how it can run into an AlarmInterrupt

Change History (12)

comment:1 Changed 3 years ago by vdelecroix

This doctest is not supposed to take more than a second

    sage: alarm(1.0)
    sage: z2 = QQbar.polynomial_root(p4, ival)
    sage: cancel_alarm()

Should I increase to 5?

comment:2 follow-up: Changed 3 years ago by vbraun

Did you try running it on a raspberry pi, for example? I understand why you would want to test for speed regressions, but this isn't a good way of doing it.

The doctest framework has already an overall speed factor of the machine, collected from previous runs. It is used for the "slow doctest" warning. At the very least this shoud be taken into account. For failed tests it should also display the actual time taken, not just crash in an AlarmInterrupt.

Last edited 3 years ago by vbraun (previous) (diff)

comment:3 in reply to: ↑ 2 ; follow-up: Changed 3 years ago by vdelecroix

Replying to vbraun:

Did you try running it on a raspberry pi, for example? I understand why you would want to test for speed regressions, but this isn't a good way of doing it.

The doctest framework has already an overall speed factor of the machine, collected from previous runs. It is used for the "slow doctest" warning. At the very least this shoud be taken into account. For failed tests it should also display the actual time taken, not just crash in an AlarmInterrupt.

Of course I did not check on raspberry pi. How should I test for speed regression then? The only framework available are doctests. The only purpose of #17895 was to speed up execution. There is nothing changed from an input/output point of view.

comment:4 Changed 3 years ago by vbraun

Sure. The point that I'm trying to make is that, at least for now, you have to be *very* conservative with the upper time limit in a doctest.

comment:5 Changed 3 years ago by vdelecroix

  • Authors set to Vincent Delecroix
  • Branch set to public/28296
  • Commit set to d7174fffce47efc9230e9408dc502b2cfe16e111
  • Status changed from new to needs_review

New commits:

d7174ffincrease alarm time set in #17895

comment:6 Changed 3 years ago by git

  • Commit changed from d7174fffce47efc9230e9408dc502b2cfe16e111 to 12c0b209f9215c8c303253900b75d9f7613e9682

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

12c0b20increase alarm time set in #17895

comment:7 Changed 3 years ago by vbraun

  • Reviewers set to Volker Braun
  • Status changed from needs_review to positive_review

comment:8 in reply to: ↑ 3 Changed 3 years ago by gh-timokau

Replying to vdelecroix:

Replying to vbraun:

Did you try running it on a raspberry pi, for example? I understand why you would want to test for speed regressions, but this isn't a good way of doing it.

The doctest framework has already an overall speed factor of the machine, collected from previous runs. It is used for the "slow doctest" warning. At the very least this shoud be taken into account. For failed tests it should also display the actual time taken, not just crash in an AlarmInterrupt.

Of course I did not check on raspberry pi.

It's not just problematic when testing on a raspberry pi. For example our (nixos) buildservers rebuild and retest sage regularly. Sometimes the build servers may be under heavy load. Then a test can take an excessive amount of time, but it should not fail (which would fail the whole sage package).

Performance regressions are hard to measure, although worthwhile. But they should be tested entirely separately from the functionality tests.

comment:9 follow-up: Changed 3 years ago by gh-timokau

So as a summary, I propose to remove the alarm completely. If I put my laptop in suspend in the middle of running the test suite and wake it up again an hour later, the test suite should still pass. If we want to keep performance tests in the main test suite, they should at least be behind an optional flag so they are disable-able.

comment:10 in reply to: ↑ 9 Changed 3 years ago by vdelecroix

Replying to gh-timokau:

So as a summary, I propose to remove the alarm completely. If I put my laptop in suspend in the middle of running the test suite and wake it up again an hour later, the test suite should still pass. If we want to keep performance tests in the main test suite, they should at least be behind an optional flag so they are disable-able.

Could you open a ticket? This is not the only test concerned. And this should be documented in the developer guide: do not use alarm to test performance.

comment:11 Changed 3 years ago by vbraun

Hiding it behind a # optional - benchmark (or so) sounds like a good solution. If you make a ticket I'll review it ;-)

comment:12 Changed 3 years ago by vbraun

  • Branch changed from public/28296 to 12c0b209f9215c8c303253900b75d9f7613e9682
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.