Opened 3 years ago
Closed 3 years ago
#28296 closed defect (fixed)
Random failure in src/sage/rings/qqbar.py
Reported by:  vbraun  Owned by:  

Priority:  major  Milestone:  sage8.9 
Component:  number theory  Keywords:  random_fail 
Cc:  Merged in:  
Authors:  Vincent Delecroix  Reviewers:  Volker Braun 
Report Upstream:  N/A  Work issues:  
Branch:  12c0b20 (Commits, GitHub, GitLab)  Commit:  12c0b209f9215c8c303253900b75d9f7613e9682 
Dependencies:  Stopgaps: 
Description
I'm seeing this with a rather high frequency:
********************************************************************** File "src/sage/rings/qqbar.py", line 520, in sage.rings.qqbar Failed example: z2 = QQbar.polynomial_root(p4, ival) Exception raised: Traceback (most recent call last): File "/Users/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 681, in _run self.compile_and_execute(example, compiler, test.globs) File "/Users/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 1105, in compile_and_execute exec(compiled, globs) File "<doctest sage.rings.qqbar[183]>", line 1, in <module> z2 = QQbar.polynomial_root(p4, ival) File "/Users/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/rings/qqbar.py", line 1411, in polynomial_root return AlgebraicNumber(ANRoot(poly, interval, multiplicity)) File "/Users/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/rings/qqbar.py", line 6209, in __init__ self._interval = self.refine_interval(interval, 64) File "/Users/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/rings/qqbar.py", line 6367, in refine_interval v = self._complex_refine_interval(interval, prec) File "/Users/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/rings/qqbar.py", line 6594, in _complex_refine_interval return self._complex_isolate_interval(interval, prec) File "/Users/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/rings/qqbar.py", line 6688, in _complex_isolate_interval rts = self._poly.complex_roots(prec, self._multiplicity) File "/Users/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/rings/qqbar.py", line 6107, in complex_roots roots_mult = complex_roots(p, min_prec=prec) File "/Users/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/rings/polynomial/complex_roots.py", line 258, in complex_roots rts = cfac.roots(multiplicities=False) File "sage/rings/polynomial/polynomial_element.pyx", line 7629, in sage.rings.polynomial.polynomial_element.Polynomial.roots (build/cythonized/sage/rings/polynomial/polynomial_element.c:59323) ext_rts = self.__pari__().polroots(precision=L.prec()) File "cypari2/gen.pyx", line 4122, in cypari2.gen.Gen.polroots AlarmInterrupt ********************************************************************** 1 item had failures: 1 of 186 in sage.rings.qqbar [1483 tests, 1 failure, 96.81 s]  sage t long src/sage/rings/qqbar.py # 1 doctest failed 
I don't understand how it can run into an AlarmInterrupt
Change History (12)
comment:1 Changed 3 years ago by
comment:2 followup: ↓ 3 Changed 3 years ago by
Did you try running it on a raspberry pi, for example? I understand why you would want to test for speed regressions, but this isn't a good way of doing it.
The doctest framework has already an overall speed factor of the machine, collected from previous runs. It is used for the "slow doctest" warning. At the very least this shoud be taken into account. For failed tests it should also display the actual time taken, not just crash in an AlarmInterrupt.
comment:3 in reply to: ↑ 2 ; followup: ↓ 8 Changed 3 years ago by
Replying to vbraun:
Did you try running it on a raspberry pi, for example? I understand why you would want to test for speed regressions, but this isn't a good way of doing it.
The doctest framework has already an overall speed factor of the machine, collected from previous runs. It is used for the "slow doctest" warning. At the very least this shoud be taken into account. For failed tests it should also display the actual time taken, not just crash in an AlarmInterrupt.
Of course I did not check on raspberry pi. How should I test for speed regression then? The only framework available are doctests. The only purpose of #17895 was to speed up execution. There is nothing changed from an input/output point of view.
comment:4 Changed 3 years ago by
Sure. The point that I'm trying to make is that, at least for now, you have to be *very* conservative with the upper time limit in a doctest.
comment:5 Changed 3 years ago by
 Branch set to public/28296
 Commit set to d7174fffce47efc9230e9408dc502b2cfe16e111
 Status changed from new to needs_review
New commits:
d7174ff  increase alarm time set in #17895

comment:6 Changed 3 years ago by
 Commit changed from d7174fffce47efc9230e9408dc502b2cfe16e111 to 12c0b209f9215c8c303253900b75d9f7613e9682
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
12c0b20  increase alarm time set in #17895

comment:7 Changed 3 years ago by
 Reviewers set to Volker Braun
 Status changed from needs_review to positive_review
comment:8 in reply to: ↑ 3 Changed 3 years ago by
Replying to vdelecroix:
Replying to vbraun:
Did you try running it on a raspberry pi, for example? I understand why you would want to test for speed regressions, but this isn't a good way of doing it.
The doctest framework has already an overall speed factor of the machine, collected from previous runs. It is used for the "slow doctest" warning. At the very least this shoud be taken into account. For failed tests it should also display the actual time taken, not just crash in an AlarmInterrupt.
Of course I did not check on raspberry pi.
It's not just problematic when testing on a raspberry pi. For example our (nixos) buildservers rebuild and retest sage regularly. Sometimes the build servers may be under heavy load. Then a test can take an excessive amount of time, but it should not fail (which would fail the whole sage package).
Performance regressions are hard to measure, although worthwhile. But they should be tested entirely separately from the functionality tests.
comment:9 followup: ↓ 10 Changed 3 years ago by
So as a summary, I propose to remove the alarm completely. If I put my laptop in suspend in the middle of running the test suite and wake it up again an hour later, the test suite should still pass. If we want to keep performance tests in the main test suite, they should at least be behind an optional flag so they are disableable.
comment:10 in reply to: ↑ 9 Changed 3 years ago by
Replying to ghtimokau:
So as a summary, I propose to remove the alarm completely. If I put my laptop in suspend in the middle of running the test suite and wake it up again an hour later, the test suite should still pass. If we want to keep performance tests in the main test suite, they should at least be behind an optional flag so they are disableable.
Could you open a ticket? This is not the only test concerned. And this should be documented in the developer guide: do not use alarm to test performance
.
comment:11 Changed 3 years ago by
Hiding it behind a # optional  benchmark (or so) sounds like a good solution. If you make a ticket I'll review it ;)
comment:12 Changed 3 years ago by
 Branch changed from public/28296 to 12c0b209f9215c8c303253900b75d9f7613e9682
 Resolution set to fixed
 Status changed from positive_review to closed
This doctest is not supposed to take more than a second
Should I increase to 5?