Opened 11 years ago
Closed 8 years ago
#9449 closed defect (duplicate)
The summary printed after running doctests is inaccurate.
Reported by: | drkirkby | Owned by: | mvngu |
---|---|---|---|
Priority: | major | Milestone: | sage-duplicate/invalid/wontfix |
Component: | doctest coverage | Keywords: | |
Cc: | SimonKing, jhpalmieri, mpatel, leif | Merged in: | |
Authors: | Reviewers: | Jeroen Demeyer | |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
The following shows a summary of a doctest failures. They were actually observed on a Solaris machine, but that is unlikely to be relevant. After building Sage
$ make ptestlong
was executed.
- 5 of the 6 doctest failures in the summary have 0 tests failing.
- BSD.py is the only one of the 6 doctest failures which has a non-zero number of failures.
The following tests failed: sage -t -long devel/sage/doc/fr/tutorial/programming.rst # 0 doctests failed sage -t -long devel/sage/sage/schemes/plane_curves/constructor.py # 0 doctests failed sage -t -long devel/sage/sage/schemes/elliptic_curves/BSD.py # 1 doctests failed sage -t -long devel/sage/sage/parallel/decorate.py # 0 doctests failed sage -t -long devel/sage/sage/libs/galrep/wrapper.pyx # 0 doctests failed ---------------------------------------------------------------------- Total time for all tests: 3990.8 seconds
In the case of one of the tests, it would appear from the log that it actually passed
sage -t -long devel/sage/sage/parallel/decorate.py [44.0 s]
This is pretty damn serious, as it means we can not rely on the doctest results.
Hardware and software used
- sage-4.5.alpha4
- A Sun T5240
- 2 x 8 core, 64-thread UltraSPARC T2+ 1167 MHz
- 32 GB RAM
- Solaris 10 update 7 (05/09)
- t2.math.washtington.edu
- gcc 4.4.1 configured to use both the Sun linker and assembler.
- 32-bit build (This is the default). The environment variable
SAGE64
was not used.
Attachments (2)
Change History (22)
Changed 11 years ago by
comment:1 follow-up: ↓ 3 Changed 11 years ago by
comment:2 Changed 11 years ago by
- Description modified (diff)
I've attached a log.
Also corrected something which was cut and pasted from another ticket
comment:3 in reply to: ↑ 1 Changed 11 years ago by
Replying to wjp:
Could you try with patches #8641, #9243, #9316 to the doctest framework? They clean up and fix parts of the error handling.
Yes. The only problem I have is that these test failures are not always reproducible. Tests which failed when I run
$ make ptestlong
actually passed later. There's a report from John H Palmieri on sage-release of devel/sage/sage/misc/trace.py
failing for him three times on this machine, then working. This makes any sort of testing difficult, and means it will be hard for me to know if s #8641, #9243, #9316 really solve the problem or not. But I will try them and report back later.
If they don't fix it yet, I can give you a patch which adds some extra debugging info to track this down.
OK, thank you.
I know there was a recent patch to the doctesting to remove spurious errors about files not found - #9316. I don't know if these could be related or not.
I've got other strange results from doctests too, but I'll put that on another ticket.
Dave
comment:4 Changed 11 years ago by
I was unable to apply the last patch cleanly - the first two were ok.
kirkby@t2:[/tmp/kirkby/sage-4.5.alpha4/local/bin] $ hg qimport http://trac.sagemath.org/sage_trac/raw-attachment/ticket/8641/trac_8641-doctest_exit_codes.3.patch adding trac_8641-doctest_exit_codes.3.patch to series file kirkby@t2:[/tmp/kirkby/sage-4.5.alpha4/local/bin] $ hg qpush applying trac_8641-doctest_exit_codes.3.patch now at: trac_8641-doctest_exit_codes.3.patch kirkby@t2:[/tmp/kirkby/sage-4.5.alpha4/local/bin] $ hg qimport http://trac.sagemath.org/sage_trac/raw-attachment/ticket/9243/trac_9243.patch adding trac_9243.patch to series file kirkby@t2:[/tmp/kirkby/sage-4.5.alpha4/local/bin] $ hg qpush applying trac_9243.patch now at: trac_9243.patch kirkby@t2:[/tmp/kirkby/sage-4.5.alpha4/local/bin] $ hg qimport http://trac.sagemath.org/sage_trac/raw-attachment/ticket/9316/scripts9316_timeout_rebased.patch adding scripts9316_timeout_rebased.patch to series file kirkby@t2:[/tmp/kirkby/sage-4.5.alpha4/local/bin] $ hg qpush applying scripts9316_timeout_rebased.patch patching file sage-doctest Hunk #1 FAILED at 18 1 out of 2 hunks FAILED -- saving rejects to file sage-doctest.rej patch failed, unable to continue (try -v) patch failed, rejects left in working dir errors during apply, please fix and refresh scripts9316_timeout_rebased.patch kirkby@t2:[/tmp/kirkby/sage-4.5.alpha4/local/bin] $ hg status kirkby@t2:[/tmp/kirkby/sage-4.5.alpha4/local/bin] $ hg log | more changeset: 1541:6510dc91cc05 tag: scripts9316_timeout_rebased.patch tag: qtip tag: tip user: Willem Jan Palenstijn <wpalenst@math.leidenuniv.nl> date: Tue Jul 06 22:58:13 2010 +0200 summary: #9316: Add exit code 64 for time out to doctests changeset: 1540:6e69aa36dd4d tag: trac_9243.patch user: Dan Drake <drake@kaist.edu> date: Wed Jun 16 13:48:32 2010 +0900 summary: trac 9243: sage-doctest should use powers of 2 for return codes changeset: 1539:08d8519d03dc tag: trac_8641-doctest_exit_codes.3.patch tag: qbase user: Mitesh Patel <qed777@gmail.com> date: Wed Jun 16 00:06:03 2010 -0700 summary: trac 8641: return nonzero code if tests fail. Dan Drake, John Palmieri
comment:5 follow-up: ↓ 6 Changed 11 years ago by
You're missing the second patch at #9243.
comment:6 in reply to: ↑ 5 Changed 11 years ago by
comment:7 follow-up: ↓ 19 Changed 11 years ago by
Here's the results after applying #8641, #9243, #9316. As you can see, there are still tests which fail with 0 doctest failures (These results are again on the Sun T5240 t2.math.washington.edu, which is the same as before).
sage -t -long devel/sage/sage/modular/ssmod/ssmod.py [3001.1 s] sage -t -long devel/sage/sage/schemes/elliptic_curves/ell_rational_field.py [3137.3 s] ---------------------------------------------------------------------- The following tests failed: sage -t -long devel/sage/doc/en/constructions/number_fields.rst # 0 doctests failed sage -t -long devel/sage/sage/categories/examples/hopf_algebras_with_basis.py # 0 doctests failed sage -t -long devel/sage/sage/libs/galrep/wrapper.pyx # 0 doctests failed sage -t -long devel/sage/sage/parallel/decorate.py # 1 doctests failed ---------------------------------------------------------------------- Total time for all tests: 4264.8 seconds
Here's the first one, which fails, but with 0 recorded failures.
sage -t -long devel/sage/doc/en/constructions/number_fields.rst python: can't open file '/rootpool2/local/kirkby/.sage//tmp/number_fields.py': [Errno 2] No such file or directory [1.9 s]
Here's the second recorded failure
sage -t -long devel/sage/sage/categories/examples/hopf_algebras_with_basis.py python: can't open file '/rootpool2/local/kirkby/.sage//tmp/hopf_algebras_with_basis.py': [Errno 2] No such file or directory [1.7 s]
The third failure is where Sage crashed:
sage -t -long devel/sage/sage/libs/galrep/wrapper.pyx ------------------------------------------------------------ Unhandled SIGBUS: A bus error occurred in Sage. This probably occurred because a *compiled* component of Sage has a bug in it (typically accessing invalid memory) or is not properly wrapped with _sig_on, _sig_off. You might want to run Sage under gdb with 'sage -gdb' to debug this. Sage will now terminate (sorry). ------------------------------------------------------------ [21.5 s]
This is recorded as 0 failures, despite the fact Sage crashed.
The only test that is recorded as a failure, with a non-zero number of failures, is the last test
sage -t -long devel/sage/sage/parallel/decorate.py ********************************************************************** File "/tmp/kirkby/sage-4.5.alpha4/devel/sage-testing/sage/parallel/decorate.py", line 152: sage: v = list(f([1,2,4])); v.sort(); v Exception raised: Traceback (most recent call last): File "/tmp/kirkby/sage-4.5.alpha4/local/bin/ncadoctest.py", line 1231, in run_one_test self.run_one_example(test, example, filename, compileflags) File "/tmp/kirkby/sage-4.5.alpha4/local/bin/sagedoctest.py", line 38, in run_one_example OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags) File "/tmp/kirkby/sage-4.5.alpha4/local/bin/ncadoctest.py", line 1172, in run_one_example compileflags, 1) in test.globs File "<doctest __main__.example_4[9]>", line 1, in <module> v = list(f([Integer(1),Integer(2),Integer(4)])); v.sort(); v###line 152: sage: v = list(f([1,2,4])); v.sort(); v File "/tmp/kirkby/sage-4.5.alpha4/local/lib/python/site-packages/sage/parallel/multiprocessing_sage.py", line 64, in parallel_iter p = Pool(processes) File "/tmp/kirkby/sage-4.5.alpha4/local/lib/python2.6/multiprocessing/__init__.py", line 227, in Pool return Pool(processes, initializer, initargs) File "/tmp/kirkby/sage-4.5.alpha4/local/lib/python2.6/multiprocessing/pool.py", line 104, in __init__ w.start() File "/tmp/kirkby/sage-4.5.alpha4/local/lib/python2.6/multiprocessing/process.py", line 104, in start self._popen = Popen(self) File "/tmp/kirkby/sage-4.5.alpha4/local/lib/python2.6/multiprocessing/forking.py", line 94, in __init__ self.pid = os.fork() OSError: [Errno 12] Not enough space ********************************************************************** 1 items had failures: 1 of 12 in __main__.example_4 ***Test Failed*** 1 failures. For whitespace errors, see the file /rootpool2/local/kirkby/.sage//tmp/.doctest_decorate.py [51.3 s]
Dave
comment:8 Changed 11 years ago by
- Priority changed from critical to blocker
I've marked this as a blocker, as I can't see how we can justify making a release if the summary of the doc tests can't be trusted to be correct. If we can't trust the tests, what can we trust?
comment:9 Changed 11 years ago by
- Cc mpatel added
Changed 11 years ago by
comment:10 Changed 11 years ago by
- Cc leif added
comment:11 Changed 11 years ago by
The "# failures" that sage reports is calculated very simply by counting the number of instances of "Expected:" "Expected nothing" and "Exception raised:".
Sage correctly detected that things went wrong, as you can tell from the fact that the files causing errors are listed in the summary at the end.
As far as I can tell, the only thing that's not quite right is the reporting of what exactly went wrong. There's likely a better way of reporting these specific crashes and missing file errors you're getting, but I don't see any reason to believe something is as seriously wrong with the error reporting as you seem to be implying on this ticket and on the mailing list.
(Reverting priority to major.)
The file not found errors (Are you out of disk space?) should probably be caught before the doctesting process is launched in the first place. I'll have to think about the SIGBUS case some more.
comment:12 Changed 11 years ago by
- Priority changed from blocker to major
comment:13 Changed 11 years ago by
To reproduce the singal reporting issue:
./sage -tp 1 bus.sage
with bus.sage containing
""" sage: import os sage: os.kill(os.getpid(), 7) """
comment:14 Changed 11 years ago by
It's not convenient for me to test these now, but take a look at the doctest sage -t -long devel/sage/sage/parallel/decorate.py
on line 4732 of
http://trac.sagemath.org/sage_trac/attachment/ticket/9449/ptestlong.log
As far as I can see, there is absolutely no error message there, so nothing to indicate that this failed, but rather that it passed in 44.0s. But at the end it is listed as failing.
I did 64 tests in parallel. It's possible the file system may have got too full. I will retest with fewer parallel tasks.
Dave
comment:15 Changed 11 years ago by
I can only tell there are indeed flaws and inconsistencies in the doctest framework (I started examining it a few days ago, made some notes, but haven't finished yet).
E.g. sage-doctest
exit code 1 can either mean "file not found", a command line syntax error or even a keyboard interrupt.
I don't understand why sage-test
and sage-ptest
are separated, though they contain common code that might get (or yet is) out of sync.
Different scripts use different temporary directories...
Some make
targets for testing do ./sage -b
, others do not...
Just to name a few; I'll continue to investigate the bunch of involved scripts further, but don't really know how soon.
-Leif
comment:16 Changed 11 years ago by
comment:17 follow-up: ↓ 18 Changed 11 years ago by
I don't know if this one is the origin, but it's simply horrible:
def test_file(F): """ This is the function that actually tests a file """ outfile = tempfile.NamedTemporaryFile() base, ext = os.path.splitext(F) cmd = 'doctest ' + opts if SAGE_SITE in os.path.realpath(F) and not '-force_lib' in cmd: cmd += ' -force_lib' filestr = os.path.split(F)[1] for i in range(0,numiteration): os.chdir(os.path.dirname(F)) command = os.path.join(SAGE_ROOT, 'local', 'bin', 'sage-%s' % cmd) s = 'bash -c "%s %s > %s" ' % (command, filestr, outfile.name) try: t = time.time() ret = os.system(s) finished_time = time.time() - t if ret>=256: ret=ret/256 ret = -ret except: ol = outfile.read() return (-5, 0, ol) ol = outfile.read() if ret != 0: break return (F, ret, finished_time, ol)
(Excerpt from sage-ptest
, note not just the returned tuples...)
Beat me if I've missed something here...
comment:18 in reply to: ↑ 17 Changed 11 years ago by
Replying to leif:
I don't know if this one is the origin, but it's simply horrible:
def test_file(F): """ This is the function that actually tests a file """ outfile = tempfile.NamedTemporaryFile() base, ext = os.path.splitext(F) cmd = 'doctest ' + opts if SAGE_SITE in os.path.realpath(F) and not '-force_lib' in cmd: cmd += ' -force_lib' filestr = os.path.split(F)[1] for i in range(0,numiteration): os.chdir(os.path.dirname(F)) command = os.path.join(SAGE_ROOT, 'local', 'bin', 'sage-%s' % cmd) s = 'bash -c "%s %s > %s" ' % (command, filestr, outfile.name) try: t = time.time() ret = os.system(s) finished_time = time.time() - t if ret>=256: ret=ret/256 ret = -ret except: ol = outfile.read() return (-5, 0, ol) ol = outfile.read() if ret != 0: break return (F, ret, finished_time, ol)(Excerpt from
sage-ptest
, note not just the returned tuples...)Beat me if I've missed something here...
I barely know Python, so there is nothing I can do about this. But if you have a better solution, it would be worth trying it. I feel a bit uneasy about Sage, when we can't trust the mechanism for testing it.
It's odd why failures on some test is pretty repeatable when running make ptestlong
but run individually, they pass. I'm pretty sure that is not a memory issue, as I've noticed that on my Sun Blade 2000 SPARC where I'm the only user. That machine has 8 GB RAM and only two processors (its not multi-cored or multi-threaded), so I don't run more than 3 tests in parallel.
comment:19 in reply to: ↑ 7 Changed 11 years ago by
comment:20 Changed 8 years ago by
- Milestone changed from sage-5.8 to sage-duplicate/invalid/wontfix
- Resolution set to duplicate
- Reviewers set to Jeroen Demeyer
- Status changed from new to closed
Since #12415 completely rewrites the doctest framework with much better code, I'm assuming this problem is fixed.
Results from long doctests on t2.math.washington.edu (Solaris 10, UltraSPARC CPUs)