Opened 11 years ago

Closed 11 years ago

#8641 closed enhancement (fixed)

"sage -t" should exit with nonzero exit code if doctests fail

Reported by: ddrake Owned by: ddrake
Priority: minor Milestone: sage-4.5.2
Component: doctest coverage Keywords:
Cc: wjp Merged in: sage-4.5.2.alpha1
Authors: Dan Drake, John Palmieri, Mitesh Patel Reviewers: Willem Jan Palenstijn
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description

Right now, if I run doctests with "sage -t" and some doctests fail, the exit code is zero -- but it would be very handy to have a nonzero exit code; for example, it would make using Mercurial's bisect command very useful.

In #7995, it seems like sage-doctest is passing back some useful exit codes, so we just need to pass those on in a reasonable way.

Attachments (6)

trac_8641-sage-test.patch (1.8 KB) - added by jhpalmieri 11 years ago.
scripts repo
trac_8641.patch (2.3 KB) - added by ddrake 11 years ago.
apply to scripts repo. Replaces previous.
trac_8641-part2.patch (1.3 KB) - added by jhpalmieri 11 years ago.
apply on top of trac_8641.patch
trac_8641-doctest_exit_codes.patch (2.3 KB) - added by mpatel 11 years ago.
Combined patch rebased vs 4.4.4.alpha0 + #8891. Replaces all previous.
trac_8641-doctest_exit_codes.2.patch (7.1 KB) - added by mpatel 11 years ago.
Modify sage-ptest, too. Apply to 4.4.4.alpha0 + #8891. Apply only this patch.
trac_8641-doctest_exit_codes.3.patch (7.2 KB) - added by mpatel 11 years ago.
Use err = err | 2 for KeyboardInterrupt. Apply only this patch, to 4.4.4.alpha0 + #8891.

Download all attachments as: .zip

Change History (37)

comment:1 Changed 11 years ago by wjp

  • Cc wjp added

comment:2 Changed 11 years ago by jhpalmieri

I'm following up on the discussion at the end of #7995.

First, the code err = err // 256 seems okay: putting a print statement at the end of that python function shows that err seems to have an appropriate value.

Other parts of the python code in sage-test confuse me, though. It seems to me that the function test_file always returns 0 (if the file exists), regardless of the success or failure of the test. The function test has a return value which depends on whether tests passed, but test_file never uses the return value from test, does it?

In practice, if I run sage -sh and then do sage-doctest on a file with errors, I get a nonzero return value, but if I run sage-test on the same file, I get a return value of 0.

comment:3 Changed 11 years ago by jhpalmieri

  • Status changed from new to needs_review

Here's a patch; I'll mark this ready for review, but it should perhaps be considered a first draft.

comment:4 Changed 11 years ago by jhpalmieri

  • Status changed from needs_review to needs_work

This needs to deal with the non-platform-independent behavior of return codes -- see the discussion at the bottom of #7995.

comment:5 Changed 11 years ago by jhpalmieri

  • Status changed from needs_work to needs_review

Perhaps replacing os.system(s) with check_call(s, shell=True) (after importing check_call from subprocess) will work? Here's draft number two.

Changed 11 years ago by jhpalmieri

scripts repo

comment:6 Changed 11 years ago by jhpalmieri

This seems to work for me on both my mac and on sage.math. It needs to be tested much more extensively, though.

comment:7 Changed 11 years ago by ddrake

  • Status changed from needs_review to needs_work

Your patch seems to work as far as hitting Ctrl-C goes, but if doctesting more than one file, if doctests fail, the entire run fails:

Traceback (most recent call last):
  File "/scratch/sage-4.3.5/local/bin/sage-test", line 177, in <module>
    err = test_file(F)
  File "/scratch/sage-4.3.5/local/bin/sage-test", line 125, in test_file
    err = test(F, 'doctest ' + opts + extra_opts)
  File "/scratch/sage-4.3.5/local/bin/sage-test", line 84, in test
    err = check_call(s, shell=True)
  File "/scratch/sage-4.3.5/local/lib/python/subprocess.py", line 488, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '/scratch/sage-4.3.5/local/bin/sage-doctest  "devel/sage/sage/combinat/tableau.py"' returned non-zero exit status 100

I think we want subprocess.call, not check_call; that will allow us to pass the exit code back up if we want to; "call" returns the exit code, but check_call either returns or raises CalledProcessError?. Of course, if you like, we can catch the exception and just use that information.

Also, I think we can avoid the shell=True bit, since our command line is so simple: just do

err = call([os.path.join(SAGE_ROOT, 'local', 'bin', 'sage-%s' % cmd), F])

and, on line 127, change 'doctest ' (note the space) to 'doctest'.

Finally, if using the subprocess module, I think we get the genuine return code and there's no need for any "err = err 256" business.

comment:8 Changed 11 years ago by ddrake

Also, there's a couple loops where we need to use the "err = err | return code" bit, otherwise we will lose the information that a doctest failed. I've updated your patch to do this; I'll upload a new patch shortly.

Changed 11 years ago by ddrake

apply to scripts repo. Replaces previous.

comment:9 follow-up: Changed 11 years ago by jhpalmieri

  • Status changed from needs_work to needs_review

This isn't quite working for me: I get errors if I do "sage -t -long FILE". I'm attaching a patch on top of yours which fixes it for me on my iMac, on sage.math, and on t2 (Solaris).

Changed 11 years ago by jhpalmieri

apply on top of trac_8641.patch

comment:10 in reply to: ↑ 9 Changed 11 years ago by ddrake

Replying to jhpalmieri:

This isn't quite working for me: I get errors if I do "sage -t -long FILE". I'm attaching a patch on top of yours which fixes it for me on my iMac, on sage.math, and on t2 (Solaris).

Ah, nice catch. I didn't test with -long. I took your patch and it works properly on my Ubuntu machines and on bsd.math.

wjp, could you take a look at these patches? I think everything is fine, but another opinion would be nice.

Also, we still should work on sage-ptest; as pointed out at #7995, there's duplicated code there. But at least with this ticket, we can easily use Mercurial's bisect command to track down failing doctests.

comment:11 Changed 11 years ago by wjp

I wonder if something else could be happening. Possibly by coincidence, 2 is also SIGINT. Could it be that on sage.math the sage-doctest script itself is being killed by the Ctrl-C instead of the doctest it is running?

After all, if you trigger one of the other error cases (like regular error failures), they are caught on sage.math properly.

Also, a transcript from sage.math:

sage: os.system("true")
0
sage: os.system("false")
256
sage: os.system("sleep 100")
[press Ctrl-C]
2

So the return values of os.system seem to match the specs.

comment:12 follow-ups: Changed 11 years ago by wjp

P.S. It does sound like using subprocess.call might be more portable; its docs have fewer warnings about undefined return values.

It does throw a KeyboardInterrupt? itself when the called program gets a SIGINT, though, so we may have to catch that to execute the line

failed.append(sage_test_command(F)+" # KeyboardInterrupt")

I'll take a closer look at the patch later today.

comment:13 in reply to: ↑ 12 ; follow-up: Changed 11 years ago by mpatel

Replying to wjp:

I'll take a closer look at the patch later today.

Do you recommend any changes to trac_8641.patch or trac_8641-part2.patch?

By the way, we may need to rebase these for #8891.

comment:14 Changed 11 years ago by mpatel

  • Milestone set to sage-5.0

Changed 11 years ago by mpatel

Combined patch rebased vs 4.4.4.alpha0 + #8891. Replaces all previous.

comment:15 in reply to: ↑ 13 ; follow-ups: Changed 11 years ago by mpatel

  • Authors set to Dan Drake, John Palmieri

Replying to mpatel:

By the way, we may need to rebase these for #8891.

I've attached a combined, rebased patch.

comment:16 in reply to: ↑ 15 Changed 11 years ago by ddrake

Replying to mpatel:

Replying to mpatel:

By the way, we may need to rebase these for #8891.

I've attached a combined, rebased patch.

Hmm, on 4.4.4.alpha0, I get a hunk reject: on line 123 or so, my copy of sage-test says

   elif os.path.isdir(F) and not (F[:1] == '.') \
            and not '#' in F and not os.sep + 'notes' in F:

and your patch says

     elif (os.path.isdir(F) and  not '#' in F and
           not os.sep + 'notes' in F):

...so somebody changed the backslash line continuation to parentheses. I'll manually fix the hunk and test your patch.

comment:17 in reply to: ↑ 15 ; follow-ups: Changed 11 years ago by ddrake

Replying to mpatel:

I've attached a combined, rebased patch.

I fixed the hunk reject and your patch seems to work fine. The exit codes match up with what sage-doctest says (although I didn't check exit code 4, since I don't know how to make the doctesting framework raise an exception). I tried with individual files, directories, with and without "-long", so it seems to work pretty well on my Ubuntu machine.

comment:18 in reply to: ↑ 17 Changed 11 years ago by ddrake

Replying to ddrake:

I fixed the hunk reject and your patch seems to work fine. The exit codes match up with what sage-doctest says (although I didn't check exit code 4, since I don't know how to make the doctesting framework raise an exception). I tried with individual files, directories, with and without "-long", so it seems to work pretty well on my Ubuntu machine.

I tested on bsd.math and the patch works there, too.

comment:19 in reply to: ↑ 17 Changed 11 years ago by wjp

Replying to ddrake:

(although I didn't check exit code 4, since I don't know how to make the doctesting framework raise an exception).

Should you still want to try, the example in ticket #7993 will trigger one of those.

Changed 11 years ago by mpatel

Modify sage-ptest, too. Apply to 4.4.4.alpha0 + #8891. Apply only this patch.

comment:20 Changed 11 years ago by mpatel

I've attached an attempt to get the same behavior for sage -tp. The patch should apply to 4.4.4.alpha0 + #8891, unless I've qfolded improperly.

It seems there's much room for improvement in sage-ptest. For example, are the mutex locks really necessary? I think both skip and populatefilelist run only in the main process. But I suppose we could leave this and other potential oddities for #9224.

comment:21 Changed 11 years ago by ddrake

  • Owner changed from tbd to ddrake

mpatel: on line 337 of sage-ptest and line 176 of sage-test, your patch (attachment:trac_8641-doctest_exit_codes.2.patch) always sets err = 2. Can you change that to

   err = 2 | err

? Then, if doctesting a bunch of files and hit ctrl-c, you can tell the difference between "all the tests that ran passed" and "there was at least one doctest failure". (And other errors that might have happened before you hit ctrl-c.)

Changed 11 years ago by mpatel

Use err = err | 2 for KeyboardInterrupt. Apply only this patch, to 4.4.4.alpha0 + #8891.

comment:22 Changed 11 years ago by mpatel

Done. V3 uses err = 2 | err for KeyboardInterrupts to both scripts.

comment:23 Changed 11 years ago by ddrake

This looks good, and I've used it a bunch and it works properly. But I'm listed as one of the authors, so perhaps I shouldn't flip it to positive review...wjp, can you look this over? I did check exit code 4 and that works properly.

comment:24 Changed 11 years ago by wjp

I'll try to find time to take a proper look friday or this weekend, but at first glance it seems like my comment 12 about SIGINT/KeyboardInterrupt is still valid.

comment:25 in reply to: ↑ 12 Changed 11 years ago by ddrake

Replying to wjp:

[subprocess.call] does throw a KeyboardInterrupt? itself when the called program gets a SIGINT, though, so we may have to catch that to execute the line

failed.append(sage_test_command(F)+" # KeyboardInterrupt")

Hrm, I'm not sure this is true -- at least on my Ubuntu machine. With the most recent version of patch, I changed the bit starting at line 84 of sage-test to this:

  try:
      err = call(s, shell=True)
      print '%s returned code %s' % (s, err)
  except KeyboardInterrupt:
      print '*****sage-text line 87 caught keyboard interrupt'
      raise

I then ran a doctest (combinat/words/finite_word.py) and in another terminal, sent the called process a SIGINT with

kill -2 `ps ax | grep finite_word | grep sage-doctest | grep python | awk '{print $1}'

The result is:

sage -t  "devel/sage/sage/combinat/words/finite_word.py"
KeyboardInterrupt -- interrupted after 3.37038588524 seconds!
/home/drake/s/sage-4.4.4.alpha0-test/local/bin/sage-doctest  "devel/sage/sage/combinat/words/finite_word.py" returned code 2
Aborting further tests.

The "interrupted after 3.370..." is from sage-doctest, line 668, which then exits with return code 2. It seems like subprocess.call in this instance notes that the process finished, and dutifully passes on the return code -- without raising a KeyboardInterrupt. Then, on line 95 the failed.append(sage_test_command(F)+" # KeyboardInterrupt") bit is executed and a KeyboardInterrupt is raised; that causes the "Aborting further tests" to be printed.

If I hit ctrl-c while running the doctest, it gets caught as you would expect:

sage -t  "devel/sage/sage/combinat/words/finite_word.py"
^CKeyboardInterrupt -- interrupted after 2.75523304939 seconds!
*****sage-text line 87 caught keyboard interrupt
Aborting further tests.

What is interesting is, if instead of sending SIGINT to the Python sage-doctest process, I send a SIGINT to the *shell* process created by subprocess.call, it seems to do nothing -- the doctest runs normally, finishes, but the return code is -2; this agrees with the Popen returncode documentation. Then, the entire process exits with return code 254, which is a bit strange, but at least it's indicating that something strange happened.

comment:26 Changed 11 years ago by wjp

Hm, you're right. I may have mixed up process IDs the previous time I tried.

In any case, it might make sense to explicitly catch this KeyboardInterrupt?. But it's not strictly necessary since it'll be caught on a higher level anyway. (It may be something to save for further doctesting cleanup patches.)

I don't like the lines err = err | test_file(F), err = err | ret and err = err | 2, though. They suggest that err is a bitmask or boolean, while it isn't. We could make it a bitmask, of course, so that 0 = all ok, 1 = failed tests, 2 = interrupted, 3 = failed tests and interrupted. Are there any other special cases we would want to consider in that case? Maybe a separate bit for timeouts? (See also #9316)

comment:27 Changed 11 years ago by wjp

  • Status changed from needs_review to positive_review

Mpatel pointed out that #9243 already tackles the powers-of-two thing.

I'm giving this a positive review, provided that #9243 is merged too. (I'll review that ticket too.)

comment:28 Changed 11 years ago by mpatel

  • Reviewers set to Willem Jan Palenstijn

comment:29 Changed 11 years ago by mpatel

  • Authors changed from Dan Drake, John Palmieri to Dan Drake, John Palmieri, Mitesh Patel

To release manager: Apply just trac_8641-doctest_exit_codes.3.patch to 4.5.alpha4.

comment:30 Changed 11 years ago by wjp

  • Milestone changed from sage-5.0 to sage-4.5.1

comment:31 Changed 11 years ago by ddrake

  • Merged in set to sage-4.5.2.alpha1
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.