Opened 17 months ago

Closed 17 months ago

Last modified 17 months ago

#28454 closed defect (duplicate)

py3 failure in src/sage/libs/eclib/interface.py

Reported by: jhpalmieri Owned by:
Priority: major Milestone: sage-duplicate/invalid/wontfix
Component: python3 Keywords:
Cc: cremona Merged in:
Authors: John Cremona, John Palmieri Reviewers: John Cremona
Report Upstream: None of the above - read trac for reasoning. Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description

On some machines, there is a Python 3 doctest failure at line 597 in the file src/sage/libs/eclib/interface.py:

sage -t --warn-long 39.6 src/sage/libs/eclib/interface.py
**********************************************************************
File "src/sage/libs/eclib/interface.py", line 597, in sage.libs.eclib.interface.mwrank_EllipticCurve.saturate
Failed example:
    E.saturation([Q1,Q2])
Expected:
    ([(1 : -27 : 1), (157 : 1950 : 1)], 3, 0.801588644684981)
Got:
    Attempt to round -0.2617840677e25 to a long int fails, aborting!
    ([(1 : -27 : 1), (157 : 1950 : 1)], 3, 0.801588644684981)
**********************************************************************
1 item had failures:
   1 of   9 in sage.libs.eclib.interface.mwrank_EllipticCurve.saturate
    [192 tests, 1 failure, 3.62 s]
----------------------------------------------------------------------
sage -t --warn-long 39.6 src/sage/libs/eclib/interface.py  # 1 doctest failed

Attachments (1)

eclib-flush.patch (559 bytes) - added by jhpalmieri 17 months ago.

Download all attachments as: .zip

Change History (32)

comment:1 Changed 17 months ago by jhpalmieri

I was trying to track this down. The message "Attempt to round ..." is from the eclib source code, so I put in some print statements, after which the "Attempt to round ..." message appeared after a different doctest. So then I thought I should flush the output after the "Attempt to round ..." message, to try to clarify where it was appearing. For reasons I don't understand, flushing the output caused the message to disappear entirely. Is this a valid fix for the problem?

In a bit more detail: I added the attached file to build/pkgs/eclib/patches (after creating that directory). Then I rebuilt eclib, after which this file passed its doctests.

Changed 17 months ago by jhpalmieri

comment:2 Changed 17 months ago by jhpalmieri

  • Cc cremona added

comment:3 Changed 17 months ago by cremona

It seems weird to me the switching python versions could affect what happens in a C++ library, which is what eclib is. The warning from eclib just means that an integer has been computed as a floating point which it then tries to fit into a long int (not a multiprecision int) and fails since it has too many digits.

I will raise an Issue in eclib for this, as there may be some underlying problem being revealed.

Note that the doctest in question was added in 2015 in https://trac.sagemath.org/ticket/18031 after an upstream fix.

Last edited 17 months ago by cremona (previous) (diff)

comment:4 follow-up: Changed 17 months ago by cremona

The "Attempt to round..." warning in eclib comes from a function which as far as I can tell should not be called by E.saturate().

Could someone who saw this message and has a python3 build try one thing: in line 2604 of src/schemes/elliptic_curves/ell_rational_field change

prec = 100

to

prec = 200

and test again? This is only the starting precision (in bits) and the code increases the precision by 50 in a loop until eclib reports that saturation has been completed. Not long ago eclib changed from using decimal precision to bit precision, and at that time I reviewed all the places in sage where eclib's precision was being set, but may not have got this right.

If that fails I'll dig deeper.

comment:5 Changed 17 months ago by cremona

  • Report Upstream changed from N/A to None of the above - read trac for reasoning.

In the "Report upstream" box there should be an option "developers do not know whether it is a bug or not"! Right now I neither acknowledge that there is a bug in eclib nor deny it. Anyway, I think that the author of a library has the right to issue warnings, and it is surely a Sage problem if a warning in a dependent library is triggered under python2 but not under python3?

comment:6 in reply to: ↑ 4 Changed 17 months ago by jhpalmieri

Replying to cremona:

The "Attempt to round..." warning in eclib comes from a function which as far as I can tell should not be called by E.saturate().

As I said in comment:1, I first added some print statements to mrank1.cc to figure out which call of ROUNDUP or ROUNDDOWN was causing the message, and this made the message appear after a different doctest. So then I tried flushing the output, as in the patch, and this made the error disappear altogether. So I have no idea what's going on.

Could someone who saw this message and has a python3 build try one thing: in line 2604 of src/schemes/elliptic_curves/ell_rational_field change

prec = 100

to

prec = 200

and test again?

No change.

I hope others look at this, too. I am running Debian using VirtualBox on my Mac, which is not an ideal environment. I have taken it for granted that this a Python3-only bug (as reported at #28298), but I don't have a Python 2 build on this machine to test. I'm trying to build one now, but I may run out of disk space.

Last edited 17 months ago by jhpalmieri (previous) (diff)

comment:7 Changed 17 months ago by cremona

Thanks, that is helpful -- my guess is that the warning message is being created in an earlier doctest but only appearing in this one. That code in mrank1.cc is only run when doing 2-descent on elliptic curves with no 2-torsion. So the next step would be to manually run some subset of the doctests in that file interface.py.

I'll make myself a new python3 build and see if I can get to the bottom of this.

comment:8 Changed 17 months ago by jhpalmieri

Note that it is also platform-specific: I don't see it on OS X, for example. It is conceivable that it depends on a combination of the OS and the processor, or it could just be the OS. It makes it (along with a few other Python 3 failures) annoying to debug and fix.

comment:9 Changed 17 months ago by jhpalmieri

And of course, I am happy to test things on my virtual machine. Let me know how I can help.

comment:10 Changed 17 months ago by jhpalmieri

I've done more investigating. I think the message actually comes from this doctest:

    sage: EllipticCurve([0, prod(prime_range(100))]).mwrank_curve().two_descent()
    Traceback (most recent call last):
    ...
    RuntimeError: Aborted

If I create a new file with just

r"""
EXAMPLES::

    sage: EllipticCurve([0, prod(prime_range(100))]).mwrank_curve().two_descent()
    Traceback (most recent call last):
    ...
    RuntimeError: Aborted

    sage: E = EllipticCurve([0,-1,1,-266,968])
    sage: Q1 = E([-1995,3674,125])
    sage: Q2 = E([157,1950,1])
    sage: E.saturation([Q1,Q2])
    ([(1 : -27 : 1), (157 : 1950 : 1)], 3, 0.801588644684981)
"""

then I get the doctest failure. If I run the command

EllipticCurve([0, prod(prime_range(100))]).mwrank_curve().two_descent()

directly in Sage, I get the message "Attempt to round ..." as part of the traceback (and I see this on my Mac, not just linux, and I see it with Python 2 and 3). So I think that when I flush the output, this message is absorbed into the "..." part of the doctest output, as it should be. When it's not flushed, it appears somewhere else and causes the failure. So maybe this is a valid fix.

I still don't know why this only happens with some operating systems, and why only with Python 3, and why the "Attempt to round ..." message appears in this particular place, but it now makes a bit more sense. Should we flush the output in the ROUNDDOWN function as well?

comment:11 follow-up: Changed 17 months ago by cremona

Thanks, that is extremely helpful. mwrank bails out almost immediately if the input curve has very large coefficients and no 2-torsion, since when certain variables which are loop bounds are too large to fit into long ints (64-bit) the loop will not run in reasonable time, so it does not make sense for the code to use multiprecision ints at that point. But it should catch this and issue a better warning/error, specifically saying that the curve's coefficients are too big. I will look into that. When this example is run with mwrank from the command line it core dumps after this message, certainly not good.

However that is just half the story: it should be when this happens from within Sage, it raises a proper python error, and this output appears immediately. Right now the error message goes into a buffer and it is very system-dependent when the message appears.

If you can see how to flush the output without changing eclib source code that would be the fastest fix, while I try to do something better for the longer term.

comment:12 in reply to: ↑ 11 Changed 17 months ago by jhpalmieri

Replying to cremona:

If you can see how to flush the output without changing eclib source code that would be the fastest fix, while I try to do something better for the longer term.

I don't know how to do this. I mean, I can find sites like https://eli.thegreenplace.net/2015/redirecting-all-kinds-of-stdout-in-python/ or https://stackoverflow.com/questions/5081657/how-do-i-prevent-a-c-shared-library-to-print-on-stdout-in-python or others, but they don't quite address the issue, so they would require modification. In addition, they look complicated and I don't understand them. Instead, as a stopgap, I would suggest a patch in build/pkgs/eclib/patches like the one I've posted here. It's not going to help with a version of eclib installed system-wide (#28333), but it's simple and will work for Sage-installed versions of eclib.

Do you have some objection to flushing the output in those functions?

comment:13 follow-up: Changed 17 months ago by cremona

No objection. I know that it is a *bad thing* for library functions to call abort() and ideally I would recode the places where that happens, but there are hundreds of such places since I seem to use it whenever something invalid is tried such as adding two vectors of different lengths, as well as when memory allocation fails or (as here) when an attempt is made to put a too large integer into a long int. As a stopgap I will make sure that when some warning is output just before abort() then the output is flushed.

This will need a new eclib version, so patching ROUNDUP and ROUNDOWN in mwrank1.cc as suggested would be a reasonable stopgap.

comment:14 in reply to: ↑ 13 Changed 17 months ago by cremona

Replying to cremona:

No objection. I know that it is a *bad thing* for library functions to call abort() and ideally I would recode the places where that happens, but there are hundreds of such places since I seem to use it whenever something invalid is tried such as adding two vectors of different lengths, as well as when memory allocation fails or (as here) when an attempt is made to put a too large integer into a long int. As a stopgap I will make sure that when some warning is output just before abort() then the output is flushed.

This will need a new eclib version, so patching ROUNDUP and ROUNDOWN in mwrank1.cc as suggested would be a reasonable stopgap.

There will soon be a new version of eclib (tarball is at http://homepages.warwick.ac.uk/staff/J.E.Cremona/ftp/eclib-20190909.tar.bz2) which removes all calls to abort() and does something reasonable wherever that used to happen, including flushing output. I have not made this new version into an official release yet. The behaviour of mwrank on being given a curve with huge coefficients is

$ ./progs/mwrank -v 0 
Program mwrank: uses 2-descent (via 2-isogeny if possible) to
determine the rank of an elliptic curve E over Q, and list a
set of points which generate E(Q) modulo 2E(Q).
and finally saturate to obtain generating points on the curve.
For more details see the mwrank documentation.
For details of algorithms see the author's book.

Please acknowledge use of this program in published work, 
and send problems to john.cremona@gmail.com.

Version compiled on Sep  9 2019 at 15:05:13 by GCC 5.4.0 20160609
using NTL bigints and NTL real and complex multiprecision floating point
Enter curve: 0 0 0 0 429347239472947294723947203947209422347204720472021
Curve [0,0,0,0,429347239472947294723947203947209422347204720472021] :   Attempt to convert -0.85368380442727e34 to long fails!
2-descent: lower bound -0.85368380442727e34 on c too large
Attempt to convert -0.13658940870836e36 to long fails!
2-descent: lower bound -0.13658940870836e36 on c too large
Failed to compute rank

resulting in something (a success flag) which Sage could test (as it already does in some circumstances). It will need a little more work before Sage can install this version and pass tests, but it could be tested as is.

Last edited 17 months ago by cremona (previous) (diff)

comment:15 Changed 17 months ago by cremona

The tarball in comment 14 is looking good (to me). After installing it in a python2 Sage I only needed to adjust one doctest in this file interface.py: line 359 now says

RuntimeError: A 2-descent did not complete successfully.

I have not yet done any python3 testing.

comment:16 follow-up: Changed 17 months ago by jhpalmieri

It looks good to me, too, at least as far as the doctests in libs/eclib/ are concerned: I just see the one problem you mentioned. This is with Python 3.

I think I will still prepare a patch for the old version, since it's easy to do, and if we end up not needing it and going straight to your new version, that's fine with me.

Last edited 17 months ago by jhpalmieri (previous) (diff)

comment:17 in reply to: ↑ 16 Changed 17 months ago by cremona

Replying to jhpalmieri:

It looks good to me, too, at least as far as the doctests in libs/eclib/ are concerned: I just see the one problem you mentioned. This is with Python 3.

Good. I had a successful "make ptestlong" too (python2).

I think I will still prepare a patch for the old version, since it's easy to do, and if we end up not needing it and going straight to your new version, that's fine with me.

Thanks.

comment:18 Changed 17 months ago by jhpalmieri

  • Branch set to u/jhpalmieri/eclib-flush

comment:19 Changed 17 months ago by jhpalmieri

  • Commit set to ad86cbe9689b2488f481a9295f5c25e5a3e69d7a

Okay, done. I think I need you to decide whether to use the patch or wait for a new version, since I don't know your timeline. If a new release won't be ready within a week or so, I would suggest using the patch. I'd certainly rather do that than have you feel under any time pressure to produce a new release. (Full Python 3 support, with all tests passing, is very close, and I want to make progress quickly when possible.)


New commits:

ad86cbetrac 28454: patch eclib to flush output after printing some messages.

comment:20 Changed 17 months ago by jhpalmieri

The other good news is that because of this ticket, I realized that the same non-flushing problem causes another Python 3 doctest failure (#28334), which I had no idea how to diagnose before.

Last edited 17 months ago by jhpalmieri (previous) (diff)

comment:21 Changed 17 months ago by cremona

  • Authors set to John Cremona, John Palmieri
  • Branch changed from u/jhpalmieri/eclib-flush to u/cremona/28454
  • Commit changed from ad86cbe9689b2488f481a9295f5c25e5a3e69d7a to 6ab659a47a1318932020dc2edec2dbb7eefc9295
  • Reviewers set to John Cremona

I hope I have not broken any etiquette by replacing your patch with my own, which just patches the file sage.libs.eclib.wrap.cpp to ensure that both cout and cerr are flushed after every eclib library call which might bail out. (I have a good idea of which these are having gone through over 200 places where eclib library functions call abort() yesterday).

The behaviour originally reported never happened for me anyway (on ubuntu) so I cannot really test, but "make ptestlong" had failures in 28 files (around 80 doctests in all) none of which are related to this as far as I can see.


New commits:

6ab659a#28454: systematic flushing of cout and cerr after calling eclib functions

comment:22 Changed 17 months ago by cremona

  • Status changed from new to needs_review

Ticket #28472 has an eclib upgrade. No reason why the patch here cannot be merged first.

comment:23 Changed 17 months ago by jhpalmieri

I get a doctest failure with Python 3 on debian: similar, but the output is flushed almost (but not quite) immediately, so the message appears in the doctest after the one where it should:

File "src/sage/libs/eclib/interface.py", line 368, in sage.libs.eclib.interface.mwrank_EllipticCurve.?
Failed example:
    E.two_descent(verbose=False)
Expected:
    True
Got:
    Attempt to round -0.2617840677e25 to a long int fails, aborting!
    True
**********************************************************************
1 item had failures:
   1 of   6 in sage.libs.eclib.interface.mwrank_EllipticCurve.?
    [192 tests, 1 failure, 7.16 s]
----------------------------------------------------------------------
sage -t --warn-long 206.6 src/sage/libs/eclib/interface.py  # 1 doctest failed
----------------------------------------------------------------------

comment:24 Changed 17 months ago by jhpalmieri

(No breach of etiquette at all, by the way. I just want to get the problem fixed.)

comment:25 follow-up: Changed 17 months ago by cremona

Strange. I'll double check that I put in all the flushes, and perhaps insert some more before as well as after each library call.

comment:26 in reply to: ↑ 25 Changed 17 months ago by cremona

Replying to cremona:

Strange. I'll double check that I put in all the flushes, and perhaps insert some more before as well as after each library call.

I think that accepting #28472 now is the best way to go.

comment:27 Changed 17 months ago by jhpalmieri

  • Milestone changed from sage-8.9 to sage-duplicate/invalid/wontfix
  • Status changed from needs_review to positive_review

I agree. I'm marking this as a duplicate.

comment:28 Changed 17 months ago by jhpalmieri

  • Branch u/cremona/28454 deleted
  • Commit 6ab659a47a1318932020dc2edec2dbb7eefc9295 deleted

(deleting the branch so it doesn't get merged by accident)

comment:29 Changed 17 months ago by chapoton

  • Resolution set to duplicate
  • Status changed from positive_review to closed

comment:30 Changed 17 months ago by embray

I know this is closed, but can you reproduce the issue if you run the tests with --memlimit=0?

comment:31 Changed 17 months ago by jhpalmieri

./sage -t --memlimit=0 src/sage/lib/eclib/interface.py fails the same way.

Note: See TracTickets for help on using tickets.