Opened 3 years ago

Closed 3 years ago

#28334 closed defect (fixed)

py3: polynomial_rational_flint.pyx problem

Reported by: John Palmieri Owned by:
Priority: blocker Milestone: sage-9.0
Component: python3 Keywords: flint, flush, py3
Cc: Samuel Lelièvre, spancratz, wbhart, Frédéric Chapoton Merged in:
Authors: John Palmieri Reviewers: Steven Trogdon
Report Upstream: Fixed upstream, but not in a stable release. Work issues:
Branch: ed66988 (Commits, GitHub, GitLab) Commit: ed66988e948dd61850b03d8bc282e8effad0aa4c
Dependencies: Stopgaps:

Status badges

Description

On some systems, Debian for example, we see this failure with Python 3:

Failed example:
    G = f.galois_group(); G
Expected:
    Transitive group number 5 of degree 4
Got:
    Exception (FLINT memory_manager). Unable to allocate memory.
    Transitive group number 5 of degree 4
**********************************************************************
1 item had failures:
   1 of  16 in sage.rings.polynomial.polynomial_rational_flint.Polynomial_rational_flint.galois_group
    [397 tests, 1 failure, 2.56 s]

Change History (36)

comment:1 Changed 3 years ago by John Palmieri

I installed a Debian virtual machine, and now I can see this error.

This problem does not seem related to the particular doctest: if I remove lots of other doctests from that file but keep this one, the error goes away. If I instead keep the file intact and add some print statements, the error is printed as this line is executed:

return PermutationGroup(H)

PermutationGroup has nothing to do with flint, I think.

Any ideas how to debug this?

comment:2 Changed 3 years ago by John Palmieri

It could be related to the problem discussed at #28106.

comment:3 in reply to:  2 Changed 3 years ago by John Palmieri

Replying to jhpalmieri:

It could be related to the problem discussed at #28106.

On the other hand, I've tried increasing the default memlimit in bin/sage-runtests, with no effect.

comment:4 Changed 3 years ago by John Palmieri

Also, I get the same error in a Sage session with

sage: from sage.doctest.control import DocTestController, DocTestDefaults
sage: DC = DocTestController(DocTestDefaults(), ['/home/john/Sage/sage-8.9.beta5/src/sage/rings/polynomial/polynomial_rational_flint.pyx'])
sage: DC.run()

I don't think this uses sage-runtests and the memory limits there. So maybe it's not the same as #28106.

comment:5 Changed 3 years ago by John Palmieri

I added a few print statements to flint's memory_manager.c. This error is being printed by flint_calloc.

comment:6 Changed 3 years ago by Kwankyu Lee

I don't have a debian machine and know nothing about flint. But John's experiment seems to indicate that the memory error is genuine. I guess somehow the polynomial_rational_flint module of sage on debian machine leaks memory.

comment:7 Changed 3 years ago by Kwankyu Lee

Cc: spancratz added

comment:8 Changed 3 years ago by John Palmieri

I'm seeing this on an Ubuntu virtual machine (running on the same OS X box). I wonder if it's the processor as much as which type of linux is running. This machine has a Core i7 processor.

comment:9 in reply to:  8 Changed 3 years ago by Eric Gourgoulhon

Replying to jhpalmieri:

I'm seeing this on an Ubuntu virtual machine (running on the same OS X box). I wonder if it's the processor as much as which type of linux is running. This machine has a Core i7 processor.

As reported in https://groups.google.com/d/msg/sage-release/VuAt1Sc46IQ/n9ejPwoiAwAJ, I also get the error on a Xeon E5-2623 processor (with Ubuntu 18.04).

comment:10 Changed 3 years ago by Eric Gourgoulhon

According to the entry (G) in the summary table of #28298, the doctest discussed here is passed on the reference patchbot, which is a Ubuntu machine though...

Last edited 3 years ago by Eric Gourgoulhon (previous) (diff)

comment:11 in reply to:  10 Changed 3 years ago by Eric Gourgoulhon

Replying to egourgoulhon:

According to the entry (G) in the summary table of #28298, the doctest discussed here is passed on the reference patchbot, which is a Ubuntu machine though...

Well, actually it is reported as failed in https://patchbot.sagemath.org/log/0/Ubuntu/18.04/x86_64/4.15.0-54-generic/petitbonum/2019-09-03%2009:06:17?short

Is this the "reference patchbot" for Python 3 Sage? If yes, then the table of #28298 must be updated.

comment:12 Changed 3 years ago by John Palmieri

Okay, I think I know what's going on. This is the same problem as in #28454: strings are being sent to stdout by a C or C++ library but the output is not flushed, or whether it's flushed is system/OS-dependent, so messages like "Exception (FLINT memory_manager). Unable to allocate memory." do not get printed when we expect them, but rather some time later. In this particular case, the message should be printed when this doctest is run:

        FLINT memory errors do not crash Sage (:trac:`17629`)::

            sage: t^(sys.maxsize//2)
            Traceback (most recent call last):
            ...
            RuntimeError: FLINT exception

in which case it's captured by the "...". But on some platforms, with Python 3 only, the message is not printed at the right time. (I don't know why, but this really does seem to be what's going on.)

One solution would be to modify FLINT to flush the output. Is there a pure Python 3 solution, which we can incorporate in the Sage library without patching or modifying FLINT?

comment:13 in reply to:  12 Changed 3 years ago by John Palmieri

Replying to jhpalmieri:

One solution would be to modify FLINT to flush the output. Is there a pure Python 3 solution, which we can incorporate in the Sage library without patching or modifying FLINT?

I should also note that patching FLINT won't help if a user is using the system's version of FLINT. So can we do this just with Sage's FLINT interface?

comment:14 Changed 3 years ago by John Palmieri

Branch: u/jhpalmieri/flint-flush

comment:15 Changed 3 years ago by John Palmieri

Commit: d256999e0711075497d6ab54f8e91a20556a6ce9

Having said that, here is a patch for FLINT. This fixes the problem for me. I won't mark as "needs review" yet, since I would like to know if there is a fix for this on the Sage/Python side first. If that looks possible but complicated, we can use this patch as a stopgap.


New commits:

d256999trac 28334: patch FLINT to flush output when printing memory error message
Last edited 3 years ago by John Palmieri (previous) (diff)

comment:16 Changed 3 years ago by git

Commit: d256999e0711075497d6ab54f8e91a20556a6ce9b44f1cac24196ee496cf0805d237f7850d618f71

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

b44f1catrac 28334: patch FLINT to flush output when printing memory error message

comment:17 Changed 3 years ago by git

Commit: b44f1cac24196ee496cf0805d237f7850d618f71ed66988e948dd61850b03d8bc282e8effad0aa4c

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

ed66988trac 28334: patch FLINT to flush output when printing memory error message

comment:18 Changed 3 years ago by John Palmieri

A stupid workaround, by the way, would be to label the doctest in comment:12 as py2. I would rather not do that.

comment:19 Changed 3 years ago by Steven Trogdon

I see this failure, as well as that at #28454, on a sage-on-gentoo build using py3. Of course both flint and eclib are system installed.

comment:20 Changed 3 years ago by John Palmieri

John Cremona is working on a new eclib release, so that problem should be fixed pretty soon, without any patching needed (but after updating the system eclib). I don't know what to do about FLINT, if patching is not the right choice.

comment:21 Changed 3 years ago by John Palmieri

I think these are the options:

  • label one doctest as py2, thereby sweeping the problem under the rug
  • add a patch to flint, which won't help with system-installed copies of flint
  • convince the Python/flint interface to flush the output

The last one should be the best, but I don't know how to do it. Do we do either of the others as a stopgap, or wait?

comment:22 Changed 3 years ago by Samuel Lelièvre

Cc: Samuel Lelièvre wbhart added
Keywords: flint flush py3 added

cc-ing Bill Hart who might have some insight on this.

comment:23 Changed 3 years ago by John Palmieri

Priority: majorblocker

comment:24 Changed 3 years ago by John Palmieri

Authors: John Palmieri
Status: newneeds_review

There has been no progress on this. I'm going to mark this as "needs review", since the other options (new flint, or get Sage to flush flint's output) can coexist with this. As I've said before, I would welcome other solutions.

comment:25 Changed 3 years ago by Steven Trogdon

Would it help to file an issue or a pull request at https://github.com/wbhart/flint2. There are numerous locations in the trunk branch where code had been added of the type

flint_print("...");
fflush(stdout);

Then again adding a fflush to memory_manager.c may be too trivial a change prior to a major release.

comment:26 Changed 3 years ago by John Palmieri

Report Upstream: N/AReported upstream. No feedback yet.

I created a pull request, but since the last release was four years ago, I don't have high hopes.

comment:27 in reply to:  26 Changed 3 years ago by Samuel Lelièvre

Replying to jhpalmieri:

I created a pull request, but since the last release was four years ago, I don't have high hopes.

The FLINT repo is active, and has a roadmap for FLINT 2.6.0, itself part of the Oscar roadmap, so there are reasons to hope.

comment:28 Changed 3 years ago by John Palmieri

Report Upstream: Reported upstream. No feedback yet.Fixed upstream, but not in a stable release.

The pull request has been merged.

comment:29 Changed 3 years ago by John Palmieri

This is the main remaining Py3 doctest failure. Shall we merge this, and then later hope to find a systematic way to flush output, or upgrade flint when it's ready?

comment:30 Changed 3 years ago by John Palmieri

Cc: Frédéric Chapoton added

comment:31 Changed 3 years ago by Steven Trogdon

I think we should merge it, but I'll wait for comment.

comment:32 Changed 3 years ago by Nils Bruin

+1 to merging this. If you end up inserting "flush" in the interface wrappers, you'll end up flushing a lot when there's nothing to flush. Since flush is a system call, such wasteful use of resources might end up quite noticeable. Better flush at the source, where you know when it's necessary.

comment:33 Changed 3 years ago by Steven Trogdon

Reviewers: Steven Trogdon
Status: needs_reviewpositive_review

I'll push the button. It can always be undone.

comment:34 Changed 3 years ago by Frédéric Chapoton

Milestone: sage-8.9sage-9.0

comment:35 Changed 3 years ago by John Palmieri

See #28649 for a followup. I still think we should merge this, especially since the patch has been accepted upstream. #28649 uses nbruin's suggestion for flushing the output when we have a non-patched FLINT.

Last edited 3 years ago by John Palmieri (previous) (diff)

comment:36 Changed 3 years ago by Volker Braun

Branch: u/jhpalmieri/flint-flushed66988e948dd61850b03d8bc282e8effad0aa4c
Resolution: fixed
Status: positive_reviewclosed
Note: See TracTickets for help on using tickets.