Opened 3 months ago

Closed 4 weeks ago

#28334 closed defect (fixed)

py3: polynomial_rational_flint.pyx problem

Reported by: jhpalmieri Owned by:
Priority: blocker Milestone: sage-9.0
Component: python3 Keywords: flint, flush, py3
Cc: slelievre, spancratz, wbhart, chapoton Merged in:
Authors: John Palmieri Reviewers: Steven Trogdon
Report Upstream: Fixed upstream, but not in a stable release. Work issues:
Branch: ed66988 (Commits) Commit: ed66988e948dd61850b03d8bc282e8effad0aa4c
Dependencies: Stopgaps:

Description

On some systems, Debian for example, we see this failure with Python 3:

Failed example:
    G = f.galois_group(); G
Expected:
    Transitive group number 5 of degree 4
Got:
    Exception (FLINT memory_manager). Unable to allocate memory.
    Transitive group number 5 of degree 4
**********************************************************************
1 item had failures:
   1 of  16 in sage.rings.polynomial.polynomial_rational_flint.Polynomial_rational_flint.galois_group
    [397 tests, 1 failure, 2.56 s]

Change History (36)

comment:1 Changed 3 months ago by jhpalmieri

I installed a Debian virtual machine, and now I can see this error.

This problem does not seem related to the particular doctest: if I remove lots of other doctests from that file but keep this one, the error goes away. If I instead keep the file intact and add some print statements, the error is printed as this line is executed:

return PermutationGroup(H)

PermutationGroup has nothing to do with flint, I think.

Any ideas how to debug this?

comment:2 follow-up: Changed 3 months ago by jhpalmieri

It could be related to the problem discussed at #28106.

comment:3 in reply to: ↑ 2 Changed 3 months ago by jhpalmieri

Replying to jhpalmieri:

It could be related to the problem discussed at #28106.

On the other hand, I've tried increasing the default memlimit in bin/sage-runtests, with no effect.

comment:4 Changed 3 months ago by jhpalmieri

Also, I get the same error in a Sage session with

sage: from sage.doctest.control import DocTestController, DocTestDefaults
sage: DC = DocTestController(DocTestDefaults(), ['/home/john/Sage/sage-8.9.beta5/src/sage/rings/polynomial/polynomial_rational_flint.pyx'])
sage: DC.run()

I don't think this uses sage-runtests and the memory limits there. So maybe it's not the same as #28106.

comment:5 Changed 3 months ago by jhpalmieri

I added a few print statements to flint's memory_manager.c. This error is being printed by flint_calloc.

comment:6 Changed 3 months ago by klee

I don't have a debian machine and know nothing about flint. But John's experiment seems to indicate that the memory error is genuine. I guess somehow the polynomial_rational_flint module of sage on debian machine leaks memory.

comment:7 Changed 3 months ago by klee

  • Cc spancratz added

comment:8 follow-up: Changed 3 months ago by jhpalmieri

I'm seeing this on an Ubuntu virtual machine (running on the same OS X box). I wonder if it's the processor as much as which type of linux is running. This machine has a Core i7 processor.

comment:9 in reply to: ↑ 8 Changed 3 months ago by egourgoulhon

Replying to jhpalmieri:

I'm seeing this on an Ubuntu virtual machine (running on the same OS X box). I wonder if it's the processor as much as which type of linux is running. This machine has a Core i7 processor.

As reported in https://groups.google.com/d/msg/sage-release/VuAt1Sc46IQ/n9ejPwoiAwAJ, I also get the error on a Xeon E5-2623 processor (with Ubuntu 18.04).

comment:10 follow-up: Changed 3 months ago by egourgoulhon

According to the entry (G) in the summary table of #28298, the doctest discussed here is passed on the reference patchbot, which is a Ubuntu machine though...

Last edited 3 months ago by egourgoulhon (previous) (diff)

comment:11 in reply to: ↑ 10 Changed 3 months ago by egourgoulhon

Replying to egourgoulhon:

According to the entry (G) in the summary table of #28298, the doctest discussed here is passed on the reference patchbot, which is a Ubuntu machine though...

Well, actually it is reported as failed in https://patchbot.sagemath.org/log/0/Ubuntu/18.04/x86_64/4.15.0-54-generic/petitbonum/2019-09-03%2009:06:17?short

Is this the "reference patchbot" for Python 3 Sage? If yes, then the table of #28298 must be updated.

comment:12 follow-up: Changed 2 months ago by jhpalmieri

Okay, I think I know what's going on. This is the same problem as in #28454: strings are being sent to stdout by a C or C++ library but the output is not flushed, or whether it's flushed is system/OS-dependent, so messages like "Exception (FLINT memory_manager). Unable to allocate memory." do not get printed when we expect them, but rather some time later. In this particular case, the message should be printed when this doctest is run:

        FLINT memory errors do not crash Sage (:trac:`17629`)::

            sage: t^(sys.maxsize//2)
            Traceback (most recent call last):
            ...
            RuntimeError: FLINT exception

in which case it's captured by the "...". But on some platforms, with Python 3 only, the message is not printed at the right time. (I don't know why, but this really does seem to be what's going on.)

One solution would be to modify FLINT to flush the output. Is there a pure Python 3 solution, which we can incorporate in the Sage library without patching or modifying FLINT?

comment:13 in reply to: ↑ 12 Changed 2 months ago by jhpalmieri

Replying to jhpalmieri:

One solution would be to modify FLINT to flush the output. Is there a pure Python 3 solution, which we can incorporate in the Sage library without patching or modifying FLINT?

I should also note that patching FLINT won't help if a user is using the system's version of FLINT. So can we do this just with Sage's FLINT interface?

comment:14 Changed 2 months ago by jhpalmieri

  • Branch set to u/jhpalmieri/flint-flush

comment:15 Changed 2 months ago by jhpalmieri

  • Commit set to d256999e0711075497d6ab54f8e91a20556a6ce9

Having said that, here is a patch for FLINT. This fixes the problem for me. I won't mark as "needs review" yet, since I would like to know if there is a fix for this on the Sage/Python side first. If that looks possible but complicated, we can use this patch as a stopgap.


New commits:

d256999trac 28334: patch FLINT to flush output when printing memory error message
Last edited 2 months ago by jhpalmieri (previous) (diff)

comment:16 Changed 2 months ago by git

  • Commit changed from d256999e0711075497d6ab54f8e91a20556a6ce9 to b44f1cac24196ee496cf0805d237f7850d618f71

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

b44f1catrac 28334: patch FLINT to flush output when printing memory error message

comment:17 Changed 2 months ago by git

  • Commit changed from b44f1cac24196ee496cf0805d237f7850d618f71 to ed66988e948dd61850b03d8bc282e8effad0aa4c

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

ed66988trac 28334: patch FLINT to flush output when printing memory error message

comment:18 Changed 2 months ago by jhpalmieri

A stupid workaround, by the way, would be to label the doctest in comment:12 as py2. I would rather not do that.

comment:19 Changed 2 months ago by strogdon

I see this failure, as well as that at #28454, on a sage-on-gentoo build using py3. Of course both flint and eclib are system installed.

comment:20 Changed 2 months ago by jhpalmieri

John Cremona is working on a new eclib release, so that problem should be fixed pretty soon, without any patching needed (but after updating the system eclib). I don't know what to do about FLINT, if patching is not the right choice.

comment:21 Changed 2 months ago by jhpalmieri

I think these are the options:

  • label one doctest as py2, thereby sweeping the problem under the rug
  • add a patch to flint, which won't help with system-installed copies of flint
  • convince the Python/flint interface to flush the output

The last one should be the best, but I don't know how to do it. Do we do either of the others as a stopgap, or wait?

comment:22 Changed 2 months ago by slelievre

  • Cc slelievre wbhart added
  • Keywords flint flush py3 added

cc-ing Bill Hart who might have some insight on this.

comment:23 Changed 7 weeks ago by jhpalmieri

  • Priority changed from major to blocker

comment:24 Changed 6 weeks ago by jhpalmieri

  • Authors set to John Palmieri
  • Status changed from new to needs_review

There has been no progress on this. I'm going to mark this as "needs review", since the other options (new flint, or get Sage to flush flint's output) can coexist with this. As I've said before, I would welcome other solutions.

comment:25 Changed 6 weeks ago by strogdon

Would it help to file an issue or a pull request at https://github.com/wbhart/flint2. There are numerous locations in the trunk branch where code had been added of the type

flint_print("...");
fflush(stdout);

Then again adding a fflush to memory_manager.c may be too trivial a change prior to a major release.

comment:26 follow-up: Changed 6 weeks ago by jhpalmieri

  • Report Upstream changed from N/A to Reported upstream. No feedback yet.

I created a pull request, but since the last release was four years ago, I don't have high hopes.

comment:27 in reply to: ↑ 26 Changed 6 weeks ago by slelievre

Replying to jhpalmieri:

I created a pull request, but since the last release was four years ago, I don't have high hopes.

The FLINT repo is active, and has a roadmap for FLINT 2.6.0, itself part of the Oscar roadmap, so there are reasons to hope.

comment:28 Changed 6 weeks ago by jhpalmieri

  • Report Upstream changed from Reported upstream. No feedback yet. to Fixed upstream, but not in a stable release.

The pull request has been merged.

comment:29 Changed 4 weeks ago by jhpalmieri

This is the main remaining Py3 doctest failure. Shall we merge this, and then later hope to find a systematic way to flush output, or upgrade flint when it's ready?

comment:30 Changed 4 weeks ago by jhpalmieri

  • Cc chapoton added

comment:31 Changed 4 weeks ago by strogdon

I think we should merge it, but I'll wait for comment.

comment:32 Changed 4 weeks ago by nbruin

+1 to merging this. If you end up inserting "flush" in the interface wrappers, you'll end up flushing a lot when there's nothing to flush. Since flush is a system call, such wasteful use of resources might end up quite noticeable. Better flush at the source, where you know when it's necessary.

comment:33 Changed 4 weeks ago by strogdon

  • Reviewers set to Steven Trogdon
  • Status changed from needs_review to positive_review

I'll push the button. It can always be undone.

comment:34 Changed 4 weeks ago by chapoton

  • Milestone changed from sage-8.9 to sage-9.0

comment:35 Changed 4 weeks ago by jhpalmieri

See #28649 for a followup. I still think we should merge this, especially since the patch has been accepted upstream. #28649 uses nbruin's suggestion for flushing the output when we have a non-patched FLINT.

Last edited 4 weeks ago by jhpalmieri (previous) (diff)

comment:36 Changed 4 weeks ago by vbraun

  • Branch changed from u/jhpalmieri/flint-flush to ed66988e948dd61850b03d8bc282e8effad0aa4c
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.