Opened 18 months ago
Closed 15 months ago
#28334 closed defect (fixed)
py3: polynomial_rational_flint.pyx problem
Reported by: | jhpalmieri | Owned by: | |
---|---|---|---|
Priority: | blocker | Milestone: | sage-9.0 |
Component: | python3 | Keywords: | flint, flush, py3 |
Cc: | slelievre, spancratz, wbhart, chapoton | Merged in: | |
Authors: | John Palmieri | Reviewers: | Steven Trogdon |
Report Upstream: | Fixed upstream, but not in a stable release. | Work issues: | |
Branch: | ed66988 (Commits) | Commit: | ed66988e948dd61850b03d8bc282e8effad0aa4c |
Dependencies: | Stopgaps: |
Description
On some systems, Debian for example, we see this failure with Python 3:
Failed example: G = f.galois_group(); G Expected: Transitive group number 5 of degree 4 Got: Exception (FLINT memory_manager). Unable to allocate memory. Transitive group number 5 of degree 4 ********************************************************************** 1 item had failures: 1 of 16 in sage.rings.polynomial.polynomial_rational_flint.Polynomial_rational_flint.galois_group [397 tests, 1 failure, 2.56 s]
Change History (36)
comment:1 Changed 18 months ago by
comment:2 follow-up: ↓ 3 Changed 18 months ago by
It could be related to the problem discussed at #28106.
comment:3 in reply to: ↑ 2 Changed 18 months ago by
Replying to jhpalmieri:
It could be related to the problem discussed at #28106.
On the other hand, I've tried increasing the default memlimit
in bin/sage-runtests
, with no effect.
comment:4 Changed 18 months ago by
Also, I get the same error in a Sage session with
sage: from sage.doctest.control import DocTestController, DocTestDefaults sage: DC = DocTestController(DocTestDefaults(), ['/home/john/Sage/sage-8.9.beta5/src/sage/rings/polynomial/polynomial_rational_flint.pyx']) sage: DC.run()
I don't think this uses sage-runtests
and the memory limits there. So maybe it's not the same as #28106.
comment:5 Changed 18 months ago by
I added a few print statements to flint's memory_manager.c
. This error is being printed by flint_calloc
.
comment:6 Changed 18 months ago by
I don't have a debian machine and know nothing about flint. But John's experiment seems to indicate that the memory error is genuine. I guess somehow the polynomial_rational_flint module of sage on debian machine leaks memory.
comment:7 Changed 18 months ago by
- Cc spancratz added
comment:8 follow-up: ↓ 9 Changed 18 months ago by
I'm seeing this on an Ubuntu virtual machine (running on the same OS X box). I wonder if it's the processor as much as which type of linux is running. This machine has a Core i7 processor.
comment:9 in reply to: ↑ 8 Changed 17 months ago by
Replying to jhpalmieri:
I'm seeing this on an Ubuntu virtual machine (running on the same OS X box). I wonder if it's the processor as much as which type of linux is running. This machine has a Core i7 processor.
As reported in https://groups.google.com/d/msg/sage-release/VuAt1Sc46IQ/n9ejPwoiAwAJ, I also get the error on a Xeon E5-2623 processor (with Ubuntu 18.04).
comment:10 follow-up: ↓ 11 Changed 17 months ago by
According to the entry (G) in the summary table of #28298, the doctest discussed here is passed on the reference patchbot, which is a Ubuntu machine though...
comment:11 in reply to: ↑ 10 Changed 17 months ago by
Replying to egourgoulhon:
According to the entry (G) in the summary table of #28298, the doctest discussed here is passed on the reference patchbot, which is a Ubuntu machine though...
Well, actually it is reported as failed in https://patchbot.sagemath.org/log/0/Ubuntu/18.04/x86_64/4.15.0-54-generic/petitbonum/2019-09-03%2009:06:17?short
Is this the "reference patchbot" for Python 3 Sage? If yes, then the table of #28298 must be updated.
comment:12 follow-up: ↓ 13 Changed 17 months ago by
Okay, I think I know what's going on. This is the same problem as in #28454: strings are being sent to stdout by a C or C++ library but the output is not flushed, or whether it's flushed is system/OS-dependent, so messages like "Exception (FLINT memory_manager). Unable to allocate memory." do not get printed when we expect them, but rather some time later. In this particular case, the message should be printed when this doctest is run:
FLINT memory errors do not crash Sage (:trac:`17629`):: sage: t^(sys.maxsize//2) Traceback (most recent call last): ... RuntimeError: FLINT exception
in which case it's captured by the "...". But on some platforms, with Python 3 only, the message is not printed at the right time. (I don't know why, but this really does seem to be what's going on.)
One solution would be to modify FLINT to flush the output. Is there a pure Python 3 solution, which we can incorporate in the Sage library without patching or modifying FLINT?
comment:13 in reply to: ↑ 12 Changed 17 months ago by
Replying to jhpalmieri:
One solution would be to modify FLINT to flush the output. Is there a pure Python 3 solution, which we can incorporate in the Sage library without patching or modifying FLINT?
I should also note that patching FLINT won't help if a user is using the system's version of FLINT. So can we do this just with Sage's FLINT interface?
comment:14 Changed 17 months ago by
- Branch set to u/jhpalmieri/flint-flush
comment:15 Changed 17 months ago by
- Commit set to d256999e0711075497d6ab54f8e91a20556a6ce9
Having said that, here is a patch for FLINT. This fixes the problem for me. I won't mark as "needs review" yet, since I would like to know if there is a fix for this on the Sage/Python side first. If that looks possible but complicated, we can use this patch as a stopgap.
New commits:
d256999 | trac 28334: patch FLINT to flush output when printing memory error message
|
comment:16 Changed 17 months ago by
- Commit changed from d256999e0711075497d6ab54f8e91a20556a6ce9 to b44f1cac24196ee496cf0805d237f7850d618f71
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
b44f1ca | trac 28334: patch FLINT to flush output when printing memory error message
|
comment:17 Changed 17 months ago by
- Commit changed from b44f1cac24196ee496cf0805d237f7850d618f71 to ed66988e948dd61850b03d8bc282e8effad0aa4c
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
ed66988 | trac 28334: patch FLINT to flush output when printing memory error message
|
comment:18 Changed 17 months ago by
A stupid workaround, by the way, would be to label the doctest in comment:12 as py2
. I would rather not do that.
comment:19 Changed 17 months ago by
I see this failure, as well as that at #28454, on a sage-on-gentoo build using py3. Of course both flint
and eclib
are system installed.
comment:20 Changed 17 months ago by
John Cremona is working on a new eclib
release, so that problem should be fixed pretty soon, without any patching needed (but after updating the system eclib
). I don't know what to do about FLINT, if patching is not the right choice.
comment:21 Changed 16 months ago by
I think these are the options:
- label one doctest as
py2
, thereby sweeping the problem under the rug - add a patch to flint, which won't help with system-installed copies of flint
- convince the Python/flint interface to flush the output
The last one should be the best, but I don't know how to do it. Do we do either of the others as a stopgap, or wait?
comment:22 Changed 16 months ago by
- Cc slelievre wbhart added
- Keywords flint flush py3 added
cc-ing Bill Hart who might have some insight on this.
comment:23 Changed 16 months ago by
- Priority changed from major to blocker
comment:24 Changed 16 months ago by
- Status changed from new to needs_review
There has been no progress on this. I'm going to mark this as "needs review", since the other options (new flint, or get Sage to flush flint's output) can coexist with this. As I've said before, I would welcome other solutions.
comment:25 Changed 16 months ago by
Would it help to file an issue or a pull request at https://github.com/wbhart/flint2. There are numerous locations in the trunk
branch where code had been added of the type
flint_print("..."); fflush(stdout);
Then again adding a fflush
to memory_manager.c
may be too trivial a change prior to a major release.
comment:26 follow-up: ↓ 27 Changed 16 months ago by
- Report Upstream changed from N/A to Reported upstream. No feedback yet.
I created a pull request, but since the last release was four years ago, I don't have high hopes.
comment:27 in reply to: ↑ 26 Changed 16 months ago by
Replying to jhpalmieri:
I created a pull request, but since the last release was four years ago, I don't have high hopes.
The FLINT repo is active, and has a roadmap for FLINT 2.6.0, itself part of the Oscar roadmap, so there are reasons to hope.
comment:28 Changed 16 months ago by
- Report Upstream changed from Reported upstream. No feedback yet. to Fixed upstream, but not in a stable release.
The pull request has been merged.
comment:29 Changed 15 months ago by
This is the main remaining Py3 doctest failure. Shall we merge this, and then later hope to find a systematic way to flush output, or upgrade flint when it's ready?
comment:30 Changed 15 months ago by
- Cc chapoton added
comment:31 Changed 15 months ago by
I think we should merge it, but I'll wait for comment.
comment:32 Changed 15 months ago by
+1 to merging this. If you end up inserting "flush" in the interface wrappers, you'll end up flushing a lot when there's nothing to flush. Since flush is a system call, such wasteful use of resources might end up quite noticeable. Better flush at the source, where you know when it's necessary.
comment:33 Changed 15 months ago by
- Reviewers set to Steven Trogdon
- Status changed from needs_review to positive_review
I'll push the button. It can always be undone.
comment:34 Changed 15 months ago by
- Milestone changed from sage-8.9 to sage-9.0
comment:35 Changed 15 months ago by
See #28649 for a followup. I think we should merge this, especially since the patch has been accepted upstream. #28649 uses nbruin's suggestion for flushing the output when we have a non-patched FLINT.
comment:36 Changed 15 months ago by
- Branch changed from u/jhpalmieri/flint-flush to ed66988e948dd61850b03d8bc282e8effad0aa4c
- Resolution set to fixed
- Status changed from positive_review to closed
I installed a Debian virtual machine, and now I can see this error.
This problem does not seem related to the particular doctest: if I remove lots of other doctests from that file but keep this one, the error goes away. If I instead keep the file intact and add some print statements, the error is printed as this line is executed:
PermutationGroup
has nothing to do with flint, I think.Any ideas how to debug this?