#12173 closed enhancement (fixed)
Update FLINT to 2.3
Reported by: | mhansen | Owned by: | tbd |
---|---|---|---|
Priority: | major | Milestone: | sage-5.10 |
Component: | packages: standard | Keywords: | flint spkg |
Cc: | jpflori, spancratz, was, leif | Merged in: | sage-5.10.beta3 |
Authors: | Mike Hansen, Fredrik Johansson, Jean-Pierre Flori | Reviewers: | John Cremona, Jeroen Demeyer, Burcin Erocal |
Report Upstream: | Reported upstream. Developers acknowledge bug. | Work issues: | |
Branch: | Commit: | ||
Dependencies: | sage-5.9.beta5 | Stopgaps: |
Description (last modified by )
Install:
- the FLINT spkg at http://boxen.math.washington.edu/home/jdemeyer/spkg/flint-2.3.p1.spkg (spkg diffs: flint-2.3.diff and flint-2.3.p1.diff)
Apply:
- trac_12173-script-v8.patch, this is the result of running 12173_script-v2.sh;
- trac_12173-fixes-v7.patch, non-easily scriptable changes and fixes for spurious changes, together with reviewers fixes;
- trac_12173-doctests-v2.patch, fix doctests, add some.
Apply to SAGE_ROOT:
Upstream bug with OS X PPC: https://groups.google.com/forum/?hl=en&fromgroups#!topic/flint-devel/KrgY0v09SwI
Attachments (75)
Change History (419)
comment:1 Changed 10 years ago by
Changed 10 years ago by
Changed 10 years ago by
comment:2 Changed 10 years ago by
zmod.patch replaces references to zmod_poly with nmod_poly and (hopefully) fixes places where the interface of a function changed. Completely untested at this point, of course.
Some remarks:
I deleted the zmod_poly_pow function implemented in Sage, since FLINT now provides such a function.
There is no longer a _nmod_poly_set_coeff_ui, which was used in one place (creating a polynomial from a list). I replaced it with nmod_poly_set_coeff_ui rather than manipulating coefficients directly, which is a bit slower, but shouldn't matter that much.
nmod_poly_resultant needs to be added in FLINT 2. If we don't get it done, the code calling it could just be commented out to fall back to the default Sage code that already handles the non-prime case.
I didn't change any filenames or Sage types (so the corresponding Sage files/functions/types are all still called zmod_poly).
The new pxd file uses the GMP mp_limb_t (etc) types rather than unsigned long (etc). Some search and replace might be needed, or importing the types from gmp.h.
Changed 10 years ago by
comment:3 Changed 10 years ago by
Added another patch for fmpz_poly functions.
fmpz_poly.pxi still needs rewriting, and I didn't touch the cases in polynomial_rational_flint.pyx (since that file needs fmpq_poly fixes as well).
comment:4 Changed 10 years ago by
- Cc jpflori added
comment:5 Changed 10 years ago by
I just want to let you know that I have incorporated the current state of progress into an installation of Sage at
/scratch/spancratz/sflint
on sage.math.washington.edu. Step by step, I have gotten rid of some of the problems, but I didn't have enough time last week to finish this. By all means, please go ahead and continue working on this copy of the work.
Best wishes
Sebastian
comment:6 Changed 10 years ago by
That might be a stupid question, but how do I access your working copy?
Should I ask for an account on the mentioned machine (to whom?)?
comment:7 Changed 10 years ago by
You should e-mail wstein@… and say you're working on getting FLINT into Sage. He'll give you an account on sage.math.washington.edu
comment:8 Changed 10 years ago by
- Component changed from PLEASE CHANGE to packages
- Type changed from PLEASE CHANGE to enhancement
comment:9 Changed 10 years ago by
Was the directory lost after the recent sage.math problems ? I can't locate a spancratz directory on /scratch... I'll have a look at the google groups in case I can find more info there.
comment:10 Changed 10 years ago by
Yes, it seems to be gone.
comment:11 Changed 10 years ago by
Any chance that one of you backed it up somewhere else ?
comment:13 Changed 10 years ago by
This new spkg and patch should apply to a clean install of sage-5.0, and almost completely allows building Sage against flint-2.3:
http://sage.math.washington.edu/home/fredrik/flint-2.3.spkg
http://sage.math.washington.edu/home/fredrik/flint-2.3-sage-5.0.diff
There is one small remaining issue building Sage: at one point, the definition of ulong in zn_poly.h apparently clashes with the one in flint. A quick workaround for development purposes is to add #ifndef ulong ... #endif in sage/local/include/zn_poly.h. I don't know the best proper fix. But with the workaround,
sage -f flint-2.3.spkg
sage -b
should work.
When starting Sage, there is a link time error (finding fmpz_set_ZZ), so it seems that the NTL interface is not being built or included correctly. Presumably, fixing that, Sage will start up successfully, though most likely there will be some bugs to hunt down...
comment:14 Changed 10 years ago by
I've had a very quick look and the spkg needs major simplification. Most of the logic which was present is now in the flint configure script. I'll try to provide an updated version.
I see that there was some restriction in the previous spkg-install reported by Michael Abshoff on an "US IIIi" with GCCC 4.3.2 which is not reflected in the configure script. Some reason for this ?
To answer my own question, by default -funroll-loops is not included by default, which covers the solaris case.
comment:15 Changed 10 years ago by
I've just reuploaded fredrik patch as an attachment so that it can be easily viewed in trac.
comment:16 Changed 10 years ago by
The NTL problem is just that interfaces/NTL-interface.lo is not included into libflint anymore. The configure and Makefile.in files should be modified.
There was some partial magic added here (*_EXTRAS): https://github.com/wbhart/flint2/commit/64b5b59977cf1b29bc01faf4d4ff20e121ab7198 but removed here: https://github.com/wbhart/flint2/commit/0103ccbef63781436ad51e060a7adcb0e6b23865 if I'm not wrong.
Violently changing the Makefile.in to include NTL-interface.lo solves the problem, although this should certainly be included upstream rather than here.
comment:17 Changed 10 years ago by
Now sage launches but I get a crash when quitting due to an undefined symbol flint_stack_cleanup. I'll have a look later.
comment:18 Changed 10 years ago by
Another complication might be that the headers used to be installed into $SAGE_LOCAL/include/FLINT which is quite meaningful because there are a lot of them.
It would be nice to use the install target, but it would put everything into $S_L/include/.
Somebody has a thought about that? Modify flint Makefile? Install the headers into $S_L/include/ and modify Sage source (mainly the module thing at the top of Sage and flint*.px* files)? Continue to copy the files by hand?
comment:19 follow-ups: ↓ 20 ↓ 21 Changed 10 years ago by
The call to flint_stack_cleanup() should be replaced with a call to _fmpz_cleanup(), I think.
I agree that it would be nice to keep the header files in a subdirectory. Dunno what the best solution is.
comment:20 in reply to: ↑ 19 Changed 10 years ago by
Replying to fredrik.johansson:
I agree that it would be nice to keep the header files in a subdirectory. Dunno what the best solution is.
My prefered solution would be that make install puts them in a separate directory by default. I'll ask on flint-devel to get bill's preference.
comment:21 in reply to: ↑ 19 Changed 10 years ago by
Replying to fredrik.johansson:
The call to flint_stack_cleanup() should be replaced with a call to _fmpz_cleanup(), I think.
After quickly looking at the memory management in flint 1.* and flint 2.3, this seems the right solution to me (without real insight nonetheless) and indeed fixes the problem!
comment:22 Changed 10 years ago by
Now I get a LOT of segfaults. For example, running:
sage: GF(4, 'a')
comment:23 Changed 10 years ago by
Two things :
- the order of the structure changed, now exp is before p, not sure this is important...
- n_factor_t has to be initialized.
With these changes I can create GF(4)!
comment:24 Changed 10 years ago by
The function n_factor is also called in descent_two_isogeny.
comment:25 Changed 10 years ago by
Running make ptest raises a limited number of problem. The first one I saw is quite bad: I get time out caused by some segfaults...
Apparently, something is now deallocated twice (I get a double linked blah from glibc, more precisely "* glibc detected * python: corrupted double-linked list: 0x00... *"). (This is not caused by the _fmps_cleanup() call, it also happens without it)
This is not sytemcan be reproduced by testing ell_tate_curve.py for example.
comment:26 follow-up: ↓ 27 Changed 10 years ago by
Aa far as the other bugs are concerned, it seems that #6919 is back??? At least, the same doctest fails.
comment:27 in reply to: ↑ 26 Changed 10 years ago by
comment:28 Changed 10 years ago by
About other easy errors:
- failures in polynomial_zmod_flint are just a change in the behavior of _mul_trunc_opposite which calls nmod_poly_mulhigh
- failures in combinat/sf/jack are just changes in ordering and normalization
- a failure in rings.fraction_field_FpT is just a change in a choice of a sqrt, the other error is more mysterious, in fact it is just because of a different normalization (the minus goes into the denominator rather the numerator)... the problem is that equality testing is consequently wrong, or the result of sqrt should be normalized with the corresponding Sage function (note for later, the nmod_poly_cmp defined there should be removed and nmod_poly_equal used instead)
- the failure in function_field_element is a chang in the choice of a sqrt
There is a segfault in function_field_ideal in a call to _nmod_poly_xgcd_euclidean
The error in flint.pyx is mentioned in the previous post.
The error in jacobian_morphism might be related to the FFT problem, otherwise, I have no idea. Quite strangely Q*239 where Q is a point on a jacobian of hyperelliptic curve should return Q because Q is of order 238... Evaluating Q*237+Q*2 is ok equal to Q, but Q*2+Q*237 is wrong equal to -Q !! This is not true for 3, but true for 4, so I guess there is a problem when a point is doubled.
And the following files (at least) randomly produced memory managment errors (although all test passed):
- book_stein_modform
- ell_tate_curve (less often, potentially different problem)
- ell_padic_field
- overconvergent/genus0
comment:29 Changed 10 years ago by
Good work. I'll look at some of the failures tonight.
Until the FFT gets fixed, we could temporarily disable SS multiplication in _fmpz_poly_mul/low/high to suppress those errors.
Do you have a diff of your edits so far?
comment:30 follow-up: ↓ 31 Changed 10 years ago by
Some progress: The nmod_xgcd problem seems to be that it segfaults on constant input (that is polynomials of degree 0), this should be reported on flint-devel as well.
comment:31 in reply to: ↑ 30 Changed 10 years ago by
Replying to jpflori:
Some progress: The nmod_xgcd problem seems to be that it segfaults on constant input (that is polynomials of degree 0), this should be reported on flint-devel as well.
Ah, great! I love when the segfault is due to "constant input" and not "this specific product of five polynomials of length 2564148" :-)
comment:32 Changed 10 years ago by
Replying to fredrik.johansson:
Do you have a diff of your edits so far?
I have. I also have to quit my office right now, but I'll quickly copy my working directory and try to upload something usable this afternoon. The patch will surely not be clean, but sufficient to make further testing, because then I'll be away for 3 days (until tuesday morning).
comment:33 Changed 10 years ago by
Sorry, it seems I failed to upload anything...
Anyway, I just found out that you can doctest a file within gdb which will vastly simplify solving the remaining issues!
comment:34 Changed 10 years ago by
Quite strangely, something got broken on my 5.0 install and I was not able to reinstall zn_poly while trying to provide a patched spkg (by the way I just found out #12433).
The problem seems to be that while building profiler.c, the inclusion of profiler.h is skipped so lots of things are undefined (even stderr!). I don't get why yet...
Anyway I just started rebuilding stuff on top of a 5.1.beta0 install, first the updated zn_poly spkg from #12433) and then my wip flint-2.3 package and no more tweaking of zn_poly.h is needed to build flint-2.3, which is good news. I'm also able to rebuild zn_poly afterward, so maybe I just did something wrong on the previous install.
Fredrik's patch needs some rebasing to apply on top of 5.1.beta0. I'll rebase it and also upload the few further changes and an updated spkg I made for testing purposes.
comment:35 Changed 10 years ago by
- Dependencies set to 12433
comment:36 Changed 10 years ago by
An updated spkg, including Fredrik's fix for the xgcd segfaults, NTL interface build, header files installation, simplified spkg-* scripts, is available at: http://perso.telecom-paristech.fr/~flori/sage/flint-2.3.jp.spkg
This is for testing only.
comment:37 Changed 10 years ago by
comment:38 Changed 10 years ago by
My bad... the memory problems were indeed caused by the call to _fmpz_cleanup as I found out using valgrind.
I wrote pass instead of return so that the function was still called...
comment:39 Changed 10 years ago by
I'm not sure how everything gets freed when Sage exits but I think I got the culprit (or one of them).
When quit_all() in sage.all is called
- in the beginning free_flint_stack() is called which calls _fmpz_cleanup(),
- but then there must be other calls to fmpz_poly_clear() for example in ..quaternion_algebra_element._clear_globals(),
whence potential double free().
Also the dealloc methods of the classes based on fmp?_poly call fmp?_clear_poly. I don't know when Python objects are deallocated but if this happens after the call to _fmpz_cleanup(), that might be problematic.
As far as PARI is concerned, the dealloc function of gen objects first checks that the pointer to the PARI object is not 0, maybe the same should be done for Sage objects based on flint objects. However, calls to fmpz_poly_clear as in quaternion_algebra_element will be more tedious to deal with.
After a first experiment, putting the call to free_flint_stack() a little later, just after twisted is closed solves memory corruption when doctesting ell_padic_field. This is quite mysterious to me...
comment:40 Changed 10 years ago by
Mmm, all of this seems quite more involved... The calls to *_poly_clear should not be problematic, the problem comes from somewhere else.
comment:41 Changed 10 years ago by
Here is some progress hopefully: the easiiest way I found to reproduce the memory corruption error is through doctesting sage/schemes/elliptic_curves/ell_padic_field.py
Within this file the problem seems to come (meaning that if I remove it I cannont repoduce the bug in a reasonable time) from the line
uN = (1 + h(x0)/y0**(2*p)).sqrt()
and more precisely from the evaluation
h(x0)
AND because it is done within a locally defined function whereas h is defined in the outer function.
I'm not sure all the memory corruption problems come from such a case, but at least sage/schemes/elliptic_curves/padics.py where I've seen one time such a bug and sage/schemes/elliptic_curves/ell_tate_curve.py where I've seen it several times use similarly locally defined functions.
comment:42 follow-up: ↓ 44 Changed 10 years ago by
That's really strange. Does building flint in reentrant mode change anything?
comment:43 Changed 10 years ago by
I think that unfortunately the real reason is deeper but the error occurs only in this situation... I'll post some valgrind logs and point out what I think might be hintful.
comment:44 in reply to: ↑ 42 ; follow-up: ↓ 46 Changed 10 years ago by
Replying to fredrik.johansson:
That's really strange. Does building flint in reentrant mode change anything?
Compiling flint in reentrant mode seems to "solve" the ell_padic_field error. I just launched "make ptest" and will report about that when it finishes.
comment:45 Changed 10 years ago by
- Dependencies changed from 12433 to #12433
Changed 10 years ago by
Doctest of ell_padic_field.py on 5.1.beta1 + new zn_poly spkg + new flint spkg.
Changed 10 years ago by
Doctest of ell_padic_field.py on 5.1.beta1 + new zn_poly spkg + new flint spkg + moving the "h =" line.
Changed 10 years ago by
Doctest of ell_padic_field.py on 5.1.beta1 + new zn_poly spkg + new flint spkg compiled reentrant.
comment:46 in reply to: ↑ 44 Changed 10 years ago by
Replying to jpflori:
Replying to fredrik.johansson:
That's really strange. Does building flint in reentrant mode change anything?
Compiling flint in reentrant mode seems to "solve" the ell_padic_field error. I just launched "make ptest" and will report about that when it finishes.
Indeed, make ptest only returns non memory related errors.
comment:47 Changed 10 years ago by
Here is what might be the important part from the valgrind log of 5.1.beta1 + zn_poly + flint (single)
==17534== Invalid read of size 8 ==17534== at 0x829DE0D: __gmpn_copyi (in /media/local/flori/sage/sage-5.1.beta1+flint-2.3/local/lib/libgmp.so.7.4.0) ==17534== by 0x8256F49: __gmpz_export (in /media/local/flori/sage/sage-5.1.beta1+flint-2.3/local/lib/libgmp.so.7.4.0) ==17534== by 0x1318FD2C: __pyx_f_4sage_4libs_4pari_3gen_12PariInstance__new_GEN_from_mpz_t (gen.c:43564) ==17534== by 0x131B31E2: __pyx_f_4sage_4libs_4pari_3gen_12PariInstance_new_gen_from_padic (gen.c:43941) ==17534== by 0x1DBFF444: __pyx_f_4sage_5rings_6padics_29padic_capped_relative_element_26pAdicCappedRelativeElement__to_gen (padic_capped_relative_element.c:15445) ==17534== by 0x1DBFECED: __pyx_pf_4sage_5rings_6padics_29padic_capped_relative_element_26pAdicCappedRelativeElement_25_pari_ (padic_capped_relative_element.c:15492) ==17534== by 0x4E7D5B2: PyObject_Call (abstract.c:2529) ==17534== by 0x131ECF4D: __pyx_pf_4sage_4libs_4pari_3gen_12PariInstance_14__call__ (gen.c:44625) ==17534== by 0x4E7D5B2: PyObject_Call (abstract.c:2529) ==17534== by 0x1D5A9CDB: __pyx_pf_4sage_5rings_6padics_21padic_generic_element_19pAdicGenericElement_19square_root (padic_generic_element.c:7417) ==17534== by 0x4E7D5B2: PyObject_Call (abstract.c:2529) ==17534== by 0x1D398D26: __pyx_pf_4sage_5rings_6padics_21local_generic_element_19LocalGenericElement_6sqrt (local_generic_element.c:2604) ==17534== by 0x4F20DBC: PyEval_EvalFrameEx (ceval.c:4013) ==17534== by 0x4F22414: PyEval_EvalCodeEx (ceval.c:3253) ==17534== by 0x4F20690: PyEval_EvalFrameEx (ceval.c:4109) ==17534== by 0x4F22414: PyEval_EvalCodeEx (ceval.c:3253) ==17534== by 0x4F20690: PyEval_EvalFrameEx (ceval.c:4109) ==17534== by 0x4F22414: PyEval_EvalCodeEx (ceval.c:3253) ==17534== by 0x4F22551: PyEval_EvalCode (ceval.c:667) ==17534== by 0x4F20C5E: PyEval_EvalFrameEx (ceval.c:4710) ==17534== by 0x4F22414: PyEval_EvalCodeEx (ceval.c:3253) ==17534== by 0x4EA53DB: function_call (funcobject.c:526) ==17534== by 0x4E7D5B2: PyObject_Call (abstract.c:2529) ==17534== by 0x4E8AD0E: instancemethod_call (classobject.c:2578) ==17534== by 0x4E7D5B2: PyObject_Call (abstract.c:2529) ==17534== Address 0x2f78a098 is 0 bytes after a block of size 8 alloc'd ==17534== at 0x4C2B313: malloc (vg_replace_malloc.c:263) ==17534== by 0x75DF272: sage_malloc (in /media/local/flori/sage/sage-5.1.beta1+flint-2.3/devel/sage-main/c_lib/libcsage.so) ==17534== by 0x75DF30D: sage_mpir_malloc (in /media/local/flori/sage/sage-5.1.beta1+flint-2.3/devel/sage-main/c_lib/libcsage.so) ==17534== by 0x14BDA9E9: __pyx_f_4sage_5rings_7integer_fast_tp_new (integer.c:33356) ==17534== by 0x14BFAB8A: __pyx_f_4sage_5rings_7integer_7Integer__valuation (integer.c:21279) ==17534== by 0x14BDB3F6: __pyx_pf_4sage_5rings_7integer_7Integer_65valuation (integer.c:21643) ==17534== by 0x4E7D5B2: PyObject_Call (abstract.c:2529) ==17534== by 0x1DC06CF3: __pyx_pf_4sage_5rings_6padics_29padic_capped_relative_element_26pAdicCappedRelativeElement_6__pow__ (padic_capped_relative_element.c:9477) ==17534== by 0x4E78891: ternary_op.isra.9 (abstract.c:1065) ==17534== by 0x4F1CA3C: PyEval_EvalFrameEx (ceval.c:1254) ==17534== by 0x4F21260: PyEval_EvalFrameEx (ceval.c:4099) ==17534== by 0x4F21260: PyEval_EvalFrameEx (ceval.c:4099) ==17534== by 0x4F22414: PyEval_EvalCodeEx (ceval.c:3253) ==17534== by 0x4EA53DB: function_call (funcobject.c:526) ==17534== by 0x4E7D5B2: PyObject_Call (abstract.c:2529) ==17534== by 0x4E8AD0E: instancemethod_call (classobject.c:2578) ==17534== by 0x4E7D5B2: PyObject_Call (abstract.c:2529) ==17534== by 0x4F1F46C: PyEval_EvalFrameEx (ceval.c:4231) ==17534== by 0x4F22414: PyEval_EvalCodeEx (ceval.c:3253) ==17534== by 0x4EA53DB: function_call (funcobject.c:526) ==17534== by 0x4E7D5B2: PyObject_Call (abstract.c:2529) ==17534== by 0x4E8AD0E: instancemethod_call (classobject.c:2578) ==17534== by 0x4E7D5B2: PyObject_Call (abstract.c:2529) ==17534== by 0x4EDE2AF: slot_tp_init (typeobject.c:5657) ==17534== by 0x4EDA0A7: type_call (typeobject.c:737) ==17534== ==17534== Invalid write of size 8 ==17534== at 0x14E76501: _fmpz_clear_mpz (fmpz.c:78) ==17534== by 0x14E86E2B: fmpz_poly_clear (fmpz.h:75) ==17534== by 0x1CD20BB8: __pyx_tp_dealloc_4sage_5rings_10polynomial_30polynomial_integer_dense_flint_Polynomial_integer_dense_flint(_object*) (polynomial_integer_dense_flint.cpp:4390) ==17534== by 0x4EB6F6E: dict_dealloc (dictobject.c:985) ==17534== by 0x4ED5EF3: subtype_dealloc (typeobject.c:999) ==17534== by 0x4E88E6D: cell_dealloc (cellobject.c:50) ==17534== by 0x4ED3282: tupledealloc (tupleobject.c:220) ==17534== by 0x4EA5646: func_dealloc (funcobject.c:461) ==17534== by 0x4EB8B1A: PyDict_Clear (dictobject.c:891) ==17534== by 0x4EB8C78: dict_tp_clear (dictobject.c:2088) ==17534== by 0x4F593B6: collect (gcmodule.c:769) ==17534== by 0x4F59F27: _PyObject_GC_Malloc (gcmodule.c:996) ==17534== by 0x4F59F8D: _PyObject_GC_NewVar (gcmodule.c:1477) ==17534== by 0x4ED374D: PyTuple_New (tupleobject.c:90) ==17534== by 0x4F3D20D: r_object (marshal.c:874) ==17534== by 0x4F3D743: r_object (marshal.c:1013) ==17534== by 0x4F3D254: r_object (marshal.c:880) ==17534== by 0x4F3D743: r_object (marshal.c:1013) ==17534== by 0x4F3FB47: PyMarshal_ReadObjectFromString (marshal.c:1181) ==17534== by 0x4F3FC2B: PyMarshal_ReadLastObjectFromFile (marshal.c:1142) ==17534== by 0x4F398F4: load_source_module (import.c:773) ==17534== by 0x4F3A97B: import_submodule (import.c:2596) ==17534== by 0x4F3AEFF: ensure_fromlist (import.c:2507) ==17534== by 0x4F3B393: import_module_level.isra.9 (import.c:2175) ==17534== by 0x4F3B849: PyImport_ImportModuleLevel (import.c:2189) ==17534== Address 0x60fac08 is 728 bytes inside a block of size 1,024 free'd ==17534== at 0x4C2A654: free (vg_replace_malloc.c:427) ==17534== by 0x14E765B0: _fmpz_cleanup (fmpz.c:88) ==17534== by 0x3C310E4A: __pyx_pf_4sage_4libs_5flint_5flint_free_flint_stack (flint.c:447) ==17534== by 0x4F20B52: PyEval_EvalFrameEx (ceval.c:3997) ==17534== by 0x4F22414: PyEval_EvalCodeEx (ceval.c:3253) ==17534== by 0x4F20690: PyEval_EvalFrameEx (ceval.c:4109) ==17534== by 0x4F22414: PyEval_EvalCodeEx (ceval.c:3253) ==17534== by 0x4F22551: PyEval_EvalCode (ceval.c:667) ==17534== by 0x4F44CEF: PyRun_FileExFlags (pythonrun.c:1346) ==17534== by 0x4F4578E: PyRun_SimpleFileExFlags (pythonrun.c:936) ==17534== by 0x4F586B4: Py_Main (main.c:599) ==17534== by 0x544876C: (below main) (libc-start.c:226)
comment:48 follow-ups: ↓ 49 ↓ 50 Changed 10 years ago by
The first Invalid read of size 8 is likely not a bug. On various platforms, mpn_copyi relies on page boundaries being aligned to a multiple of 16 bytes so that it can safely read an extra 8 bytes, IIRC. (This pops up all over the place when valgrinding flint unit tests, and it's convenient to add it to the valgrind suppression file.)
The second one has to be what it looks like: some polynomial lives until after _fmpz_cleanup has been called.
So the question is how we can guarantee that all Python objects referencing flint objects get deallocated before _fmpz_cleanup is called.
If this is impossible, I guess we could compile flint in reentrant mode for Sage (but this would slow things down, unfortunately).
One possibility is that we change _fmpz_cleanup to free all mpz limb data but leave the mpz array. This might still be a bit fragile, but if it works, it should solve the problem of polluting valgrind logs since there will only be one large block of unfreed memory.
We must also have this problem with flint_compute_primes.
Does Sage free the MPFR caches on exit, and if so where?
Perhaps this should be brought to sage-devel.
comment:49 in reply to: ↑ 48 Changed 10 years ago by
Replying to fredrik.johansson:
Does Sage free the MPFR caches on exit, and if so where?
Actually, this is irrelevant, because it can't cause the same kind of errors.
comment:50 in reply to: ↑ 48 ; follow-up: ↓ 51 Changed 10 years ago by
Replying to fredrik.johansson:
The first Invalid read of size 8 is likely not a bug. On various platforms, mpn_copyi relies on page boundaries being aligned to a multiple of 16 bytes so that it can safely read an extra 8 bytes, IIRC. (This pops up all over the place when valgrinding flint unit tests, and it's convenient to add it to the valgrind suppression file.)
Thanks, I wasn't aware of that.
The second one has to be what it looks like: some polynomial lives until after _fmpz_cleanup has been called.
So the question is how we can guarantee that all Python objects referencing flint objects get deallocated before _fmpz_cleanup is called.
I tried to modify the dealloc methods of all classes involving flint polynomials and in the different files directly using them by first checking that the pointer is not 0 before calling the clear function, but it did not solve the problem. So either I forgot some of them or my test was stupid (I just copied the idea from the PARI's GEN class, I didn't actually think).
If this is impossible, I guess we could compile flint in reentrant mode for Sage (but this would slow things down, unfortunately).
Slowing down the library would not be that great. As i mentioned earlier, jsut moving the call to _flint_cleanup within quit_sage solves the problem, even though this is surely not the "right" solution. I'm not convinced compiling flint in reentrant mode would be although I have no real knowledge of reentrancy.
One possibility is that we change _fmpz_cleanup to free all mpz limb data but leave the mpz array. This might still be a bit fragile, but if it works, it should solve the problem of polluting valgrind logs since there will only be one large block of unfreed memory.
We must also have this problem with flint_compute_primes.
Does Sage free the MPFR caches on exit, and if so where?
Deallocation is done by quit_sage() defined in $SAGE_ROOT/devel/sage/sage/all.py
Different libraries are concerned, e.g. PARI, but MPFR is not. Potentially, and if this was possible, this was just forgotten.
Perhaps this should be brought to sage-devel.
comment:51 in reply to: ↑ 50 Changed 10 years ago by
Replying to jpflori:
or my test was stupid (I just copied the idea from the PARI's GEN class, I didn't actually think).
This was surely the case :)
comment:52 follow-up: ↓ 53 Changed 10 years ago by
I do think moving the _fmpz_cleanup call within quit_sage is the correct solution if it does work.
If it doesn't work, that means some part of Sage is hanging on to some Python objects, and that code should arguably be fixed so that they can be freed first (perhaps with an explicit call).
comment:53 in reply to: ↑ 52 Changed 10 years ago by
Replying to fredrik.johansson:
I do think moving the _fmpz_cleanup call within quit_sage is the correct solution if it does work.
I'm quite ok with this decision, although, as I pointed before, I have no idea of when Python/Sage? objects actually get deallocated, so this might be a flaky solution.
Now I'm getting a little lost: I have trouble to understand how the error (in the different forms it takes, I assumed it was actually an underlying "double free") can be caused by a call to fmpz_poly_clean after _fmpz_cleanup was called. Indeed, all fmpz_poly_clean does is to deallocate the space needed to store pointers to the coefficients and write in the fmpz_unused_arr that the coeff itself can be reused. This array was already deallocated by _fmpz_cleanup, so we end up writing somewhere we should not really, but whatever should I say: it is quite unlikely that this will actually break something.
Sorry if that sounds stupid or is deeply wrong, but I prefer to be sure everything is clear in my mind.
comment:54 Changed 10 years ago by
fmpz_poly_clear calls fmpz_clear on all its coefficients. fmpz_clear adds the pointer to the unused array to make it available for subsequent reuse. So we can't deallocate fmpz_unused_arr until all fmpz_clear calls have been made.
comment:55 Changed 10 years ago by
Ok, but then some fmpz won't be freed and that's all. This won't lead to any segfault, memory error or horrible breaks.
In case someone actually used Sage in library mode, and played at launching and closing it several times, that would lead to memory leaks (surely among a lot of other ones), but nothing more.
And I do not see any problematic calls to fmpz_clear within Sage. The few ones (its definition within fmpz.pxi, 4 calls in polynomial_integer_dense_flint.pyx and one call in quaternion_algebra_element.pyx) are typically made on local variables which are allocated and freed within functions.
comment:56 follow-up: ↓ 57 Changed 10 years ago by
It (calling some flint clear function after _fmpz_cleanup) can segfault -- fmpz_clear ends up writing to fmpz_unused_arr, and that array might have been freed.
Any explicit "local" calls in Sage should be fine -- presumably those will only be triggered by explicit computation. What could go wrong is that Sage say caches some polynomial in some coercion or IO code somewhere, and due to some cyclic reference the cache doesn't gets deallocated until the Python interpreter shuts down (i.e. after having already gone through sage_exit).
comment:57 in reply to: ↑ 56 Changed 10 years ago by
Replying to fredrik.johansson:
It (calling some flint clear function after _fmpz_cleanup) can segfault -- fmpz_clear ends up writing to fmpz_unused_arr, and that array might have been freed.
I agree with that. I would have just expected it to happen less often, especially that it doesn't segfault but return errors typical of a double free (from glibc, which seems to me even more unfortunate than getting a nice segfault). I indeed get an error every time I doctest ell_padic_field.py
comment:58 Changed 10 years ago by
Anyway, after some more testing I'm rather convinced the call to fmpz_poly_clear is the problem.
comment:59 Changed 10 years ago by
Although there are several calls to fmpz_poly_clear in the quatalg stuff which does not seem to cause similar problem.
As a solution, let's just put the call to _fmpz_cleanup at the end of the quit_sage function and let the buildbot decide if it's really acceptable.
I'll go on with the other problems now.
comment:60 Changed 10 years ago by
The patch makes the tests in fraction_field_FpT and function_field_element pass. I have not checked whether there are tests elsewhere in Sage that disagree with the chosen convention.
comment:61 follow-up: ↓ 67 Changed 10 years ago by
With the most recent changes in my repository, and moving the fmpz_cleanup call to the end of the quit function, the following no longer crash/fail: sage_object.pyx, elliptic_curves/padics.py, hyperelliptic_curves/jacobian_morphism.py, flint.pyx, modform/element.py, ell_point.py
The following still fail with what only appears to be simple normalization problems: polynomial_zmod_flint.pyx, combinat/sf/jack.py
Only one failure remains that appears to be a genuine bug: graphs/matchpoly.pyx
This one seems a bit difficult to debug. Anyone familiar with the math?
comment:62 Changed 10 years ago by
As far as jack.py goes, I'm not sure exactly where the difference in normalization comes from, but the new output is nicer, so I suggest just changing those doctests.
comment:63 follow-up: ↓ 65 Changed 10 years ago by
Jean-Pierre, could you push your changes to the build scripts so we can merge them in the official repository? (It would be nice to get a new 2.3 beta version out ASAP and base a new spkg on that.)
comment:64 follow-up: ↓ 66 Changed 10 years ago by
Unfortunately, there are still memory errors; they just show up inconsistently. The ones that sometimes crash for me include:
sage -t devel/sage-main/sage/modular/abvar/torsion_subgroup.py # Killed/crashed sage -t devel/sage-main/sage/modular/modform/ambient_g1.py # Killed/crashed sage -t devel/sage-main/sage/tests/book_stein_modform.py # Time out sage -t devel/sage-main/sage/modular/hecke/element.py # Killed/crashed
comment:65 in reply to: ↑ 63 Changed 10 years ago by
Replying to fredrik.johansson:
Jean-Pierre, could you push your changes to the build scripts so we can merge them in the official repository? (It would be nice to get a new 2.3 beta version out ASAP and base a new spkg on that.)
I've got a github account with the same login as here. Feel free to get what you want from there, merge it, modify it, or whatever you want. If you need me to do any further processing, please let me know.
comment:66 in reply to: ↑ 64 Changed 10 years ago by
Replying to fredrik.johansson:
Unfortunately, there are still memory errors; they just show up inconsistently. The ones that sometimes crash for me include:
sage -t devel/sage-main/sage/modular/abvar/torsion_subgroup.py # Killed/crashed sage -t devel/sage-main/sage/modular/modform/ambient_g1.py # Killed/crashed sage -t devel/sage-main/sage/tests/book_stein_modform.py # Time out sage -t devel/sage-main/sage/modular/hecke/element.py # Killed/crashed
Too bad...
comment:67 in reply to: ↑ 61 ; follow-up: ↓ 69 Changed 10 years ago by
Replying to fredrik.johansson:
With the most recent changes in my repository, and moving the fmpz_cleanup call to the end of the quit function, the following no longer crash/fail: sage_object.pyx, elliptic_curves/padics.py, hyperelliptic_curves/jacobian_morphism.py, flint.pyx, modform/element.py, ell_point.py
The following still fail with what only appears to be simple normalization problems: polynomial_zmod_flint.pyx, combinat/sf/jack.py
Only one failure remains that appears to be a genuine bug: graphs/matchpoly.pyx
This one seems a bit difficult to debug. Anyone familiar with the math?
That's strange, I did not get this failure earlier. I'll have a look this afternoon.
comment:68 Changed 10 years ago by
No memory errors if I comment out free_flint_stack(), so hopefully there really isn't anything else going on. We just need to figure out how to fix this one...
comment:69 in reply to: ↑ 67 Changed 10 years ago by
Replying to jpflori:
Replying to fredrik.johansson:
Only one failure remains that appears to be a genuine bug: graphs/matchpoly.pyx
This one seems a bit difficult to debug. Anyone familiar with the math?
That's strange, I did not get this failure earlier. I'll have a look this afternoon.
I've just tested graphs/matchpoly.pyx with the latest flint's commit (see what is on my github account), together with all patches to the Sage library posted here (and some minor additional stuff including moving flint_free_stack just before cleaning symmetrica), and cannot repoduce errors in that file.
Could you post a log here?
comment:70 Changed 10 years ago by
File "/scratch/fjohanss/sage-5.1.beta1/devel/sage-main/sage/graphs/matchpoly.pyx", line 333: sage: len(str(f)) Expected: 406824 Got: 6179056
The new value definitely seems to be wrong: complete_poly(n) has coefficients with about n digits for n = 100, 200, ..., 900, but then complete_poly(1000) evaluates to have coefficients with 6000 digits.
This is on a 32-bit system, by the way.
comment:71 Changed 10 years ago by
I'm on a 64 bit system and cannot see the error... we should get confirmation form someone else (on both architectures), it's possible something got screwed on my system.
By the way, it seems to me that the FFT bug is still present on my system although I've merged the latest commits. Could you confirm or infirm that (testing sage/libs/flint/flint.pyx)?
comment:72 Changed 10 years ago by
The FFT bug is fixed for me (same system; the standalone .c program also works on a 64-bit system, but I have not tried Sage there).
comment:73 Changed 10 years ago by
Ok, so something went wrong on my side.
I'll double check everything tomorrow.
comment:74 Changed 10 years ago by
I indeed ended up with a different version of my directory on the machine where I test Sage....
comment:75 Changed 10 years ago by
Ok, everything should be fine now. I don't have errors anymore in flint.pyx,. Nonetheless I cannot reproduce the problem in matchpoly.pyx, so this may indeed be related to a 32/64 bits difference?
comment:76 Changed 10 years ago by
On my amd64 Ubuntu 12.04 (processor must be some kind of Xeon), with sage 5.1.beta1 + the new zn_poly spkg at #12433 + custom flint 2.3 spkg at http://perso.telecom-paristech.fr/~flori/sage/flint-2.3.jp.spkg + the following patches:
- flint-2.3-sage-5.1.beta1.patch
- flint-2.3-jp.patch (this one should be cleaned, split, etc.)
- sqrt_normalization_fix.diff
- headers.patch
- multrunc.patch
make ptest only spat 3 errors:
- the ordering problem in jack.py
- memory error in ell_curve_is_isogeny.py
- memory error in ell_tate_curve
comment:77 Changed 10 years ago by
I pushed an experimental alternative to _fmpz_cleanup that does not free the main arrays. Could you check if using this avoids memory problems?
comment:78 Changed 10 years ago by
I'll test this once the doc of Sage 5.1.beta2 finally finishes to build.
comment:79 Changed 10 years ago by
A quick test of the file ell_tate_curve which produced an error each time I ran it with the latest version of everything, seems to indicate that not freeing the array is enough.
I've just launched a make ptest to be sure that the problem was not just moved somewhere else once more.
comment:80 Changed 10 years ago by
No errors except for jack.py after a first make ptest.
comment:81 Changed 10 years ago by
Very good news. I guess we should try running long tests as well?
comment:82 Changed 10 years ago by
Same results after a second make ptest on sage-5.1.beta1 and another one on sage-5.1.beta2: only errors in jack.py (no memory errors and no error in matchpoly)
Yes, we should try running the long tests at least once, but then that will be the patchbot job I think
comment:83 follow-up: ↓ 84 Changed 10 years ago by
By the way, the only problem I get in jack.py is that some fractions which used to be "minus sthg over minus sthg" are now "plus sthg over plus sthg".
comment:84 in reply to: ↑ 83 Changed 10 years ago by
Replying to jpflori:
By the way, the only problem I get in jack.py is that some fractions which used to be "minus sthg over minus sthg" are now "plus sthg over plus sthg".
Yes, same here. The new output is clearly nicer, so we should just update the doctests.
comment:85 Changed 10 years ago by
Here come some more proper patches.
After running make ptestlong on 5.1.beta2 I got failures:
- in jack.py as before
- in sage/misc/cython.py because I renamed the FLINT directory into flint and fmpq.h is not found anymore, this also use the old fmpq_poly.c file which is now included in FLINT itself I guess. So the test has to be changed somehow.
- in polynomial_rational_flint.pyx: time out...
comment:86 Changed 10 years ago by
- Description modified (diff)
comment:87 Changed 10 years ago by
- Description modified (diff)
The cython stuff should be fixed now.
comment:88 Changed 10 years ago by
The polynomial_rational_flint timeout problem seem to be that calling "getitem" on a slice is indeed horribly slow. Playing with the (only) long test of the file, Sage returns f[1:3] instantly but needs 2.6 seconds to return g[1:2] !!! And I've removed the test for signal activation.
comment:89 Changed 10 years ago by
Using the same code as in polynomial_integer_dense_flint yields similar performance.
Maybe there's a problem in fmpq_poly_get_slice, although the code seems ok at first sight.
comment:90 Changed 10 years ago by
Or the call to "len(list(self))" to get the degree of the polynomial plus one is just too bizarre.
comment:91 Changed 10 years ago by
That was this bizarre construction indeed. Here comes a fix.
comment:92 Changed 10 years ago by
- Description modified (diff)
comment:93 Changed 10 years ago by
I've built Sage 5.1.beta2 within a 32 bits virtual machine running Ubuntu 12.04 and could reproduce the matchpoly error.
comment:94 Changed 10 years ago by
Running make ptestlong within the 32 bits virtual machine only raised this error as well (and another one in polynomial_real_mpfr_dense which could not be reproduced when ran alone).
comment:95 Changed 10 years ago by
I've updated the spkg available at http://perso.telecom-paristech.fr/~flori/sage/flint-2.3.jp.spkg with new git commits. Moreover it contains better cleaned up spkg-install and spkg-check scripts, together with an updated SPKG.txt (for the moment nothing is committed).
I'm not sure about some stuff for that spkg :
- There is some stuff related to Cygwin, I've tried to let it act as before, but have no idea why this was needed, nor if it's still needed.
- There is some stuff related to 64 bits build when SAGE64 is set, not sure about the status of this (in fact I don't really know on what systems this is needed for any spkg, not usual Linux install for sure). With FLINT current configure system, behaving as before overwrites the default CFLAGS FLINT would have chosen (ok, the only relevant part is -O2 -g which is also set together with -m64 when SAGE64 is set, but I'm not really happy with such a solution, could FLINT use some kind of ABI variable?).
- We could or should treat the SAGE_DEBUG flag correctly. As far as CFLAGS are concerned, we can set them ourselves but there is no way to prevent FLINT from using the FLINT_TUNE magic.
- I've put Fredrik and I as maintainers for the moment, any more reasonable proposition is welcomed.
comment:96 Changed 10 years ago by
- Description modified (diff)
comment:97 Changed 10 years ago by
The spkg-check is in fact broken because of undefined symbols related to NTL in libflint.so. This also occurs on my personal install of FLINT when built with the NTL interface.
comment:98 Changed 10 years ago by
My bad, I forgot to add -lntl when building FLINT with the NTL interface. This is now fixed on my github account and in the spkg.
comment:99 Changed 10 years ago by
- Description modified (diff)
I've uploaded a new spkg (same address) based on Fredrik's latest git commit (c644b926cd6f...) plus an additional modification to Makefile.in so that the NTL interface is correctly build (not committed, let's wait for Bill to properly fix it, if that's actually needed).
With this latest spkg, Sage 5.1.beta3 builds correctly and passes "make ptestlong" on my amd64 Ubuntu 12.04.
Within the 32 bits virtual machine, Sage 5.1.beta2 passes "./sage -t sage/graphs/matchpoly.pyx" (although the mul_SS bug is still present, now another algorithm is used I presume).
So I guess we just have to wait for a proper release of FLINT, potentially modify the way it is configure by spkg-install, and properly review every patch to the Sage library and changes to the spkg.
comment:100 Changed 10 years ago by
- Description modified (diff)
Updated http://sage.math.washington.edu/home/fredrik/flint-2.3.spkg to the release version.
Possibly this now just needs changing the milestone to 5.2, rebasing any patches that have broken, and getting it reviewed?
comment:101 Changed 10 years ago by
I'm currently tweaking the spkg-install to take into account the build system changes in FLINT, should not be too long.
I've also build 5.1.rc0 to rebase the patches on it.
comment:102 Changed 10 years ago by
- Description modified (diff)
comment:103 Changed 10 years ago by
There is some work to be done to rebase on #10617. I'm working on this.
As far as the spkg is concerned, I think there is only the Cygwin question to deal with. It seems to me that FLINT can not be build as a shared library on Cygwin (at least Bill disabled this possibility). I'm not sure this is required by Sage, but would guess so. So potentially we should disable the Cygwin hacks altogether and only build a static library (which would link to the static NTL library).
comment:104 Changed 10 years ago by
- Description modified (diff)
comment:105 Changed 10 years ago by
By the way "make ptest" gives no error on my computer.
comment:106 Changed 10 years ago by
Is 17060.patch the right file? It just seems to be the flint1-based patch from #10617, which won't work.
comment:107 Changed 10 years ago by
Indeed, wrong file. I just hope that I did not lose the patch on top of that one...
comment:109 Changed 10 years ago by
I can verify that the patches apply and build cleanly on 5.1-rc0, and the test suite passes (have not tried long tests).
Is this ready for review? I guess updating the build scripts to use static libraries on Cygwin is necessary first?
comment:110 Changed 10 years ago by
I've completely failed to croospost on both flint-devel and sage-devel yesterday, and have just posted on sage-devel to get some info about the shared/static and cygwin stuff.
I'm also installing Cygwin right now to get a hint of what happens there.
Once I've gathered more info, I'll update the Cygwin part of spkg-install and I guess the spkg will be ready for review.
On the other hand, I guess that all the patches to the Sage library deserve proper review now. If you feel ok with mine, please post about it here. I'll have a(nother) look at what Mike, Sebastian and you did to be sure we've not let anything go through.
What might be nice to check is that all functions defined in the pxi files still correspond to actual functions (for example the compose stuff now has an _fmpz trailing), because Im not sure that defining something in these files that does not actually exist in the library would lead to a build failure (if it does, then there's no problem cos everything builds fine here :))
comment:111 Changed 10 years ago by
We should do something about this horrible horror:
sage: a = ZZ['x'](range(100000)); R = Integers(3)['x'] sage: %time R(a); CPU times: user 51.96 s, sys: 0.00 s, total: 51.96 s
It was this slow before as well, but there's no reason not to fix it.
comment:112 follow-up: ↓ 118 Changed 10 years ago by
Hmm, I seem to be unable to fix this in the obvious way. If I add
from sage.rings.polynomial.polynomial_integer_dense_flint cimport Polynomial_integer_dense_flint
to polynomial_zmod_flint.pyx in order to be able to access the value as an fmpz_poly_t, I get
In file included from /scratch/fjohanss/sage-5.1.rc0/local/include/NTL/ZZ.h:19:0, from /scratch/fjohanss/sage-5.1.rc0/local/include/flint/NTL-interface.h:36, from sage/rings/polynomial/polynomial_zmod_flint.c:250: /scratch/fjohanss/sage-5.1.rc0/local/include/NTL/tools.h:11:19: fatal error: cstdlib: No such file or directory
Any idea how to fix this?
comment:113 Changed 10 years ago by
Is this a flint related problem? If we convert a
into a list first we get a huge speedup:
sage: a = ZZ['x'](range(100000)) sage: R = Integers(3)['x'] sage: %timeit R(list(a)) 5 loops, best of 3: 116 ms per loop
Perhaps we should look into R._element_constructor_
for the root of the issue, maybe some ring coercion done in a weird way?
comment:114 Changed 10 years ago by
Right, this is not a flint problem as such, and we should probably open a separate ticket for it.
But in this particular case, we should be doing a direct conversion between the flint polynomial types, which will be much faster still.
comment:115 Changed 10 years ago by
I have created #13257 for this issue.
comment:116 Changed 10 years ago by
I've posted an updated spkg on my website (http://perso.telecom-paristech.fr/~flori/sage/flint-2.3.spkg) based on my github trunk. Apart from updating the src directory to the FLINT final 2.3 release and committing, tagging the changes in the HG repository, nothing should really change (except for fixing bugs :) ).
comment:117 Changed 10 years ago by
- Description modified (diff)
- Keywords flint added
comment:118 in reply to: ↑ 112 Changed 10 years ago by
Replying to fredrik.johansson:
Hmm, I seem to be unable to fix this in the obvious way. If I add
from sage.rings.polynomial.polynomial_integer_dense_flint cimport Polynomial_integer_dense_flintto polynomial_zmod_flint.pyx in order to be able to access the value as an fmpz_poly_t, I get
In file included from /scratch/fjohanss/sage-5.1.rc0/local/include/NTL/ZZ.h:19:0, from /scratch/fjohanss/sage-5.1.rc0/local/include/flint/NTL-interface.h:36, from sage/rings/polynomial/polynomial_zmod_flint.c:250: /scratch/fjohanss/sage-5.1.rc0/local/include/NTL/tools.h:11:19: fatal error: cstdlib: No such file or directoryAny idea how to fix this?
I once had such a problem. IIRC a hackish but working fix is to "cdef extern blahblah" directly the structure in the file where you want to use it, so that you avoid to include all of the corresponding file with cimport and the corresponding dependencies and conflicts hell, in particular here where C and C++ get mixed because of NTL.
comment:119 Changed 9 years ago by
I've rebased patches on top of 5.2. I guess this is quite of useless because all patches related to #715 should get remerged soon and we'll need to rebase on top of that which is basically the same as the latest 5.1 betas. Anyway...
Apart from that I've posted a patch to get faster conversion between integer and mod polys, this is quite dirty, but seems to work ok. I did not have the time to rerun the test suite yet.
The trick about ntl was to tell cython to use g++ rather than gcc.
Two other remarks:
- there is some file still named zmod_poly, I guess that's from FLINT 1.x and zn_poly, we might rename it to nmod_poly,
- the integer_dense class does not use the template thing used by other polynomial class, I guess it should be treated in another ticket, but will be potentially a big amount of work. this is also true for other classes there.
- the fmpz_mod_poly module could be wrapped, once again a fairly big amount of work.
comment:120 Changed 9 years ago by
And obviously I did not changed the doc of the new function I created. Any more sensible name or code organization is welcomed.
comment:121 Changed 9 years ago by
I've updated a more proper spkg at the same address as before. And updated some patches.
comment:122 Changed 9 years ago by
- Description modified (diff)
- Status changed from new to needs_review
comment:123 Changed 9 years ago by
In case this helps the patchbot:
Use the spkg:
- the FLINT spkg at http://perso.telecom-paristech.fr/~flori/sage/flint-2.3.spkg;
Apply:
- 01-flint-2.3-sage-5.2.patch,
- doctests.patch (optional),
- flint_stack_cleanup.patch,
- n_factor.patch,
- degree.patch,
- sqrt_normalization_fix.patch,
- 07-headers-sage-5.2.patch,
- multrunc.patch,
- fmpz_cleanup.patch,
- jack.patch,
- cython.patch,
- slice.patch,
- ref_flint1.patch,
- compose_eval.patch,
- 15-integer_to_mod.patch,
- 16-rename_zmod_to_nmod.patch.
comment:124 Changed 9 years ago by
Is there a reason for not just using fmpz_poly_get_nmod_poly?
comment:125 Changed 9 years ago by
Not at all. Except that I did not now it existed...
Changed 9 years ago by
Changed 9 years ago by
comment:127 Changed 9 years ago by
- Keywords spkg added
- Status changed from needs_review to needs_work
- Work issues set to spkg order
We need to make sure the MPFR (and the PARI and NTL) spkg get build before FLINT.
comment:128 Changed 9 years ago by
Just as a reminder, we should updated the Singular spkg with FLINT support when this gets in. I've created #13331.
comment:129 Changed 9 years ago by
- Description modified (diff)
- Status changed from needs_work to needs_review
- Work issues spkg order deleted
comment:130 Changed 9 years ago by
I did not look into the patch, but according to Jean-Pierre, you had problems with segfaults at quitting Sage.
I am pretty much sure that this comes from something similar I had to struggle with at #12215 in the case of Pari. Namely, if you want to deallocate C-data of pari, flint, ... instances, you should do so in a __dealloc__
method, but not like this:
import sage.matrix.matrix_mod2_dense sage.matrix.matrix_mod2_dense.free_m4ri() import sage.libs.flint.flint sage.libs.flint.flint.free_flint_stack() pari._unsafe_deallocate_pari_stack()
That is code from sage.all.quit_sage. I have replaced the pari._unsafe_deallocate... by a proper __dealloc__
method in #12215, but perhaps you should do the same for the flint stack here?
comment:131 Changed 9 years ago by
I think there is two differences here which complicate the situation:
- we don't have an equivalent of PARIUniqueInstance to which every Sage objects using (in)directly fmpz's could point to
- I fear some classes keep C pointers to fmpz's without any Python level wrapping (such as gen in the PARI case) and free them when they are deleted.
So without implementing such a unique FLINT instance and pointing to it in every stuff using part of FLINT, it will be hard to ensure that every Sage object is deleted before calling the FLINT cleanup function.
comment:132 Changed 9 years ago by
How is this going? Is this waiting on a review or does the problem Simon mentioned need to be solved first?
comment:133 Changed 9 years ago by
- Description modified (diff)
I think the spkg here is (almost?) completely usable.
Simon's remark is sensible, but currently there should be no segfaults occurring at all. It would surely lead to a better integration of Sage/Python? and FLINT memory management, but this would need a huge amount of work, modifying all the pieces of code in Sage which directly use FLINT at the C/Cython level. This could be a "follow-up" ticket (although the problem also exists with the previous versions of FLINT).
The main blocker now is to wait for the final FLINT 2.3 release.
(I've update the spkg link to point to a working address)
comment:134 follow-up: ↓ 135 Changed 9 years ago by
2.3 final is out.
I guess we need a separate ticket for upgrading MPIR to 2.6.0?
comment:135 in reply to: ↑ 134 Changed 9 years ago by
Replying to fredrik.johansson:
2.3 final is out.
Indeed :)
I guess we need a separate ticket for upgrading MPIR to 2.6.0?
I don't think this is mandatory to upgrade FLINT? Of course it would be nice to update MPIR as well.
I'll try to upload an updated spkg containing the 2.3 final later today, so this should quite trivially get positive review.
comment:136 Changed 9 years ago by
Oops, I just updated my repo and saw that Bill made MPIR 2.6.0 a prereq for FLINT 2.3...
comment:137 Changed 9 years ago by
The MPIR upgrade could be dealt with in #13137, although this was originally targetted for 2.5.*.
Changed 9 years ago by
comment:138 Changed 9 years ago by
- Description modified (diff)
Here comes an updated spkg including FLINT 2.3 final at http://boxen.math.washington.edu/home/jpflori/flint-2.3.spkg and two combined and rebased patch to apply to Sage 5.5.rc0.
The only real additions were make to jack.patch because a lot of new doctests were included in sage/combinat/sf/jack.py and had to be fixed as well.
The other rebasing stuff was quite trivial.
I've not included the renaming done by the previous 16-... patch to stick with FLINT terminology as it was surely a bad idea and broke some unpickling tests in Sage. Anyway, it can be done later.
Remark that as Sage only includes MPIR 2.4.x and not MPIR 2.5.x, the FLINT spkg builds correctly and its and Sage test suite pass, so we could first include this, then work on upgrading MPIR directly to 2.6.x, skipping the problematic 2.5.x versions.
comment:139 Changed 9 years ago by
- Description modified (diff)
comment:140 follow-up: ↓ 141 Changed 9 years ago by
I was delighted to see that people have done all the work in getting flint-2.3 into Sage, as I am planning to make it a dependency for eclib.
I am happy to test the new spkg and patch from this ticket, and hope that doing so without reading all the above comments will still be useful input.
comment:141 in reply to: ↑ 140 Changed 9 years ago by
Replying to cremona:
I was delighted to see that people have done all the work in getting flint-2.3 into Sage, as I am planning to make it a dependency for eclib.
I am happy to test the new spkg and patch from this ticket, and hope that doing so without reading all the above comments will still be useful input.
The new spkg built fine for me (64-bit ubuntu), the patches both applied fine to sage-5.5, and all tests pass. Hence:
+1 to a positive review.
comment:142 follow-up: ↓ 143 Changed 9 years ago by
The spkg should be rebased to the flint-1.5.2.p2 package currently in Sage.
Also, the deprecated _sig_on
and _sig_off
should be replaced by sig_on()
and sig_off()
.
comment:143 in reply to: ↑ 142 Changed 9 years ago by
- Description modified (diff)
Replying to jdemeyer:
The spkg should be rebased to the flint-1.5.2.p2 package currently in Sage.
Done.
Also, the deprecated
_sig_on
and_sig_off
should be replaced bysig_on()
andsig_off()
.
Done.
comment:144 Changed 9 years ago by
I removed the longlong patch as the asm code completely changed. Hopefully no new fix is needed but that will need testing on ARM, not sure that was done during the FLINT release process.
comment:145 Changed 9 years ago by
- Reviewers set to John Cremona
The new spkg builds fine for me (64-bit ubuntu) with SAGE_CHECK=yes. I am running a full test after applying both patches and doing sage -br.
All tests pass. I'm not sure that I will test it again after Jeroen's suggested edits!
comment:146 follow-up: ↓ 147 Changed 9 years ago by
Allow me to suggest a few small changes to spkg-install
:
- Change
CFLAGS="-O0 -g"; export CFLAGS
to
export CFLAGS="-O0 -g $CFLAGS"
- Rename
$EXTRA_CONF
to$FLINT_CONFIGURE
for analogy with other packages, don't quote it in the./configure
invocation and changeEXTRA_CONF="--disable-static"
to
FLINT_CONFIGURE="--disable-static $FLINT_CONFIGURE"
- Quote
$SAGE_LOCAL
:rm -f "$SAGE_LOCAL"/lib/libflint* rm -rf "$SAGE_LOCAL"/include/flint
- Why doesn't
spkg-check
have a newline at the end of the file?
comment:147 in reply to: ↑ 146 ; follow-up: ↓ 148 Changed 9 years ago by
Replying to jdemeyer:
Allow me to suggest a few small changes to
spkg-install
:
- Change
CFLAGS="-O0 -g"; export CFLAGSto
export CFLAGS="-O0 -g $CFLAGS"
Done for CFLAGS and similar exports.
- Rename
$EXTRA_CONF
to$FLINT_CONFIGURE
for analogy with other packages, don't quote it in the./configure
invocation and changeEXTRA_CONF="--disable-static"to
FLINT_CONFIGURE="--disable-static $FLINT_CONFIGURE"
Done. Why not quote FLINT_CONFIGURE?
- Quote
$SAGE_LOCAL
:rm -f "$SAGE_LOCAL"/lib/libflint* rm -rf "$SAGE_LOCAL"/include/flint
Done.
- Why doesn't
spkg-check
have a newline at the end of the file?
No reason, it's in now.
comment:148 in reply to: ↑ 147 Changed 9 years ago by
Replying to jpflori:
Why not quote FLINT_CONFIGURE?
To allow
FLINT_CONFIGURE="--enable-foo --disable-bar"
You want this to be two separate words, similar to how $CC
and $CFLAGS
should not be quoted.
comment:149 follow-up: ↓ 151 Changed 9 years ago by
This breaks all over the place on OS X, see http://build.sagemath.org/sage/builders/UW%20bsd%20%28OSX%2010.6%20x86_64%29/builds/192/steps/shell_7/logs/stdio
This also shows that some cdef
functions should have except
clauses, in particular
cdef void _set_fmpz_poly(self, fmpz_poly_t)
Otherwise, what's the point of sig_on()
?
comment:150 Changed 9 years ago by
- Status changed from needs_review to needs_work
- Work issues set to cdef except
comment:151 in reply to: ↑ 149 ; follow-up: ↓ 152 Changed 9 years ago by
Replying to jdemeyer:
This breaks all over the place on OS X, see http://build.sagemath.org/sage/builders/UW%20bsd%20%28OSX%2010.6%20x86_64%29/builds/192/steps/shell_7/logs/stdio
Does FLINT pass its testsuite there? (I don't have access to bsd so cannot check for myself.)
This also shows that some
cdef
functions should haveexcept
clauses, in particularcdef void _set_fmpz_poly(self, fmpz_poly_t)Otherwise, what's the point of
sig_on()
?
comment:152 in reply to: ↑ 151 Changed 9 years ago by
Replying to jpflori:
Replying to jdemeyer:
This breaks all over the place on OS X, see http://build.sagemath.org/sage/builders/UW%20bsd%20%28OSX%2010.6%20x86_64%29/builds/192/steps/shell_7/logs/stdio
Does FLINT pass its testsuite there?
Yes, it does.
comment:153 Changed 9 years ago by
- Work issues changed from cdef except to cdef except, C++/C99
This is problematic:
Extension('sage.rings.fraction_field_FpT', sources = ['sage/rings/fraction_field_FpT.pyx'], language = 'c++', libraries = ["csage", "flint", "gmp", "gmpxx", "ntl", "zn_poly"], extra_compile_args=["-std=c99", "-D_XPG6"], include_dirs = [SAGE_INC + 'flint/'], depends = flint_depends),
A file cannot be both C++ and C99.
This causes a build failure on OpenSolaris (buildbot machine hawk
).
Changed 9 years ago by
comment:154 Changed 9 years ago by
- Description modified (diff)
- Status changed from needs_work to needs_review
- Work issues cdef except, C++/C99 deleted
comment:155 Changed 9 years ago by
- Description modified (diff)
comment:156 follow-up: ↓ 157 Changed 9 years ago by
- Status changed from needs_review to positive_review
Your changes look good, so I'm putting back this to positive review with the idea it will relaunch a set of tests on the buildbots (am I wrong?) even though I cannot really think of how this could fix the bsd problems.
comment:157 in reply to: ↑ 156 ; follow-up: ↓ 158 Changed 9 years ago by
- Milestone changed from sage-5.6 to sage-5.7
- Status changed from positive_review to needs_work
Replying to jpflori:
Your changes look good, so I'm putting back this to positive review with the idea it will relaunch a set of tests on the buildbots (am I wrong?) even though I cannot really think of how this could fix the bsd problems.
Not so fast. First of all, the main FLINT patch never received positive_review. And then indeed, it's unlikely the OS X problems are resolved.
comment:158 in reply to: ↑ 157 Changed 9 years ago by
Replying to jdemeyer:
Replying to jpflori:
Your changes look good, so I'm putting back this to positive review with the idea it will relaunch a set of tests on the buildbots (am I wrong?) even though I cannot really think of how this could fix the bsd problems.
Not so fast. First of all, the main FLINT patch never received positive_review.
Hum, that's true.
And then indeed, it's unlikely the OS X problems are resolved.
Anyway, it leaves you the pleasure to launch tests on bsd. Cannot do much about that as I don't have access to such systems and I won't have time to stare at the code thinking about what goes wrong in factor() or whatever.
comment:159 follow-up: ↓ 160 Changed 9 years ago by
Looks like a chicken-and-egg situation, if the only way to get buildbots to test it is to give it a positive review!
comment:160 in reply to: ↑ 159 Changed 9 years ago by
Replying to cremona:
Looks like a chicken-and-egg situation, if the only way to get buildbots to test it is to give it a positive review!
That's not really true. I sometimes test tickets (like this one) which don't have positive review.
However: if a reviewer thinks a ticket is good, except perhaps that it hasn't been tested on some systems, then that reviewer should put the ticket to positive_review to get it tested. But that's not the situation here, no reviewer has looked sufficiently well at the patch to approve it.
comment:161 follow-up: ↓ 253 Changed 9 years ago by
For easier reviewing, it would really be good to split up the big patchbomb into a "rename methods" patch (e.g. from zmod to nmod) which does only that and should be automatically generated and a "remainder" patch.
comment:162 follow-up: ↓ 255 Changed 9 years ago by
In fact the changes used to be split among different patches, but at some point I've gather them back together thinking it would be easier to maintain them that way.
You can see the previous bunch of patch in the diff at http://trac.sagemath.org/sage_trac/ticket/12173#comment:138 Except for additional changes in jack.pyx and the removal of a bad change I made, the changes did not really evolve IIRC, maybe some rebasing only. I don't really feel like resplitting everything back now...
comment:163 Changed 9 years ago by
- Work issues set to OS X
In any case, things still look very bad on OS X: http://build.sagemath.org/sage/builders/UW%20bsd%20%28OSX%2010.6%20x86_64%29/builds/194/steps/shell_7/logs/stdio
comment:164 Changed 9 years ago by
On OS X 10.4 PPC, Sage doesn't even start:
Testing that Sage starts... [2013-01-18 19:51:11] Sage version 5.7.alpha0, released 2013-01-17 Traceback (most recent call last): File "/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.7.alpha0/local/bin/sage-eval", line 4, in <module> from sage.all import * File "/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.7.alpha0/local/lib/python2.7/site-packages/sage/all.py", line 66, in <module> from sage.misc.all import * # takes a while File "/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.7.alpha0/local/lib/python2.7/site-packages/sage/misc/all.py", line 83, in <module> from functional import (additive_order, File "/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.7.alpha0/local/lib/python2.7/site-packages/sage/misc/functional.py", line 36, in <module> from sage.rings.complex_double import CDF File "integer.pxd", line 9, in init sage.rings.complex_double (sage/rings/complex_double.c:16552) ImportError: dlopen(/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.7.alpha0/local/lib/python2.7/site-packages/sage/rings/integer.so, 2): Symbol not found: _n_factor Referenced from: /Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.7.alpha0/local/lib/python2.7/site-packages/sage/rings/integer.so Expected in: dynamic lookup
comment:165 Changed 9 years ago by
On all buildbot systems which aren't OS X, this builds fine and all doctests pass.
comment:166 follow-up: ↓ 167 Changed 9 years ago by
So are there any developers around with access to an OS X system who'd be able to take a look at this?
comment:167 in reply to: ↑ 166 Changed 9 years ago by
- Cc was added
Replying to fredrik.johansson:
So are there any developers around with access to an OS X system who'd be able to take a look at this?
Does not look so. Maybe the easiest way is to ask William for access to bsd.
comment:168 Changed 9 years ago by
Hum, that's not just OS X that OS X on PPC. By the look of the log I am wondering if flint was built properly. I don't know if the buildbot run with spkg-check but I would be curious to see the result on that machine. And the build log too.
Karl-dieter has (or had) such a machine.
comment:169 Changed 9 years ago by
There are problems on all OS X machines, it's just that the problem on OS X 10.4 PPC is different from the problems with other OS X versions.
comment:170 Changed 9 years ago by
Then I can have a look at it (10.5.8). I'll see if beta0 build at the same time, that may be another issue in and of itself.
comment:171 follow-up: ↓ 172 Changed 9 years ago by
It built for me on 10.6.8 and all tests passed. What kind of errors should I be looking for?
comment:172 in reply to: ↑ 171 Changed 9 years ago by
Replying to roed:
It built for me on 10.6.8 and all tests passed.
That's totally wierd. I doesn't work on any OS X buildbot machine (there are 3 of them).
What kind of errors should I be looking for?
The following tests failed: sage -t --long -force_lib devel/sage/doc/de/tutorial/tour_advanced.rst # 1 doctests failed sage -t --long -force_lib devel/sage/doc/en/constructions/modular_forms.rst # 1 doctests failed sage -t --long -force_lib devel/sage/doc/en/bordeaux_2008/birds_other.rst # 1 doctests failed sage -t --long -force_lib devel/sage/doc/en/tutorial/tour_advanced.rst # 1 doctests failed sage -t --long -force_lib devel/sage/doc/fr/tutorial/tour_advanced.rst # 1 doctests failed sage -t --long -force_lib devel/sage/doc/ru/tutorial/tour_advanced.rst # 1 doctests failed sage -t --long -force_lib devel/sage/sage/matrix/matrix_modn_dense_template.pxi # 1 doctests failed sage -t --long -force_lib devel/sage/sage/matrix/matrix2.pyx # 2 doctests failed sage -t --long -force_lib devel/sage/sage/modular/buzzard.py # 1 doctests failed sage -t --long -force_lib devel/sage/sage/modular/abvar/abvar.py # Killed/crashed sage -t --long -force_lib devel/sage/sage/modular/abvar/torsion_subgroup.py # 4 doctests failed sage -t --long -force_lib devel/sage/sage/modular/modform/ambient.py # Killed/crashed sage -t --long -force_lib devel/sage/sage/modular/modform/eis_series.py # Killed/crashed sage -t --long -force_lib devel/sage/sage/modular/modform/constructor.py # 1 doctests failed sage -t --long -force_lib devel/sage/sage/modular/modform/find_generators.py # Killed/crashed sage -t --long -force_lib devel/sage/sage/modular/modform/hecke_operator_on_qexp.py # 2 doctests failed sage -t --long -force_lib devel/sage/sage/modular/modform/vm_basis.py # 2 doctests failed sage -t --long -force_lib devel/sage/sage/modular/modform/space.py # Killed/crashed sage -t --long -force_lib devel/sage/sage/modular/modsym/ambient.py # Killed/crashed sage -t --long -force_lib devel/sage/sage/modular/modsym/subspace.py # Killed/crashed sage -t --long -force_lib devel/sage/sage/modular/overconvergent/genus0.py # Killed/crashed sage -t --long -force_lib devel/sage/sage/modular/overconvergent/hecke_series.py # Killed/crashed sage -t --long -force_lib devel/sage/sage/rings/qqbar.py # 1 doctests failed sage -t --long -force_lib devel/sage/sage/rings/number_field/number_field.py # Killed/crashed sage -t --long -force_lib devel/sage/sage/rings/number_field/number_field_element.pyx # Killed/crashed sage -t --long -force_lib devel/sage/sage/rings/polynomial/complex_roots.py # 1 doctests failed sage -t --long -force_lib devel/sage/sage/rings/polynomial/polynomial_element.pyx # 1 doctests failed sage -t --long -force_lib devel/sage/sage/schemes/elliptic_curves/gal_reps.py # Killed/crashed sage -t --long -force_lib devel/sage/sage/schemes/elliptic_curves/heegner.py # Killed/crashed sage -t --long -force_lib devel/sage/sage/schemes/elliptic_curves/ell_rational_field.py # Killed/crashed sage -t --long -force_lib devel/sage/sage/modular/modform/element.py # Time out sage -t --long -force_lib devel/sage/sage/tests/book_stein_modform.py # 2 doctests failed
comment:173 Changed 9 years ago by
The first patch to the library doesn't apply properly over 5.7.beta0. Looking into it but I hope it won't need more changes for beta1.
comment:174 follow-up: ↓ 175 Changed 9 years ago by
All tests passed on top of 5.6.rc0 (which I had handy) on an iMAC core2 duo running 10.5.8. No clues what could go wrong on the build bots. This build used the gcc shipped with this particular version of sage.
comment:175 in reply to: ↑ 174 Changed 9 years ago by
Replying to fbissey:
All tests passed on top of 5.6.rc0 (which I had handy) on an iMAC core2 duo running 10.5.8. No clues what could go wrong on the build bots.
Perhaps you just have a lucky version? I could very well be that versions >= 10.6 have a problem and versions <= 10.4 a different problem.
This build used the gcc shipped with this particular version of sage.
The buildbot machines also.
comment:176 Changed 9 years ago by
But I'm having no trouble with 10.6.8. What version of OS X are the buildbot machines running?
comment:177 follow-up: ↓ 182 Changed 9 years ago by
May be the order things are done matters. I had a plain 5.6.rc0 on which I applied the patches. Then I updated flint. Finally I did sage -ba.
I assume the buildbots never built flint-1.5 and used a prepatched sage-root and sage spkg. I cannot prepare this particular test before tomorrow.
I actually cannot access my iMac before tomorrow (it's shutdown down and someone is borrowing it for video conferencing) but I am wondering if my build and David's still have an old dylib of the previous flint. The way flint is upgraded probably don't remove everything of flint 1.5/ It merely overwrite the common files and leave the other in place. If there is a versionned libflint-1.5.dylib it probably is still there.
Could you try to find if there are still bits of flint 1.5 on your install and remove them David?
comment:178 Changed 9 years ago by
The buildbots always do a complete install from source, so there was never a FLINT-1.5 with the buildbot tests. The OS X buildbots are 10.4 PCC, 10.6 x86_64, 10.8 x86_64.
comment:179 Changed 9 years ago by
OK my iMAC is 10.5 "x86_32" that may be part of it.
comment:180 Changed 9 years ago by
- Description modified (diff)
comment:181 Changed 9 years ago by
Rebased patch on top of 5.7.beta1
comment:182 in reply to: ↑ 177 ; follow-up: ↓ 183 Changed 9 years ago by
Replying to fbissey:
May be the order things are done matters. I had a plain 5.6.rc0 on which I applied the patches. Then I updated flint. Finally I did sage -ba.
I started with a freshly built 5.6, installed the spkg and patches and then did sage -b.
The way flint is upgraded probably don't remove everything of flint 1.5/ It merely overwrite the common files and leave the other in place. If there is a versionned libflint-1.5.dylib it probably is still there.
I ran "find $SAGE_ROOT -name *flint* -print," and the only versioned files I found were
./spkg/installed/flint-1.5.2.p2 ./spkg/installed/flint-2.3 ./spkg/logs/flint-1.5.2.p2.log ./spkg/logs/flint-2.3.log ./spkg/optional/flint-2.3.spkg ./spkg/standard/flint-1.5.2.p2.spkg
The .a, .o, .so and .dylib files were all unversioned:
./devel/sage-main/build/temp.macosx-10.6-x86_64-2.7/sage/libs/flint/flint.o ./local/lib/libflint.a ./local/lib/libflint.dylib
Could you try to find if there are still bits of flint 1.5 on your install and remove them David?
Are there any of the above that actually matter removing?
comment:183 in reply to: ↑ 182 Changed 9 years ago by
Replying to roed:
Replying to fbissey:
May be the order things are done matters. I had a plain 5.6.rc0 on which I applied the patches. Then I updated flint. Finally I did sage -ba.
I started with a freshly built 5.6, installed the spkg and patches and then did sage -b.
The way flint is upgraded probably don't remove everything of flint 1.5/ It merely overwrite the common files and leave the other in place. If there is a versionned libflint-1.5.dylib it probably is still there.
I ran "find $SAGE_ROOT -name *flint* -print," and the only versioned files I found were
./spkg/installed/flint-1.5.2.p2 ./spkg/installed/flint-2.3 ./spkg/logs/flint-1.5.2.p2.log ./spkg/logs/flint-2.3.log ./spkg/optional/flint-2.3.spkg ./spkg/standard/flint-1.5.2.p2.spkgThe .a, .o, .so and .dylib files were all unversioned:
./devel/sage-main/build/temp.macosx-10.6-x86_64-2.7/sage/libs/flint/flint.o ./local/lib/libflint.a ./local/lib/libflint.dylibCould you try to find if there are still bits of flint 1.5 on your install and remove them David?
Are there any of the above that actually matter removing?
It was an idea that I had but it turns out flint is indeed unversioned and the spkg tries to clean previous install before putting the new libs. So that's a dead end. No need to remove anything. I am trying with a build including flint 2.3 from scratch.
comment:184 Changed 9 years ago by
The build from scratch didn't behave any differently on my system, all tests pass. So is flint making weird assumptions about OS X on ppc and x86_64?
comment:185 Changed 9 years ago by
Flint has some inline assembly for x86, x86_64, arm and itanium, but nothing for ppc. You could try editing longlong.h to fall back to the generic code and see if that changes anything on x86_64.
The only way to get to the bottom of this is probably to take one of the failing tests and isolate the point in the code where results start to differ from a passing system...
comment:186 Changed 9 years ago by
I could put my hand on a Mac OS X 10.7 where the problems occur.
At least one of the really worrying things is as follow:
- converting an integer coefficients poly to a rational coeffs one kill all coeffs after any coeff whose size is larger than 64.stg bits.
- but they are not really killed as the poly is still aware of it degree and you can still test equality of the original int poly with the new rat poly and get True.
That's the problem behind all the factor() calls giving stupid results (and maybe other failing doctests).
There seems to be a problem in the call to fmpq_poly_set_fmpz_poly in polynomial_rational_flint.pyx at line 257.
comment:187 Changed 9 years ago by
It's quite funny, if I call fmpz_poly_print before entering fmpq_poly_set_fmpz_poly then its ok, but inside I get 0s instead of the right MPZ values (for things not stored directly). And when outside again, its ok.
And if I first create an fmpq_poly with small coeffs and then multipliy it and convert it to int coeffs and convert it back to rat coeffs its ok...
comment:188 follow-up: ↓ 189 Changed 9 years ago by
Thanks a bunch for looking at this, jp.
It sounds really bizarre. Just from looking at the code, I see no reason that it should fail.
What happens if you manually print the actual values of the fmpzs (as longs) in both the input fmpz_poly and the output fmpq_poly, and what does the output show?
Does reentrant mode help?
comment:189 in reply to: ↑ 188 Changed 9 years ago by
Replying to fredrik.johansson:
What happens if you manually print the actual values of the fmpzs (as longs) in both the input fmpz_poly and the output fmpq_poly, and what does the output show?
Inside fmpq_set_fmpz_poly, printing each (in fact Ive tried to bedug only the case a*x where a is either 1 or huge so "each" is "the two") fmpz independently using fmpz_print(op->coeffs+i) or through fmpz_print_poly gives the same result: I get 0 for what are actually mpz_t rather than (tweaked) longs (IIRC I don't have access to the Mac right now) which at least is consistent. What disturbing is that calling COEFF_IS_MPZ gives the correct answer inside the function call (i.e. 1 for the x coefficient when a is huge, although 0 is printed, and 0 for the constant coefficient which is 0).
I've checked the value of x._ _poly/ op is the same outside and inside the function call, and the address op->coeffs inside the function call looks fine.
Does reentrant mode help?
I'll try that tomorrow.
FYI I tried using the fallback routines in longlong.h and that did not solve the problem.
comment:190 follow-up: ↓ 191 Changed 9 years ago by
I was wondering what would show if you print them with printf("%ld", poly[i])
comment:191 in reply to: ↑ 190 ; follow-up: ↓ 194 Changed 9 years ago by
Using a reentrant version did not solve the problem... Replying to fredrik.johansson:
I was wondering what would show if you print them with printf("%ld", poly[i])
I modified FLINT as follow:
- in fmpz_poly.h:
static __inline__ int fmpz_poly_print(const fmpz_poly_t poly) { int i = 0; for (; i < fmpz_poly_length(poly); i++) { printf("exp: %d ", i); printf("MPZ: %d ", (COEFF_IS_MPZ(*(poly->coeffs+i)))?1:0); printf("addr: %lX ", (unsigned long) poly->coeffs+i); printf("val: %ld ", *(poly->coeffs+i)); printf("val: %lX ", *(poly->coeffs+i)); printf("fmpz: "); fmpz_print(poly->coeffs+i); printf(" !! "); } printf("\n"); return fmpz_poly_fprint(stdout, poly); }
- in fmpq_poly/set_fmpz_poly.c:
void fmpq_poly_set_fmpz_poly(fmpq_poly_t rop, const fmpz_poly_t op) { if (fmpz_poly_is_zero(op)) { fmpq_poly_zero(rop); } else { int i = 0; fmpz_poly_print(op); printf("\n"); fmpq_poly_fit_length(rop, fmpz_poly_length(op)); fmpz_poly_print(op); printf("\n"); _fmpq_poly_set_length(rop, fmpz_poly_length(op)); printf("length = %ld\n", fmpz_poly_length(op)); printf("id = %lX\n", (unsigned long)op); fmpz_poly_print(op); printf("\n"); for (; i < fmpz_poly_length(op); i++) { printf("exp: %d ", i); printf("MPZ: %d ", (COEFF_IS_MPZ(*(op->coeffs+i)))?1:0); printf("addr: %lX ", (unsigned long) op->coeffs+i); printf("val: %ld ", *(op->coeffs+i)); printf("val: %lX ", *(op->coeffs+i)); printf("fmpz: "); fmpz_print(op->coeffs+i); printf(" !! "); } printf("\n"); _fmpz_vec_set(rop->coeffs, op->coeffs, rop->length); fmpz_one(rop->den); } }
In Sage:
- in polynomial_integer_dense_flint.pyx:
def jprint(self): fmpz_poly_print(self.__poly)
- in polynomial_rational_flint.pyx
elif PY_TYPE_CHECK(x, Polynomial_integer_dense_flint): print hex(<long>((<Polynomial_integer_dense_flint>x).__poly)) printf("%X\n",(<Polynomial_integer_dense_flint>x).__poly) print x x.jprint() print "" (<Polynomial_integer_dense_flint>x).jprint() print "" fmpq_poly_set_fmpz_poly(self.__poly, (<Polynomial_integer_dense_flint>x).__poly)
Here is some data after modifying Sage and FLINT:
sage: a = 2**61 sage: b = 2**62 sage: c = 2**63 sage: R.<x> = ZZ[] sage: f = R([1,2,3,a,b,c,3,2,1]) sage: f.change_ring(QQ) 0x1152447d0 152447D0 x^8 + 2*x^7 + 3*x^6 + 9223372036854775808*x^5 + 4611686018427387904*x^4 + 2305843009213693952*x^3 + 3*x^2 + 2*x + 1 exp: 0 MPZ: 0 addr: 7FE1D1919410 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FE1D1919411 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FE1D1919412 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FE1D1919413 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FE1D1919414 val: 4611686018427387966 val: 400000000000003E fmpz: 4611686018427387904 !! exp: 5 MPZ: 1 addr: 7FE1D1919415 val: 4611686018427387965 val: 400000000000003D fmpz: 9223372036854775808 !! exp: 6 MPZ: 0 addr: 7FE1D1919416 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FE1D1919417 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FE1D1919418 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 4611686018427387904 9223372036854775808 3 2 1 exp: 0 MPZ: 0 addr: 7FE1D1919410 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FE1D1919411 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FE1D1919412 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FE1D1919413 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FE1D1919414 val: 4611686018427387966 val: 400000000000003E fmpz: 4611686018427387904 !! exp: 5 MPZ: 1 addr: 7FE1D1919415 val: 4611686018427387965 val: 400000000000003D fmpz: 9223372036854775808 !! exp: 6 MPZ: 0 addr: 7FE1D1919416 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FE1D1919417 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FE1D1919418 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 4611686018427387904 9223372036854775808 3 2 1 exp: 0 MPZ: 0 addr: 7FE1D1919410 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FE1D1919411 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FE1D1919412 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FE1D1919413 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FE1D1919414 val: 4611686018427387966 val: 400000000000003E fmpz: 0 !! exp: 5 MPZ: 1 addr: 7FE1D1919415 val: 4611686018427387965 val: 400000000000003D fmpz: 0 !! exp: 6 MPZ: 0 addr: 7FE1D1919416 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FE1D1919417 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FE1D1919418 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 0 0 3 2 1 exp: 0 MPZ: 0 addr: 7FE1D1919410 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FE1D1919411 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FE1D1919412 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FE1D1919413 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FE1D1919414 val: 4611686018427387966 val: 400000000000003E fmpz: 0 !! exp: 5 MPZ: 1 addr: 7FE1D1919415 val: 4611686018427387965 val: 400000000000003D fmpz: 0 !! exp: 6 MPZ: 0 addr: 7FE1D1919416 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FE1D1919417 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FE1D1919418 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 0 0 3 2 1 length = 9 id = 1152447D0 exp: 0 MPZ: 0 addr: 7FE1D1919410 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FE1D1919411 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FE1D1919412 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FE1D1919413 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FE1D1919414 val: 4611686018427387966 val: 400000000000003E fmpz: 0 !! exp: 5 MPZ: 1 addr: 7FE1D1919415 val: 4611686018427387965 val: 400000000000003D fmpz: 0 !! exp: 6 MPZ: 0 addr: 7FE1D1919416 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FE1D1919417 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FE1D1919418 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 0 0 3 2 1 exp: 0 MPZ: 0 addr: 7FE1D1919410 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FE1D1919411 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FE1D1919412 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FE1D1919413 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FE1D1919414 val: 4611686018427387966 val: 400000000000003E fmpz: 0 !! exp: 5 MPZ: 1 addr: 7FE1D1919415 val: 4611686018427387965 val: 400000000000003D fmpz: 0 !! exp: 6 MPZ: 0 addr: 7FE1D1919416 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FE1D1919417 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FE1D1919418 val: 1 val: 1 fmpz: 1 !! x^8 + 2*x^7 + 3*x^6 + 2305843009213693952*x^3 + 3*x^2 + 2*x + 1
comment:192 follow-ups: ↓ 193 ↓ 198 Changed 9 years ago by
If I'm reading this right, it prints the fmpz_poly four times before even entering fmpq_poly, and it's wrong the third time, that is, when calling (<Polynomial_integer_dense_flint>x).jprint()?
But the fmpz coefficients are the same, so it is (seemingly) the entries in flint's global mpz array that are being zeroed when they shouldn't.
It smells a lot like one of the bugs we've had in the past with the non-reentrant mode, i.e., somewhere, some code loads an mpz pointer, and hangs on to it while the mpz array gets resized. It could potentially be triggered by a compiler bug which, say, makes it overlook constness somewhere.
You might be able to track down exactly where the mpz gets zeroed (say, by saving the pointer somewhere, and printing it in various places). Or, since the fmpz just encodes an index into the mpz array, you could perhaps edit the COEFF_TO_PTR macro to print the index, and find out the places where the particular indices of the mpzs in the coefficients of this polynomial are being accessed. It might also help to print where mpz array is being resized.
comment:193 in reply to: ↑ 192 Changed 9 years ago by
Replying to fredrik.johansson:
If I'm reading this right, it prints the fmpz_poly four times before even entering fmpq_poly, and it's wrong the third time, that is, when calling (<Polynomial_integer_dense_flint>x).jprint()?
Nope, the first wrong printing is the the first one inside fmpq_set_fmpz_poly. The first "print x" command at the Sage level corresponds to "x^{8 + 2*x}7 + 3*x^{6 + 92233...". }
But the fmpz coefficients are the same, so it is (seemingly) the entries in flint's global mpz array that are being zeroed when they shouldn't.
I'm not sure they really get zeroed. Ok it sounds fishy, but I'm able to print again f correctly after having done f.change_ring(QQ)...
I'll give the same piece of code a shot on Linux to see if anything looks different but I doubt it, then I'll try to watch the addresses with gdb.
comment:194 in reply to: ↑ 191 Changed 9 years ago by
Replying to jpflori: This line
void fmpq_poly_set_fmpz_poly(fmpq_poly_t rop, const fmpz_poly_t op) { .... printf("addr: %lX ", (unsigned long) op->coeffs+i); ....
was supposed to be
void fmpq_poly_set_fmpz_poly(fmpq_poly_t rop, const fmpz_poly_t op) { .... printf("addr: %lX ", (unsigned long) (op->coeffs+i)); ....
The additional one appearing sometimes there is strange:
0x1152447d0 152447D0 ... id = 1152447D0
Maybe some casting problems?
On Linux, I get the same value the three times.
comment:195 Changed 9 years ago by
My bad, its just because I used %X and on the Mac, the address is more than 4 bytes.
comment:196 Changed 9 years ago by
It seems there have been some problems when updating FLINT, with a flint directory created, but a FLINT one somehow still there.
I did not test reentrant mode correctly because there were multiple spaces before --reentrant and it was not parsed by FLINT configure.
comment:197 Changed 9 years ago by
And with reentrant mode, the problem goes away:
sage: a = 2**61 sage: b = 2**62 sage: c = 2**63 sage: R.<x> = ZZ[] sage: f = R([1,2,3,a,b,c,3,2,1]) sage: f.change_ring(QQ) 0x1118f67d0L 1118F67D0 x^8 + 2*x^7 + 3*x^6 + 9223372036854775808*x^5 + 4611686018427387904*x^4 + 2305843009213693952*x^3 + 3*x^2 + 2*x + 1 exp: 0 MPZ: 0 addr: 7FFA4B407410 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FFA4B407418 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FFA4B407420 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FFA4B407428 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FFA4B407430 val: 4611721196672653224 val: 40001FFE92D017A8 fmpz: 4611686018427387904 !! exp: 5 MPZ: 1 addr: 7FFA4B407438 val: 4611721196672654616 val: 40001FFE92D01D18 fmpz: 9223372036854775808 !! exp: 6 MPZ: 0 addr: 7FFA4B407440 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FFA4B407448 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FFA4B407450 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 4611686018427387904 9223372036854775808 3 2 1 exp: 0 MPZ: 0 addr: 7FFA4B407410 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FFA4B407418 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FFA4B407420 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FFA4B407428 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FFA4B407430 val: 4611721196672653224 val: 40001FFE92D017A8 fmpz: 4611686018427387904 !! exp: 5 MPZ: 1 addr: 7FFA4B407438 val: 4611721196672654616 val: 40001FFE92D01D18 fmpz: 9223372036854775808 !! exp: 6 MPZ: 0 addr: 7FFA4B407440 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FFA4B407448 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FFA4B407450 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 4611686018427387904 9223372036854775808 3 2 1 exp: 0 MPZ: 0 addr: 7FFA4B407410 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FFA4B407418 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FFA4B407420 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FFA4B407428 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FFA4B407430 val: 4611721196672653224 val: 40001FFE92D017A8 fmpz: 4611686018427387904 !! exp: 5 MPZ: 1 addr: 7FFA4B407438 val: 4611721196672654616 val: 40001FFE92D01D18 fmpz: 9223372036854775808 !! exp: 6 MPZ: 0 addr: 7FFA4B407440 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FFA4B407448 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FFA4B407450 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 4611686018427387904 9223372036854775808 3 2 1 exp: 0 MPZ: 0 addr: 7FFA4B407410 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FFA4B407418 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FFA4B407420 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FFA4B407428 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FFA4B407430 val: 4611721196672653224 val: 40001FFE92D017A8 fmpz: 4611686018427387904 !! exp: 5 MPZ: 1 addr: 7FFA4B407438 val: 4611721196672654616 val: 40001FFE92D01D18 fmpz: 9223372036854775808 !! exp: 6 MPZ: 0 addr: 7FFA4B407440 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FFA4B407448 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FFA4B407450 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 4611686018427387904 9223372036854775808 3 2 1 length = 9 id = 1118F67D0 exp: 0 MPZ: 0 addr: 7FFA4B407410 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FFA4B407418 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FFA4B407420 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FFA4B407428 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FFA4B407430 val: 4611721196672653224 val: 40001FFE92D017A8 fmpz: 4611686018427387904 !! exp: 5 MPZ: 1 addr: 7FFA4B407438 val: 4611721196672654616 val: 40001FFE92D01D18 fmpz: 9223372036854775808 !! exp: 6 MPZ: 0 addr: 7FFA4B407440 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FFA4B407448 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FFA4B407450 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 4611686018427387904 9223372036854775808 3 2 1 exp: 0 MPZ: 0 addr: 7FFA4B407410 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FFA4B407418 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FFA4B407420 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FFA4B407428 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FFA4B407430 val: 4611721196672653224 val: 40001FFE92D017A8 fmpz: 4611686018427387904 !! exp: 5 MPZ: 1 addr: 7FFA4B407438 val: 4611721196672654616 val: 40001FFE92D01D18 fmpz: 9223372036854775808 !! exp: 6 MPZ: 0 addr: 7FFA4B407440 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FFA4B407448 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FFA4B407450 val: 1 val: 1 fmpz: 1 !! x^8 + 2*x^7 + 3*x^6 + 9223372036854775808*x^5 + 4611686018427387904*x^4 + 2305843009213693952*x^3 + 3*x^2 + 2*x + 1
comment:198 in reply to: ↑ 192 Changed 9 years ago by
Replying to fredrik.johansson:
If I'm reading this right, it prints the fmpz_poly four times before even entering fmpq_poly, and it's wrong the third time, that is, when calling (<Polynomial_integer_dense_flint>x).jprint()?
But the fmpz coefficients are the same, so it is (seemingly) the entries in flint's global mpz array that are being zeroed when they shouldn't.
So that must be it.
It smells a lot like one of the bugs we've had in the past with the non-reentrant mode, i.e., somewhere, some code loads an mpz pointer, and hangs on to it while the mpz array gets resized. It could potentially be triggered by a compiler bug which, say, makes it overlook constness somewhere.
You might be able to track down exactly where the mpz gets zeroed (say, by saving the pointer somewhere, and printing it in various places). Or, since the fmpz just encodes an index into the mpz array, you could perhaps edit the COEFF_TO_PTR macro to print the index, and find out the places where the particular indices of the mpzs in the coefficients of this polynomial are being accessed. It might also help to print where mpz array is being resized.
I'll try this now.
comment:199 Changed 9 years ago by
I does not seem the global array gets resized. Ive put some printf lines in the if(fmpz_allocated) part of _fmpz_new_mpz and ot does not get triggered.
But somehow the fmpz_arr value changes...
sage: f.change_ring(QQ) 0x1170d57d0L 1170D57D0 x^8 + 2*x^7 + 3*x^6 + 9223372036854775808*x^5 + 4611686018427387904*x^4 + 2305843009213693952*x^3 + 3*x^2 + 2*x + 1 global array at 0x7FEB3C002A00 exp: 0 MPZ: 0 addr: 7FEB3B304C80 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FEB3B304C88 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FEB3B304C90 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FEB3B304C98 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FEB3B304CA0 val: 4611686018427387966 val: 400000000000003E fmpz: 4611686018427387904 !! exp: 5 MPZ: 1 addr: 7FEB3B304CA8 val: 4611686018427387965 val: 400000000000003D fmpz: 9223372036854775808 !! exp: 6 MPZ: 0 addr: 7FEB3B304CB0 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FEB3B304CB8 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FEB3B304CC0 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 4611686018427387904 9223372036854775808 3 2 1 global array at 0x7FEB3C002A00 exp: 0 MPZ: 0 addr: 7FEB3B304C80 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FEB3B304C88 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FEB3B304C90 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FEB3B304C98 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FEB3B304CA0 val: 4611686018427387966 val: 400000000000003E fmpz: 4611686018427387904 !! exp: 5 MPZ: 1 addr: 7FEB3B304CA8 val: 4611686018427387965 val: 400000000000003D fmpz: 9223372036854775808 !! exp: 6 MPZ: 0 addr: 7FEB3B304CB0 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FEB3B304CB8 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FEB3B304CC0 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 4611686018427387904 9223372036854775808 3 2 1 global array at 0x7FEB3BBCF200 global array at 0x7FEB3BBCF200 exp: 0 MPZ: 0 addr: 7FEB3B304C80 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FEB3B304C88 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FEB3B304C90 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FEB3B304C98 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FEB3B304CA0 val: 4611686018427387966 val: 400000000000003E fmpz: 0 !! exp: 5 MPZ: 1 addr: 7FEB3B304CA8 val: 4611686018427387965 val: 400000000000003D fmpz: 0 !! exp: 6 MPZ: 0 addr: 7FEB3B304CB0 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FEB3B304CB8 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FEB3B304CC0 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 0 0 3 2 1 global array at 0x7FEB3BBCF200 exp: 0 MPZ: 0 addr: 7FEB3B304C80 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FEB3B304C88 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FEB3B304C90 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FEB3B304C98 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FEB3B304CA0 val: 4611686018427387966 val: 400000000000003E fmpz: 0 !! exp: 5 MPZ: 1 addr: 7FEB3B304CA8 val: 4611686018427387965 val: 400000000000003D fmpz: 0 !! exp: 6 MPZ: 0 addr: 7FEB3B304CB0 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FEB3B304CB8 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FEB3B304CC0 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 0 0 3 2 1 length = 9 id = 1170D57D0 global array at 0x7FEB3BBCF200 exp: 0 MPZ: 0 addr: 7FEB3B304C80 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FEB3B304C88 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FEB3B304C90 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FEB3B304C98 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FEB3B304CA0 val: 4611686018427387966 val: 400000000000003E fmpz: 0 !! exp: 5 MPZ: 1 addr: 7FEB3B304CA8 val: 4611686018427387965 val: 400000000000003D fmpz: 0 !! exp: 6 MPZ: 0 addr: 7FEB3B304CB0 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FEB3B304CB8 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FEB3B304CC0 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 0 0 3 2 1 exp: 0 MPZ: 0 addr: 7FEB3B304C80 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FEB3B304C88 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FEB3B304C90 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FEB3B304C98 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FEB3B304CA0 val: 4611686018427387966 val: 400000000000003E fmpz: 0 !! exp: 5 MPZ: 1 addr: 7FEB3B304CA8 val: 4611686018427387965 val: 400000000000003D fmpz: 0 !! exp: 6 MPZ: 0 addr: 7FEB3B304CB0 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FEB3B304CB8 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FEB3B304CC0 val: 1 val: 1 fmpz: 1 !! x^8 + 2*x^7 + 3*x^6 + 2305843009213693952*x^3 + 3*x^2 + 2*x + 1
comment:200 Changed 9 years ago by
But somehow it seems we get two global arrays:
********************************************************************** * * * Warning: this is a prerelease version, and it may be unstable. * * * ********************************************************************** Allocating fmpz_arr at 0x7FC702AC2600 Allocating fmpz_unused_arr at 0x7FC704123480 sage: a = 2**61 sage: b = 2**62 sage: c = 2**63 sage: R.<x> = ZZ[] sage: f = R([1,2,3,a,b,c,3,2,1]) Allocating fmpz_arr at 0x7FC703850000 Allocating fmpz_unused_arr at 0x7FC702366530 sage: f = R([1,2,3,a,b,c,3,2,1]) sage: f.jprint() global array at 0x7FC703850000 exp: 0 MPZ: 0 addr: 7FC7026D2530 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FC7026D2538 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FC7026D2540 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FC7026D2548 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FC7026D2550 val: 4611686018427387964 val: 400000000000003C fmpz: 4611686018427387904 !! exp: 5 MPZ: 1 addr: 7FC7026D2558 val: 4611686018427387963 val: 400000000000003B fmpz: 9223372036854775808 !! exp: 6 MPZ: 0 addr: 7FC7026D2560 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FC7026D2568 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FC7026D2570 val: 1 val: 1 fmpz: 1 !! 9 1 2f.change_ring(QQ) 0x10dce0838L 10DCE0838 x^8 + 2*x^7 + 3*x^6 + 9223372036854775808*x^5 + 4611686018427387904*x^4 + 2305843009213693952*x^3 + 3*x^2 + 2*x + 1 global array at 0x7FC703850000 exp: 0 MPZ: 0 addr: 7FC7026D2530 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FC7026D2538 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FC7026D2540 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FC7026D2548 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FC7026D2550 val: 4611686018427387964 val: 400000000000003C fmpz: 4611686018427387904 !! exp: 5 MPZ: 1 addr: 7FC7026D2558 val: 4611686018427387963 val: 400000000000003B fmpz: 9223372036854775808 !! exp: 6 MPZ: 0 addr: 7FC7026D2560 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FC7026D2568 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FC7026D2570 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 4611686018427387904 9223372036854775808 3 2 1 global array at 0x7FC703850000 exp: 0 MPZ: 0 addr: 7FC7026D2530 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FC7026D2538 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FC7026D2540 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FC7026D2548 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FC7026D2550 val: 4611686018427387964 val: 400000000000003C fmpz: 4611686018427387904 !! exp: 5 MPZ: 1 addr: 7FC7026D2558 val: 4611686018427387963 val: 400000000000003B fmpz: 9223372036854775808 !! exp: 6 MPZ: 0 addr: 7FC7026D2560 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FC7026D2568 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FC7026D2570 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 4611686018427387904 9223372036854775808 3 2 1 global array at 0x7FC702AC2600 global array at 0x7FC702AC2600 exp: 0 MPZ: 0 addr: 7FC7026D2530 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FC7026D2538 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FC7026D2540 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FC7026D2548 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FC7026D2550 val: 4611686018427387964 val: 400000000000003C fmpz: 0 !! exp: 5 MPZ: 1 addr: 7FC7026D2558 val: 4611686018427387963 val: 400000000000003B fmpz: 0 !! exp: 6 MPZ: 0 addr: 7FC7026D2560 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FC7026D2568 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FC7026D2570 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 0 0 3 2 1 global array at 0x7FC702AC2600 exp: 0 MPZ: 0 addr: 7FC7026D2530 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FC7026D2538 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FC7026D2540 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FC7026D2548 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FC7026D2550 val: 4611686018427387964 val: 400000000000003C fmpz: 0 !! exp: 5 MPZ: 1 addr: 7FC7026D2558 val: 4611686018427387963 val: 400000000000003B fmpz: 0 !! exp: 6 MPZ: 0 addr: 7FC7026D2560 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FC7026D2568 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FC7026D2570 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 0 0 3 2 1 length = 9 id = 10DCE0838 global array at 0x7FC702AC2600 exp: 0 MPZ: 0 addr: 7FC7026D2530 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FC7026D2538 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FC7026D2540 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FC7026D2548 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FC7026D2550 val: 4611686018427387964 val: 400000000000003C fmpz: 0 !! exp: 5 MPZ: 1 addr: 7FC7026D2558 val: 4611686018427387963 val: 400000000000003B fmpz: 0 !! exp: 6 MPZ: 0 addr: 7FC7026D2560 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FC7026D2568 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FC7026D2570 val: 1 val: 1 fmpz: 1 !! 9 1 2 3 2305843009213693952 0 0 3 2 1 exp: 0 MPZ: 0 addr: 7FC7026D2530 val: 1 val: 1 fmpz: 1 !! exp: 1 MPZ: 0 addr: 7FC7026D2538 val: 2 val: 2 fmpz: 2 !! exp: 2 MPZ: 0 addr: 7FC7026D2540 val: 3 val: 3 fmpz: 3 !! exp: 3 MPZ: 0 addr: 7FC7026D2548 val: 2305843009213693952 val: 2000000000000000 fmpz: 2305843009213693952 !! exp: 4 MPZ: 1 addr: 7FC7026D2550 val: 4611686018427387964 val: 400000000000003C fmpz: 0 !! exp: 5 MPZ: 1 addr: 7FC7026D2558 val: 4611686018427387963 val: 400000000000003B fmpz: 0 !! exp: 6 MPZ: 0 addr: 7FC7026D2560 val: 3 val: 3 fmpz: 3 !! exp: 7 MPZ: 0 addr: 7FC7026D2568 val: 2 val: 2 fmpz: 2 !! exp: 8 MPZ: 0 addr: 7FC7026D2570 val: 1 val: 1 fmpz: 1 !! x^8 + 2*x^7 + 3*x^6 + 2305843009213693952*x^3 + 3*x^2 + 2*x + 1 sage:
comment:201 Changed 9 years ago by
This is stranger than ever :-)
If you can figure out where the two arrays are coming from, it might just be the solution of the problem.
If nothing else works, perhaps we could just update flint to include the patch https://github.com/fredrik-johansson/flint2/commit/03487ed18f40f08cea7f5a71dcfd43cb6fd70115 but with the __thread
prefixes removed. (If my own benchmarking was correct, this will even make flint faster, too.)
comment:202 Changed 9 years ago by
Think I got it. Something does not like the dylib64 extension so some parts of Sage where linked with the static libflint.a, but some others with libflint.dylib64.
Ive copied libflint.dylib64 to liblint.dylib, touched everything in sage/libs/flint and sage/rings/polynomial and rebuilt Sage and could not reproduce the error.
All of this is because of me... https://groups.google.com/d/msg/flint-devel/Vy0PlkeEgf8/TbpSDTZBgg8J So keeping dylib64 extension from FLINT 1 series was not a good idea.
comment:203 Changed 9 years ago by
In fact I think I got confused between the make target and the extension of the library in FLINT 1. Sorry about that.
comment:204 Changed 9 years ago by
- Status changed from needs_work to needs_review
- Work issues OS X deleted
Updated spkg. Please test.
comment:205 Changed 9 years ago by
- Description modified (diff)
Patches corrected and rebased on top of 5.8.b2, hoping that b3 did not break them...
comment:206 Changed 9 years ago by
I've just reupped an spkg which actually apply the patch.
comment:207 Changed 9 years ago by
Now it's also clear why this problem affects only OS X systems.
comment:208 Changed 9 years ago by
This should be rebased on top of #14241 (expect for hg and SPKG.txt history this should be trivial).
comment:209 Changed 9 years ago by
And even on top of #14268
comment:210 Changed 9 years ago by
- Cc leif added
comment:211 follow-up: ↓ 212 Changed 9 years ago by
So, what needs to be done on this ticket? Is it just the rebase? It would be great to finally get this in Sage (#14304 depends on it for example).
comment:212 in reply to: ↑ 211 Changed 9 years ago by
Replying to roed:
So, what needs to be done on this ticket? Is it just the rebase? It would be great to finally get this in Sage (#14304 depends on it for example).
I'd say it "just" needs review now that the crap I introduced for OS X is fixed. John Cremona was already happy with the patches if that's relevant. See some comment above.
As far as rebasing is concerned, I did it not so long ago, but it might need another one unfortunately...
comment:213 follow-up: ↓ 214 Changed 9 years ago by
What has been done beyond the patches listed at comment 123? Anything besides OS X fixes and rebasing?
comment:214 in reply to: ↑ 213 Changed 9 years ago by
Replying to roed:
What has been done beyond the patches listed at comment 123? Anything besides OS X fixes and rebasing?
I don't think they really changed except for the fact they were merged together without the last of the list which was not such a good idea, or at least deserve more thoughts.
After comment 123 they were basically just rebased over and over, and at some point additional fixes were added by Jeroen, see http://trac.sagemath.org/sage_trac/ticket/12173?replyto=213#comment:153.
The spkg itself needs rebasing for sure, on top of #14241 and potentially #14268. I could surely do that next week. Don't know about the patches to the Sage library.
comment:215 Changed 9 years ago by
Ok I've reupped a rebased p0 spkg including the OS X fix and porting the ulong stuff from #14268. Not completely sure the ulong stuff is completely right, but it seems to pass all tests.
Spkg at:
Please review the new spkg and the Sage's library patches!
comment:216 Changed 9 years ago by
- Dependencies changed from #12433 to #12433, #14268
(Of course there may be some proper spkg rebasing hell because #14268 is not merged yet and so the flint spkg there is not tagged.)
comment:217 Changed 9 years ago by
- Description modified (diff)
comment:218 Changed 9 years ago by
Should it really depend on #14268?
Minor thing: spkg-check
: No newline at end of file.
comment:219 Changed 9 years ago by
I think spkg-install
still needs some clean-up (overwriting CFLAGS
, quoting, superfluous ;
:-) ... )
Haven't tested anything yet...
comment:220 Changed 9 years ago by
P.S.: For consistency, we could also support a user-definable FLINT_CONFIGURE
(and [append to and] use that instead of $EXTRA_CONF
).
comment:221 Changed 9 years ago by
Replying to leif:
Should it really depend on #14268?
I think the changes from #14268 make sense. Using one typedef rather than macros (and bunch of #undef in FLINT 2.3 all over the code to mitigate conflicts with system headers in FLINT itself) is a good thing. I'll report it upstream and that will hopefully change in a future minor release (also including the OS X fix).
So even though the patch for FLINT 2.3 is completely different from the one from #14268, the idea is from there and I think it's a good think maybe not to depend on the ticket itself, but at least to inherit from the spkg there. Therefore if we get this ticket positively reviewed before #14268 we could just remove the dependency on #14268 and the 1.5.2.p4 spkg from there. I don't think making #14268 on this ticket would be fair.
Minor thing:
spkg-check
: No newline at end of file.
I'll fix that.
comment:222 Changed 9 years ago by
Ok I've reupped a cleaner spkg. In fact most of the spkg-* fixes were in a previous version, but I must have mixed up some versions when working on the OS X fix and was too lazy to double check that.
comment:223 Changed 9 years ago by
FWIW, I don't see how you compile FLINT as C99.
comment:224 follow-up: ↓ 225 Changed 9 years ago by
I think we don't...
Do you suggest we keep the "#define ulong unsigned long"? I don't really care.
comment:225 in reply to: ↑ 224 Changed 9 years ago by
Replying to jpflori:
Do you suggest we keep the "#define ulong unsigned long"? I don't really care.
I think we should [better?] coordinate that with upstream (probably including changes to the weird configure
and Makefile.in
, too -- haven't looked at FLINT's current state recently at all).
comment:226 follow-up: ↓ 228 Changed 9 years ago by
I cannot review the Sage library patch, "since the file size exceeds 262144 bytes"... ;-)
comment:227 Changed 9 years ago by
Replying to leif:
Replying to jpflori:
Do you suggest we keep the "#define ulong unsigned long"? I don't really care.
I think we should [better?] coordinate that with upstream (probably including changes to the weird
configure
andMakefile.in
, too -- haven't looked at FLINT's current state recently at all).
You can just propose whatever changes you want to include here, I'm in close contact with upstream, especially for the build system, so I'll take care of coordination. I agree there is some fishy things going on with FLINT_TUNE and CFLAGS and surely other things.
And don't say "let's switch to autotools" unless you wanna be in charge of the build system maintenance as Bill Hart will definitely not want to maintain it.
Be also aware there won't be any FLINT release before at least the dev meeting in may. So I think we should just make minor modifications and get this in quickly. It's been bitrotting for so long... I'll up another spkg based on #14241 not touching the ulong stuff.
comment:228 in reply to: ↑ 226 Changed 9 years ago by
Replying to leif:
I cannot review the Sage library patch, "since the file size exceeds 262144 bytes"... ;-)
The big patch bomb is modulo rebasing issues made from the patches at
plus fixes by Jeroen from
comment:229 Changed 9 years ago by
Ok so we have now:
- http://boxen.math.washington.edu/home/jpflori/flint-2.3.p0.spkg.14241 based on #14241, and mirrored at http://boxen.math.washington.edu/home/jpflori/flint-2.3.p0.spkg,
- http://boxen.math.washington.edu/home/jpflori/flint-2.3.p0.spkg.14268 based on #14268.
Rename the one you prefer and give it a try.
comment:230 Changed 9 years ago by
- Dependencies changed from #12433, #14268 to #12433
comment:231 Changed 9 years ago by
Replying to jpflori:
Replying to leif:
Replying to jpflori:
Do you suggest we keep the "#define ulong unsigned long"? I don't really care.
for f in $SOURCES; do sed -i 's/ulong/mp_limb_t/g' $f; done
;-)
I think we should [better?] coordinate that with upstream (probably including changes to the weird
configure
andMakefile.in
, too -- haven't looked at FLINT's current state recently at all).You can just propose whatever changes you want to include here, I'm in close contact with upstream, especially for the build system, so I'll take care of coordination. I agree there is some fishy things going on with FLINT_TUNE and CFLAGS and surely other things.
And don't say "let's switch to autotools" unless you wanna be in charge of the build system maintenance as Bill Hart will definitely not want to maintain it.
:-) Yes. I just meant I don't know if there are any changes in progress w.r.t. this.
It doesn't seem we'd need any Sage-specific changes, so IMHO we should take fixes from upstream, or push (generic) changes there.
Be also aware there won't be any FLINT release before at least the dev meeting in may.
Yes. But FLINT 2.3.1 should (already) include any fixes we (and others) need, see above.
comment:232 Changed 9 years ago by
Replying to leif:
Replying to jpflori:
Replying to leif:
Replying to jpflori:
Do you suggest we keep the "#define ulong unsigned long"? I don't really care.
for f in $SOURCES; do sed -i 's/ulong/mp_limb_t/g' $f; done
;-)
I think we should [better?] coordinate that with upstream (probably including changes to the weird
configure
andMakefile.in
, too -- haven't looked at FLINT's current state recently at all).You can just propose whatever changes you want to include here, I'm in close contact with upstream, especially for the build system, so I'll take care of coordination. I agree there is some fishy things going on with FLINT_TUNE and CFLAGS and surely other things.
And don't say "let's switch to autotools" unless you wanna be in charge of the build system maintenance as Bill Hart will definitely not want to maintain it.
:-) Yes. I just meant I don't know if there are any changes in progress w.r.t. this.
Not that I am aware of, so unless Bill posts here and rants we all should assume so.
It doesn't seem we'd need any Sage-specific changes, so IMHO we should take fixes from upstream, or push (generic) changes there.
Be also aware there won't be any FLINT release before at least the dev meeting in may.
Yes. But FLINT 2.3.1 should (already) include any fixes we (and others) need, see above.
Of course, its just I'm getting tired by this ticket and don't want to wait for the 2.3.1 release which at best will happen in mid-may and then all the stuff here will have to be rebased again... And that's optimistic because I guess most people at the FLINT meeting will expect to work with FLINT 2.3 and this spkg and will implement new features in FLINT/Sage, rather than working on the FLINT build system and trying to push a new release.
So I feel our best shot is to get http://boxen.math.washington.edu/home/jpflori/flint-2.3.p0.spkg.14241 in as it just modifies flint-2.3 minimally to make it work on OS X. Let's keep the C99, strange env variables fun for a later FLINT release/Sage spkg if its not definitely needed to let Sage work on supported systems.
comment:233 Changed 9 years ago by
Ok, too many uploads today, it seems the last version I posted are broken. Please wait for a couple of minutes so that I fix that.
comment:234 Changed 9 years ago by
Should be ok now. At least I've checked the two spkgs seem to build and pass their testsuite on my system.
comment:235 Changed 9 years ago by
- Description modified (diff)
comment:236 follow-up: ↓ 238 Changed 9 years ago by
Works fine on the buildbot.
comment:237 follow-up: ↓ 239 Changed 9 years ago by
As for ulong
: I think the correct solution is
typedef unsigned long ulong;
instead of #define
. This typedef
is compatible with the common C libraries. Of course, the workaround currently in zn_poly
would need to be removed too.
comment:238 in reply to: ↑ 236 ; follow-up: ↓ 240 Changed 9 years ago by
Replying to jdemeyer:
Works fine on the buildbot.
I've also tested it with Sage 5.9.beta2 on Linux x86 (GCC 4.7.2, with '-O3') and Linux x86_64 (GCC 4.8.0, with '-O3') without any issues, i.e., patches apply, sage -b
works as expected (all modules that need rebuilding get rebuilt -- no need to brute-force sage -ba
), and all long tests ([p]testlong) passed.
I haven't reviewed the 8000+ lines patch though...
comment:239 in reply to: ↑ 237 Changed 9 years ago by
Replying to jdemeyer:
As for
ulong
: I think the correct solution istypedef unsigned long ulong;instead of
#define
. Thistypedef
is compatible with the common C libraries. Of course, the workaround currently inzn_poly
would need to be removed too.
Well, the disadvantage of typedef
s (vs. #define
s) is always that there's no way to test (with the preprocessor) whether a type is already (type
)def'd (other than checking such in configure
and defining HAVE_TYPEDEF_FOO
etc.).
As mentioned elsewhere, in the present cases it would IMHO be cleaner to use mp_limb_t
anyway (which of course isn't that much shorter or easier to type than unsigned long
).
comment:240 in reply to: ↑ 238 Changed 9 years ago by
Replying to leif:
I've also tested it with Sage 5.9.beta2 on Linux x86 (GCC 4.7.2, with '-O3') and Linux x86_64 (GCC 4.8.0, with '-O3') without any issues, i.e., patches apply,
sage -b
works as expected (all modules that need rebuilding get rebuilt -- no need to brute-forcesage -ba
), and all long tests ([p]testlong) passed.
Oh, and of course FLINT 2.3's test suite passed in both cases as well; I just noticed there's no message regarding that (and when running the test suite in parallel, the output isn't synced, i.e., you may get foo...bar...
and a couple of PASS
s where it's not obvious what they refer to, which isn't really a problem until a test fails).
comment:241 follow-ups: ↓ 242 ↓ 243 Changed 9 years ago by
About the typedef thing, I propose we wait until the FLINT meeting in early may which I should attend. I'll discuss it there with Bill Hart.
But if you think its really an issue and want #14268 merged first (or at the same time, I mean going from define to typedef in FLINT whether its 1.x or 2.x), feel free to discuss it on flint-devel group. You can then use the spkg a few comments above I based on the one from #14268.
And it's know that the output of tests are out of sync when run in parallel, but before they were not even run in parallel, so I think the situation is better now, although not optimal.
comment:242 in reply to: ↑ 241 Changed 9 years ago by
Replying to jpflori:
About the typedef thing, I propose we wait until the FLINT meeting in early may which I should attend. I'll discuss it there with Bill Hart.
But if you think its really an issue and want #14268 merged first (or at the same time, I mean going from define to typedef in FLINT whether its 1.x or 2.x), feel free to discuss it on flint-devel group. You can then use the spkg a few comments above I based on the one from #14268.
I think we should postpone that to a follow-up ticket, i.e., get FLINT 2.3 in as is first.
And it's know that the output of tests are out of sync when run in parallel, but before they were not even run in parallel, so I think the situation is better now, although not optimal.
:-) Yes, but adding (back?) a message to spkg-check
in case all tests passed wouldn't be bad, nor hard...
The end of the log currently looks like this:
... g++-4.8.0 -march=native -mtune=native -O3 -fno-strict-aliasing -DHONORS_CFLAGS -funroll-loops -I${SAGE_ROOT}/spkg/build/flint-2.3.p0/src -I${SAGE_ROOT}/local/include -I${SAGE_ROOT}/local/include -I${SAGE_ROOT}/local/include interfaces/test/t-NTL-interface.cpp build/interfaces/NTL-interface.o -o build/interfaces/test/t-NTL-interface -L${SAGE_ROOT}/spkg/build/flint-2.3.p0/src -L${SAGE_ROOT}/local/lib -L${SAGE_ROOT}/local/lib -L${SAGE_ROOT}/local/lib -lflint -lntl -lmpfr -lmpir -lm -lpthread; make[1]: Leaving directory `${SAGE_ROOT}/spkg/build/flint-2.3.p0/src' ZZ_to_fmpz....PASS ZZX_to_fmpz_poly....PASS real 57m16.835s user 50m49.950s sys 1m56.670s Deleting temporary build directory ${SAGE_ROOT}/spkg/build/flint-2.3.p0 Finished installing flint-2.3.p0.spkg
comment:243 in reply to: ↑ 241 ; follow-up: ↓ 244 Changed 9 years ago by
Replying to jpflori:
About the typedef thing, I propose we wait until the FLINT meeting in early may which I should attend. I'll discuss it there with Bill Hart.
I agree. The main argument in favour of typedef
is that some other libraries (including glibc
, which should be a strong precedent and zn_poly
) also use typedef
. And it's not a problem to have the same typedef
twice.
comment:244 in reply to: ↑ 243 Changed 9 years ago by
Replying to jdemeyer:
The main argument in favour of
typedef
is that some other libraries (includingglibc
, which should be a strong precedent andzn_poly
) also usetypedef
. And it's not a problem to have the sametypedef
twice.
typedef unsigned long ulong; typedef unsigned long ulong;
gives an error (redefinition of ulong) at least with earlier versions of GCC. clang
is more verbose:
typedef.c:3:23: warning: redefinition of typedef 'ulong' is a C11 feature [-Wtypedef-redefinition] typedef unsigned long ulong; ^ typedef.c:1:23: note: previous definition is here typedef unsigned long ulong; ^ 1 warning generated.
comment:245 Changed 9 years ago by
GCC 4.6.3 warns with -pedantic
:
typedef.c:2:23: warning: redefinition of typedef 'ulong' [-pedantic] typedef.c:1:23: note: previous declaration of 'ulong' was here
Without -pedantic
(even with -Wall -Wextra
), there is no warning or error.
comment:246 Changed 9 years ago by
GCC 4.4.3 errors out, regardless what -std
is active.
comment:247 follow-up: ↓ 248 Changed 9 years ago by
But g++ always accepts it, this is legal in C++.
comment:248 in reply to: ↑ 247 Changed 9 years ago by
Replying to jdemeyer:
But g++ always accepts it, this is legal in C++.
... while both FLINT and zn_poly are (mostly) written in C, not C++. ;-)
comment:249 follow-up: ↓ 250 Changed 9 years ago by
comment:250 in reply to: ↑ 249 Changed 9 years ago by
Replying to jdemeyer:
Some good read: http://stackoverflow.com/questions/8594954/repeated-typedefs-invalid-in-c-but-valid-in-c
So the optimal solution is
#define typedef please typedef ... typedef unsigned long ulong;
comment:251 Changed 9 years ago by
Or maybe, a bit safer:
#ifndef typedef #define typedef please typedef #endif ...
comment:252 Changed 9 years ago by
- Milestone changed from sage-5.9 to sage-5.10
comment:253 in reply to: ↑ 161 Changed 9 years ago by
Reminder:
Replying to jdemeyer:
For easier reviewing, it would really be good to split up the big patchbomb into a "rename methods" patch (e.g. from zmod to nmod) which does only that and should be automatically generated and a "remainder" patch.
comment:254 Changed 9 years ago by
- Dependencies changed from #12433 to #12433, #14393
- Status changed from needs_review to needs_work
This should be rebased to #14393.
comment:255 in reply to: ↑ 162 Changed 9 years ago by
Replying to jpflori:
I don't really feel like resplitting everything back now...
But you expect a reviewer to do it... not me.
comment:256 follow-up: ↓ 258 Changed 9 years ago by
As I pointed above before, all the content of the patchbomb is from:
except that:
- the last patch from there is omitted in the merged patchbomb;
- the patch bomb was rebased several times;
- one or two newly introduced doctests had to be changed;
- your fixes from http://trac.sagemath.org/sage_trac/ticket/12173?replyto=213#comment:153 were included.
But if you feel we must really have the latest version resplitted to do a proper review, I'll give it a try when rebasing once more against #14393.
comment:257 Changed 9 years ago by
- Work issues set to rebasing, split patches back
comment:258 in reply to: ↑ 256 Changed 9 years ago by
Replying to jpflori:
But if you feel we must really have the latest version resplitted to do a proper review
You don't need to resplit everything as before. The problem is that a large part of this patch is simply renaming stuff. This should be done automatically (e.g. with a sed
command).
That's the minimal thing that I require: a script which does the automatic parts of the patch, so I can review that script instead of its output (which is much more).
comment:259 Changed 9 years ago by
- Description modified (diff)
- Status changed from needs_work to needs_review
- Work issues rebasing, split patches back deleted
Ok here you go. At least Sage builds correctly, I've just launched make ptestlong.
comment:260 Changed 9 years ago by
- Description modified (diff)
comment:261 Changed 9 years ago by
Remainder: the current spkg does not take #14268 into account, anyway it is not merged yet, so could rather be rebased on top of this ticket. For an spkg based on #14268, look at the links at comment 229: http://trac.sagemath.org/sage_trac/ticket/12173#comment:229.
comment:262 Changed 9 years ago by
Hum, there is something wrong. Having a look at it right now.
comment:263 Changed 9 years ago by
(when testing cython.py)
comment:264 Changed 9 years ago by
The last version of script.sh does not delete fmpq_poly.h...
comment:265 Changed 9 years ago by
In fact that's not the good script... Reuploading everything in a few minutes.
comment:266 Changed 9 years ago by
- Description modified (diff)
comment:267 Changed 9 years ago by
I've upped nmod_poly.patch which gives a more readable view of the changes in nmod_poly.pxd than what mercurial spits out in trac_12173-fixes-v4.patch.
comment:268 Changed 9 years ago by
"make ptestlong" went fine on a 64 bits Ubuntu 12.04.2 install (except for two unrelated files, cachefunc.py wich passes when run afterward and gen_interpreters.py which fails because of NFS timestamps issues).
comment:269 Changed 9 years ago by
Will this package make it possible to drop the flintqs spkg too?
comment:270 follow-up: ↓ 271 Changed 9 years ago by
Everything went fine on a fresh 5.9.beta5 build with the patched ecl package as at #14426 (which has been merged into 5.9.rc0 now). I did not do make ptestlong, but I did do sage -tp10 --long on the whole library and had no failures.
I don't know the answer to the question about flintqs.
comment:271 in reply to: ↑ 270 Changed 9 years ago by
Replying to cremona:
I don't know the answer to the question about flintqs.
Nor do I. Latest info I remember is from:
- https://groups.google.com/d/msg/flint-devel/lmRdEPV6el8/TjHd0NzSOWsJ
- https://groups.google.com/d/msg/flint-devel/lmRdEPV6el8/JNbZ4eTy_n0J
but it seems Julien is aware of that.
Anyway, this should be dealt with at #11792.
comment:272 Changed 9 years ago by
Just to be clear: these three patches when applied should result in exactly the same end result as the original flint-2.3-library-5.9.b2.patch, right?
comment:273 Changed 9 years ago by
No.
comment:274 Changed 9 years ago by
But functionally yes.
Here is what I did:
- look thru the original patch and write the sed script;
- remove the scripted part from the original part by hand;
- remove the purely doctests, doc, parts of the original part and put them apart;
- fix the punctured original patch by hand, and especially, completely reorder the nmod_poly.pxd part so that it can be read by a human;
- remark some other parts could be scripted and go back to the first step until Sage builds and passes its testsuite.
comment:275 Changed 9 years ago by
Slightly updated. Now the three patches together are exactly the same as the original, except for changes in module_list.py
(which needed rebasing anyway) and except for the CRLF line endings.
comment:276 Changed 9 years ago by
- Description modified (diff)
comment:277 Changed 9 years ago by
- Description modified (diff)
comment:278 Changed 9 years ago by
Sorry, I was wrong. The three patches aren't the same as the original, there is an additional move of zmod_poly.pxd
to nmod_poly.pxd
and similar for zmod_poly_linkage.pxi
.
comment:279 Changed 9 years ago by
- Description modified (diff)
- Status changed from needs_review to needs_info
Except for the renaming of the two zmod_ files to nmod_, except for the CRLF line endings and except for module_list.py
, the first three patches correspond the the previous split-in-3 patches, while the four patches (with trac_12173-extra.patch) correspond to the original all-in-one patch.
Therefore, the file trac_12173-extra.patch contains some unexplained changes which were in the original all-in-one patch but not in jpflori's 3 patches. This needs to be clarified.
comment:280 Changed 9 years ago by
- Reviewers changed from John Cremona to John Cremona, Jeroen Demeyer
comment:281 follow-up: ↓ 283 Changed 9 years ago by
Indeed, some pieces went missing or were inadvertently added.
- sage/ext/gen_interpreters.py, spurious case change not fixed in the "fixes" patch, so this hunk is needed;
- sage/groups/perm_gps/partn_ref/data_structures_pxd.pxi, I missed this one, we should modify "s|sage\.libs\.flint\.long_extras|sage.libs.flint.ulong_extras|g" in the script by omitting the "sage.libs" part
- sage/libs/flint/fmpq_poly.pxd, ignore this hunk, it just adds a spurious newline
- sage/libs/flint/fmpz_poly.pxi, the first hunk is needed, not sure where it went; the second hunk is more or less moving lines, in fact we could even order the lines better; the third hunk was plain wrong and now defines twice "fmpz_poly_scalar_fdiv_ui", so the solution is to remove that line completely.
- sage/libs/flint/ntl_interface.pxd, should be applied; ZZ_limbs does not exists anymore, and in fact the previous declaration of fmpz_to_mpz here was misplaced... even though fmpz_get_ZZ is unused, its place is here so the all in one patch is right.
- sage/modular/modform/eis_series.py, discard this hunk, we forgot to rename a zmod to nmod before (within a comment).
An additional change would make sense:
- sage/libs/flint/fmpz_poly.pxi: replace "fmpz_t fmpz_poly_get_coeff_ptr" by "fmpz* fmpz_poly_get_coeff_ptr", and maybe remove the commented out line (fmpz_poly_scalar_mul), cleanup spaces
Changed 9 years ago by
Changed 9 years ago by
Changed 9 years ago by
comment:282 Changed 9 years ago by
- Description modified (diff)
Removed trailing whitespace from all patches.
comment:283 in reply to: ↑ 281 ; follow-ups: ↓ 284 ↓ 285 Changed 9 years ago by
Replying to jpflori:
- sage/groups/perm_gps/partn_ref/data_structures_pxd.pxi, I missed this one, we should modify "s|sage\.libs\.flint\.long_extras|sage.libs.flint.ulong_extras|g" in the script by omitting the "sage.libs" part
Done.
comment:284 in reply to: ↑ 283 Changed 9 years ago by
Replying to jdemeyer:
Replying to jpflori:
- sage/groups/perm_gps/partn_ref/data_structures_pxd.pxi, I missed this one, we should modify "s|sage\.libs\.flint\.long_extras|sage.libs.flint.ulong_extras|g" in the script by omitting the "sage.libs" part
Done.
Don't you mind taking care of the changes I proposed? I also started modifying the script and the fixes patch, but let's not duplicate work.
comment:285 in reply to: ↑ 283 ; follow-up: ↓ 288 Changed 9 years ago by
Replying to jdemeyer:
Replying to jpflori:
- sage/groups/perm_gps/partn_ref/data_structures_pxd.pxi, I missed this one, we should modify "s|sage\.libs\.flint\.long_extras|sage.libs.flint.ulong_extras|g" in the script by omitting the "sage.libs" part
Done.
Nevermind, that won't fix the problem. We should rather add "s|flint/long_extras\.h|flint/ulong_extras\.h|g". I'm taking care of that and other fixes now.
comment:286 Changed 9 years ago by
Oops, me too.
Changed 9 years ago by
Changed 9 years ago by
comment:287 Changed 9 years ago by
OK, not making further changes.
I removed some hunks and, for reference, I put all removed hunks in undo.patch. So this should not be applied, it contains stuff from the original all-in-one patch which are undone.
comment:288 in reply to: ↑ 285 Changed 9 years ago by
Replying to jpflori:
We should rather add "s|flint/long_extras\.h|flint/ulong_extras\.h|g".
That's essentially the change I made, it works.
comment:289 Changed 9 years ago by
- Description modified (diff)
comment:290 Changed 9 years ago by
Your patches actually undid some of the changes that I made, please fix this. For example, I don't think it is needed to change the script.
comment:291 Changed 9 years ago by
Indeed, I did not have the latest version of your script...
comment:292 Changed 9 years ago by
- Description modified (diff)
- Status changed from needs_info to needs_review
Should be fixed now, I'm getting lost on what versions of everything I actually use now.
Apart from the suggestion you had not inlcuded yet, I only modified the "fixes" patch including new cosmetic changes to "fmpz_poly.pxi". The v6 is based on the v5 you posted (and that I hopefully got the right version of and did not mess with some versions I crafted earlier).
comment:293 Changed 9 years ago by
I also remove a blank line at the top of sage/groups/perm_gps/partn_ref/data_structures_pxd.pxi and break a long line in fmpq_poly.pxd. That should be all.
I did not split long lines in nmod_poly.pxd, there are too many of them and that would not be so useful and the patch would be even less readable, so unless someone really want that we conform strictly to our coding guidelines in that file, I won't do it.
comment:294 Changed 9 years ago by
There is still something wrong, your trac_12173-fixes-v6.patch doesn't actually apply cleanly on top of trac_12173-script-v7.patch (probably because I removed some trailing white space).
comment:295 Changed 9 years ago by
I think I know how to fix the merge conflict, but I give you a chance to fix it first.
Changed 9 years ago by
comment:296 Changed 9 years ago by
I think the patch was not copied onto my USB key. This one should be ok.
comment:297 Changed 9 years ago by
OK, now let's continue the actual review...
comment:298 follow-up: ↓ 299 Changed 9 years ago by
jpflori: please confirm that you agree to not apply the stuff in undo.patch
comment:299 in reply to: ↑ 298 Changed 9 years ago by
Replying to jdemeyer:
jpflori: please confirm that you agree to not apply the stuff in undo.patch
Yes, I do.
Changed 9 years ago by
comment:300 follow-ups: ↓ 301 ↓ 302 Changed 9 years ago by
- Status changed from needs_review to needs_info
The following is quite strange:
Configuring FLINT. Configuring...x86_64-Linux Testing __builtin_popcountl...yes Testing native popcount..../configure : ligne 315 : 8361 Instruction non permise build/test-popcnt > /dev/null 2>&1 no Building FLINT shared library.
The configure script *dies*, but the result is still zero, so spkg-install still starts to compile!
Notice that the death of the configure script is present in upstream's 2.3 and today's git checkout, so it's not sage-specific.
I have tried to get more information about it : on x86_64, compiling with -mpopcnt works, but running the resulting test file raises an illegal instruction... which kills the configure script without error (!).
I still haven't reported upstream because I saw the problem while testing the spkg in this report. You'll have to actually look at your spkg/logs/flint-2.3.p0.log to see the problem, because the package compiles perfectly.
comment:301 in reply to: ↑ 300 Changed 9 years ago by
- Status changed from needs_info to needs_review
Replying to Snark:
The configure script *dies*
Are you sure about this? I'm pretty sure it's not the configure
script which dies, but a subprogram started by configure
to test popcount. The test-popcnt
program dies a horrible death and configure
correctly concludes (note the "no") there is no popcount support.
comment:302 in reply to: ↑ 300 Changed 9 years ago by
Replying to Snark:
The following is quite strange:
Configuring FLINT. Configuring...x86_64-Linux Testing __builtin_popcountl...yes Testing native popcount..../configure : ligne 315 : 8361 Instruction non permise build/test-popcnt > /dev/null 2>&1 no Building FLINT shared library.
What does bash --version
say?
comment:303 Changed 9 years ago by
Confirmed with GNU bash, version 4.2.37(1)-release (x86_64-pc-linux-gnu)
But really, the bug is just that
./configure : ligne 315 : 8361 Illegal instruction build/test-popcnt > /dev/null 2>&1
is displayed.
Imagine the output simply says
Configuring FLINT. Configuring...x86_64-Linux Testing __builtin_popcountl...yes Testing native popcount... no Building FLINT shared library.
comment:304 Changed 9 years ago by
Oh, indeed! This is in fact the last test, so it's normal that no more tests are shown. I'll perhaps write and propose a patch to upstream to make the configure script less noisy about this error, and more noisy about the fact that it successfully generated a few files.
I'll just review this ticket from the start now that is settled.
comment:305 follow-up: ↓ 307 Changed 9 years ago by
- Status changed from needs_review to needs_info
In the doctests-v2 patch, I notice there are comments about "todo: missing auto normalization" ; but isn't the chunk about normalizing, in which case the comments should go too?
The patches don't apply to sage-5.8 ; the description didn't mention I was to use something else than the latest stable version.
comment:306 Changed 9 years ago by
- Dependencies changed from #12433, #14393 to sage-5.9.beta5
comment:307 in reply to: ↑ 305 ; follow-up: ↓ 308 Changed 9 years ago by
Replying to Snark:
The patches don't apply to sage-5.8 ; the description didn't mention I was to use something else than the latest stable version.
Well, the patches' names tell you what they're based on.
comment:308 in reply to: ↑ 307 Changed 9 years ago by
comment:309 Changed 9 years ago by
- Status changed from needs_info to needs_review
I think the auto normalizaiton thing has nothing to do with FLINT, it something with the Sage code used.
The fact that the output of the doctests slightly changed, is because FLINT now deals with fractions a little differently and is better at normalizing the output (less minus signs), but that's not the same normalization as the one quoted in the comment I think.
comment:310 follow-up: ↓ 313 Changed 9 years ago by
- Status changed from needs_review to needs_work
Unfortunately, there is a problem on OS X 10.4 PPC:
Traceback (most recent call last): File "./spkg-install", line 4, in <module> from sage.all import save File "/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta1/local/lib/python2.7/site-packages/sage/all.py", line 81, in <module> from sage.misc.all import * # takes a while File "/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta1/local/lib/python2.7/site-packages/sage/misc/all.py", line 87, in <module> from functional import (additive_order, File "/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta1/local/lib/python2.7/site-packages/sage/misc/functional.py", line 36, in <module> from sage.rings.complex_double import CDF File "integer.pxd", line 9, in init sage.rings.complex_double (sage/rings/complex_double.c:16711) ImportError: dlopen(/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta1/local/lib/python2.7/site-packages/sage/rings/integer.so, 2): Symbol not found: _n_factor Referenced from: /Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta1/local/lib/python2.7/site-packages/sage/rings/integer.so Expected in: dynamic lookup
comment:311 follow-up: ↓ 312 Changed 9 years ago by
Here are some comments after reading attachment:trac_12173-fixes-v6.patch.
I don't understand this hunk:
diff --git a/sage/libs/flint/fmpz_poly.pyx b/sage/libs/flint/fmpz_poly.pyx --- a/sage/libs/flint/fmpz_poly.pyx +++ b/sage/libs/flint/fmpz_poly.pyx @@ -53,7 +53,7 @@ cdef long c cdef Integer w if PY_TYPE_CHECK(v, str): - if fmpz_poly_set_str(self.poly, v): + if not fmpz_poly_set_str(self.poly, v): return else: raise ValueError, "Unable to create Fmpz_poly from that string."
nmod_poly_pow(..., 2)
should be used below instead of nmod_poly_mul()
.
diff --git a/sage/libs/flint/nmod_poly_linkage.pxi b/sage/libs/flint/nmod_poly_linkage.pxi --- a/sage/libs/flint/nmod_poly_linkage.pxi +++ b/sage/libs/flint/nmod_poly_linkage.pxi @@ -496,7 +495,7 @@ elif e == 1: nmod_poly_set(res, x) elif e == 2: - nmod_poly_sqr(res, x) + nmod_poly_mul(res, x, x) else: if res == x: nmod_poly_set(tmp, x) @@ -510,7 +509,7 @@ nmod_poly_set_coeff_ui(res, 0, 1) e = e >> 1 while(e != 0): - nmod_poly_sqr(pow2, pow2) + nmod_poly_mul(pow2, pow2, pow2) if e % 2: nmod_poly_mul(res, res, pow2) e = e >> 1
This seems to remove the fast exit path:
diff --git a/sage/rings/fraction_field_FpT.pyx b/sage/rings/fraction_field_FpT.pyx --- a/sage/rings/fraction_field_FpT.pyx +++ b/sage/rings/fraction_field_FpT.pyx @@ -688,26 +688,34 @@ """ if nmod_poly_degree(self._numer) == -1: return self - if not nmod_poly_sqrt_check(self._numer) or not nmod_poly_sqrt_check(self._denom): - return None
Fredrik says that FLINT also does the same checks in nmod_poly_sqrt()
, but some memory allocations are performed by the time we get to that point. If the input to this function is not a perfect square most of the time, then IMHO removing these checks will introduce a performance regression.
In the same hunk as above:
+ # Make denominator monic + a = nmod_poly_leading(denom) + if a != 1: + a = mod_inverse_int(a, self.p) + nmod_poly_scalar_mul_nmod(numer, numer, a) + nmod_poly_scalar_mul_nmod(denom, denom, a) + # Choose numerator with smaller leading coefficient + a = nmod_poly_leading(numer) + if a > self.p - a: + nmod_poly_neg(numer, numer)
Is this necessary? If the input is normalized, the denominator will have 1 as a leading coefficient anyway. If we really want to normalize, the normalize()
method in the same file should be used.
_nmod_poly_set_coeff_ui()
was removed from FLINT2. Removing the underscore introduces a potential branch for each function call. Shall we implement our own underscore version?
diff --git a/sage/rings/polynomial/polynomial_zmod_flint.pyx b/sage/rings/polynomial/polynomial_zmod_flint.pyx --- a/sage/rings/polynomial/polynomial_zmod_flint.pyx +++ b/sage/rings/polynomial/polynomial_zmod_flint.pyx @@ -157,17 +170,47 @@ <snip> + sig_on() for i from 0 <= i < length: - _nmod_poly_set_coeff_ui(&self.x, i, l_in[i]) - # we need to set the length manually, we used _nmod_poly_set_coeff_ui - self.x.length = length + nmod_poly_set_coeff_ui(&self.x, i, l_in[i]) + sig_off() + return 0
comment:312 in reply to: ↑ 311 Changed 9 years ago by
Replying to burcin:
Here are some comments after reading attachment:trac_12173-fixes-v6.patch.
I don't understand this hunk:
diff --git a/sage/libs/flint/fmpz_poly.pyx b/sage/libs/flint/fmpz_poly.pyx --- a/sage/libs/flint/fmpz_poly.pyx +++ b/sage/libs/flint/fmpz_poly.pyx @@ -53,7 +53,7 @@ cdef long c cdef Integer w if PY_TYPE_CHECK(v, str): - if fmpz_poly_set_str(self.poly, v): + if not fmpz_poly_set_str(self.poly, v): return else: raise ValueError, "Unable to create Fmpz_poly from that string."
Indeed, this looks wrong. This also shows this function does not seem to be well tested.
nmod_poly_pow(..., 2)
should be used below instead ofnmod_poly_mul()
.diff --git a/sage/libs/flint/nmod_poly_linkage.pxi b/sage/libs/flint/nmod_poly_linkage.pxi --- a/sage/libs/flint/nmod_poly_linkage.pxi +++ b/sage/libs/flint/nmod_poly_linkage.pxi @@ -496,7 +495,7 @@ elif e == 1: nmod_poly_set(res, x) elif e == 2: - nmod_poly_sqr(res, x) + nmod_poly_mul(res, x, x) else: if res == x: nmod_poly_set(tmp, x) @@ -510,7 +509,7 @@ nmod_poly_set_coeff_ui(res, 0, 1) e = e >> 1 while(e != 0): - nmod_poly_sqr(pow2, pow2) + nmod_poly_mul(pow2, pow2, pow2) if e % 2: nmod_poly_mul(res, res, pow2) e = e >> 1
sure, but I wouldn't say that's a critical issue.
This seems to remove the fast exit path:
diff --git a/sage/rings/fraction_field_FpT.pyx b/sage/rings/fraction_field_FpT.pyx --- a/sage/rings/fraction_field_FpT.pyx +++ b/sage/rings/fraction_field_FpT.pyx @@ -688,26 +688,34 @@ """ if nmod_poly_degree(self._numer) == -1: return self - if not nmod_poly_sqrt_check(self._numer) or not nmod_poly_sqrt_check(self._denom): - return NoneFredrik says that FLINT also does the same checks in
nmod_poly_sqrt()
, but some memory allocations are performed by the time we get to that point. If the input to this function is not a perfect square most of the time, then IMHO removing these checks will introduce a performance regression.
I think the best place for such a function would be in FLINT itself. The problem is that if we plan on integrating this ticket without calling this to be implemented function (I'd prefer to go this way, let's not wait for another couple of monthes/years to merge this ticket), we should make sure to not forget to call it from Sage when it actually gets implemented in (a stable release of) FLINT, so we should surely open a follow-up ticket.
In the same hunk as above:
+ # Make denominator monic + a = nmod_poly_leading(denom) + if a != 1: + a = mod_inverse_int(a, self.p) + nmod_poly_scalar_mul_nmod(numer, numer, a) + nmod_poly_scalar_mul_nmod(denom, denom, a) + # Choose numerator with smaller leading coefficient + a = nmod_poly_leading(numer) + if a > self.p - a: + nmod_poly_neg(numer, numer)Is this necessary? If the input is normalized, the denominator will have 1 as a leading coefficient anyway. If we really want to normalize, the
normalize()
method in the same file should be used.
Sure, I'll have alook at the code.
_nmod_poly_set_coeff_ui()
was removed from FLINT2. Removing the underscore introduces a potential branch for each function call. Shall we implement our own underscore version?
I'd say no, if that's useful, it should be in FLINT.
diff --git a/sage/rings/polynomial/polynomial_zmod_flint.pyx b/sage/rings/polynomial/polynomial_zmod_flint.pyx --- a/sage/rings/polynomial/polynomial_zmod_flint.pyx +++ b/sage/rings/polynomial/polynomial_zmod_flint.pyx @@ -157,17 +170,47 @@ <snip> + sig_on() for i from 0 <= i < length: - _nmod_poly_set_coeff_ui(&self.x, i, l_in[i]) - # we need to set the length manually, we used _nmod_poly_set_coeff_ui - self.x.length = length + nmod_poly_set_coeff_ui(&self.x, i, l_in[i]) + sig_off() + return 0
comment:313 in reply to: ↑ 310 Changed 9 years ago by
Replying to jdemeyer:
Unfortunately, there is a problem on OS X 10.4 PPC:
Traceback (most recent call last): File "./spkg-install", line 4, in <module> from sage.all import save File "/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta1/local/lib/python2.7/site-packages/sage/all.py", line 81, in <module> from sage.misc.all import * # takes a while File "/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta1/local/lib/python2.7/site-packages/sage/misc/all.py", line 87, in <module> from functional import (additive_order, File "/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta1/local/lib/python2.7/site-packages/sage/misc/functional.py", line 36, in <module> from sage.rings.complex_double import CDF File "integer.pxd", line 9, in init sage.rings.complex_double (sage/rings/complex_double.c:16711) ImportError: dlopen(/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta1/local/lib/python2.7/site-packages/sage/rings/integer.so, 2): Symbol not found: _n_factor Referenced from: /Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta1/local/lib/python2.7/site-packages/sage/rings/integer.so Expected in: dynamic lookup
Any chance to get the FLINt build log? (and to know what file it produced? libflint.dylib i presume.) And some "otool -L" output on integer.so? (By the way is it expected that the cython produced file end in .so and not in .dylib on Mac OS?)
comment:314 Changed 9 years ago by
I think i found one build log at http://build.sagemath.org/sage/builders/UGent%20moufang%20%28OSX%2010.4%20ppc32%29/builds/197/steps/shell_6/logs/flint
And it seems something is broken in FLINT configuration:
gcc -fPIC -O2 -g -ansi -pedantic -Wall -m64 -mcpu=970 -mtune=970 -mpowerpc64 -falign-loops=16 -falign-functions=16 -falign-labels=16 -falign-jumps=16 -I/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta0/spkg/build/flint-2.3.p0/src -I/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta0/local/include -I/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta0/local/include -I/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta0/local/include -c clz_tab.c -o build/clz_tab.lo;
does not look that good on a ppc32 (which I assume to be 32 bits) computer.
comment:315 Changed 9 years ago by
And the problematic lines are:
elif [ "$KERNEL" = "Darwin" -a "$ARCH" = "ppc" ]; then FLINT_TUNE=" -funroll-loops" elif [ "`uname -p`" = "powerpc" ]; then FLINT_TUNE="-m64 -mcpu=970 -mtune=970 -mpowerpc64 -falign-loops=16 -falign-functions=16 -falign-labels=16 -falign-jumps=16"
in FLINT configure. Of course I don't remember for which kind of cpus the uname -p stuff was put in FLINT 1.
comment:316 follow-up: ↓ 318 Changed 9 years ago by
Hum, its not from FLINT 1, but was added in FLINT 2 to catch PowerG5 cpus I guess. The problem is likely to be that Sage builds 32 bits by default on OS X 10.4 (and 10.5) (see http://www.sagemath.org/doc/installation/source.html and the stuff about SAGE64), but FLINT 2 wants to be too smart and forces the build of a 64 bits library (because it has detected the cpu has such capabilities).
comment:317 follow-up: ↓ 319 Changed 9 years ago by
Replying to jpflori:
The problem is that if we plan on integrating this ticket without calling this to be implemented function (I'd prefer to go this way, let's not wait for another couple of monthes/years to merge this ticket), we should make sure to not forget to call it from Sage when it actually gets implemented in (a stable release of) FLINT, so we should surely open a follow-up ticket.
I suggest we revert these hunks and put the code for nmod_poly_sqrt_check()
back. I wasn't trying to suggest that we implement these in FLINT and wait for the next release. I'd really like to get this ticket resolved by the end of the week.
comment:318 in reply to: ↑ 316 ; follow-up: ↓ 320 Changed 9 years ago by
Replying to jpflori:
FLINT 2 wants to be too smart
For certain values of "smart" I guess. Rolling your own configure
system is almost certainly a bad idea. One should never just add -m32
or -m64
to the CFLAGS
. I would also avoid adding the -mtune
and -mcpu
options.
comment:319 in reply to: ↑ 317 Changed 9 years ago by
Replying to burcin:
Replying to jpflori:
The problem is that if we plan on integrating this ticket without calling this to be implemented function (I'd prefer to go this way, let's not wait for another couple of monthes/years to merge this ticket), we should make sure to not forget to call it from Sage when it actually gets implemented in (a stable release of) FLINT, so we should surely open a follow-up ticket.
I suggest we revert these hunks and put the code for
nmod_poly_sqrt_check()
back. I wasn't trying to suggest that we implement these in FLINT and wait for the next release.
The function used to be three lines of calls to FLINT C functions, so it should be trivial to add to FLINT 2 (although refactoring the nmod_poly_sqrt function so that it uses the _check function in a proper way could need (a little) more thoughts). Therefore, what I'd prefer to do is to push something on github and let Bill pull it to FLINT's trunk. And we should then use this backported patch in our spkg. So when this gets into a stable version of FLINT, we'll only have to update the spkg and delete the patch. Same feeling for the underscore set_ui method.
I'd really like to get this ticket resolved by the end of the week.
Same here!
comment:320 in reply to: ↑ 318 Changed 9 years ago by
Replying to jdemeyer:
Replying to jpflori:
FLINT 2 wants to be too smart
For certain values of "smart" I guess. Rolling your own
configure
system is almost certainly a bad idea. One should never just add-m32
or-m64
to theCFLAGS
. I would also avoid adding the-mtune
and-mcpu
options.
I agree, the compiler should be smart enough to decide what to compile by default, and the user should take the responsibility to pass such options.
comment:321 Changed 9 years ago by
- Description modified (diff)
- Status changed from needs_work to needs_review
Ok, here comes a new patch to apply taking Burcin's review into account.
The spkg at the previous address has been updated to include the fix for PowerPC G5. It's being discussed upstream and should get in at some point.
I'm starting a "make ptestlong" right now on top of 5.10.beta1 (already tested some files directly related to FLINT).
comment:322 Changed 9 years ago by
By the way, the doctests in fmpz_poly.pyx are formatted in an ugly way.
comment:323 Changed 9 years ago by
Ill be updating the script to clean up that file and merge patch which should be (i.e fixes + review + ffixes) when I've built a clean Sage 5.10.beta1.
Changed 9 years ago by
comment:324 Changed 9 years ago by
- Description modified (diff)
I've updated the patch, please check tat I've not messed up everything. Apart from that, I think what really needs to be done is to check that the spkg works fine on PowerPC G5 OS X 10.4.
comment:325 follow-up: ↓ 327 Changed 9 years ago by
For powerpc
, why not remove the whole
elif [ "`uname -p`" = "powerpc" ]; then FLINT_TUNE="-falign-loops=16 -falign-functions=16 -falign-labels=16 -falign-jumps=16"
block?
comment:326 Changed 9 years ago by
- Reviewers changed from John Cremona, Jeroen Demeyer to John Cremona, Jeroen Demeyer, Burcin Erocal
comment:327 in reply to: ↑ 325 Changed 9 years ago by
Replying to jdemeyer:
For
powerpc
, why not remove the wholeelif [ "`uname -p`" = "powerpc" ]; then FLINT_TUNE="-falign-loops=16 -falign-functions=16 -falign-labels=16 -falign-jumps=16"block?
I'd say to perform minimal modification. I don't really care. In fact I'd prefer that FLINT gets its flags from gmp/mpfr if CFLAGS is empty and does not do anything if not.
comment:328 Changed 9 years ago by
- Status changed from needs_review to positive_review
Looks good to me.
Many thanks for the great work.
comment:329 Changed 9 years ago by
- Status changed from positive_review to needs_work
It doesn't actually build on OS X 10.4 PPC G5:
g++ -fPIC -O2 -g -ansi -pedantic -Wall -falign-loops=16 -falign-functions=16 -falign-labels=16 -falign-jumps=16 -I/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta2/spkg/build/flint-2.3.p0/src -I/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta2/local/include -I/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta2/local/include -I/Users/buildbot/build/sage/moufang-1/moufang_full/build/sage-5.10.beta2/local/include -c interfaces/NTL-interface.cpp -o build/interfaces/NTL-interface.lo; /usr/bin/ld: common symbols not allowed with MH_DYLIB output format with the -multi_module option build/fmpz.lo definition of common _fmpz_arr (size 4) build/profiler.lo definition of common _clock_accum (size 160) build/profiler.lo definition of common _clock_last (size 160) build/ulong_extras.lo definition of common _flint_num_primes_mutex (size 44) build/ulong_extras.lo definition of common _flint_prime_inverses (size 4) build/ulong_extras.lo definition of common _flint_primes (size 4) build/ulong_extras.lo definition of common _sieve (size 4) build/fmpz.lo definition of common _fmpz_unused_arr (size 4) collect2: error: ld returned 1 exit status make[4]: *** [libflint.dylib] Error 1 make[3]: *** [library] Error 2 make[3]: Target `all' not remade because of errors. Error: Failed to build FLINT shared library.
I can try a few variations of the flags to see if that helps.
comment:330 Changed 9 years ago by
Needs to pass "-fno-common" on darwin. http://gcc.gnu.org/ml/gcc/2005-06/msg00375.html and follow up is interesting on that subject.
comment:331 Changed 9 years ago by
OK, I am making and testing a new spkg. Please leave yours alone in order to avoid duplicate work.
comment:332 Changed 9 years ago by
- Description modified (diff)
- Status changed from needs_work to needs_review
comment:333 Changed 9 years ago by
- Description modified (diff)
comment:334 Changed 9 years ago by
The new spkg with -fno-common
at least builds on OS X 10.4 PPC G5. I'll need some more time to actually completely test it.
comment:335 Changed 9 years ago by
- Report Upstream changed from N/A to Not yet reported upstream; Will do shortly.
comment:336 Changed 9 years ago by
FLINT's testsuite (./sage -i -c
) passes on OS X 10.4 PPC G5.
comment:337 Changed 9 years ago by
Jeroen, could you please test te spkg on the PPC G5 with FLINT_TUNE="-fno-common -funroll-loops"? If that works fine, I guess we should just go with that upstream rather than the "*=16" thing.
comment:338 Changed 9 years ago by
- Status changed from needs_review to positive_review
Anyway, I'm ok with Jeroen changes to the spkg/patches, so I'm putting this back to positive_review.
comment:339 Changed 9 years ago by
- Report Upstream changed from Not yet reported upstream; Will do shortly. to Reported upstream. Developers acknowledge bug.
comment:340 Changed 9 years ago by
- Description modified (diff)
comment:341 Changed 9 years ago by
FYI, some patches similar to what Jeroen did have been integrated upstream and should be in the next release (and the other patch about dylib64 as well).
comment:342 Changed 9 years ago by
- Merged in set to sage-5.10.beta3
- Resolution set to fixed
- Status changed from positive_review to closed
comment:343 Changed 9 years ago by
It would probably be better to keep a symlink $SAGE_LOCAL/include/FLINT -> flint
for a while... which would also have avoided this.
comment:344 Changed 9 years ago by
The FLINT include directory issue (FLINT
vs. flint
) is now #14603.
Building flint2 in Sage: