#13947 closed defect (fixed)
zn_poly segfaults during tuning and tests on OS X and Cygwin when built on a busy system
Reported by:  jpflori  Owned by:  tbd 

Priority:  blocker  Milestone:  sage5.10 
Component:  packages: standard  Keywords:  zn_poly spkg cygwin osx nuss_mul fail 
Cc:  leif, jhpalmieri, jdemeyer, kcrisman, klee  Merged in:  sage5.10.beta5 
Authors:  Leif Leonhardy  Reviewers:  Jeroen Demeyer 
Report Upstream:  N/A  Work issues:  
Branch:  Commit:  
Dependencies:  Stopgaps: 
Description (last modified by )
See #13137 for more info. This is true with different versions of MPIR so seems to be because of zn_poly and not of MPIR. No problems where spotted on Linuces.
New spkg: http://boxen.math.washington.edu/home/leif/Sage/spkgs/zn_poly0.9.p11.spkg
md5sum: 012e63d181151c19ddc71bdfaeb14e03 zn_poly0.9.p11.spkg
zn_poly0.9.p11 (Leif Leonhardy, May 24th, 2013)
 #13947: Fix
nuss_mul()
test failing especially if tuning happened under "heavy" load (at least on MacOS X and Cygwin) Addfix_fudge_factor_in_nusstest.c.patch
; fix suggested by David Harvey.
Attachments (2)
Change History (63)
comment:1 Changed 8 years ago by
comment:2 Changed 8 years ago by
P.S.: I was actually going to create a zn_poly spkg which simply saves the tuning parameters in case the tests fail, asking the user for submitting them to sagedevel or sagerelease... (and probably doing a few more attempts to get working tuning parameters, and/or inform the user that he/she should reinstall the spkg when the sysload is lower). :)
comment:3 followup: ↓ 10 Changed 8 years ago by
P.P.S.: John, is it always (just) nuss_mul()
that fails the test?
comment:5 Changed 8 years ago by
Also reproduced on hawk (OpenSolaris i386).
comment:6 in reply to: ↑ 4 ; followup: ↓ 7 Changed 8 years ago by
Replying to jdemeyer:
I also reproduced it on bsd.math.
How? I tried hard yesterday, but didn't manage.
Even with John's tuning parameters that made the test(s) fail for him, still all tests (quick as well as extensive) pass for me on bsd.math (with Sage 5.6.beta3 [with GCC 4.6.3 built], and the included MPIR 2.4.0, FWIW).
I was actually hoping we could reproduce the test failures on e.g. Linux as well with such "failing" parameters, although probably depending on the GCC version, too.
comment:7 in reply to: ↑ 6 ; followup: ↓ 8 Changed 8 years ago by
Replying to leif:
Replying to jdemeyer:
I also reproduced it on bsd.math.
How? I tried hard yesterday, but didn't manage.
Hmmm, I probably forgot to run make test
(which rebuilds test/test
with a debug version of src/tuning.c
) in addition to make
(which apparently just rebuilds the static library, not used by test
; IMHO a flaw in the Makefiles).
But still, I cannot reproduce the failure with John's parameters.
comment:8 in reply to: ↑ 7 Changed 8 years ago by
Replying to leif:
Replying to leif:
Replying to jdemeyer:
I also reproduced it on bsd.math.
How? I tried hard yesterday, but didn't manage.
Hmmm, I probably forgot to run
make test
(which rebuildstest/test
with a debug version ofsrc/tuning.c
) in addition tomake
(which apparently just rebuilds the static library, not used bytest
; IMHO a flaw in the Makefiles).But still, I cannot reproduce the failure with John's parameters.
Ooops, not true. In the last attempt, I missed that nuss_mul()
failed, but only when tested "extensively":
(sagesh) leif@bsd:src$ test/test quick all mpn_smp_basecase()... ok mpn_smp_kara()... ok mpn_smp()... ok mpn_mulmid()... ok zn_array_recover_reduce()... ok zn_array_pack()... ok zn_array_unpack()... ok zn_array_mul_KS1()... ok zn_array_mul_KS2()... ok zn_array_mul_KS3()... ok zn_array_mul_KS4()... ok zn_array_sqr_KS1()... ok zn_array_sqr_KS2()... ok zn_array_sqr_KS3()... ok zn_array_sqr_KS4()... ok zn_array_mulmid_KS1()... ok zn_array_mulmid_KS2()... ok zn_array_mulmid_KS3()... ok zn_array_mulmid_KS4()... ok nuss_mul()... ok pmfvec_fft_dc()... ok pmfvec_fft_huge()... ok pmfvec_ifft_dc()... ok pmfvec_ifft_huge()... ok pmfvec_tpfft_dc()... ok pmfvec_tpfft_huge()... ok pmfvec_tpifft_dc()... ok pmfvec_tpifft_huge()... ok zn_array_mul_fft()... ok zn_array_sqr_fft()... ok zn_array_mulmid_fft()... ok zn_array_mul_fft_dft()... ok zn_array_invert()... ok All tests passed. (sagesh) leif@bsd:src$ time test/test all mpn_smp_basecase()... ok mpn_smp_kara()... ok mpn_smp()... ok mpn_mulmid()... ok zn_array_recover_reduce()... ok zn_array_pack()... ok zn_array_unpack()... ok zn_array_mul_KS1()... ok zn_array_mul_KS2()... ok zn_array_mul_KS3()... ok zn_array_mul_KS4()... ok zn_array_sqr_KS1()... ok zn_array_sqr_KS2()... ok zn_array_sqr_KS3()... ok zn_array_sqr_KS4()... ok zn_array_mulmid_KS1()... ok zn_array_mulmid_KS2()... ok zn_array_mulmid_KS3()... ok zn_array_mulmid_KS4()... ok nuss_mul()... FAIL! At least one test FAILED!
Presumably because those tests take a pretty long time... ;)
comment:9 Changed 8 years ago by
The (extensive) tests that didn't get run because test
exits upon the first failure (nuss_mul()
) all pass for me:
(sagesh) leif@bsd:src$ time test/test pmfvec_fft_dc pmfvec_fft_huge pmfvec_ifft_dc pmfvec_ifft_huge pmfvec_tpfft_dc pmfvec_tpfft_huge pmfvec_tpifft_dc pmfvec_tpifft_huge zn_array_mul_fft zn_array_sqr_fft zn_array_mulmid_fft zn_array_mul_fft_dft zn_array_invert && echo OK pmfvec_fft_dc()... ok pmfvec_fft_huge()... ok pmfvec_ifft_dc()... ok pmfvec_ifft_huge()... ok pmfvec_tpfft_dc()... ok pmfvec_tpfft_huge()... ok pmfvec_tpifft_dc()... ok pmfvec_tpifft_huge()... ok zn_array_mul_fft()... ok zn_array_sqr_fft()... ok zn_array_mulmid_fft()... ok zn_array_mul_fft_dft()... ok zn_array_invert()... ok All tests passed. real 1m15.127s user 1m15.054s sys 0m0.027s OK
(This is with John's "failing" tuning parameters, bsd.math.)
comment:10 in reply to: ↑ 3 ; followup: ↓ 11 Changed 8 years ago by
Replying to leif:
P.P.S.: John, is it always (just)
nuss_mul()
that fails the test?
That's my recollection. In my experiment yesterday, I only ran that test (using ./test nuss_mull
).
comment:11 in reply to: ↑ 10 ; followup: ↓ 13 Changed 8 years ago by
Replying to jhpalmieri:
Replying to leif:
P.P.S.: John, is it always (just)
nuss_mul()
that fails the test?That's my recollection. In my experiment yesterday, I only ran that test (using
./test nuss_mull
).
With your tuning parameters, I also only get the extensive test of nuss_mul()
failing. Reproducible on Linux, with GCC 4.7.0. (Haven't tried other versions yet, but this shows at least it's not limited to GCC 4.6.3.)
comment:12 followup: ↓ 23 Changed 8 years ago by
IIRC (and I'm quite sure I am) I got segfault as well during the tuning itself on Cygwin (64bits Windows 7), mostly when issuing make with MAKE="make j4" so the system must have been busy as well, but IIRC (less sure) it also happened when building zn_poly alone.
The segfaults happened while tuning KS/FFT things, mostly the last one which is mulmid, but I seem to tremember it also happened during the previous KS/FFT things sometimes.
Of course I tried to reproduce that this morning and could not (I let ATLAS build in parallel to keep the system busy but that did not seem to do the trick).
I'll give it another shot in the next couple of days.
comment:13 in reply to: ↑ 11 ; followup: ↓ 14 Changed 8 years ago by
Replying to leif:
Replying to jhpalmieri:
Replying to leif:
P.P.S.: John, is it always (just)
nuss_mul()
that fails the test?That's my recollection. In my experiment yesterday, I only ran that test (using
./test nuss_mull
).With your tuning parameters, I also only get the extensive test of
nuss_mul()
failing. Reproducible on Linux, with GCC 4.7.0. (Haven't tried other versions yet, but this shows at least it's not limited to GCC 4.6.3.)
Could you give it a shot by only testing the MPIR part and disabling the comparison in the test code? And do the same with zn_poly code only? So if it's really a bug in MPIR, or calling a function on invalid (let's say really too small) parameters we'll be settled.
comment:14 in reply to: ↑ 13 Changed 8 years ago by
Replying to jpflori:
Could you give it a shot by only testing the MPIR part and disabling the comparison in the test code? And do the same with zn_poly code only?
???
It's just the comparison that fails (or, more precisely, the tests make the "success" depend on the comparison only); no segfaults, no failed assertions.
I removed the "exit on first failure" and got 5 failures (from the "extensive" nuss_mul()
test; all other tests passed, as mentioned).
So if it's really a bug in MPIR, or calling a function on invalid (let's say really too small) parameters we'll be settled.
Well, since the failure depends on zn_poly's thresholds (for zn_poly's functions), it's IMHO clearly in zn_poly, not MPIR. (Unless zn_poly was right only with the "failing" tuning parameters, and incidentally MPIR [2.4.0 and 2.6.0] and zn_poly would give the same wrong results otherwise. Or am I missing something?)
There are still random numbers involved though, so the tests may pass or fail under different circumstances.
comment:15 followup: ↓ 16 Changed 8 years ago by
The offending parameter in John's tuning.c
seems to be tuning_info[62].mul_fft_thresh
(=90, which is extraordinarily low), i.e., that's the one that (for me) causes the nuss_mul()
test failures here.
(There are others, but those aren't relevant for the tests, apparently.)
comment:16 in reply to: ↑ 15 ; followup: ↓ 17 Changed 8 years ago by
Replying to leif:
The offending parameter in John's
tuning.c
seems to betuning_info[62].mul_fft_thresh
(=90, which is extraordinarily low), i.e., that's the one that (for me) causes thenuss_mul()
test failures here.
When I set all mul_fft_thresh
s to 1, I get a lot more failures (although not all tests/comparisons fail).
More interestingly, the failures only happen when squaring. If I use separate "buffers" for both operands (in test/nusstest.c
), all failures vanish, so this seems to be an aliasing problem. (Still strange the error doesn't happen for all inputs; someone^{TM} should investigate further... ;) )
comment:17 in reply to: ↑ 16 Changed 8 years ago by
Replying to leif:
Replying to leif:
The offending parameter in John's
tuning.c
seems to betuning_info[62].mul_fft_thresh
(=90, which is extraordinarily low), i.e., that's the one that (for me) causes thenuss_mul()
test failures here.When I set all
mul_fft_thresh
s to 1, I get a lot more failures (although not all tests/comparisons fail).More interestingly, the failures only happen when squaring.
I also get the quick test to fail with all (2...64 bits) mul_fft_thresh
entries set to 1.
And I meanwhile managed to get "invalid" tuning parameters on Linux x86_64, too (although just once, but unintentionally).
I don't think the bug (or test failure) is in any way related to the compiler / GCC version or compilation options, as I've so far been able to force it with every GCC version I tried (4.4.3, 4.6.3, 4.7.0, 4.7.2), regardless of whether I used e.g. O0
or O3
, or fnostrictaliasing
.
Still don't know whether (just) testcase_nuss_mul()
is broken (in violating preconditions by using the same array for both [identical] operands when squaring [sqr==1
], although assertion checking is enabled when compiling for the test
program), or whether it actually triggers a real bug by doing so. Someone more knowledgable than me should probably check this.
[As mentioned, all failures vanish when buf1 != buf2
, i.e., when they don't alias even if sqr==1
.]
comment:18 Changed 8 years ago by
 Cc kcrisman added
comment:19 followup: ↓ 20 Changed 8 years ago by
Ok, leif, can you put your recipe to trigger the failure in the summary?
comment:20 in reply to: ↑ 19 ; followups: ↓ 21 ↓ 22 Changed 8 years ago by
Replying to fbissey:
Ok, leif, can you put your recipe to trigger the failure in the summary?
Oh, I don't recall right now (searching logs ...), but I think I just faked the values in tuning.c
(generated by tune/tune[.c]
) by modifying test/test.c
(i.e., added something like { int i; for (i=2;i<=64;i++) tuning_info[i].mul_fft_thresh=1; }
to the beginning of test/test.c
(?)'s main()
).
After running sage f s zn_poly
, start a Sage subshell and enter the build directory.
Then you can play with it, i.e., modify the code (or tuning values), and run make test && test/test [quick] [tests_to_run]*
(in $SAGE_ROOT/spkg/build/zn_poly0.9.p{9,10}/src/
) IIRC.
(More to come if I find the logs, otherwise also see the comments above for more info.)
comment:21 in reply to: ↑ 20 Changed 8 years ago by
Replying to leif:
(More to come if I find the logs, otherwise also see the comments above for more info.)
Hmmm, sorry, cannot find any. I vaguely remember I had a power outage before I saved anything... 8/
comment:22 in reply to: ↑ 20 Changed 8 years ago by
Replying to leif:
Replying to fbissey:
Ok, leif, can you put your recipe to trigger the failure in the summary?
Oh, I don't recall right now (searching logs ...), but I think I just faked the values in
tuning.c
(generated bytune/tune[.c]
) by modifyingtest/test.c
(i.e., added something like{ int i; for (i=2;i<=64;i++) tuning_info[i].mul_fft_thresh=1; }
to the beginning oftest/test.c
(?)'smain()
).
Yep:

zn_poly0.9.p5/src/test/test.c
old new 209 209 210 210 int all_success = 1, any_targets = 0, quick = 0, success, i, j; 211 211 212 #if 1  defined(FAKE_THRESHOLDS) 213 for(i=2;i<=64;i++) 214 tuning_info[i].mul_fft_thresh=1; // always (I think) 215 #endif 216 212 217 for (j = 1; j < argc; j++) 213 218 { 214 219 if (!strcmp (argv[j], "quick"))
I've also found

zn_poly0.9.p5/src/test/nusstest.c
old new 59 59 ref_zn_array_scalar_mul (res, res, n, x, mod); 60 60 int success = !zn_array_cmp (ref, res, n); 61 61 62 #if 1  defined(TEST_VERBOSE) 63 if(!success) 64 { 65 fprintf(stderr, 66 "testcase_nuss_mul(): comparison FAILED: lgL=%u (n=%lu) sqr=%d mod.m=%lu mod.bits=%d\n", 67 lgL, n, sqr, 68 mod>m, mod>bits); 69 } 70 #endif 71 62 72 pmfvec_clear (vec2); 63 73 pmfvec_clear (vec1); 64 74 … … 67 77 if (!sqr) 68 78 free (buf2); 69 79 free (buf1); 70 80 71 81 return success; 72 82 } 73 83 … … 84 94 zn_mod_t mod; 85 95 86 96 for (i = 0; i < num_test_bitsizes; i++) 97 #if 0 87 98 for (lgL = 2; lgL <= (quick ? 11 : 13) && success; lgL++) 88 99 for (trial = 0; trial < (quick ? 1 : 5) && success; trial++) 89 100 { … … 92 103 success = success && testcase_nuss_mul (lgL, 1, mod); 93 104 zn_mod_clear (mod); 94 105 } 106 #else /* don't stop upon first failure: */ 107 for (lgL = 2; lgL <= (quick ? 11 : 13) /* && success */; lgL++) 108 for (trial = 0; trial < (quick ? 1 : 5) /* && success */; trial++) 109 { 110 zn_mod_init (mod, random_modulus (test_bitsizes[i], 1)); 111 success &= testcase_nuss_mul (lgL, 0, mod); 112 success &= testcase_nuss_mul (lgL, 1, mod); 113 zn_mod_clear (mod); 114 } 115 #endif 95 116 96 117 return success; 97 118 }
to not stop at the first test failure in nusstest.c
. (The patches here are against the .p5
, but that shouldn't matter if you just strip the first folder name with patch p1
.)
comment:23 in reply to: ↑ 12 ; followup: ↓ 24 Changed 8 years ago by
Replying to jpflori:
IIRC (and I'm quite sure I am) I got segfault as well during the tuning itself on Cygwin (64bits Windows 7), mostly when issuing make with MAKE="make j4" so the system must have been busy as well, but IIRC (less sure) it also happened when building zn_poly alone.
Just as a data point, I can confirm this, even with make j2
, on Cygwin Win 7.
comment:24 in reply to: ↑ 23 ; followup: ↓ 25 Changed 8 years ago by
Replying to kcrisman:
Replying to jpflori:
IIRC (and I'm quite sure I am) I got segfault as well during the tuning itself on Cygwin (64bits Windows 7), mostly when issuing make with MAKE="make j4" so the system must have been busy as well, but IIRC (less sure) it also happened when building zn_poly alone.
Just as a data point, I can confirm this, even with
make j2
, on Cygwin Win 7.
Confirm what exactly?
Tuning fails if the box is too busy? And if so, how?
Building itself (before and/or after tuning) can also fail?
Or does just the quick test after "successfully" building zn_poly fail (due to failing comparisons, as intended, or with a segfault or whatever)?
comment:25 in reply to: ↑ 24 Changed 8 years ago by
IIRC (and I'm quite sure I am) I got segfault as well during the tuning itself on Cygwin (64bits Windows 7), mostly when issuing make with MAKE="make j4" so the system must have been busy as well, but IIRC (less sure) it also happened when building zn_poly alone.
Just as a data point, I can confirm this, even with
make j2
, on Cygwin Win 7.Confirm what exactly? Tuning fails if the box is too busy? And if so, how?
Correct; with one other spkg being built it was too much. Segfault during tuning in KS/FFT mul, repeatable. No problems during short selftest, though of course I couldn't try that without using just one thread in any case.
Building itself (before and/or after tuning) can also fail?
I guess not.
comment:26 Changed 8 years ago by
#14268 contains a patched zn_poly spkg for a totally unrelated problem. Just pointing this out in case somebody here plans to patch zn_poly.
comment:27 Changed 8 years ago by
 Cc klee added
comment:28 Changed 8 years ago by
ping
comment:29 Changed 8 years ago by
pong
comment:30 followup: ↓ 31 Changed 8 years ago by
 Keywords nuss_mul fail added
Tracebacks of segfaults, anyone?
(As mentioned, I can only reproduce failing comparisons  with faked tuning parameters.)
comment:31 in reply to: ↑ 30 Changed 8 years ago by
Replying to leif:
Tracebacks of segfaults, anyone?
(As mentioned, I can only reproduce failing comparisons  with faked tuning parameters.)
I consistently get this failure installing latest versions of Sage. Where (or how) can I get the tracebacks?
comment:32 Changed 8 years ago by
Hi, I am the original author.
I have debugged this issue outside of sage, using the version of zn_poly 0.9 on my web page. I can reproduce the issue with "test/test nuss_mul" on sage.math (or maybe I've logged into boxen?), using the tuning file provided by wdj above.
After some debugging, I have found a genuine bug in the test code. In nusstest.c, line 60 currently reads
ulong x = nuss_mul_fudge (lgL, 0, mod);
It should be
ulong x = nuss_mul_fudge (lgL, sqr, mod);
Basically what's happening is that nuss_mul
returns its results multiplied by a fudge factor, and the test code has to undo that fudge factor to compare the results. The current version always uses the thresholds from the "multiplication" version of the code to figure out the fudge factor. But when sqr == 1, it should be using the "squaring" thresholds.
Please let me know if that solves the problem.
comment:33 Changed 8 years ago by
So the problem is only in the testing code?
We should really try to fix this ASAP.
comment:34 Changed 8 years ago by
Thanks David.
Haven't tested the fix yet, but that wouldn't explain segfaults others mentioned (I couldn't reproduce myself).
Another issue is that the selftuning under heavy load apparently yields unreasonable thresholds, on MacOS X and Cygwin at least.
comment:35 Changed 8 years ago by
Ok, I've created a quickanddirty spkg just for testing your patch:

src/test/nusstest.c
55 55 // compare target implementation against reference implementation 56 56 ref_zn_array_negamul (ref, buf1, buf2, n, mod); 57 57 nuss_mul (res, buf1, buf2, vec1, vec2); 58 ulong x = nuss_mul_fudge (lgL, 0, mod);58 ulong x = nuss_mul_fudge (lgL, sqr, mod); 59 59 ref_zn_array_scalar_mul (res, res, n, x, mod); 60 60 int success = !zn_array_cmp (ref, res, n); 61 61
http://boxen.math.washington.edu/home/leif/Sage/spkgs/zn_poly0.9.p11testing.spkg
(No update of SPKG.txt
, nothing committed.)
comment:36 followup: ↓ 37 Changed 8 years ago by
P.S.: I'll probably update it later to allow conditional faking of thresholds... (as I don't get "appropriate" thresholds on Linux, and only rarely on the MacOS X box I have access to).
comment:37 in reply to: ↑ 36 Changed 8 years ago by
Replying to leif:
P.S.: I'll probably update it later to allow conditional faking of thresholds... (as I don't get "appropriate" thresholds on Linux, and only rarely on the MacOS X box I have access to).
Ok, did so.
You can now install the spkg with ZN_POLY_FAKE_THRESHOLDS
set to something nonempty to set all mul_fft_thresh
s to 1, as I previously did to provoke failures.
(Even) with this, the "quick" test suite (still) passes for me now.
Further changes for debugging: Failures in test_nuss_mul()
now get reported, and the test suite doesn't exit on the first failure, but continues testing. (Especially test_nuss_mul()
now performs all tests regardless of failures.)
Feel free to change patches/conditionally_fake_mul_fft_threshs.patch
to fake other tuning parameters as well; as the name says, I'm only changing tuning_info[2..64].mul_fft_thresh
since doing so previously triggered failures for me.
comment:38 followup: ↓ 39 Changed 8 years ago by
 Priority changed from major to blocker
leif: can you make a proper spkg and put a link to that spkg in the ticket description?
comment:39 in reply to: ↑ 38 Changed 8 years ago by
Replying to jdemeyer:
leif: can you make a proper spkg and put a link to that spkg in the ticket description?
Should I just include David's patch or also add some debugging in case we still get failures?
comment:40 Changed 8 years ago by
(I could leave it as is and just update SPKG.txt
accordingly, of course also committing the changes.)
comment:41 followup: ↓ 43 Changed 8 years ago by
I would just include the patch and see if people still report problems.
comment:42 Changed 8 years ago by
 Description modified (diff)
 Status changed from new to needs_review
comment:43 in reply to: ↑ 41 Changed 8 years ago by
Replying to jdemeyer:
I would just include the patch and see if people still report problems.
Ok, did so, see attached diff.
The testing
spkg is still there, in case anybody wants to play with it.
comment:44 Changed 8 years ago by
Somebody should take a look at tuning on MacOS X and Cygwin though, as the failures were apparently triggered by "random" tuning parameters... (which the patch obviously doesn't affect).
comment:45 Changed 8 years ago by
@David  thank you so much for helping track this down in "stable" code! I hope this is the only one...
Somebody should take a look at tuning on MacOS X and Cygwin though
Agreed. I may be able to do this on OS X today, but not Cygwin until later.
comment:46 Changed 8 years ago by
I can't reproduce it on my Mac box, but I don't think I ever did. Maybe John can try it on bsd again...
comment:47 Changed 8 years ago by
I agree that tuning on heavily loaded OS X systems has not been addressed.
I also don't know about any segfaults.
Anyway, I tried building the old and new spkgs on a loaded OS X system. I could not reproduce any failures in the quick test suite, but the old spkg reliably failed its full testsuite, while the new spkg reliably passed its full test suite.
klee, can you test it out, too?
comment:48 Changed 8 years ago by
Interim report:
1) Making Sage 5.10.beta2 again, I checked it fails at the same spot, zn_poly's quick test suite, with "null_mul()... FAIL!". 2) I installed the new spkg with the command "./sage f http://boxen.math.washington.edu/home/leif/Sage/spkgs/zn_poly0.9.p11.spkg" and succeeded. It passed the zn_poly's quick test suite smoothly! 3) Now my machine is making Sage 5.10.beta2. 4) Then I downloaded Sage 5.10.beta4, the latest, and replaced zn_poly0.9.p10.spkg with zn_poly0.9.p11.spkg in the directory spkg/standard. Then I started making Sage 5.10.beta4. So now my machine is making both beta2 and beta4 at the same time. Perhaps this makes sure the machine is quite loaded. Also the machine is running two virtual machines and some usual applications like Chrome browser.
I will report the final result as soon as the machin finishes building!
By the way, thank you all so much.
comment:49 Changed 8 years ago by
Report on making Sage 5.10.beta2:
Built successfully. Tested successfully except one failure, which seems unrelated with the current issue.
sage t devel/sage/sage/calculus/calculus.py ********************************************************************** File "devel/sage/sage/calculus/calculus.py", line 1309, in sage.calculus.calculus.laplace Failed example: (p1+p2).save(os.path.join(SAGE_TMP, "de_plot.png")) Expected nothing Got: dyld: Library not loaded: /usr/X11/lib/libfreetype.6.dylib Referenced from: /usr/X11/bin/fclist Reason: Incompatible library version: fclist requires version 14.0.0 or later, but libfreetype.6.dylib provides version 10.0.0 dyld: Library not loaded: /usr/X11/lib/libfreetype.6.dylib Referenced from: /usr/X11/bin/fclist Reason: Incompatible library version: fclist requires version 14.0.0 or later, but libfreetype.6.dylib provides version 10.0.0 **********************************************************************
comment:50 Changed 8 years ago by
Yes, this is a very occasional OS X error that I haven't been able to track down, and that has nothing to do with this ticket.
comment:51 Changed 8 years ago by
Report on making Sage 5.10.beta4:
Well... Building failed with "Error installing package sage5.10.beta4". So I tried "./sage i spkg/standard/zn_poly0.9.p11.spkg", and it was installed successfully.
So my overall impression is that the patch corrects the issue, and the issue seems unrelated with the heavy loadedness of my machine (Mac Pro quadcore intel xeon with Mac OS X 10.7.5).
comment:52 followup: ↓ 53 Changed 8 years ago by
Report on making Sage 5.10.beta4: Well... Building failed with "Error installing package sage5.10.beta4". So I tried "./sage i spkg/standard/zn_poly0.9.p11.spkg", and it was installed successfully.
But what was the failure in installing beta4? If it was still zn_poly
then just installing this spkg wouldn't address the underlying issue.
To really test this, assuming the failure was zn_poly
, can you unpack the beta4 tarball again, but replace spkg/standard/zn_poly<old>.spkg
with this spkg *before compiling* and then compile under heavy load (maybe even just make j4
) and see if the problem persists.
comment:53 in reply to: ↑ 52 ; followup: ↓ 55 Changed 8 years ago by
Replying to kcrisman:
Report on making Sage 5.10.beta4: Well... Building failed with "Error installing package sage5.10.beta4". So I tried "./sage i spkg/standard/zn_poly0.9.p11.spkg", and it was installed successfully.
But what was the failure in installing beta4? If it was still
zn_poly
then just installing this spkg wouldn't address the underlying issue.
package sage5.10.beta4
= the Sage library spkg failed to install, so apparently zn_poly did install successfully, as the former depends on the latter.
On the other hand, (re)installing zn_poly afterwards (with just sage i ...
) should have just told you that it's already installed, which you at least did not explicitly mention.
(But you said you copied the .p11
into spkg/standard/
before building Sage 5.10.beta4.)
comment:54 Changed 8 years ago by
... where "you" addresses Kwankyu, in case that wasn't clear.
comment:55 in reply to: ↑ 53 Changed 8 years ago by
Replying to leif:
Replying to kcrisman:
Report on making Sage 5.10.beta4: Well... Building failed with "Error installing package sage5.10.beta4". So I tried "./sage i spkg/standard/zn_poly0.9.p11.spkg", and it was installed successfully.
But what was the failure in installing beta4? If it was still
zn_poly
then just installing this spkg wouldn't address the underlying issue.package
sage5.10.beta4
= the Sage library spkg failed to install, so apparently zn_poly did install successfully, as the former depends on the latter.On the other hand, (re)installing zn_poly afterwards (with just
sage i ...
) should have just told you that it's already installed, which you at least did not explicitly mention.(But you said you copied the
.p11
intospkg/standard/
before building Sage 5.10.beta4.)
Yes, I copied .p11
into spkg/standard/
and removed .p10
before I started building Sage 5.10.beta4.
Sorry that I don't remember the reason of the failure of beta4. The message was somewhat unclear to me, but seemed unrelated with zn_poly. Now I am building beta4 to reproduce the failure.
I used "sage i" rather than "sage f", and remember the installation of the spkg started as if it was not done before. On this point, I am not so confident of my own memory though. Anyway, the installation was successful.
comment:56 followup: ↓ 57 Changed 8 years ago by
Rebuilding beta4 now succeeded, but when I started the justbuilt Sage, I got
Athena:sage5.10.beta4$ ./sage   Sage Version 5.10.beta4, Release Date: 20130520   Type "notebook()" for the browserbased notebook interface.   Type "help()" for help.   ********************************************************************** * * * Warning: this is a prerelease version, and it may be unstable. * * * **********************************************************************  ImportError Traceback (most recent call last) <ipythoninput1e91e614a7080> in <module>() 2 sys.path.append(os.environ['HOME'] + '/Workplace/sage') 3 > 4 from lib import * /Users/Kwankyu/Workplace/sage/lib/__init__.py in <module>() > 1 from curve.simple_curve import SimpleCurve /Users/Kwankyu/Workplace/sage/lib/curve/simple_curve.py in <module>() 22 from sage.matrix.constructor import matrix, vector 23 > 24 from lib.curve import affine_curve 25 from affine_curve import AffinePlaneCurve, CoordinateRing 26 /Users/Kwankyu/Workplace/sage/lib/curve/affine_curve.py in <module>() 9 from sage.categories.morphism import Morphism 10 from sage.categories.finite_fields import FiniteFields > 11 from sage.schemes.generic.projective_space import ProjectiveSpace 12 from sage.rings.fraction_field import FractionField 13 from sage.rings.infinity import infinity ImportError: No module named projective_space sage:
Still "./sage f spkg/standard/zn_poly0.9.p11.spkg" succeeds.
comment:57 in reply to: ↑ 56 ; followup: ↓ 58 Changed 8 years ago by
Replying to klee:
Rebuilding beta4 now succeeded, but when I started the justbuilt Sage, I got
Athena:sage5.10.beta4$ ./sage   Sage Version 5.10.beta4, Release Date: 20130520   Type "notebook()" for the browserbased notebook interface.   Type "help()" for help.   ********************************************************************** * * * Warning: this is a prerelease version, and it may be unstable. * * * **********************************************************************  ImportError Traceback (most recent call last) <ipythoninput1e91e614a7080> in <module>() 2 sys.path.append(os.environ['HOME'] + '/Workplace/sage') 3 > 4 from lib import * /Users/Kwankyu/Workplace/sage/lib/__init__.py in <module>() > 1 from curve.simple_curve import SimpleCurve /Users/Kwankyu/Workplace/sage/lib/curve/simple_curve.py in <module>() 22 from sage.matrix.constructor import matrix, vector 23 > 24 from lib.curve import affine_curve 25 from affine_curve import AffinePlaneCurve, CoordinateRing 26 /Users/Kwankyu/Workplace/sage/lib/curve/affine_curve.py in <module>() 9 from sage.categories.morphism import Morphism 10 from sage.categories.finite_fields import FiniteFields > 11 from sage.schemes.generic.projective_space import ProjectiveSpace 12 from sage.rings.fraction_field import FractionField 13 from sage.rings.infinity import infinity ImportError: No module named projective_space sage:
This is both unrelated to zn_poly and hardly related to Sage 5.10.beta4.
Outdated init.sage
? Cf. #14217, merged into Sage 5.10.beta3.
comment:58 in reply to: ↑ 57 Changed 8 years ago by
Replying to leif:
Replying to klee:
> 11 from sage.schemes.generic.projective_space import ProjectiveSpace ImportError: No module named projective_spaceThis is both unrelated to zn_poly and hardly related to Sage 5.10.beta4.
Outdated
init.sage
? Cf. #14217, merged into Sage 5.10.beta3.
P.S.: The relevant "layout" change was announced (or suggested) on sagedevel a while ago.
comment:59 followup: ↓ 60 Changed 8 years ago by
 Merged in set to sage5.10.beta5
 Resolution set to fixed
 Reviewers set to Jeroen Demeyer
 Status changed from needs_review to closed
At least this spkg fixes some bug, so it's good to have.
comment:60 in reply to: ↑ 59 Changed 8 years ago by
At least this spkg fixes some bug, so it's good to have.
True! But did you open a new ticket for the original bug, which is probably not resolved by this? (JP, I assume that on a loaded Cygwin system we still get the original issue.)
comment:61 Changed 8 years ago by
I successfully installed Sage5.10.rc0 without the zn_poly failure issue. (the error after starting Sage as reported in a previous comment was just because of my own outdated scripts, and is irrelevant with this ticket. Sorry for the noise.)
Thanks a lot!
Does it really segfault, and especially does the tuning segfault?
I thought zn_poly would just occasionally generate "unexpected^{TM}" values during tuning on MacOS X and Cygwin (presumably only under heavy system load), such that afterwards some tests (with zn_poly rebuilt, or more precisely, relinked with these paramaters) would deterministically fail.
(This might still depend on the compiler as well, at least the way it fails.)