Opened 8 months ago
Closed 7 months ago
#29167 closed defect (duplicate)
test failures in sage.schemes.curves.zariski_vankampen with sirocco built with O2
Reported by:  embray  Owned by:  

Priority:  major  Milestone:  sageduplicate/invalid/wontfix 
Component:  algebraic topology  Keywords:  cygwin sirocco 
Cc:  mmarco  Merged in:  
Authors:  Erik Bray  Reviewers:  
Report Upstream:  Fixed upstream, but not in a stable release.  Work issues:  
Branch:  u/embray/ticket29167 (Commits)  Commit:  35d7703fff39cdd210ced0846c1e8693bb262aa6 
Dependencies:  Stopgaps: 
Description (last modified by )
#29149 fixes building Sage with the sirocco package installed on Cygwin. This required, for linking purposes, removing the forced O0
flag (this could probably be worked around by other meansbuilding with optimizations disabled should be possible).
However, as mmarco remarked below, there are some optimizations that cause numerical errors in sirocco such that it fails to converge or takes a longer path than necessary. This can be reproduced on Linux (although it was previously found on Cygwin).
In particular this causes test failures in sage.schemes.curves.zariski_vankampen
. First the following test fails:
File "src/sage/schemes/curves/zariski_vankampen.py", line 268, in sage.schemes.curves.zariski_vankampen.followstrand Failed example: followstrand(f, x0, x1, 1.0) # optional  sirocco # abs tol 1e15 Expected: [(0.0, 1.0, 0.0), (0.7500000000000001, 1.015090921153253, 0.24752813818386948), (1.0, 1.026166099551513, 0.32768940253604323)] Got: [(0.0, 1.0, 0.0), (0.04687500000000001, 1.0000610264662042, 0.015624364352899234), (0.09376287283569926, 1.0002440686687775, 0.031249206930640684), (0.14068935711887412, 1.00054911561723, 0.046879295645742315), (0.18768019150872053, 1.0009762160115023, 0.06251939538085316), (0.23476112890433726, 1.0015254781428102, 0.07817426805966325), (0.2819579506558838, 1.0021970697475961, 0.0938486727311868), (0.32683564123989106, 1.0029469806716729, 0.10873190673317454), (0.36890847616239786, 1.0037475631034418, 0.12266360371442413), (0.408351758902248, 1.0045828947575761, 0.13570358710012892), (0.4453298364708575, 1.0054396825086667, 0.14790834427708396), (0.5146637319120003, 1.007235298523974, 0.1707339873423019), (0.5753308904230002, 1.0090048273677124, 0.19063761622091374), (0.6284146541201252, 1.0107014101228857, 0.20799548368797782), (0.6748629473551095, 1.0122968271247523, 0.22313613838778654), (0.7155052039357208, 1.0137759106169235, 0.23634583220457633), (0.7866291529517906, 1.0165461226074612, 0.25937181657421426), (0.8399721147138429, 1.0187713227013186, 0.2765614548183696), (0.8799793360353821, 1.0205207374109584, 0.28940677345545085), (0.939990168017691, 1.0232704733075533, 0.30859664694102984), (1.0, 1.026166099551513, 0.3276894025360433)]
The first an last lines are correct, but none of the stuff in the middle.
Then a bit later it hangs, seemingly indefinitely at
Trying (line 339): B = zvk.braid_in_segment(g,CC(p1),CC(p2)) # optional  sirocco
which on Linux passes very quickly. However, the CPU is still very active during this hang so it must be some kind of busy loop.
I don't understand anything about this package except that it's using some kind of linear approximation techniques, so there might be some small numerical glitches being invoked that are causing some solutions to diverge or something.
It might be possible to work around this by identifying exactly which optimizations result in the problem, rather than disabling all optimizations.
Upstream PR: https://github.com/miguelmarco/SIROCCO2/pull/3
Change History (23)
comment:1 followup: ↓ 5 Changed 8 months ago by
comment:2 followup: ↓ 4 Changed 8 months ago by
I just tested under a linux box disabling the tweek we did to prevent the compiler optimization and I get the same error as you point, so it really looks that is the cuplrit.
In particular, the only line I had to modify is this one:
: ${CXXFLAGS=O0 g}
In the configure.ac
file [1]
Maybe that kind of autotools behaviour is platformdependent?
[1] https://github.com/miguelmarco/SIROCCO2/blob/master/configure.ac#L4
P.S. I noticed that you forked the old sirocco repo. The one we are using here is the version 2 one: https://github.com/miguelmarco/SIROCCO2
comment:3 Changed 8 months ago by
That's for the note about O0
. I have another ticket #29149 where I'm fixing building sirocco on Cygwin (though I haven't posted the patch yet). In particular I had to remove CXXFLAGS=O0
to even get it to build, because otherwise I had problems during linking with some templates; I'm not exactly sure why but it seems like they were being overspecialized, in such a way that resulted in multiple definitions for some functions. Removing O0
fixed it because then the compiler would optimize out unused specializations, but there is probably a better solution.
As you say, this is likely the culprit.
comment:4 in reply to: ↑ 2 Changed 8 months ago by
Replying to mmarco:
P.S. I noticed that you forked the old sirocco repo. The one we are using here is the version 2 one: https://github.com/miguelmarco/SIROCCO2
I see. When I noticed that my copy was not 2.0 I updated my fork of the repo to 2.0 as well, so I can confirm that I am building the correct version at least.
comment:5 in reply to: ↑ 1 Changed 8 months ago by
Replying to mmarco:
For example, we discovered when we moved to autotools that the optimization flag O2 can actually go against performance (and maybe even correctness) , so we had to make sure that it was compiled with no optimization options. Maybe the problem you are detecting on windows is related to that?
Did you ever figure out what specific optimizations were hurting performance and/or correctness, and where those optimizations were occurring? Because disabling all optimizations is a blunt hammer and probably hurts performance in other cases. I know figuring that out can be tricky of course, but I have to ask.
comment:6 Changed 8 months ago by
 Summary changed from Cygwin: test failures in sage.schemes.curves.zariski_vankampen with sirocco to test failures in sage.schemes.curves.zariski_vankampen with sirocco built with O2
I was able to reproduce this on Linux also by removing O0
and setting O2
instead.
comment:7 Changed 8 months ago by
 Description modified (diff)
comment:8 Changed 8 months ago by
I can also reproduce the problem enabling basic optimizations with O1
, so it could be any combination of
fautoincdec fbranchcountreg fcombinestackadjustments fcompareelim fcpropregisters fdce fdeferpop fdelayedbranch fdse fforwardpropagate fguessbranchprobability fifconversion fifconversion2 finlinefunctionscalledonce fipaprofile fipapureconst fipareference fipareferenceaddressable fmergeconstants fmoveloopinvariants fomitframepointer freorderblocks fshrinkwrap fshrinkwrapseparate fsplitwidetypes fssabackprop fssaphiopt ftreebitccp ftreeccp ftreech ftreecoalescevars ftreecopyprop ftreedce ftreedominatoropts ftreedse ftreeforwprop ftreefre ftreephiprop ftreepta ftreescevcprop ftreesink ftreeslsr ftreesra ftreeter funitatatime
comment:9 Changed 8 months ago by
I didn't dig into the details of what specific optimization was the problem. Sorry.
comment:10 Changed 8 months ago by
One problem I found (if not the problem) is that although there is a template specialization for evaluation polynomials on complex intervals:
template <> IComplex Polynomial<IComplex>::operator () (const IComplex &x, const IComplex &y) const{ IComplex rop, ropx, ropy; #ifdef DEVELOPER std::cerr << "using specialized polynomial eval for " << __PRETTY_FUNCTION__ << std::endl; #endif ropx = this>evalIPolHornerXY (x, y); ropy = this>evalIPolHornerYX (x, y); rop = this>evalPolClassic (x, y); rop.r.a = MAX(MAX(ropx.r.a, ropy.r.a), rop.r.a); rop.r.b = MIN(MIN(ropx.r.b, ropy.r.b), rop.r.b); rop.i.a = MAX(MAX(ropx.i.a, ropy.i.a), rop.i.a); rop.i.b = MIN(MIN(ropx.i.b, ropy.i.b), rop.i.b); return rop; }
when building with O1
that specialization seems to be being ignored, and instead it's using the unspecialized template:
template <class T> T Polynomial<T>::operator() (const T &x, const T &y) const { #ifdef DEVELOPER std::cerr << "using unspecialized polynomial eval for " << __PRETTY_FUNCTION__ << std::endl; #endif return this>evalPolClassic (x,y); }
I don't know why that would be but I'll keep looking...
comment:11 Changed 8 months ago by
This seems to fix it for me:
diff git a/include/polynomial.hpp b/include/polynomial.hpp index 88d1299..4596579 100644  a/include/polynomial.hpp +++ b/include/polynomial.hpp @@ 230,6 +230,10 @@ T Polynomial<T>::operator() (const T &x, const T &y) const { return this>evalPolClassic (x,y); } +// Defined in lib/polynomial.cpp +template <> IComplex Polynomial<IComplex>::operator() (const IComplex &x, const IComplex &y) const; +template <> MPIComplex Polynomial<MPIComplex>::operator () (const MPIComplex &x, const MPIComplex &y) const; + template <class T> T Polynomial<T>::diffX (const T &x, const T &y) const { // coef[(i*(i+1))/2 + j] is coeficient of monomial of degree 'i', @@ 311,6 +315,10 @@ T Polynomial<T>::diffY (const T &x, const T &y) const { return this>evalPolYClassic (x,y); } +// Defined in lib/polynomial.cpp +template<> IComplex Polynomial<IComplex>::diffY (const IComplex &x, const IComplex &y) const; +template<> MPIComplex Polynomial<MPIComplex>::diffY (const MPIComplex &x, const MPIComplex &y) const; + template <class T> T Polynomial<T>::diffXX (const T &x, const T &y) const {
The problem is that sirocco.cpp includes "polynomial.hpp" which only has the unspecialized operator()
defined inline, but no declaration for the specialized version, so it just use the unspecialized one. Meanwhile, when compiling polynomial.cpp
the unused specializations are just optimized out.
This would possibly explain the link errors I was having on Cygwin as wellI don't know why the problem only occurs on Windows but it must be generating an explicit specialization for Polynomial<IComplex>::operator()
in sirocco.o
and then conflicting with the one in polynomial.o
. Going to test that theory out now.
comment:12 Changed 8 months ago by
Yep, this is exactly the link error that I got which motivated me to try removing the O0
. This explains it exactly:
libtool: link: g++ std=gnu++11 std=gnu++11 shared nostdlib /usr/lib/gcc/x86_64pccygwin/7.4.0/crtbeginS.o .libs/icomplex.o .libs/interval.o .libs/mp_complex.o .libs/mp_icomplex.o .libs/mp_interval.o .libs/polynomial.o .libs/sirocco.o lmpfr lgmp L/home/embray/src/sagemath/sage/local/lib L/home/embray/src/sagemath/sage/local/lib/../lib L/usr/lib/gcc/x86_64pccygwin/7.4.0 L/usr/lib/gcc/x86_64pccygwin/7.4.0/../../../../x86_64pccygwin/lib/../lib L/usr/lib/gcc/x86_64pccygwin/7.4.0/../../../../lib L/lib/../lib L/usr/lib/../lib L/usr/lib/gcc/x86_64pccygwin/7.4.0/../../../../x86_64pccygwin/lib L/usr/lib/gcc/x86_64pccygwin/7.4.0/../../.. lstdc++ lgcc_s lgcc lcygwin ladvapi32 lshell32 luser32 lkernel32 lgcc_s lgcc /usr/lib/gcc/x86_64pccygwin/7.4.0/crtend.o O0 Wl,rpath Wl,/home/embray/src/sagemath/sage/local/lib o .libs/cygsirocco0.dll Wl,enableautoimagebase Xlinker outimplib Xlinker .libs/libsirocco.dll.a .libs/sirocco.o:sirocco.cpp:(.text$_ZNK10PolynomialI8IComplexEclERKS0_S3_[_ZNK10PolynomialI8IComplexEclERKS0_S3_]+0x0): multiple definition of `Polynomial<IComplex>::operator()(IComplex const&, IComplex const&) const' .libs/polynomial.o:polynomial.cpp:(.text+0x642): first defined here .libs/sirocco.o:sirocco.cpp:(.text$_ZNK10PolynomialI8IComplexE5diffYERKS0_S3_[_ZNK10PolynomialI8IComplexE5diffYERKS0_S3_]+0x0): multiple definition of `Polynomial<IComplex>::diffY(IComplex const&, IComplex const&) const' .libs/polynomial.o:polynomial.cpp:(.text+0xfbe): first defined here .libs/sirocco.o:sirocco.cpp:(.text$_ZNK10PolynomialI10MPIComplexEclERKS0_S3_[_ZNK10PolynomialI10MPIComplexEclERKS0_S3_]+0x0): multiple definition of `Polynomial<MPIComplex>::operator()(MPIComplex const&, MPIComplex const&) const' .libs/polynomial.o:polynomial.cpp:(.text+0x188a): first defined here .libs/sirocco.o:sirocco.cpp:(.text$_ZNK10PolynomialI10MPIComplexE5diffYERKS0_S3_[_ZNK10PolynomialI10MPIComplexE5diffYERKS0_S3_]+0x0): multiple definition of `Polynomial<MPIComplex>::diffY(MPIComplex const&, MPIComplex const&) const' .libs/polynomial.o:polynomial.cpp:(.text+0x22f8): first defined here collect2: error: ld returned 1 exit status
comment:13 Changed 8 months ago by
 Branch set to u/embray/ticket29167
 Commit set to 35d7703fff39cdd210ced0846c1e8693bb262aa6
 Description modified (diff)
 Report Upstream changed from N/A to Reported upstream. Developers acknowledge bug.
 Status changed from new to needs_review
New commits:
35d7703  Trac #29167: Fix incorrect results that occur when sirocco is build with

comment:14 followup: ↓ 15 Changed 8 months ago by
Thanks a lot. Great work.
I might need some time to review and test it though.
Did you check that all test pass with this patch?
Also, what would you consider the best way to go: just patch sirocco at install time inside Sage, or release a new version of Sirocco and using that in Sage?
comment:15 in reply to: ↑ 14 Changed 8 months ago by
 Report Upstream changed from Reported upstream. Developers acknowledge bug. to Fixed upstream, but not in a stable release.
Replying to mmarco:
Thanks a lot. Great work.
I might need some time to review and test it though.
Thank you for looking at it! There's no hurry.
Did you check that all test pass with this patch?
Per our discussion on GitHub, yes. I'm just reiterating here for the record.
Also, what would you consider the best way to go: just patch sirocco at install time inside Sage, or release a new version of Sirocco and using that in Sage?
It's up to you. There's no urgency on this either way, so whatever's easiest for you.
comment:16 Changed 8 months ago by
I just created a new release in github. The source code is available at https://github.com/miguelmarco/SIROCCO2/releases/download/v2.0.1/libsirocco2.0.1.tar.gz
comment:17 Changed 8 months ago by
Oh shoot, before you made the release I should have pointed out to you the fixes I made for Cygwin. I forgot to make a PR for them yet. I will do that now...
comment:18 Changed 8 months ago by
No problem, I can make another release (that is the good thing about numbers: we won't run out of them).
comment:19 Changed 8 months ago by
I made a new release on github.
The tarball is available at https://github.com/miguelmarco/SIROCCO2/releases/download/2.0.2/libsirocco2.0.2.tar.gz
comment:20 Changed 7 months ago by
Erik, are you going to rebase this ticket based on the new release?
comment:21 Changed 7 months ago by
Yes, I've been busy with other things, but I plan to replace this ticket with one to update the sirocco spkg to the new release.
comment:22 Changed 7 months ago by
 Milestone changed from sage9.1 to sageduplicate/invalid/wontfix
 Status changed from needs_review to positive_review
Superseded by #29284 which upgrades the SIROCCO SPKG to the new 2.0.2 release, rather than patching.
comment:23 Changed 7 months ago by
 Resolution set to duplicate
 Status changed from positive_review to closed
Thanks for looking into this.
Sadly, I don't have access to any windows system, so I never tested Sirocco there.
As you say, Sirocco does numerical aproximations of the movements of roots of a polynomial, but carefully testing that the step is small enough to guarantee the correctness. So it is indeed very sensitive to all the small subtleties of floating point computing.
For example, we discovered when we moved to autotools that the optimization flag O2 can actually go against performance (and maybe even correctness) , so we had to make sure that it was compiled with no optimization options. Maybe the problem you are detecting on windows is related to that?