Opened 10 years ago
Closed 10 years ago
#10430 closed defect (fixed)
Add some bugfixes to the PARI package
Reported by:  jdemeyer  Owned by:  tbd 

Priority:  blocker  Milestone:  sage4.6.2 
Component:  packages: standard  Keywords:  pari spkg bugs patches 
Cc:  drkirkby  Merged in:  sage4.6.2.alpha2 
Authors:  Jeroen Demeyer  Reviewers:  Leif Leonhardy, Volker Braun 
Report Upstream:  N/A  Work issues:  
Branch:  Commit:  
Dependencies:  Stopgaps: 
Description (last modified by )
We should add bugfixes for
 http://pari.math.ubordeaux.fr/cgibin/bugreport.cgi?bug=1132 (see #10279)
 http://pari.math.ubordeaux.fr/cgibin/bugreport.cgi?bug=1144 (see #2329)
 http://pari.math.ubordeaux.fr/cgibin/bugreport.cgi?bug=1143 (see #2329)
 http://pari.math.ubordeaux.fr/cgibin/bugreport.cgi?bug=1084 (see #9620)
 http://pari.math.ubordeaux.fr/cgibin/bugreport.cgi?bug=1141 (see #10369)
 #10559: path to perl hardcoded in gphelp (GP/PARI)
New spkg: http://sage.math.washington.edu/home/jdemeyer/spkg/pari2.4.3.alpha.p5.spkg
Attachments (4)
Change History (63)
comment:1 Changed 10 years ago by
 Description modified (diff)
 Summary changed from Add some bugfixes to PARI to Add some bugfixes to the PARI package
comment:2 followup: ↓ 3 Changed 10 years ago by
comment:3 in reply to: ↑ 2 ; followup: ↓ 4 Changed 10 years ago by
Replying to leif:
Perhaps we should really also address #10120, as more systems than originally reported seem to be affected, i.e. reduce (perhaps partially) optimization to
O1
to work around obvious bugs in GCC 4.4.1 on these platforms.
Here's an idea: we first try to build with O3 and when that doesn't work, fall back to O2, then O1, then O0.
This way we don't have to find out exactly which versions of gcc are broken.
I think reporting this to PARI is pointless, because they can't help (and probably won't care about) a broken gcc.
comment:4 in reply to: ↑ 3 ; followup: ↓ 5 Changed 10 years ago by
Replying to jdemeyer:
Replying to leif:
Perhaps we should really also address #10120, as more systems than originally reported seem to be affected, i.e. reduce (perhaps partially) optimization to
O1
to work around obvious bugs in GCC 4.4.1 on these platforms.Here's an idea: we first try to build with O3 and when that doesn't work, fall back to O2, then O1, then O0.
"For reference: OpenSuse? 11.2 (gcc (SUSE Linux) 4.4.1 [gcc4_4branch revision 150839]) has the same problem when building PARI: on a machine with 64GB of RAM, it eventually fails after all memory is exhausted (takes hours). [...]"
So I don't think that's the way to go. (Other machines might start swapping, which effectively "freezes" some systems.)
Or should we do something like
(ulimit St 900; $MAKE) # Which value is appropriate?
?
I think reporting this to PARI is pointless, because they can't help (and probably won't care about) a broken gcc.
They at least perhaps have better experience which files are most likely to trigger failures due to GCC bugs.
comment:5 in reply to: ↑ 4 ; followup: ↓ 6 Changed 10 years ago by
Replying to leif:
Or should we do something like
(ulimit St 900; $MAKE) # Which value is appropriate??
How about ulimiting the memory?
comment:6 in reply to: ↑ 5 Changed 10 years ago by
 Cc drkirkby added
 Keywords bugs patches added
Replying to jdemeyer:
Replying to leif:
Or should we do something like
(ulimit St 900; $MAKE) # Which value is appropriate?
?
How about ulimiting the memory?
Much harder to estimate, isn't it? (Feel free to test out adequate values, with O3
etc.; perhaps something Dave likes...)
Ok, if a process starts thrashing, it won't consume much (user) CPU time as well.
comment:7 followup: ↓ 9 Changed 10 years ago by
I don't think we should be changing ulimit
. Sage used to unset it at one point, and that was changed in a trac ticket.
Changing it could cause all sorts of problems for someone. If Sage fails with the limit they set, then tough  they set the limit.
Once we start changing limits, we could cause other proceses to fail, which might be more important to someone.
Dave
comment:8 Changed 10 years ago by
David: we could check the current value of ulimit v
to make sure we are only decreasing the value, not increasing.
I quickly tested ulimit v
on a few systems, this is what I found for the minimal power of 2 for ulimit v
to have a successful build of the pari spkg:
 Gentoo Linux, kernel 2.6.32, x86_64, gcc 4.6.0: 128 MB
 Ubuntu Linux 8.04.4 LTS, kernel 2.6.24, x86_64, gcc 4.5.1: 128 MB
 Mac OS X 10.4 PPC, gcc 4.0.1: ulimit v doesn't seem to work
comment:9 in reply to: ↑ 7 Changed 10 years ago by
Replying to drkirkby:
I don't think we should be changing
ulimit
. Sage used to unset it at one point, and that was changed in a trac ticket.Changing it could cause all sorts of problems for someone. If Sage fails with the limit they set, then tough  they set the limit.
Once we start changing limits, we could cause other proceses to fail, which might be more important to someone.
We would only set limits in (PARI's) spkginstall
.
Note that ulimit
only affects the current process and its subprocesses (i.e. gets inherited), therefore I also used the parentheses in the example above.
ulimit
is (also) a bash
builtin btw. We could also limit its use to Linux.
And ordinary users (i.e., their processes) cannot increase limits once they are set.
comment:10 followups: ↓ 11 ↓ 12 Changed 10 years ago by
P.S.:
If we do "trial building" with some limit(s), we should also make sure that the build actually failed due to a resource limit before retrying with less optimization, e.g. check that the exit code was 152 (SIGXCPU
+ 128) if we use a CPU time limit.
comment:11 in reply to: ↑ 10 Changed 10 years ago by
Replying to leif:
P.S.:
If we do "trial building" with some limit(s), we should also make sure that the build actually failed due to a resource limit before retrying with less optimization, e.g. check that the exit code was 152 (
SIGXCPU
+ 128) if we use a CPU time limit.
With ulimit v
I receive SIGKILL
on exhausted memory, which isn't very specific...
comment:12 in reply to: ↑ 10 ; followup: ↓ 13 Changed 10 years ago by
Replying to leif:
P.S.:
If we do "trial building" with some limit(s), we should also make sure that the build actually failed due to a resource limit before retrying with less optimization
The build could fail for many various reasons, including but not limited to allocating too much memory. There are various other tickets where a PARI build fails because of a broken gcc. All these should be caught, not only the cases where we run out of memory.
comment:13 in reply to: ↑ 12 Changed 10 years ago by
Replying to jdemeyer:
Replying to leif:
P.S.:
If we do "trial building" with some limit(s), we should also make sure that the build actually failed due to a resource limit before retrying with less optimization
The build could fail for many various reasons, including but not limited to allocating too much memory. There are various other tickets where a PARI build fails because of a broken gcc. All these should be caught, not only the cases where we run out of memory.
Of course.
I wonder if we then would get PARI build errors due to GCC bugs reported any longer... ;)
comment:14 followup: ↓ 26 Changed 10 years ago by
Got one more PARI flaw:
It installs three real copies of the shared library rather than one with two symbolic links to it.
Currently not sure if (but I believe) that's an upstream matter, or if we do that.
comment:15 Changed 10 years ago by
 Description modified (diff)
comment:16 Changed 10 years ago by
 Description modified (diff)
Very preliminary spkg: http://sage.math.washington.edu/home/jdemeyer/spkg/pari2.4.3.alpha.p1.spkg (not yet tested properly)
comment:17 Changed 10 years ago by
 Priority changed from major to blocker
comment:18 Changed 10 years ago by
 Description modified (diff)
comment:19 Changed 10 years ago by
 Status changed from new to needs_review
comment:20 Changed 10 years ago by
 Status changed from needs_review to needs_work
 Work issues set to Don't use "make install" since that needs tex
comment:21 Changed 10 years ago by
 Status changed from needs_work to needs_review
 Work issues Don't use "make install" since that needs tex deleted
comment:22 followup: ↓ 30 Changed 10 years ago by
I wonder if we really need the make installdoc*
patch (TeX usage) since apparently all errors are ignored... ;)
I'd like to have ticket references also in SPKG.txt
(Changelog).
Trial building with O3
...O0
won't work if initial_CFLAGS
already contain some (higher) optimization level (which I think isn't unlikely), since $optflag
gets prepended.
I would start with CFLAGS
as is, and then append O2
etc. in case the previous build failed. Also, we IMHO shouldn't retry if configure
failed. (We could simply keep the exit
in these cases.)
I'm not sure if all platforms support ulimit v
, although Linuces (where GCC bugs showed up) certainly do. Perhaps we should test its exit status (and skip trial building if the platform does not).
test ... ne ...
(etc.) is for numerical comparison, not for comparing strings.
The changes aren't yet committed.
comment:23 followup: ↓ 24 Changed 10 years ago by
 Status changed from needs_review to needs_work
comment:24 in reply to: ↑ 23 Changed 10 years ago by
Replying to leif:
... ./spkginstall: line 274: [: yes: integer expression expected Installing PARI/GP... Making installlibsta in Olinuxx86_64 Making install in Olinuxx86_64 make[1]: Entering directory `/home/leif/Sage/sage4.6.1.alpha3/spkg/build/pari2.4.3.alpha.p2/src/Olinuxx86_64' make[1]: warning: jN forced in submake: disabling jobserver mode. make[1]: Entering directory `/home/leif/Sage/sage4.6.1.alpha3/spkg/build/pari2.4.3.alpha.p2/src/Olinuxx86_64' make[1]: warning: jN forced in submake: disabling jobserver mode. mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/lib" mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/include"/pari mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1" mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc "/home/leif/Sage/sage4.6.1.alpha3/local/bin" rm f "/home/leif/Sage/sage4.6.1.alpha3/local/lib"/libparigmp2.4.so.3.0.0 "/home/leif/Sage/sage4.6.1.alpha3/local/lib"/libparigmp2.4.so.3 "/home/leif/Sage/sage4.6.1.alpha3/local/lib"/libpari.so rm f "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1"/pari.1 "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1"/gp.1 "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1"/gp2.4.1 mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/bin" "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../doc/gphelp.1 "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1" ../config/install ../doc/gphelp "/home/leif/Sage/sage4.6.1.alpha3/local/bin" for i in paricfg.h mpinl.h; do \ ../config/install m 644 $i "/home/leif/Sage/sage4.6.1.alpha3/local/include"/pari; done mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/lib/pari" File ../src/funclist not changed. rm f libpari.a ar r libpari.a mp.o mpinl.o F2x.o FF.o Flx.o FpE.o FpV.o FpX.o Qfb.o RgV.o RgX.o ZV.o ZX.o alglin1.o alglin2.o arith1.o arith2.o base1.o base2.o base3.o base4.o base5.o bb_group.o bibli1.o bibli2.o bit.o buch1.o buch2.o buch3.o buch4.o concat.o ellanal.o elliptic.o galconj.o gen1.o gen2.o gen3.o hnf_snf.o ifactor1.o lll.o perm.o polarit1.o polarit2.o polarit3.o prime.o random.o rootpol.o subcyclo.o subgroup.o trans1.o trans2.o trans3.o anal.o compat.o compile.o default.o errmsg.o es.o eval.o hash.o init.o intnum.o members.o pariinl.o parse.o sumiter.o DedekZeta.o Hensel.o QX_factor.o aprcl.o elldata.o ellsea.o galois.o galpol.o groupid.o krasner.o kummer.o mpqs.o nffactor.o part.o stark.o subfield.o thue.o mv: cannot stat `../src/desc/funclistlinuxx86_649212.tmp': No such file or directory make[1]: [../src/funclist] Error 1 (ignored) if test d ../data; then cd ../data; for d in `ls`; do mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/$d && for f in `ls $d`; do ../config/install m 644 $d/$f "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/$d; done >/dev/null; done; fi ../config/install m 644 pari.cfg "/home/leif/Sage/sage4.6.1.alpha3/local/lib/pari" ../config/install m 644 ../examples/EXPLAIN "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install ../misc/tex2mail "/home/leif/Sage/sage4.6.1.alpha3/local/bin" if test n "../src/funclist"; then mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/PARI; ../config/install m 644 ../src/desc/PARI/822.pm "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/PARI; ../config/install m 644 ../src/desc/pari.desc "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"; fi ../config/install m 644 ../examples/Inputrc "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../examples/Makefile "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../examples/bench.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../doc/gp.1 "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1"/gp2.4.1 ln s gp.1 "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1"/pari.1 ../config/install m 644 ../examples/cl.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ln s gp2.4.1 "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1"/gp.1 cd ../src/desc && /usr/bin/perl merge_822 ../funclist > deflinuxx86_649212.tmp ../config/install m 644 ../examples/classno.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples cannot find ../funclist at merge_822 line 4. make[1]: *** [../src/desc/pari.desc] Error 2 make[1]: *** Waiting for unfinished jobs.... ../config/install m 644 ../examples/contfrac.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../doc/Makefile "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../examples/lucas.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../examples/extgcd.c "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../examples/rho.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../examples/squfof.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ar: creating libpari.a ../config/install m 644 ../examples/taylor.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples gcc4.5.1 o "/home/leif/Sage/sage4.6.1.alpha3/local/lib"/libparigmp2.4.so.3.0.0 shared O3 Wall fnostrictaliasing fomitframepointer O3 g march=native O3 fnostrictaliasing fomitframepointer DHONORS_CFLAGS march=native O3 DHONORS_CPPFLAGS fPIC Wl,shared,soname=libparigmp2.4.so.3 mp.o mpinl.o F2x.o FF.o Flx.o FpE.o FpV.o FpX.o Qfb.o RgV.o RgX.o ZV.o ZX.o alglin1.o alglin2.o arith1.o arith2.o base1.o base2.o base3.o base4.o base5.o bb_group.o bibli1.o bibli2.o bit.o buch1.o buch2.o buch3.o buch4.o concat.o ellanal.o elliptic.o galconj.o gen1.o gen2.o gen3.o hnf_snf.o ifactor1.o lll.o perm.o polarit1.o polarit2.o polarit3.o prime.o random.o rootpol.o subcyclo.o subgroup.o trans1.o trans2.o trans3.o anal.o compat.o compile.o default.o errmsg.o es.o eval.o hash.o init.o intnum.o members.o pariinl.o parse.o sumiter.o DedekZeta.o Hensel.o QX_factor.o aprcl.o elldata.o ellsea.o galois.o galpol.o groupid.o krasner.o kummer.o mpqs.o nffactor.o part.o stark.o subfield.o thue.o lc lm L/home/leif/Sage/sage4.6.1.alpha3/local/lib lgmp /usr/bin/ranlib libpari.a ../config/install m 644 ../doc/tex2mail.1 "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1" ../config/install m 644 ../misc/README "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc ../config/install m 644 ../misc/color.dft "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc ../config/install m 644 ../misc/gpalias "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc ../config/install ../misc/gpflog "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc ../config/install m 644 ../misc/gprc.dft "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc ../config/install m 644 ../misc/pari.xpm "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc ../config/install ../misc/xgp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/lib" rm f "/home/leif/Sage/sage4.6.1.alpha3/local/lib"/libpari.a for i in paridecl paripriv pari paricast paricom parierr parigen pariinl parinf pariold paristio parisys paritune ; do \ ../config/install m 644 ../src/headers/$i.h "/home/leif/Sage/sage4.6.1.alpha3/local/include"/pari; done ../config/install m 644 libpari.a "/home/leif/Sage/sage4.6.1.alpha3/local/lib"/libpari.a ../config/install m 644 ../doc/translations "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/appa.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc rm f "/home/leif/Sage/sage4.6.1.alpha3/local/include"/pari/genpari.h ln s pari.h "/home/leif/Sage/sage4.6.1.alpha3/local/include"/pari/genpari.h ../config/install m 644 ../doc/appb.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/appd.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/parimacro.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/pdfmacs.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc make[1]: Leaving directory `/home/leif/Sage/sage4.6.1.alpha3/spkg/build/pari2.4.3.alpha.p2/src/Olinuxx86_64' ../config/install m 644 ../doc/refcard.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/tutorial.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/users.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/usersch1.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/usersch2.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/usersch3.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/usersch4.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/usersch5.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/paricfg.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/libpari.dvi "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc cp: cannot stat `../doc/libpari.dvi': No such file or directory ../config/install m 644 ../doc/users.dvi "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc cp: cannot stat `../doc/users.dvi': No such file or directory ../config/install m 644 ../doc/tutorial.dvi "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc cp: cannot stat `../doc/tutorial.dvi': No such file or directory ../config/install m 644 ../doc/refcard.dvi "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc cp: cannot stat `../doc/refcard.dvi': No such file or directory ../config/install m 644 ../doc/refcard.ps "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc cp: cannot stat `../doc/refcard.ps': No such file or directory make[1]: Leaving directory `/home/leif/Sage/sage4.6.1.alpha3/spkg/build/pari2.4.3.alpha.p2/src/Olinuxx86_64' make: *** [install] Error 2 Error installing PARI
(The line breaks are perhaps partially "suboptimal"...)
comment:25 Changed 10 years ago by
Suggested changes:

spkginstall
diff r d58606ef346e spkginstall
a b 174 174 175 175 if [ $? ne 0 ]; then 176 176 echo >&2 "Error: Configuring PARI with readline and GMP kernel failed." 177 return1177 exit 1 178 178 fi 179 179 180 180 if [ ! f Makefile ]; then 181 181 echo >&2 "Error: Unable to configure PARI: No Makefile generated!" 182 return1182 exit 1 183 183 fi 184 184 185 185 if [ "$UNAME" = "CYGWIN" ]; then … … 198 198 { 199 199 echo "Installing PARI/GP..." 200 200 201 $MAKE install installlibsta 201 # Parallel install is broken: 202 $MAKE j1 install installlibsta 202 203 if [ $? ne 0 ]; then 203 204 echo >&2 "Error installing PARI" 204 205 exit 1 … … 258 259 else 259 260 # First try O3, then O2 and so on until the build works. 260 261 # This is mainly meant to work around compiler bugs. 261 initial_CFLAGS="g $CFLAGS" 262 for optflag in O3 O2 O1 O0; do 263 CFLAGS="$optflag $initial_CFLAGS" 264 echo "===========================================" 265 echo "Building PARI/GP with optimization flag $optflag" 266 echo "===========================================" 267 if build; then 268 build_success=yes 269 break 270 fi 271 done 262 CFLAGS="O3 g $CFLAGS" 263 echo "===========================================" 264 echo "Building PARI/GP with default optimization" 265 echo "(O3 or userspecified optimization level)" 266 echo "===========================================" 267 if ! build; then 268 echo >&2 "Warning: Initial build of PARI/GP failed  retrying with" \ 269 "less optimization..." 270 for optflag in O2 O1 O0; do 271 CFLAGS="$CFLAGS $optflag" 272 echo "===========================================" 273 echo "Building PARI/GP with optimization flag $optflag" 274 echo "===========================================" 275 if build; then 276 build_success=yes 277 break 278 fi 279 done 280 else 281 build_success=yes 282 fi 272 283 fi 273 284 274 if [ $build_success neyes ]; then285 if [ $build_success != yes ]; then 275 286 echo >&2 "Error building PARI/GP" 276 287 exit 1 277 288 fi … … 280 291 install 281 292 282 293 283 # All (previous) errors are catched in build() , so we don't test $? here.284 # Although we perhaps should also check success of the numerous copy commands285 # inside build().294 # All (previous) errors are catched in build() and install(), so we don't test 295 # $? here. Although we perhaps should also check success of the numerous copy 296 # commands inside install(). 286 297 287 298 if [ "$UNAME" = "Darwin" ]; then 288 299 pari_shlib="libpari.dylib"
Of course we could handle modified CFLAGS
differently, such that we don't eventually end up with O3 g ... O2 O1 O0
, but that's IMHO a minor, cosmetic issue.
comment:26 in reply to: ↑ 14 Changed 10 years ago by
Replying to leif:
Got one more PARI flaw:
It installs three real copies of the shared library rather than one with two symbolic links to it.
Currently not sure if (but I believe) that's an upstream matter, or if we do that.
Upstream does fine (if we use make install
).
comment:27 followup: ↓ 28 Changed 10 years ago by
 Reviewers set to Leif Leonhardy
I essentially agree with your comments, *except* for not retrying when Configure
fails. It could very well be that some gcc
bug causes Configure
to fail and we should also catch that.
You're probably right that ulimit v
doesn't work everywhere (on OS X 10.4, the command succeeds but doesn't actually limit anything), but I don't think that's an issue. If it doesn't work, so be it...
comment:28 in reply to: ↑ 27 Changed 10 years ago by
Replying to jdemeyer:
I essentially agree with your comments, *except* for not retrying when
Configure
fails. It could very well be that somegcc
bug causesConfigure
to fail and we should also catch that.
How / when would changing the O
level solve Configure
errors? I can't imagine such, at least not with gcc
...
You're probably right that
ulimit v
doesn't work everywhere (on OS X 10.4, the command succeeds but doesn't actually limit anything), but I don't think that's an issue. If it doesn't work, so be it...
:) Never mind, though we could also set some CPU time limit.
Should we report the broken parallel make install
upstream?
I must admit I haven't looked close at it; it failed with 8 jobs in the first place.
comment:29 followup: ↓ 31 Changed 10 years ago by
P.S.: We could also do make k ...
to compile as much as possible with higher optimization.
comment:30 in reply to: ↑ 22 Changed 10 years ago by
Replying to leif:
I wonder if we really need the
make installdoc*
patch (TeX usage) since apparently all errors are ignored... ;)
True, but it also prevents tex
from hanging (you know the major misfeature of tex
when it prompts the user for input). We could probably solve this by redirecting the standard input from /dev/null
, but I think not building the documentation is a cleaner solution.
comment:31 in reply to: ↑ 29 Changed 10 years ago by
Replying to leif:
P.S.: We could also do
make k ...
to compile as much as possible with higher optimization.
Agreed.
comment:32 followup: ↓ 33 Changed 10 years ago by
Replying to jdemeyer:
Replying to leif:
I wonder if we really need the
make installdoc*
patch (TeX usage) since apparently all errors are ignored... ;)True, but it also prevents
tex
from hanging (you know the major misfeature oftex
when it prompts the user for input). We could probably solve this by redirecting the standard input from/dev/null
, but I think not building the documentation is a cleaner solution.
It won't prompt you if it isn't installed... ;) Other issues?
In general, I think if [La]TeX is installed, we should use it, at least unless there's also some equivalent HMTL documentation or alike.
Or ship prebuilt PDFs...
comment:33 in reply to: ↑ 32 Changed 10 years ago by
comment:34 Changed 10 years ago by
 Description modified (diff)
New spkg: http://sage.math.washington.edu/home/jdemeyer/spkg/pari2.4.3.alpha.p3.spkg. make install
issue not yet fixed (I will investigate and if necessary, report upstream).
About the documentation: if we ship PDF documentation for PARI, are users going to find it?
comment:35 followups: ↓ 36 ↓ 37 Changed 10 years ago by
I doubt 5 minutes CPU time is enough for slow machines, at least with tune
. (Not sure at the moment how many processes are involved in that.)
I'd set the limits for the specific tasks only anyway, but you could increase the [time] limit in case we do tune. (spkgcheck
is of course not affected.)
Cannot test it until you also do $MAKE j1 install ...
or fix the race condition... ;)
comment:36 in reply to: ↑ 35 Changed 10 years ago by
 Work issues set to Fix or work around race condition in `make install`; increase timeout if we tune PARI.
Replying to leif:
I doubt 5 minutes CPU time is enough for slow machines, at least with
tune
. (Not sure at the moment how many processes are involved in that.)
PARI's tune
definitely takes longer, even on fast(er) machines.
I'd set the limits for the specific tasks only anyway, but you could increase the [time] limit in case we do tune. (
spkgcheck
is of course not affected.)
comment:37 in reply to: ↑ 35 Changed 10 years ago by
Replying to leif:
I doubt 5 minutes CPU time is enough for slow machines, at least with
tune
. (Not sure at the moment how many processes are involved in that.)
How about disabling the limits when tuning? I think we may assume that the people who tune PARI know what they are doing.
comment:38 Changed 10 years ago by
Race condition in make install
reported upstream (with patch): http://pari.math.ubordeaux.fr/cgibin/bugreport.cgi?bug=1148
comment:39 Changed 10 years ago by
 Status changed from needs_work to needs_review
 Work issues Fix or work around race condition in `make install`; increase timeout if we tune PARI. deleted
New spkg, same location: http://sage.math.washington.edu/home/jdemeyer/spkg/pari2.4.3.alpha.p3.spkg
comment:40 followups: ↓ 41 ↓ 42 Changed 10 years ago by
Well this is a spectacular bandaid to work around compiler breakage :)
 Having hardcoded memory and time limits will just create problems down the road as pari is bound to get bigger, gcc is going to use more ram, and people try this on a wider (slower) range of hardware.
 We apparently know that optimization of Pari is not working correctly on gcc 4.4.1 yet we still build it with optimization and hope for the best? What could possibly go wrong?
How about we disable optimization (or set a knowngood value, maybe O1
) if the compiler is gcc4.4.1. Unless you set SAGE_PARI_tune=yes
, in which case we'll still build it with all optimizations turned on. That way we are on the safe side and the workaround will become unnecessary over time as people upgrade to newer gcc releases.
And if you know what you are doing you can easily override it.
comment:41 in reply to: ↑ 40 Changed 10 years ago by
Replying to vbraun:
Well this is a spectacular bandaid to work around compiler breakage :)
 Having hardcoded memory and time limits will just create problems down the road as pari is bound to get bigger, gcc is going to use more ram, and people try this on a wider (slower) range of hardware.
Well, the chosen limits are very conservative, so I doubt we will run into this problem any time soon.
 We apparently know that optimization of Pari is not working correctly on gcc 4.4.1 yet we still build it with optimization and hope for the best? What could possibly go wrong?
There isn't just one single broken version, we want to catch all broken gcc's. See #9897 for example.
comment:42 in reply to: ↑ 40 ; followup: ↓ 43 Changed 10 years ago by
Replying to vbraun:
Well this is a spectacular bandaid to work around compiler breakage :)
I totally agree with this by the way, but in my opinion there are two things we can do:
 Use O3 always and leave the user with a noncompiling Sage ("it's not our fault, it's gcc's fault")
 Do the optimizationfallback as in this spkg.
I believe making a blacklist of versions known to fail is not a good solution because there will always be systems with a broken gcc that we don't know of.
comment:43 in reply to: ↑ 42 ; followup: ↓ 44 Changed 10 years ago by
I totally agree with this by the way, but in my opinion there are two things we can do:
 Use O3 always and leave the user with a noncompiling Sage ("it's not our fault, it's gcc's fault")
 Do the optimizationfallback as in this spkg.
 Show the error and then tell the user how to report the issue and continue the build without optimization. I think all thats needed is
CFLAGS=O0 ./sage f spkg/standard/pari* make
to build pari with no optimization and then build the rest?
I believe making a blacklist of versions known to fail is not a good solution because there will always be systems with a broken gcc that we don't know of.
But if we silently try some workarounds then we will never find out about broken compilers either. The user should file a trac ticket to document the issue so it can be fixed in Sage and reported upstream.
comment:44 in reply to: ↑ 43 Changed 10 years ago by
Replying to vbraun:
But if we silently try some workarounds then we will never find out about broken compilers either. The user should file a trac ticket to document the issue so it can be fixed in Sage and reported upstream.
I would prefer that users post the bug directly upstream (or not, gcc ignores most bug reports anyway). I certainly don't want to waste too much energy in making gcc bug reports for every user who can't compile Sage due to a gcc bug.
comment:45 followup: ↓ 46 Changed 10 years ago by
Well nothing is going to be reported if we just silently compile away without optimization. I feel your pain wrt. to gcc bugs but if they don't get reported then they can't get fixed.
So far I've only seen
 general brokenness of
O3
on itanium (dying arch...), we should just default toO2
there.  gcc4.4.1 needs
O1
comment:46 in reply to: ↑ 45 Changed 10 years ago by
Replying to vbraun:
Well nothing is going to be reported if we just silently compile away without optimization. I feel your pain wrt. to gcc bugs but if they don't get reported then they can't get fixed.
Maybe we should suggest to gcc to add PARI as a test suite (I'm mostly joking here). For some reason it has a fenomenal ability to expose gcc bugs.
comment:47 followup: ↓ 49 Changed 10 years ago by
One would need to ascertain if this is a gcc bug or a Pari bug. Badly written code will cause more aggressive optimisations to fail. It does not necessary mean it is a compiler bug.
comment:48 Changed 10 years ago by
 Status changed from needs_review to needs_work
True, but thats the case here:
 brokenness of O3 on itanium is discussed on the gcc bug ticket and definitely a compiler bug.
 gcc4.4.1 with >=
O2
just starts using all available memory until it dies.
My suggestion would be to just catch those two cases. And add a useful error message to the spkg in case pari fails to compile, explaining how to report this and how to work around it by compiling the pari spkg with different CFLAGS.
comment:49 in reply to: ↑ 47 Changed 10 years ago by
Replying to drkirkby:
One would need to ascertain if this is a gcc bug or a Pari bug. Badly written code will cause more aggressive optimisations to fail. It does not necessary mean it is a compiler bug.
Badly written code should never cause the compiler to crash or to use infinite memory. These things are certainly compiler bugs.
comment:50 Changed 10 years ago by
 Description modified (diff)
comment:51 Changed 10 years ago by
 Status changed from needs_work to needs_review
comment:52 Changed 10 years ago by
comment:53 Changed 10 years ago by
 Reviewers changed from Leif Leonhardy to Leif Leonhardy, Volker Braun
 Status changed from needs_review to positive_review
I'm happy with the current version, so I'll give this ticket a positive review. If any compiler bugs are still preventing pari from being built on some hardware then this should be reported to the gcc wrapper.
comment:54 Changed 10 years ago by
 Merged in set to sage4.6.2.alpha1
 Resolution set to fixed
 Status changed from positive_review to closed
comment:55 Changed 10 years ago by
 Merged in sage4.6.2.alpha1 deleted
 Resolution fixed deleted
 Status changed from closed to new
Testing on the buildbot seems to indicate there might still be some race conditions in parallel make install
. So maybe we should avoid doing that.
comment:56 Changed 10 years ago by
 Description modified (diff)
 Status changed from new to needs_review
Fixed race conditions in make install
by using j1
.
comment:57 followup: ↓ 58 Changed 10 years ago by
 Status changed from needs_review to positive_review
That'll get rid of potential races in installation. Perhaps we should disable parallel make for all spkgs that don't use a proven build system like autotools or SCons. Chances are that any handrolled makefile has concurrency issues...
I'll take it that you are going to commit the changes to the included repository before adding the spkg to the next Sage release, because right now they are not.
comment:58 in reply to: ↑ 57 Changed 10 years ago by
Replying to vbraun:
I'll take it that you are going to commit the changes to the included repository before adding the spkg to the next Sage release, because right now they are not.
Yes, done.
comment:59 Changed 10 years ago by
 Merged in set to sage4.6.2.alpha2
 Resolution set to fixed
 Status changed from positive_review to closed
Perhaps we should really also address #10120, as more systems than originally reported seem to be affected, i.e. reduce (perhaps partially) optimization to
O1
to work around obvious bugs in GCC 4.4.1 on these platforms.Did someone report this to the PARI guys? Perhaps they could provide a patch such that we don't have to maintain it (that selectively changes the compiler flags for only some files).
Unfortunately(?), not all people building on e.g. openSUSE 11.2 run into these problems, apparently.