Opened 12 years ago
Closed 12 years ago
#10430 closed defect (fixed)
Add some bugfixes to the PARI package
Reported by:  Jeroen Demeyer  Owned by:  tbd 

Priority:  blocker  Milestone:  sage4.6.2 
Component:  packages: standard  Keywords:  pari spkg bugs patches 
Cc:  David Kirkby  Merged in:  sage4.6.2.alpha2 
Authors:  Jeroen Demeyer  Reviewers:  Leif Leonhardy, Volker Braun 
Report Upstream:  N/A  Work issues:  
Branch:  Commit:  
Dependencies:  Stopgaps: 
Description (last modified by )
We should add bugfixes for
 http://pari.math.ubordeaux.fr/cgibin/bugreport.cgi?bug=1132 (see #10279)
 http://pari.math.ubordeaux.fr/cgibin/bugreport.cgi?bug=1144 (see #2329)
 http://pari.math.ubordeaux.fr/cgibin/bugreport.cgi?bug=1143 (see #2329)
 http://pari.math.ubordeaux.fr/cgibin/bugreport.cgi?bug=1084 (see #9620)
 http://pari.math.ubordeaux.fr/cgibin/bugreport.cgi?bug=1141 (see #10369)
 #10559: path to perl hardcoded in gphelp (GP/PARI)
New spkg: http://sage.math.washington.edu/home/jdemeyer/spkg/pari2.4.3.alpha.p5.spkg
Attachments (4)
Change History (63)
comment:1 Changed 12 years ago by
Description:  modified (diff) 

Summary:  Add some bugfixes to PARI → Add some bugfixes to the PARI package 
comment:2 followup: 3 Changed 12 years ago by
comment:3 followup: 4 Changed 12 years ago by
Replying to leif:
Perhaps we should really also address #10120, as more systems than originally reported seem to be affected, i.e. reduce (perhaps partially) optimization to
O1
to work around obvious bugs in GCC 4.4.1 on these platforms.
Here's an idea: we first try to build with O3 and when that doesn't work, fall back to O2, then O1, then O0.
This way we don't have to find out exactly which versions of gcc are broken.
I think reporting this to PARI is pointless, because they can't help (and probably won't care about) a broken gcc.
comment:4 followup: 5 Changed 12 years ago by
Replying to jdemeyer:
Replying to leif:
Perhaps we should really also address #10120, as more systems than originally reported seem to be affected, i.e. reduce (perhaps partially) optimization to
O1
to work around obvious bugs in GCC 4.4.1 on these platforms.Here's an idea: we first try to build with O3 and when that doesn't work, fall back to O2, then O1, then O0.
"For reference: OpenSuse? 11.2 (gcc (SUSE Linux) 4.4.1 [gcc4_4branch revision 150839]) has the same problem when building PARI: on a machine with 64GB of RAM, it eventually fails after all memory is exhausted (takes hours). [...]"
So I don't think that's the way to go. (Other machines might start swapping, which effectively "freezes" some systems.)
Or should we do something like
(ulimit St 900; $MAKE) # Which value is appropriate?
?
I think reporting this to PARI is pointless, because they can't help (and probably won't care about) a broken gcc.
They at least perhaps have better experience which files are most likely to trigger failures due to GCC bugs.
comment:5 followup: 6 Changed 12 years ago by
Replying to leif:
Or should we do something like
(ulimit St 900; $MAKE) # Which value is appropriate??
How about ulimiting the memory?
comment:6 Changed 12 years ago by
Cc:  David Kirkby added 

Keywords:  bugs patches added 
Replying to jdemeyer:
Replying to leif:
Or should we do something like
(ulimit St 900; $MAKE) # Which value is appropriate?
?
How about ulimiting the memory?
Much harder to estimate, isn't it? (Feel free to test out adequate values, with O3
etc.; perhaps something Dave likes...)
Ok, if a process starts thrashing, it won't consume much (user) CPU time as well.
comment:7 followup: 9 Changed 12 years ago by
I don't think we should be changing ulimit
. Sage used to unset it at one point, and that was changed in a trac ticket.
Changing it could cause all sorts of problems for someone. If Sage fails with the limit they set, then tough  they set the limit.
Once we start changing limits, we could cause other proceses to fail, which might be more important to someone.
Dave
comment:8 Changed 12 years ago by
David: we could check the current value of ulimit v
to make sure we are only decreasing the value, not increasing.
I quickly tested ulimit v
on a few systems, this is what I found for the minimal power of 2 for ulimit v
to have a successful build of the pari spkg:
 Gentoo Linux, kernel 2.6.32, x86_64, gcc 4.6.0: 128 MB
 Ubuntu Linux 8.04.4 LTS, kernel 2.6.24, x86_64, gcc 4.5.1: 128 MB
 Mac OS X 10.4 PPC, gcc 4.0.1: ulimit v doesn't seem to work
comment:9 Changed 12 years ago by
Replying to drkirkby:
I don't think we should be changing
ulimit
. Sage used to unset it at one point, and that was changed in a trac ticket.Changing it could cause all sorts of problems for someone. If Sage fails with the limit they set, then tough  they set the limit.
Once we start changing limits, we could cause other proceses to fail, which might be more important to someone.
We would only set limits in (PARI's) spkginstall
.
Note that ulimit
only affects the current process and its subprocesses (i.e. gets inherited), therefore I also used the parentheses in the example above.
ulimit
is (also) a bash
builtin btw. We could also limit its use to Linux.
And ordinary users (i.e., their processes) cannot increase limits once they are set.
comment:10 followups: 11 12 Changed 12 years ago by
P.S.:
If we do "trial building" with some limit(s), we should also make sure that the build actually failed due to a resource limit before retrying with less optimization, e.g. check that the exit code was 152 (SIGXCPU
+ 128) if we use a CPU time limit.
comment:11 Changed 12 years ago by
Replying to leif:
P.S.:
If we do "trial building" with some limit(s), we should also make sure that the build actually failed due to a resource limit before retrying with less optimization, e.g. check that the exit code was 152 (
SIGXCPU
+ 128) if we use a CPU time limit.
With ulimit v
I receive SIGKILL
on exhausted memory, which isn't very specific...
comment:12 followup: 13 Changed 12 years ago by
Replying to leif:
P.S.:
If we do "trial building" with some limit(s), we should also make sure that the build actually failed due to a resource limit before retrying with less optimization
The build could fail for many various reasons, including but not limited to allocating too much memory. There are various other tickets where a PARI build fails because of a broken gcc. All these should be caught, not only the cases where we run out of memory.
comment:13 Changed 12 years ago by
Replying to jdemeyer:
Replying to leif:
P.S.:
If we do "trial building" with some limit(s), we should also make sure that the build actually failed due to a resource limit before retrying with less optimization
The build could fail for many various reasons, including but not limited to allocating too much memory. There are various other tickets where a PARI build fails because of a broken gcc. All these should be caught, not only the cases where we run out of memory.
Of course.
I wonder if we then would get PARI build errors due to GCC bugs reported any longer... ;)
comment:14 followup: 26 Changed 12 years ago by
Got one more PARI flaw:
It installs three real copies of the shared library rather than one with two symbolic links to it.
Currently not sure if (but I believe) that's an upstream matter, or if we do that.
comment:15 Changed 12 years ago by
Authors:  → Jeroen Demeyer 

Description:  modified (diff) 
comment:16 Changed 12 years ago by
Description:  modified (diff) 

Very preliminary spkg: http://sage.math.washington.edu/home/jdemeyer/spkg/pari2.4.3.alpha.p1.spkg (not yet tested properly)
comment:17 Changed 12 years ago by
Priority:  major → blocker 

comment:18 Changed 12 years ago by
Description:  modified (diff) 

comment:19 Changed 12 years ago by
Status:  new → needs_review 

comment:20 Changed 12 years ago by
Status:  needs_review → needs_work 

Work issues:  → Don't use "make install" since that needs tex 
comment:21 Changed 12 years ago by
Status:  needs_work → needs_review 

Work issues:  Don't use "make install" since that needs tex 
Changed 12 years ago by
Attachment:  pari2.4.3.alpha.p2.diff added 

spkg patch .p0 to .p2, for reference
comment:22 followup: 30 Changed 12 years ago by
I wonder if we really need the make installdoc*
patch (TeX usage) since apparently all errors are ignored... ;)
I'd like to have ticket references also in SPKG.txt
(Changelog).
Trial building with O3
...O0
won't work if initial_CFLAGS
already contain some (higher) optimization level (which I think isn't unlikely), since $optflag
gets prepended.
I would start with CFLAGS
as is, and then append O2
etc. in case the previous build failed. Also, we IMHO shouldn't retry if configure
failed. (We could simply keep the exit
in these cases.)
I'm not sure if all platforms support ulimit v
, although Linuces (where GCC bugs showed up) certainly do. Perhaps we should test its exit status (and skip trial building if the platform does not).
test ... ne ...
(etc.) is for numerical comparison, not for comparing strings.
The changes aren't yet committed.
comment:23 followup: 24 Changed 12 years ago by
Status:  needs_review → needs_work 

comment:24 Changed 12 years ago by
Replying to leif:
... ./spkginstall: line 274: [: yes: integer expression expected Installing PARI/GP... Making installlibsta in Olinuxx86_64 Making install in Olinuxx86_64 make[1]: Entering directory `/home/leif/Sage/sage4.6.1.alpha3/spkg/build/pari2.4.3.alpha.p2/src/Olinuxx86_64' make[1]: warning: jN forced in submake: disabling jobserver mode. make[1]: Entering directory `/home/leif/Sage/sage4.6.1.alpha3/spkg/build/pari2.4.3.alpha.p2/src/Olinuxx86_64' make[1]: warning: jN forced in submake: disabling jobserver mode. mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/lib" mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/include"/pari mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1" mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc "/home/leif/Sage/sage4.6.1.alpha3/local/bin" rm f "/home/leif/Sage/sage4.6.1.alpha3/local/lib"/libparigmp2.4.so.3.0.0 "/home/leif/Sage/sage4.6.1.alpha3/local/lib"/libparigmp2.4.so.3 "/home/leif/Sage/sage4.6.1.alpha3/local/lib"/libpari.so rm f "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1"/pari.1 "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1"/gp.1 "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1"/gp2.4.1 mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/bin" "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../doc/gphelp.1 "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1" ../config/install ../doc/gphelp "/home/leif/Sage/sage4.6.1.alpha3/local/bin" for i in paricfg.h mpinl.h; do \ ../config/install m 644 $i "/home/leif/Sage/sage4.6.1.alpha3/local/include"/pari; done mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/lib/pari" File ../src/funclist not changed. rm f libpari.a ar r libpari.a mp.o mpinl.o F2x.o FF.o Flx.o FpE.o FpV.o FpX.o Qfb.o RgV.o RgX.o ZV.o ZX.o alglin1.o alglin2.o arith1.o arith2.o base1.o base2.o base3.o base4.o base5.o bb_group.o bibli1.o bibli2.o bit.o buch1.o buch2.o buch3.o buch4.o concat.o ellanal.o elliptic.o galconj.o gen1.o gen2.o gen3.o hnf_snf.o ifactor1.o lll.o perm.o polarit1.o polarit2.o polarit3.o prime.o random.o rootpol.o subcyclo.o subgroup.o trans1.o trans2.o trans3.o anal.o compat.o compile.o default.o errmsg.o es.o eval.o hash.o init.o intnum.o members.o pariinl.o parse.o sumiter.o DedekZeta.o Hensel.o QX_factor.o aprcl.o elldata.o ellsea.o galois.o galpol.o groupid.o krasner.o kummer.o mpqs.o nffactor.o part.o stark.o subfield.o thue.o mv: cannot stat `../src/desc/funclistlinuxx86_649212.tmp': No such file or directory make[1]: [../src/funclist] Error 1 (ignored) if test d ../data; then cd ../data; for d in `ls`; do mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/$d && for f in `ls $d`; do ../config/install m 644 $d/$f "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/$d; done >/dev/null; done; fi ../config/install m 644 pari.cfg "/home/leif/Sage/sage4.6.1.alpha3/local/lib/pari" ../config/install m 644 ../examples/EXPLAIN "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install ../misc/tex2mail "/home/leif/Sage/sage4.6.1.alpha3/local/bin" if test n "../src/funclist"; then mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/PARI; ../config/install m 644 ../src/desc/PARI/822.pm "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/PARI; ../config/install m 644 ../src/desc/pari.desc "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"; fi ../config/install m 644 ../examples/Inputrc "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../examples/Makefile "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../examples/bench.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../doc/gp.1 "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1"/gp2.4.1 ln s gp.1 "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1"/pari.1 ../config/install m 644 ../examples/cl.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ln s gp2.4.1 "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1"/gp.1 cd ../src/desc && /usr/bin/perl merge_822 ../funclist > deflinuxx86_649212.tmp ../config/install m 644 ../examples/classno.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples cannot find ../funclist at merge_822 line 4. make[1]: *** [../src/desc/pari.desc] Error 2 make[1]: *** Waiting for unfinished jobs.... ../config/install m 644 ../examples/contfrac.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../doc/Makefile "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../examples/lucas.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../examples/extgcd.c "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../examples/rho.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ../config/install m 644 ../examples/squfof.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples ar: creating libpari.a ../config/install m 644 ../examples/taylor.gp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/examples gcc4.5.1 o "/home/leif/Sage/sage4.6.1.alpha3/local/lib"/libparigmp2.4.so.3.0.0 shared O3 Wall fnostrictaliasing fomitframepointer O3 g march=native O3 fnostrictaliasing fomitframepointer DHONORS_CFLAGS march=native O3 DHONORS_CPPFLAGS fPIC Wl,shared,soname=libparigmp2.4.so.3 mp.o mpinl.o F2x.o FF.o Flx.o FpE.o FpV.o FpX.o Qfb.o RgV.o RgX.o ZV.o ZX.o alglin1.o alglin2.o arith1.o arith2.o base1.o base2.o base3.o base4.o base5.o bb_group.o bibli1.o bibli2.o bit.o buch1.o buch2.o buch3.o buch4.o concat.o ellanal.o elliptic.o galconj.o gen1.o gen2.o gen3.o hnf_snf.o ifactor1.o lll.o perm.o polarit1.o polarit2.o polarit3.o prime.o random.o rootpol.o subcyclo.o subgroup.o trans1.o trans2.o trans3.o anal.o compat.o compile.o default.o errmsg.o es.o eval.o hash.o init.o intnum.o members.o pariinl.o parse.o sumiter.o DedekZeta.o Hensel.o QX_factor.o aprcl.o elldata.o ellsea.o galois.o galpol.o groupid.o krasner.o kummer.o mpqs.o nffactor.o part.o stark.o subfield.o thue.o lc lm L/home/leif/Sage/sage4.6.1.alpha3/local/lib lgmp /usr/bin/ranlib libpari.a ../config/install m 644 ../doc/tex2mail.1 "/home/leif/Sage/sage4.6.1.alpha3/local/share/man/man1" ../config/install m 644 ../misc/README "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc ../config/install m 644 ../misc/color.dft "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc ../config/install m 644 ../misc/gpalias "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc ../config/install ../misc/gpflog "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc ../config/install m 644 ../misc/gprc.dft "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc ../config/install m 644 ../misc/pari.xpm "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc ../config/install ../misc/xgp "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/misc mkdir p "/home/leif/Sage/sage4.6.1.alpha3/local/lib" rm f "/home/leif/Sage/sage4.6.1.alpha3/local/lib"/libpari.a for i in paridecl paripriv pari paricast paricom parierr parigen pariinl parinf pariold paristio parisys paritune ; do \ ../config/install m 644 ../src/headers/$i.h "/home/leif/Sage/sage4.6.1.alpha3/local/include"/pari; done ../config/install m 644 libpari.a "/home/leif/Sage/sage4.6.1.alpha3/local/lib"/libpari.a ../config/install m 644 ../doc/translations "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/appa.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc rm f "/home/leif/Sage/sage4.6.1.alpha3/local/include"/pari/genpari.h ln s pari.h "/home/leif/Sage/sage4.6.1.alpha3/local/include"/pari/genpari.h ../config/install m 644 ../doc/appb.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/appd.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/parimacro.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/pdfmacs.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc make[1]: Leaving directory `/home/leif/Sage/sage4.6.1.alpha3/spkg/build/pari2.4.3.alpha.p2/src/Olinuxx86_64' ../config/install m 644 ../doc/refcard.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/tutorial.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/users.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/usersch1.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/usersch2.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/usersch3.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/usersch4.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/usersch5.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/paricfg.tex "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc ../config/install m 644 ../doc/libpari.dvi "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc cp: cannot stat `../doc/libpari.dvi': No such file or directory ../config/install m 644 ../doc/users.dvi "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc cp: cannot stat `../doc/users.dvi': No such file or directory ../config/install m 644 ../doc/tutorial.dvi "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc cp: cannot stat `../doc/tutorial.dvi': No such file or directory ../config/install m 644 ../doc/refcard.dvi "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc cp: cannot stat `../doc/refcard.dvi': No such file or directory ../config/install m 644 ../doc/refcard.ps "/home/leif/Sage/sage4.6.1.alpha3/local/share/pari"/doc cp: cannot stat `../doc/refcard.ps': No such file or directory make[1]: Leaving directory `/home/leif/Sage/sage4.6.1.alpha3/spkg/build/pari2.4.3.alpha.p2/src/Olinuxx86_64' make: *** [install] Error 2 Error installing PARI
(The line breaks are perhaps partially "suboptimal"...)
comment:25 Changed 12 years ago by
Suggested changes:

spkginstall
diff r d58606ef346e spkginstall
a b 174 174 175 175 if [ $? ne 0 ]; then 176 176 echo >&2 "Error: Configuring PARI with readline and GMP kernel failed." 177 return1177 exit 1 178 178 fi 179 179 180 180 if [ ! f Makefile ]; then 181 181 echo >&2 "Error: Unable to configure PARI: No Makefile generated!" 182 return1182 exit 1 183 183 fi 184 184 185 185 if [ "$UNAME" = "CYGWIN" ]; then … … 198 198 { 199 199 echo "Installing PARI/GP..." 200 200 201 $MAKE install installlibsta 201 # Parallel install is broken: 202 $MAKE j1 install installlibsta 202 203 if [ $? ne 0 ]; then 203 204 echo >&2 "Error installing PARI" 204 205 exit 1 … … 258 259 else 259 260 # First try O3, then O2 and so on until the build works. 260 261 # This is mainly meant to work around compiler bugs. 261 initial_CFLAGS="g $CFLAGS" 262 for optflag in O3 O2 O1 O0; do 263 CFLAGS="$optflag $initial_CFLAGS" 264 echo "===========================================" 265 echo "Building PARI/GP with optimization flag $optflag" 266 echo "===========================================" 267 if build; then 268 build_success=yes 269 break 270 fi 271 done 262 CFLAGS="O3 g $CFLAGS" 263 echo "===========================================" 264 echo "Building PARI/GP with default optimization" 265 echo "(O3 or userspecified optimization level)" 266 echo "===========================================" 267 if ! build; then 268 echo >&2 "Warning: Initial build of PARI/GP failed  retrying with" \ 269 "less optimization..." 270 for optflag in O2 O1 O0; do 271 CFLAGS="$CFLAGS $optflag" 272 echo "===========================================" 273 echo "Building PARI/GP with optimization flag $optflag" 274 echo "===========================================" 275 if build; then 276 build_success=yes 277 break 278 fi 279 done 280 else 281 build_success=yes 282 fi 272 283 fi 273 284 274 if [ $build_success neyes ]; then285 if [ $build_success != yes ]; then 275 286 echo >&2 "Error building PARI/GP" 276 287 exit 1 277 288 fi … … 280 291 install 281 292 282 293 283 # All (previous) errors are catched in build() , so we don't test $? here.284 # Although we perhaps should also check success of the numerous copy commands285 # inside build().294 # All (previous) errors are catched in build() and install(), so we don't test 295 # $? here. Although we perhaps should also check success of the numerous copy 296 # commands inside install(). 286 297 287 298 if [ "$UNAME" = "Darwin" ]; then 288 299 pari_shlib="libpari.dylib"
Of course we could handle modified CFLAGS
differently, such that we don't eventually end up with O3 g ... O2 O1 O0
, but that's IMHO a minor, cosmetic issue.
comment:26 Changed 12 years ago by
Replying to leif:
Got one more PARI flaw:
It installs three real copies of the shared library rather than one with two symbolic links to it.
Currently not sure if (but I believe) that's an upstream matter, or if we do that.
Upstream does fine (if we use make install
).
comment:27 followup: 28 Changed 12 years ago by
Reviewers:  → Leif Leonhardy 

I essentially agree with your comments, *except* for not retrying when Configure
fails. It could very well be that some gcc
bug causes Configure
to fail and we should also catch that.
You're probably right that ulimit v
doesn't work everywhere (on OS X 10.4, the command succeeds but doesn't actually limit anything), but I don't think that's an issue. If it doesn't work, so be it...
comment:28 Changed 12 years ago by
Replying to jdemeyer:
I essentially agree with your comments, *except* for not retrying when
Configure
fails. It could very well be that somegcc
bug causesConfigure
to fail and we should also catch that.
How / when would changing the O
level solve Configure
errors? I can't imagine such, at least not with gcc
...
You're probably right that
ulimit v
doesn't work everywhere (on OS X 10.4, the command succeeds but doesn't actually limit anything), but I don't think that's an issue. If it doesn't work, so be it...
:) Never mind, though we could also set some CPU time limit.
Should we report the broken parallel make install
upstream?
I must admit I haven't looked close at it; it failed with 8 jobs in the first place.
comment:29 followup: 31 Changed 12 years ago by
P.S.: We could also do make k ...
to compile as much as possible with higher optimization.
comment:30 Changed 12 years ago by
Replying to leif:
I wonder if we really need the
make installdoc*
patch (TeX usage) since apparently all errors are ignored... ;)
True, but it also prevents tex
from hanging (you know the major misfeature of tex
when it prompts the user for input). We could probably solve this by redirecting the standard input from /dev/null
, but I think not building the documentation is a cleaner solution.
comment:31 Changed 12 years ago by
Replying to leif:
P.S.: We could also do
make k ...
to compile as much as possible with higher optimization.
Agreed.
comment:32 followup: 33 Changed 12 years ago by
Replying to jdemeyer:
Replying to leif:
I wonder if we really need the
make installdoc*
patch (TeX usage) since apparently all errors are ignored... ;)True, but it also prevents
tex
from hanging (you know the major misfeature oftex
when it prompts the user for input). We could probably solve this by redirecting the standard input from/dev/null
, but I think not building the documentation is a cleaner solution.
It won't prompt you if it isn't installed... ;) Other issues?
In general, I think if [La]TeX is installed, we should use it, at least unless there's also some equivalent HMTL documentation or alike.
Or ship prebuilt PDFs...
comment:33 Changed 12 years ago by
comment:34 Changed 12 years ago by
Description:  modified (diff) 

New spkg: http://sage.math.washington.edu/home/jdemeyer/spkg/pari2.4.3.alpha.p3.spkg. make install
issue not yet fixed (I will investigate and if necessary, report upstream).
About the documentation: if we ship PDF documentation for PARI, are users going to find it?
comment:35 followups: 36 37 Changed 12 years ago by
I doubt 5 minutes CPU time is enough for slow machines, at least with tune
. (Not sure at the moment how many processes are involved in that.)
I'd set the limits for the specific tasks only anyway, but you could increase the [time] limit in case we do tune. (spkgcheck
is of course not affected.)
Cannot test it until you also do $MAKE j1 install ...
or fix the race condition... ;)
comment:36 Changed 12 years ago by
Work issues:  → Fix or work around race condition in `make install`; increase timeout if we tune PARI. 

Replying to leif:
I doubt 5 minutes CPU time is enough for slow machines, at least with
tune
. (Not sure at the moment how many processes are involved in that.)
PARI's tune
definitely takes longer, even on fast(er) machines.
I'd set the limits for the specific tasks only anyway, but you could increase the [time] limit in case we do tune. (
spkgcheck
is of course not affected.)
comment:37 Changed 12 years ago by
Replying to leif:
I doubt 5 minutes CPU time is enough for slow machines, at least with
tune
. (Not sure at the moment how many processes are involved in that.)
How about disabling the limits when tuning? I think we may assume that the people who tune PARI know what they are doing.
comment:38 Changed 12 years ago by
Race condition in make install
reported upstream (with patch): http://pari.math.ubordeaux.fr/cgibin/bugreport.cgi?bug=1148
Changed 12 years ago by
Attachment:  pari2.4.3.alpha.p2p3.diff added 

spkg patch .p2 to .p3, for reference
comment:39 Changed 12 years ago by
Status:  needs_work → needs_review 

Work issues:  Fix or work around race condition in `make install`; increase timeout if we tune PARI. 
New spkg, same location: http://sage.math.washington.edu/home/jdemeyer/spkg/pari2.4.3.alpha.p3.spkg
comment:40 followups: 41 42 Changed 12 years ago by
Well this is a spectacular bandaid to work around compiler breakage :)
 Having hardcoded memory and time limits will just create problems down the road as pari is bound to get bigger, gcc is going to use more ram, and people try this on a wider (slower) range of hardware.
 We apparently know that optimization of Pari is not working correctly on gcc 4.4.1 yet we still build it with optimization and hope for the best? What could possibly go wrong?
How about we disable optimization (or set a knowngood value, maybe O1
) if the compiler is gcc4.4.1. Unless you set SAGE_PARI_tune=yes
, in which case we'll still build it with all optimizations turned on. That way we are on the safe side and the workaround will become unnecessary over time as people upgrade to newer gcc releases.
And if you know what you are doing you can easily override it.
comment:41 Changed 12 years ago by
Replying to vbraun:
Well this is a spectacular bandaid to work around compiler breakage :)
 Having hardcoded memory and time limits will just create problems down the road as pari is bound to get bigger, gcc is going to use more ram, and people try this on a wider (slower) range of hardware.
Well, the chosen limits are very conservative, so I doubt we will run into this problem any time soon.
 We apparently know that optimization of Pari is not working correctly on gcc 4.4.1 yet we still build it with optimization and hope for the best? What could possibly go wrong?
There isn't just one single broken version, we want to catch all broken gcc's. See #9897 for example.
comment:42 followup: 43 Changed 12 years ago by
Replying to vbraun:
Well this is a spectacular bandaid to work around compiler breakage :)
I totally agree with this by the way, but in my opinion there are two things we can do:
 Use O3 always and leave the user with a noncompiling Sage ("it's not our fault, it's gcc's fault")
 Do the optimizationfallback as in this spkg.
I believe making a blacklist of versions known to fail is not a good solution because there will always be systems with a broken gcc that we don't know of.
comment:43 followup: 44 Changed 12 years ago by
I totally agree with this by the way, but in my opinion there are two things we can do:
 Use O3 always and leave the user with a noncompiling Sage ("it's not our fault, it's gcc's fault")
 Do the optimizationfallback as in this spkg.
 Show the error and then tell the user how to report the issue and continue the build without optimization. I think all thats needed is
CFLAGS=O0 ./sage f spkg/standard/pari* make
to build pari with no optimization and then build the rest?
I believe making a blacklist of versions known to fail is not a good solution because there will always be systems with a broken gcc that we don't know of.
But if we silently try some workarounds then we will never find out about broken compilers either. The user should file a trac ticket to document the issue so it can be fixed in Sage and reported upstream.
comment:44 Changed 12 years ago by
Replying to vbraun:
But if we silently try some workarounds then we will never find out about broken compilers either. The user should file a trac ticket to document the issue so it can be fixed in Sage and reported upstream.
I would prefer that users post the bug directly upstream (or not, gcc ignores most bug reports anyway). I certainly don't want to waste too much energy in making gcc bug reports for every user who can't compile Sage due to a gcc bug.
comment:45 followup: 46 Changed 12 years ago by
Well nothing is going to be reported if we just silently compile away without optimization. I feel your pain wrt. to gcc bugs but if they don't get reported then they can't get fixed.
So far I've only seen
 general brokenness of
O3
on itanium (dying arch...), we should just default toO2
there.  gcc4.4.1 needs
O1
comment:46 Changed 12 years ago by
Replying to vbraun:
Well nothing is going to be reported if we just silently compile away without optimization. I feel your pain wrt. to gcc bugs but if they don't get reported then they can't get fixed.
Maybe we should suggest to gcc to add PARI as a test suite (I'm mostly joking here). For some reason it has a fenomenal ability to expose gcc bugs.
comment:47 followup: 49 Changed 12 years ago by
One would need to ascertain if this is a gcc bug or a Pari bug. Badly written code will cause more aggressive optimisations to fail. It does not necessary mean it is a compiler bug.
comment:48 Changed 12 years ago by
Status:  needs_review → needs_work 

True, but thats the case here:
 brokenness of O3 on itanium is discussed on the gcc bug ticket and definitely a compiler bug.
 gcc4.4.1 with >=
O2
just starts using all available memory until it dies.
My suggestion would be to just catch those two cases. And add a useful error message to the spkg in case pari fails to compile, explaining how to report this and how to work around it by compiling the pari spkg with different CFLAGS.
comment:49 Changed 12 years ago by
Replying to drkirkby:
One would need to ascertain if this is a gcc bug or a Pari bug. Badly written code will cause more aggressive optimisations to fail. It does not necessary mean it is a compiler bug.
Badly written code should never cause the compiler to crash or to use infinite memory. These things are certainly compiler bugs.
comment:50 Changed 12 years ago by
Description:  modified (diff) 

comment:51 Changed 12 years ago by
Status:  needs_work → needs_review 

comment:52 Changed 12 years ago by
comment:53 Changed 12 years ago by
Reviewers:  Leif Leonhardy → Leif Leonhardy, Volker Braun 

Status:  needs_review → positive_review 
I'm happy with the current version, so I'll give this ticket a positive review. If any compiler bugs are still preventing pari from being built on some hardware then this should be reported to the gcc wrapper.
comment:54 Changed 12 years ago by
Merged in:  → sage4.6.2.alpha1 

Resolution:  → fixed 
Status:  positive_review → closed 
comment:55 Changed 12 years ago by
Merged in:  sage4.6.2.alpha1 

Resolution:  fixed 
Status:  closed → new 
Testing on the buildbot seems to indicate there might still be some race conditions in parallel make install
. So maybe we should avoid doing that.
comment:56 Changed 12 years ago by
Description:  modified (diff) 

Status:  new → needs_review 
Fixed race conditions in make install
by using j1
.
comment:57 followup: 58 Changed 12 years ago by
Status:  needs_review → positive_review 

That'll get rid of potential races in installation. Perhaps we should disable parallel make for all spkgs that don't use a proven build system like autotools or SCons. Chances are that any handrolled makefile has concurrency issues...
I'll take it that you are going to commit the changes to the included repository before adding the spkg to the next Sage release, because right now they are not.
comment:58 Changed 12 years ago by
Replying to vbraun:
I'll take it that you are going to commit the changes to the included repository before adding the spkg to the next Sage release, because right now they are not.
Yes, done.
comment:59 Changed 12 years ago by
Merged in:  → sage4.6.2.alpha2 

Resolution:  → fixed 
Status:  positive_review → closed 
Perhaps we should really also address #10120, as more systems than originally reported seem to be affected, i.e. reduce (perhaps partially) optimization to
O1
to work around obvious bugs in GCC 4.4.1 on these platforms.Did someone report this to the PARI guys? Perhaps they could provide a patch such that we don't have to maintain it (that selectively changes the compiler flags for only some files).
Unfortunately(?), not all people building on e.g. openSUSE 11.2 run into these problems, apparently.