Opened 9 years ago

Closed 8 years ago

#10430 closed defect (fixed)

Add some bugfixes to the PARI package

Reported by: jdemeyer Owned by: tbd
Priority: blocker Milestone: sage-4.6.2
Component: packages: standard Keywords: pari spkg bugs patches
Cc: drkirkby Merged in: sage-4.6.2.alpha2
Authors: Jeroen Demeyer Reviewers: Leif Leonhardy, Volker Braun
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Attachments (4)

10430_branch_cut.patch (3.1 KB) - added by jdemeyer 9 years ago.
Doctest fixes
pari-2.4.3.alpha.p1.diff (40.0 KB) - added by jdemeyer 9 years ago.
spkg patch for reference
pari-2.4.3.alpha.p2.diff (43.5 KB) - added by jdemeyer 9 years ago.
spkg patch .p0 to .p2, for reference
pari-2.4.3.alpha.p2-p3.diff (8.2 KB) - added by jdemeyer 9 years ago.
spkg patch .p2 to .p3, for reference

Download all attachments as: .zip

Change History (63)

comment:1 Changed 9 years ago by jdemeyer

  • Description modified (diff)
  • Summary changed from Add some bugfixes to PARI to Add some bugfixes to the PARI package

comment:2 follow-up: Changed 9 years ago by leif

Perhaps we should really also address #10120, as more systems than originally reported seem to be affected, i.e. reduce (perhaps partially) optimization to -O1 to work around obvious bugs in GCC 4.4.1 on these platforms.

Did someone report this to the PARI guys? Perhaps they could provide a patch such that we don't have to maintain it (that selectively changes the compiler flags for only some files).

Unfortunately(?), not all people building on e.g. openSUSE 11.2 run into these problems, apparently.

comment:3 in reply to: ↑ 2 ; follow-up: Changed 9 years ago by jdemeyer

Replying to leif:

Perhaps we should really also address #10120, as more systems than originally reported seem to be affected, i.e. reduce (perhaps partially) optimization to -O1 to work around obvious bugs in GCC 4.4.1 on these platforms.

Here's an idea: we first try to build with -O3 and when that doesn't work, fall back to -O2, then -O1, then -O0.

This way we don't have to find out exactly which versions of gcc are broken.

I think reporting this to PARI is pointless, because they can't help (and probably won't care about) a broken gcc.

comment:4 in reply to: ↑ 3 ; follow-up: Changed 9 years ago by leif

Replying to jdemeyer:

Replying to leif:

Perhaps we should really also address #10120, as more systems than originally reported seem to be affected, i.e. reduce (perhaps partially) optimization to -O1 to work around obvious bugs in GCC 4.4.1 on these platforms.

Here's an idea: we first try to build with -O3 and when that doesn't work, fall back to -O2, then -O1, then -O0.

Koen reported:

"For reference: OpenSuse? 11.2 (gcc (SUSE Linux) 4.4.1 [gcc-4_4-branch revision 150839]) has the same problem when building PARI: on a machine with 64GB of RAM, it eventually fails after all memory is exhausted (takes hours). [...]"

So I don't think that's the way to go. (Other machines might start swapping, which effectively "freezes" some systems.)

Or should we do something like

    (ulimit -St 900; $MAKE) # Which value is appropriate?

?

I think reporting this to PARI is pointless, because they can't help (and probably won't care about) a broken gcc.

They at least perhaps have better experience which files are most likely to trigger failures due to GCC bugs.

comment:5 in reply to: ↑ 4 ; follow-up: Changed 9 years ago by jdemeyer

Replying to leif:

Or should we do something like

    (ulimit -St 900; $MAKE) # Which value is appropriate?

?

How about ulimiting the memory?

comment:6 in reply to: ↑ 5 Changed 9 years ago by leif

  • Cc drkirkby added
  • Keywords bugs patches added

Replying to jdemeyer:

Replying to leif:

Or should we do something like

    (ulimit -St 900; $MAKE) # Which value is appropriate?

?

How about ulimiting the memory?

Much harder to estimate, isn't it? (Feel free to test out adequate values, with -O3 etc.; perhaps something Dave likes...)

Ok, if a process starts thrashing, it won't consume much (user) CPU time as well.

comment:7 follow-up: Changed 9 years ago by drkirkby

I don't think we should be changing ulimit. Sage used to unset it at one point, and that was changed in a trac ticket.

Changing it could cause all sorts of problems for someone. If Sage fails with the limit they set, then tough - they set the limit.

Once we start changing limits, we could cause other proceses to fail, which might be more important to someone.

Dave

comment:8 Changed 9 years ago by jdemeyer

David: we could check the current value of ulimit -v to make sure we are only decreasing the value, not increasing.

I quickly tested ulimit -v on a few systems, this is what I found for the minimal power of 2 for ulimit -v to have a successful build of the pari spkg:

  1. Gentoo Linux, kernel 2.6.32, x86_64, gcc 4.6.0: 128 MB
  2. Ubuntu Linux 8.04.4 LTS, kernel 2.6.24, x86_64, gcc 4.5.1: 128 MB
  3. Mac OS X 10.4 PPC, gcc 4.0.1: ulimit -v doesn't seem to work

comment:9 in reply to: ↑ 7 Changed 9 years ago by leif

Replying to drkirkby:

I don't think we should be changing ulimit. Sage used to unset it at one point, and that was changed in a trac ticket.

Changing it could cause all sorts of problems for someone. If Sage fails with the limit they set, then tough - they set the limit.

Once we start changing limits, we could cause other proceses to fail, which might be more important to someone.

We would only set limits in (PARI's) spkg-install.

Note that ulimit only affects the current process and its subprocesses (i.e. gets inherited), therefore I also used the parentheses in the example above.

ulimit is (also) a bash built-in btw. We could also limit its use to Linux.

And ordinary users (i.e., their processes) cannot increase limits once they are set.

comment:10 follow-ups: Changed 9 years ago by leif

P.S.:

If we do "trial building" with some limit(s), we should also make sure that the build actually failed due to a resource limit before retrying with less optimization, e.g. check that the exit code was 152 (SIGXCPU + 128) if we use a CPU time limit.

comment:11 in reply to: ↑ 10 Changed 9 years ago by leif

Replying to leif:

P.S.:

If we do "trial building" with some limit(s), we should also make sure that the build actually failed due to a resource limit before retrying with less optimization, e.g. check that the exit code was 152 (SIGXCPU + 128) if we use a CPU time limit.

With ulimit -v I receive SIGKILL on exhausted memory, which isn't very specific...

comment:12 in reply to: ↑ 10 ; follow-up: Changed 9 years ago by jdemeyer

Replying to leif:

P.S.:

If we do "trial building" with some limit(s), we should also make sure that the build actually failed due to a resource limit before retrying with less optimization

The build could fail for many various reasons, including but not limited to allocating too much memory. There are various other tickets where a PARI build fails because of a broken gcc. All these should be caught, not only the cases where we run out of memory.

comment:13 in reply to: ↑ 12 Changed 9 years ago by leif

Replying to jdemeyer:

Replying to leif:

P.S.:

If we do "trial building" with some limit(s), we should also make sure that the build actually failed due to a resource limit before retrying with less optimization

The build could fail for many various reasons, including but not limited to allocating too much memory. There are various other tickets where a PARI build fails because of a broken gcc. All these should be caught, not only the cases where we run out of memory.

Of course.

I wonder if we then would get PARI build errors due to GCC bugs reported any longer... ;-)

comment:14 follow-up: Changed 9 years ago by leif

Got one more PARI flaw:

It installs three real copies of the shared library rather than one with two symbolic links to it.

Currently not sure if (but I believe) that's an upstream matter, or if we do that.

comment:15 Changed 9 years ago by jdemeyer

  • Authors set to Jeroen Demeyer
  • Description modified (diff)

comment:16 Changed 9 years ago by jdemeyer

  • Description modified (diff)

Changed 9 years ago by jdemeyer

Doctest fixes

Changed 9 years ago by jdemeyer

spkg patch for reference

comment:17 Changed 9 years ago by jdemeyer

  • Priority changed from major to blocker

comment:18 Changed 9 years ago by jdemeyer

  • Description modified (diff)

comment:19 Changed 9 years ago by jdemeyer

  • Status changed from new to needs_review

comment:20 Changed 9 years ago by jdemeyer

  • Status changed from needs_review to needs_work
  • Work issues set to Don't use "make install" since that needs tex

comment:21 Changed 9 years ago by jdemeyer

  • Status changed from needs_work to needs_review
  • Work issues Don't use "make install" since that needs tex deleted

Changed 9 years ago by jdemeyer

spkg patch .p0 to .p2, for reference

comment:22 follow-up: Changed 9 years ago by leif

I wonder if we really need the make install-doc* patch (TeX usage) since apparently all errors are ignored... ;-)

I'd like to have ticket references also in SPKG.txt (Changelog).

Trial building with -O3...-O0 won't work if initial_CFLAGS already contain some (higher) optimization level (which I think isn't unlikely), since $optflag gets prepended. I would start with CFLAGS as is, and then append -O2 etc. in case the previous build failed. Also, we IMHO shouldn't retry if configure failed. (We could simply keep the exit in these cases.)

I'm not sure if all platforms support ulimit -v, although Linuces (where GCC bugs showed up) certainly do. Perhaps we should test its exit status (and skip trial building if the platform does not).

test ... -ne ... (etc.) is for numerical comparison, not for comparing strings.

The changes aren't yet committed.

comment:23 follow-up: Changed 9 years ago by leif

  • Status changed from needs_review to needs_work

comment:24 in reply to: ↑ 23 Changed 9 years ago by leif

Replying to leif:

...
./spkg-install: line 274: [: yes: integer expression expected
Installing PARI/GP...
Making install-lib-sta in Olinux-x86_64
Making install in Olinux-x86_64
make[1]: Entering directory `/home/leif/Sage/sage-4.6.1.alpha3/spkg/build/pari-2.4.3.alpha.p2/src/Olinux-x86_64'
make[1]: warning: -jN forced in submake: disabling jobserver mode.
make[1]: Entering directory `/home/leif/Sage/sage-4.6.1.alpha3/spkg/build/pari-2.4.3.alpha.p2/src/Olinux-x86_64'
make[1]: warning: -jN forced in submake: disabling jobserver mode.
mkdir -p "/home/leif/Sage/sage-4.6.1.alpha3/local/lib"
mkdir -p "/home/leif/Sage/sage-4.6.1.alpha3/local/include"/pari
mkdir -p "/home/leif/Sage/sage-4.6.1.alpha3/local/share/man/man1"
mkdir -p "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/misc "/home/leif/Sage/sage-4.6.1.alpha3/local/bin"
rm -f "/home/leif/Sage/sage-4.6.1.alpha3/local/lib"/libpari-gmp-2.4.so.3.0.0 "/home/leif/Sage/sage-4.6.1.alpha3/local/lib"/libpari-gmp-2.4.so.3 "/home/leif/Sage/sage-4.6.1.alpha3/local/lib"/libpari.so
rm -f "/home/leif/Sage/sage-4.6.1.alpha3/local/share/man/man1"/pari.1 "/home/leif/Sage/sage-4.6.1.alpha3/local/share/man/man1"/gp.1 "/home/leif/Sage/sage-4.6.1.alpha3/local/share/man/man1"/gp-2.4.1
mkdir -p "/home/leif/Sage/sage-4.6.1.alpha3/local/bin" "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
mkdir -p "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/examples
../config/install -m 644 ../doc/gphelp.1 "/home/leif/Sage/sage-4.6.1.alpha3/local/share/man/man1"
../config/install ../doc/gphelp    "/home/leif/Sage/sage-4.6.1.alpha3/local/bin"
for i in paricfg.h mpinl.h; do \
	  ../config/install -m 644 $i "/home/leif/Sage/sage-4.6.1.alpha3/local/include"/pari; done
mkdir -p "/home/leif/Sage/sage-4.6.1.alpha3/local/lib/pari"
File ../src/funclist not changed.
rm -f libpari.a
ar r libpari.a mp.o mpinl.o F2x.o FF.o Flx.o FpE.o FpV.o FpX.o Qfb.o RgV.o RgX.o ZV.o ZX.o alglin1.o alglin2.o arith1.o arith2.o base1.o base2.o base3.o base4.o base5.o bb_group.o bibli1.o bibli2.o bit.o buch1.o buch2.o buch3.o buch4.o concat.o ellanal.o elliptic.o galconj.o gen1.o gen2.o gen3.o hnf_snf.o ifactor1.o lll.o perm.o polarit1.o polarit2.o polarit3.o prime.o random.o rootpol.o subcyclo.o subgroup.o trans1.o trans2.o trans3.o anal.o compat.o compile.o default.o errmsg.o es.o eval.o hash.o init.o intnum.o members.o pariinl.o parse.o sumiter.o DedekZeta.o Hensel.o QX_factor.o aprcl.o elldata.o ellsea.o galois.o galpol.o groupid.o krasner.o kummer.o mpqs.o nffactor.o part.o stark.o subfield.o thue.o
mv: cannot stat `../src/desc/funclist-linux-x86_64-9212.tmp': No such file or directory
make[1]: [../src/funclist] Error 1 (ignored)
if test -d ../data; then cd ../data; 	   for d in `ls`; do 	     mkdir -p "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/$d && 	     for f in `ls $d`; do 	       ../config/install -m 644 $d/$f "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/$d; 	     done >/dev/null;	   done; 	 fi
../config/install -m 644 pari.cfg "/home/leif/Sage/sage-4.6.1.alpha3/local/lib/pari"
../config/install -m 644 ../examples/EXPLAIN     "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/examples
../config/install ../misc/tex2mail "/home/leif/Sage/sage-4.6.1.alpha3/local/bin"
if test -n "../src/funclist"; then	   mkdir -p "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/PARI;	   ../config/install -m 644 ../src/desc/PARI/822.pm "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/PARI;	   ../config/install -m 644 ../src/desc/pari.desc "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"; fi
../config/install -m 644 ../examples/Inputrc     "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/examples
../config/install -m 644 ../examples/Makefile    "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/examples
../config/install -m 644 ../examples/bench.gp    "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/examples
../config/install -m 644 ../doc/gp.1 "/home/leif/Sage/sage-4.6.1.alpha3/local/share/man/man1"/gp-2.4.1
ln -s gp.1 "/home/leif/Sage/sage-4.6.1.alpha3/local/share/man/man1"/pari.1
../config/install -m 644 ../examples/cl.gp       "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/examples
ln -s gp-2.4.1 "/home/leif/Sage/sage-4.6.1.alpha3/local/share/man/man1"/gp.1
cd ../src/desc && /usr/bin/perl merge_822 ../funclist > def-linux-x86_64-9212.tmp
../config/install -m 644 ../examples/classno.gp  "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/examples
cannot find ../funclist at merge_822 line 4.
make[1]: *** [../src/desc/pari.desc] Error 2
make[1]: *** Waiting for unfinished jobs....
../config/install -m 644 ../examples/contfrac.gp "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/examples
../config/install -m 644 ../doc/Makefile     "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
../config/install -m 644 ../examples/lucas.gp    "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/examples
../config/install -m 644 ../examples/extgcd.c    "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/examples
../config/install -m 644 ../examples/rho.gp      "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/examples
../config/install -m 644 ../examples/squfof.gp   "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/examples
ar: creating libpari.a
../config/install -m 644 ../examples/taylor.gp   "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/examples
gcc-4.5.1  -o "/home/leif/Sage/sage-4.6.1.alpha3/local/lib"/libpari-gmp-2.4.so.3.0.0 -shared -O3 -Wall -fno-strict-aliasing -fomit-frame-pointer  -O3 -g -march=native -O3 -fno-strict-aliasing -fomit-frame-pointer -DHONORS_CFLAGS -march=native -O3 -DHONORS_CPPFLAGS -fPIC -Wl,-shared,-soname=libpari-gmp-2.4.so.3  mp.o mpinl.o F2x.o FF.o Flx.o FpE.o FpV.o FpX.o Qfb.o RgV.o RgX.o ZV.o ZX.o alglin1.o alglin2.o arith1.o arith2.o base1.o base2.o base3.o base4.o base5.o bb_group.o bibli1.o bibli2.o bit.o buch1.o buch2.o buch3.o buch4.o concat.o ellanal.o elliptic.o galconj.o gen1.o gen2.o gen3.o hnf_snf.o ifactor1.o lll.o perm.o polarit1.o
 polarit2.o polarit3.o prime.o random.o rootpol.o subcyclo.o subgroup.o trans1.o trans2.o trans3.o anal.o compat.o compile.o default.o errmsg.o es.o eval.o hash.o init.o intnum.o members.o pariinl.o parse.o sumiter.o DedekZeta.o Hensel.o QX_factor.o aprcl.o elldata.o ellsea.o galois.o galpol.o groupid.o krasner.o kummer.o mpqs.o nffactor.o part.o stark.o subfield.o thue.o -lc -lm -L/home/leif/Sage/sage-4.6.1.alpha3/local/lib -lgmp 
/usr/bin/ranlib libpari.a
../config/install -m 644 ../doc/tex2mail.1 "/home/leif/Sage/sage-4.6.1.alpha3/local/share/man/man1"
../config/install -m 644 ../misc/README    "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/misc
../config/install -m 644 ../misc/color.dft "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/misc
../config/install -m 644 ../misc/gpalias   "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/misc
../config/install ../misc/gpflog "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/misc
../config/install -m 644 ../misc/gprc.dft  "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/misc
../config/install -m 644 ../misc/pari.xpm  "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/misc
../config/install ../misc/xgp    "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/misc
mkdir -p "/home/leif/Sage/sage-4.6.1.alpha3/local/lib"
rm -f "/home/leif/Sage/sage-4.6.1.alpha3/local/lib"/libpari.a
for i in paridecl paripriv pari paricast paricom parierr parigen pariinl parinf pariold paristio parisys paritune ; do \
	   ../config/install -m 644 ../src/headers/$i.h  "/home/leif/Sage/sage-4.6.1.alpha3/local/include"/pari; done
../config/install -m 644 libpari.a "/home/leif/Sage/sage-4.6.1.alpha3/local/lib"/libpari.a
../config/install -m 644 ../doc/translations "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
../config/install -m 644 ../doc/appa.tex     "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
rm -f "/home/leif/Sage/sage-4.6.1.alpha3/local/include"/pari/genpari.h
ln -s pari.h "/home/leif/Sage/sage-4.6.1.alpha3/local/include"/pari/genpari.h
../config/install -m 644 ../doc/appb.tex     "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
../config/install -m 644 ../doc/appd.tex     "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
../config/install -m 644 ../doc/parimacro.tex "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
../config/install -m 644 ../doc/pdfmacs.tex  "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
make[1]: Leaving directory `/home/leif/Sage/sage-4.6.1.alpha3/spkg/build/pari-2.4.3.alpha.p2/src/Olinux-x86_64'
../config/install -m 644 ../doc/refcard.tex  "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
../config/install -m 644 ../doc/tutorial.tex "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
../config/install -m 644 ../doc/users.tex    "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
../config/install -m 644 ../doc/usersch1.tex "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
../config/install -m 644 ../doc/usersch2.tex "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
../config/install -m 644 ../doc/usersch3.tex "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
../config/install -m 644 ../doc/usersch4.tex "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
../config/install -m 644 ../doc/usersch5.tex "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
../config/install -m 644 ../doc/paricfg.tex  "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
../config/install -m 644 ../doc/libpari.dvi    "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
cp: cannot stat `../doc/libpari.dvi': No such file or directory
../config/install -m 644 ../doc/users.dvi    "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
cp: cannot stat `../doc/users.dvi': No such file or directory
../config/install -m 644 ../doc/tutorial.dvi "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
cp: cannot stat `../doc/tutorial.dvi': No such file or directory
../config/install -m 644 ../doc/refcard.dvi  "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
cp: cannot stat `../doc/refcard.dvi': No such file or directory
../config/install -m 644 ../doc/refcard.ps   "/home/leif/Sage/sage-4.6.1.alpha3/local/share/pari"/doc
cp: cannot stat `../doc/refcard.ps': No such file or directory
make[1]: Leaving directory `/home/leif/Sage/sage-4.6.1.alpha3/spkg/build/pari-2.4.3.alpha.p2/src/Olinux-x86_64'
make: *** [install] Error 2
Error installing PARI

(The line breaks are perhaps partially "suboptimal"...)

comment:25 Changed 9 years ago by leif

Suggested changes:

  • spkg-install

    diff -r d58606ef346e spkg-install
    a b  
    174174
    175175    if [ $? -ne 0 ]; then
    176176        echo >&2 "Error: Configuring PARI with readline and GMP kernel failed."
    177         return 1
     177        exit 1
    178178    fi
    179179
    180180    if [ ! -f Makefile ]; then
    181181        echo >&2 "Error: Unable to configure PARI: No Makefile generated!"
    182         return 1
     182        exit 1
    183183    fi
    184184
    185185    if [ "$UNAME" = "CYGWIN" ]; then
     
    198198{
    199199    echo "Installing PARI/GP..."
    200200
    201     $MAKE install install-lib-sta
     201    # Parallel install is broken:
     202    $MAKE -j1 install install-lib-sta
    202203    if [ $? -ne 0 ]; then
    203204        echo >&2 "Error installing PARI"
    204205        exit 1
     
    258259else
    259260    # First try -O3, then -O2 and so on until the build works.
    260261    # This is mainly meant to work around compiler bugs.
    261     initial_CFLAGS="-g $CFLAGS"
    262     for optflag in -O3 -O2 -O1 -O0; do
    263         CFLAGS="$optflag $initial_CFLAGS"
    264         echo "==========================================="
    265         echo "Building PARI/GP with optimization flag $optflag"
    266         echo "==========================================="
    267         if build; then
    268             build_success=yes
    269             break
    270         fi
    271     done
     262    CFLAGS="-O3 -g $CFLAGS"
     263    echo "==========================================="
     264    echo "Building PARI/GP with default optimization"
     265    echo "(-O3 or user-specified optimization level)"
     266    echo "==========================================="
     267    if ! build; then
     268        echo >&2 "Warning: Initial build of PARI/GP failed - retrying with" \
     269            "less optimization..."
     270        for optflag in -O2 -O1 -O0; do
     271            CFLAGS="$CFLAGS $optflag"
     272            echo "==========================================="
     273            echo "Building PARI/GP with optimization flag $optflag"
     274            echo "==========================================="
     275            if build; then
     276                build_success=yes
     277                break
     278            fi
     279        done
     280    else
     281        build_success=yes
     282    fi
    272283fi
    273284
    274 if [ $build_success -ne yes ]; then
     285if [ $build_success != yes ]; then
    275286    echo >&2 "Error building PARI/GP"
    276287    exit 1
    277288fi
     
    280291install
    281292
    282293
    283 # All (previous) errors are catched in build(), so we don't test $? here.
    284 # Although we perhaps should also check success of the numerous copy commands
    285 # inside build().
     294# All (previous) errors are catched in build() and install(), so we don't test
     295# $? here. Although we perhaps should also check success of the numerous copy
     296# commands inside install().
    286297
    287298if [ "$UNAME" = "Darwin" ]; then
    288299    pari_shlib="libpari.dylib"

Of course we could handle modified CFLAGS differently, such that we don't eventually end up with -O3 -g ... -O2 -O1 -O0, but that's IMHO a minor, cosmetic issue.

comment:26 in reply to: ↑ 14 Changed 9 years ago by leif

Replying to leif:

Got one more PARI flaw:

It installs three real copies of the shared library rather than one with two symbolic links to it.

Currently not sure if (but I believe) that's an upstream matter, or if we do that.

Upstream does fine (if we use make install).

comment:27 follow-up: Changed 9 years ago by jdemeyer

  • Reviewers set to Leif Leonhardy

I essentially agree with your comments, *except* for not retrying when Configure fails. It could very well be that some gcc bug causes Configure to fail and we should also catch that.

You're probably right that ulimit -v doesn't work everywhere (on OS X 10.4, the command succeeds but doesn't actually limit anything), but I don't think that's an issue. If it doesn't work, so be it...

comment:28 in reply to: ↑ 27 Changed 9 years ago by leif

Replying to jdemeyer:

I essentially agree with your comments, *except* for not retrying when Configure fails. It could very well be that some gcc bug causes Configure to fail and we should also catch that.

How / when would changing the -O level solve Configure errors? I can't imagine such, at least not with gcc...

You're probably right that ulimit -v doesn't work everywhere (on OS X 10.4, the command succeeds but doesn't actually limit anything), but I don't think that's an issue. If it doesn't work, so be it...

:-) Never mind, though we could also set some CPU time limit.


Should we report the broken parallel make install upstream?

I must admit I haven't looked close at it; it failed with 8 jobs in the first place.

comment:29 follow-up: Changed 9 years ago by leif

P.S.: We could also do make -k ... to compile as much as possible with higher optimization.

comment:30 in reply to: ↑ 22 Changed 9 years ago by jdemeyer

Replying to leif:

I wonder if we really need the make install-doc* patch (TeX usage) since apparently all errors are ignored... ;-)

True, but it also prevents tex from hanging (you know the major misfeature of tex when it prompts the user for input). We could probably solve this by redirecting the standard input from /dev/null, but I think not building the documentation is a cleaner solution.

comment:31 in reply to: ↑ 29 Changed 9 years ago by jdemeyer

Replying to leif:

P.S.: We could also do make -k ... to compile as much as possible with higher optimization.

Agreed.

comment:32 follow-up: Changed 9 years ago by leif

Replying to jdemeyer:

Replying to leif:

I wonder if we really need the make install-doc* patch (TeX usage) since apparently all errors are ignored... ;-)

True, but it also prevents tex from hanging (you know the major misfeature of tex when it prompts the user for input). We could probably solve this by redirecting the standard input from /dev/null, but I think not building the documentation is a cleaner solution.

It won't prompt you if it isn't installed... ;-) Other issues?

In general, I think if [La]TeX is installed, we should use it, at least unless there's also some equivalent HMTL documentation or alike.

Or ship pre-built PDFs...

comment:33 in reply to: ↑ 32 Changed 9 years ago by jdemeyer

Replying to leif:

Or ship pre-built PDFs...

This is probably the easiest solution.

comment:34 Changed 9 years ago by jdemeyer

  • Description modified (diff)

New spkg: http://sage.math.washington.edu/home/jdemeyer/spkg/pari-2.4.3.alpha.p3.spkg. make install issue not yet fixed (I will investigate and if necessary, report upstream).

About the documentation: if we ship PDF documentation for PARI, are users going to find it?

comment:35 follow-ups: Changed 9 years ago by leif

I doubt 5 minutes CPU time is enough for slow machines, at least with --tune. (Not sure at the moment how many processes are involved in that.)

I'd set the limits for the specific tasks only anyway, but you could increase the [time] limit in case we do tune. (spkg-check is of course not affected.)

Cannot test it until you also do $MAKE -j1 install ... or fix the race condition... ;-)

comment:36 in reply to: ↑ 35 Changed 9 years ago by leif

  • Work issues set to Fix or work around race condition in `make install`; increase time-out if we tune PARI.

Replying to leif:

I doubt 5 minutes CPU time is enough for slow machines, at least with --tune. (Not sure at the moment how many processes are involved in that.)

PARI's tune definitely takes longer, even on fast(er) machines.


I'd set the limits for the specific tasks only anyway, but you could increase the [time] limit in case we do tune. (spkg-check is of course not affected.)

comment:37 in reply to: ↑ 35 Changed 9 years ago by jdemeyer

Replying to leif:

I doubt 5 minutes CPU time is enough for slow machines, at least with --tune. (Not sure at the moment how many processes are involved in that.)

How about disabling the limits when tuning? I think we may assume that the people who tune PARI know what they are doing.

comment:38 Changed 9 years ago by jdemeyer

Race condition in make install reported upstream (with patch): http://pari.math.u-bordeaux.fr/cgi-bin/bugreport.cgi?bug=1148

Changed 9 years ago by jdemeyer

spkg patch .p2 to .p3, for reference

comment:39 Changed 9 years ago by jdemeyer

  • Status changed from needs_work to needs_review
  • Work issues Fix or work around race condition in `make install`; increase time-out if we tune PARI. deleted

comment:40 follow-ups: Changed 8 years ago by vbraun

Well this is a spectacular bandaid to work around compiler breakage :-)

  • Having hard-coded memory and time limits will just create problems down the road as pari is bound to get bigger, gcc is going to use more ram, and people try this on a wider (slower) range of hardware.
  • We apparently know that optimization of Pari is not working correctly on gcc 4.4.1 yet we still build it with optimization and hope for the best? What could possibly go wrong?

How about we disable optimization (or set a known-good value, maybe -O1) if the compiler is gcc-4.4.1. Unless you set SAGE_PARI_tune=yes, in which case we'll still build it with all optimizations turned on. That way we are on the safe side and the workaround will become unnecessary over time as people upgrade to newer gcc releases. And if you know what you are doing you can easily override it.

comment:41 in reply to: ↑ 40 Changed 8 years ago by jdemeyer

Replying to vbraun:

Well this is a spectacular bandaid to work around compiler breakage :-)

  • Having hard-coded memory and time limits will just create problems down the road as pari is bound to get bigger, gcc is going to use more ram, and people try this on a wider (slower) range of hardware.

Well, the chosen limits are very conservative, so I doubt we will run into this problem any time soon.

  • We apparently know that optimization of Pari is not working correctly on gcc 4.4.1 yet we still build it with optimization and hope for the best? What could possibly go wrong?

There isn't just one single broken version, we want to catch all broken gcc's. See #9897 for example.

comment:42 in reply to: ↑ 40 ; follow-up: Changed 8 years ago by jdemeyer

Replying to vbraun:

Well this is a spectacular bandaid to work around compiler breakage :-)

I totally agree with this by the way, but in my opinion there are two things we can do:

  1. Use -O3 always and leave the user with a non-compiling Sage ("it's not our fault, it's gcc's fault")
  2. Do the optimization-fallback as in this spkg.

I believe making a blacklist of versions known to fail is not a good solution because there will always be systems with a broken gcc that we don't know of.

comment:43 in reply to: ↑ 42 ; follow-up: Changed 8 years ago by vbraun

I totally agree with this by the way, but in my opinion there are two things we can do:

  1. Use -O3 always and leave the user with a non-compiling Sage ("it's not our fault, it's gcc's fault")
  2. Do the optimization-fallback as in this spkg.
  1. Show the error and then tell the user how to report the issue and continue the build without optimization. I think all thats needed is
    CFLAGS=-O0 ./sage -f spkg/standard/pari*
    make
    

to build pari with no optimization and then build the rest?

I believe making a blacklist of versions known to fail is not a good solution because there will always be systems with a broken gcc that we don't know of.

But if we silently try some workarounds then we will never find out about broken compilers either. The user should file a trac ticket to document the issue so it can be fixed in Sage and reported upstream.

comment:44 in reply to: ↑ 43 Changed 8 years ago by jdemeyer

Replying to vbraun:

But if we silently try some workarounds then we will never find out about broken compilers either. The user should file a trac ticket to document the issue so it can be fixed in Sage and reported upstream.

I would prefer that users post the bug directly upstream (or not, gcc ignores most bug reports anyway). I certainly don't want to waste too much energy in making gcc bug reports for every user who can't compile Sage due to a gcc bug.

comment:45 follow-up: Changed 8 years ago by vbraun

Well nothing is going to be reported if we just silently compile away without optimization. I feel your pain wrt. to gcc bugs but if they don't get reported then they can't get fixed.

So far I've only seen

  1. general brokenness of -O3 on itanium (dying arch...), we should just default to -O2 there.
  2. gcc-4.4.1 needs -O1

comment:46 in reply to: ↑ 45 Changed 8 years ago by jdemeyer

Replying to vbraun:

Well nothing is going to be reported if we just silently compile away without optimization. I feel your pain wrt. to gcc bugs but if they don't get reported then they can't get fixed.

Maybe we should suggest to gcc to add PARI as a test suite (I'm mostly joking here). For some reason it has a fenomenal ability to expose gcc bugs.

comment:47 follow-up: Changed 8 years ago by drkirkby

One would need to ascertain if this is a gcc bug or a Pari bug. Badly written code will cause more aggressive optimisations to fail. It does not necessary mean it is a compiler bug.

comment:48 Changed 8 years ago by vbraun

  • Status changed from needs_review to needs_work

True, but thats the case here:

  1. brokenness of -O3 on itanium is discussed on the gcc bug ticket and definitely a compiler bug.
  2. gcc-4.4.1 with >= -O2 just starts using all available memory until it dies.

My suggestion would be to just catch those two cases. And add a useful error message to the spkg in case pari fails to compile, explaining how to report this and how to work around it by compiling the pari spkg with different CFLAGS.

comment:49 in reply to: ↑ 47 Changed 8 years ago by jdemeyer

Replying to drkirkby:

One would need to ascertain if this is a gcc bug or a Pari bug. Badly written code will cause more aggressive optimisations to fail. It does not necessary mean it is a compiler bug.

Badly written code should never cause the compiler to crash or to use infinite memory. These things are certainly compiler bugs.

comment:50 Changed 8 years ago by jdemeyer

  • Description modified (diff)

comment:51 Changed 8 years ago by jdemeyer

  • Status changed from needs_work to needs_review

comment:52 Changed 8 years ago by jdemeyer

I removed all the trial building code from the spkg in light of #10572. Also added patches for #10559. New spkg needs review.

comment:53 Changed 8 years ago by vbraun

  • Reviewers changed from Leif Leonhardy to Leif Leonhardy, Volker Braun
  • Status changed from needs_review to positive_review

I'm happy with the current version, so I'll give this ticket a positive review. If any compiler bugs are still preventing pari from being built on some hardware then this should be reported to the gcc wrapper.

comment:54 Changed 8 years ago by jdemeyer

  • Merged in set to sage-4.6.2.alpha1
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:55 Changed 8 years ago by jdemeyer

  • Merged in sage-4.6.2.alpha1 deleted
  • Resolution fixed deleted
  • Status changed from closed to new

Testing on the buildbot seems to indicate there might still be some race conditions in parallel make install. So maybe we should avoid doing that.

comment:56 Changed 8 years ago by jdemeyer

  • Description modified (diff)
  • Status changed from new to needs_review

Fixed race conditions in make install by using -j1.

comment:57 follow-up: Changed 8 years ago by vbraun

  • Status changed from needs_review to positive_review

That'll get rid of potential races in installation. Perhaps we should disable parallel make for all spkgs that don't use a proven build system like autotools or SCons. Chances are that any hand-rolled makefile has concurrency issues...

I'll take it that you are going to commit the changes to the included repository before adding the spkg to the next Sage release, because right now they are not.

comment:58 in reply to: ↑ 57 Changed 8 years ago by jdemeyer

Replying to vbraun:

I'll take it that you are going to commit the changes to the included repository before adding the spkg to the next Sage release, because right now they are not.

Yes, done.

comment:59 Changed 8 years ago by jdemeyer

  • Merged in set to sage-4.6.2.alpha2
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.