Opened 8 years ago

Closed 8 years ago

#10979 closed defect (fixed)

GNU patch fails to build on AIX 5.3

Reported by: drkirkby Owned by: drkirkby
Priority: major Milestone: sage-4.7
Component: porting: AIX or HP-UX Keywords:
Cc: fbissey, weger@… Merged in: sage-4.7.alpha3
Authors: David Kirkby Reviewers: François Bissey
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description (last modified by drkirkby)

When trying to build Sage on AIX 5.3 with the following hardware:

  • IBM RS/6000 7025 F50
  • 4 x 332 MHz 32-bit PowerPC CPUs
  • 3 GB RAM
  • A fairly wide mixture of disks sizes (3 x 9 GB, 1 x 18 GB, 2 x 36 GB and 1 x 73 GB)
  • DDS-4 tape drive
  • AIX 5.3 (A POSIX certified operating system). Updated with Technology level 12, service pack 2, released on the 36th week of 2010). i.e. running 5300-12-02-1036.

the build of Sage sage-4.7.alpha1 fails with:

gcc -c  -DHAVE_CONFIG_H -Ded_PROGRAM=\"/usr/bin/ed\" -I. -I. -g -O2 quotearg.c
gcc -c  -DHAVE_CONFIG_H -Ded_PROGRAM=\"/usr/bin/ed\" -I. -I. -g -O2 quotesys.c
gcc -c  -DHAVE_CONFIG_H -Ded_PROGRAM=\"/usr/bin/ed\" -I. -I. -g -O2 util.c
gcc -c  -DHAVE_CONFIG_H -Ded_PROGRAM=\"/usr/bin/ed\" -I. -I. -g -O2 version.c
gcc -c  -DHAVE_CONFIG_H -Ded_PROGRAM=\"/usr/bin/ed\" -I. -I. -g -O2 xmalloc.c
gcc -o patch -g -O2  error.o malloc.o realloc.o addext.o argmatch.o backupfile.o basename.o dirname.o getopt.o getopt1.o inp.o maketime.o partime.o patch.o pch.o quote.o quotearg.o quotesys.o util.o version.o xmalloc.o 
ld: 0711-593 SEVERE ERROR: Symbol C_BSTAT (entry 408) in object error.o:
        The symbol refers to a csect with symbol number 0, which was not
        found. The new symbol cannot be associated with a csect and
        is being ignored.
ld: 0711-593 SEVERE ERROR: Symbol C_BSTAT (entry 411) in object error.o:
        The symbol refers to a csect with symbol number 0, which was not
        found. The new symbol cannot be associated with a csect and
        is being ignored.
ld: 0711-593 SEVERE ERROR: Symbol C_BSTAT (entry 416) in object error.o:
        The symbol refers to a csect with symbol number 0, which was not
        found. The new symbol cannot be associated with a csect and
        is being ignored.
ld: 0711-593 SEVERE ERROR: Symbol C_BSTAT (entry 419) in object error.o:
        The symbol refers to a csect with symbol number 0, which was not
        found. The new symbol cannot be associated with a csect and
        is being ignored.

<snip 100 or so similar error messages >

ld: 0711-593 SEVERE ERROR: Symbol C_BSTAT (entry 1187) in object util.o:
        The symbol refers to a csect with symbol number 0, which was not
        found. The new symbol cannot be associated with a csect and
        is being ignored.
collect2: ld returned 12 exit status
make[2]: *** [patch] Error 1
make[2]: Leaving directory `/home/users/drkirkby/sage-4.7.alpha1/spkg/build/patch-2.5.9/src'
Error building GNU patch

real    2m35.220s
user    1m44.674s
sys     0m32.137s
sage: An error occurred while installing patch-2.5.9

It would appear this is probably a result of a patch to AIX, as numerous people have reported this when compiling loads of different bits of software with gcc.

http://www.ibm.com/developerworks/forums/thread.jspa?threadID=348558

On the following gcc bug, some say it a result of updating AIX, as versions of gcc which worked before suddenly stopped working.

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46072

There is a tip there about how to get around this by one of the following two methods.

  • Ensure all static variables are initialized to a value
  • Remove debugging information, by not adding -g, or using -g0. (Unfortunately, it seems that GNU patch adds the -g option - it is not something Sage has specifically done.)

I don't know if Alan Weger from IBM has any ideas on this. I've seen this error message many times now, and it really is a major headache to getting Sage to build on AIX.

For GNU patch at least, simply adding "-g0" to CFLAGS to disable debugging information solves it. A new package can be found at:

http://boxen.math.washington.edu/home/kirkby/patches/patch-2.5.9.p0.spkg

There are no library patches. The Mercurial patch attacked is for review purposes only. All changes have been committed to the Mercurial repository

Dave

Attachments (1)

Fix-for-GNU-patch-on-AIX.patch (1.4 KB) - added by drkirkby 8 years ago.
Mercurial patch - only for review purposes - the changes are committed to the respositroy on the .spkg

Download all attachments as: .zip

Change History (7)

comment:1 Changed 8 years ago by drkirkby

  • Authors set to David Kirkby
  • Description modified (diff)
  • Status changed from new to needs_review

The attached patch adds "-g0" to CFLAGS only on AIX. This removes the debugging information and solves the problem.

GNU patch now builds OK on AIX.

-bash-4.1$ ./sage -f patch-2.5.9.p0
Force installing patch-2.5.9.p0
Calling sage-spkg on patch-2.5.9.p0
Warning: Attempted to overwrite SAGE_ROOT environment variable
patch-2.5.9.p0
Machine:
AIX aixbox 3 5 000245984C00
Deleting directories from past builds of previous/current versions of patch-2.5.9.p0
Extracting package /home/users/drkirkby/sage-4.7.alpha1/spkg/standard/patch-2.5.9.p0.spkg ...
-rw-r--r--    1 drkirkby staff        169427 23 Mar 2011  /home/users/drkirkby/sage-4.7.alpha1/spkg/standard/patch-2.5.9.p0.spkg
Finished extraction
****************************************************
Host system
uname -a:
AIX aixbox 3 5 000245984C00
****************************************************
****************************************************
CC Version
gcc -v
Using built-in specs.
Target: powerpc-ibm-aix5.3.0.0
Configured with: ../stage/gcc-4.2.4/configure --disable-shared --enable-threads=posix --prefix=/opt/pware --with-long-double-128 --with-mpfr=/opt/pware --with-gmp=/opt/pware
Thread model: aix
gcc version 4.2.4
****************************************************
checking for gcc... gcc
checking for C compiler default output... a.out

<snip lots irrelevant messages>

gcc -c  -DHAVE_CONFIG_H -Ded_PROGRAM=\"/usr/bin/ed\" -I. -I. -g0 quotesys.c
gcc -c  -DHAVE_CONFIG_H -Ded_PROGRAM=\"/usr/bin/ed\" -I. -I. -g0 util.c
gcc -c  -DHAVE_CONFIG_H -Ded_PROGRAM=\"/usr/bin/ed\" -I. -I. -g0 version.c
gcc -c  -DHAVE_CONFIG_H -Ded_PROGRAM=\"/usr/bin/ed\" -I. -I. -g0 xmalloc.c
gcc -o patch -g0  error.o malloc.o realloc.o addext.o argmatch.o backupfile.o basename.o dirname.o getopt.o getopt1.o inp.o maketime.o partime.o patch.o pch.o quote.o quotearg.o quotesys.o util.o version.o xmalloc.o 
/bin/sh ./mkinstalldirs /home/users/drkirkby/sage-4.7.alpha1/local/bin /home/users/drkirkby/sage-4.7.alpha1/local/man/man1
./install-sh -c patch /home/users/drkirkby/sage-4.7.alpha1/local/bin/`echo patch | sed 's,x,x,'`
./install-sh -c -m 644 ./patch.man /home/users/drkirkby/sage-4.7.alpha1/local/man/man1/`echo patch | sed 's,x,x,'`.1

real    2m17.218s
user    1m10.198s
sys     0m32.483s
Successfully installed patch-2.5.9.p0
Now cleaning up tmp files.
Making Sage/Python scripts relocatable...
python: A file or directory in the path name does not exist.
Finished installing patch-2.5.9.p0.spkg
-bash-4.1$ uname -a
AIX aixbox 3 5 000245984C00

Changed 8 years ago by drkirkby

Mercurial patch - only for review purposes - the changes are committed to the respositroy on the .spkg

comment:2 follow-up: Changed 8 years ago by fbissey

  • Status changed from needs_review to positive_review

I probably don't have the update in question installed, of course I am not currently set up to build sage on aix either.

IBM aix is starting to scare me.

Anyway the patch is trivial and indeed should solve the problem. Of course we'll probably hit the issue again in some other packages (polybori comes on the top of my head as building with -g by default, I could be wrong).

comment:3 in reply to: ↑ 2 ; follow-up: Changed 8 years ago by drkirkby

  • Description modified (diff)

Replying to fbissey:

I probably don't have the update in question installed, of course I am not currently set up to build sage on aix either.

It appears the bug is seen on at least AIX 5.3 (which I run) and AIX 6.1. I don't know about the latest 7.1, which was released in September 2010. (There never was an AIX 7 or AIX 7.0. IBM went from AIX 6.1 to AIX 7.1).

Anyway the patch is trivial and indeed should solve the problem. Of course we'll probably hit the issue again in some other packages (polybori comes on the top of my head as building with -g by default, I could be wrong).

The bug only occurs if the source code has uninitialised static variables, so PolyBoRi might not be affected.

But the issue does get hit elsewhere. Some packages are not so easy to fix. The GSL is one such package (#10000). Whilst GSL has some other issues on AIX (which I can fix easily with patches which have been accepted upstream), the debugging information one is not so easy to fix. Setting CFLAGS to include -g0 just results in

gcc -g0 -g foo.c

so the GSL configure script adds the "-g" after any attempt to put "-g0". So one can't easily avoid building GSL without debugging information. No doubt I could hack the configure script

IBM aix is starting to scare me.

the problem with the OS is that it is not very popular, so most people don't test their open-source code on AIX. But IBM make some really fast hardware, with clock speeds of at least 5 GHz - not that I personally own anything in that league.

comment:4 in reply to: ↑ 3 ; follow-up: Changed 8 years ago by kcrisman

  • Reviewers set to François Bissey

Replying to drkirkby:

Replying to fbissey:

I probably don't have the update in question installed, of course I am not currently set up to build sage on aix either.

It appears the bug is seen on at least AIX 5.3 (which I run) and AIX 6.1. I don't know about the latest 7.1, which was released in September 2010. (There never was an AIX 7 or AIX 7.0. IBM went from AIX 6.1 to AIX 7.1).

Anyway the patch is trivial and indeed should solve the problem. Of course we'll probably hit the issue again in some other packages (polybori comes on the top of my head as building with -g by default, I could be wrong).

The bug only occurs if the source code has uninitialised static variables, so PolyBoRi might not be affected.

But the issue does get hit elsewhere. Some packages are not so easy to fix. The GSL is one such package (#10000). Whilst GSL has some other issues on AIX (which I can fix easily with patches which have been accepted upstream), the debugging information one is not so easy to fix. Setting CFLAGS to include -g0 just results in

gcc -g0 -g foo.c

so the GSL configure script adds the "-g" after any attempt to put "-g0". So one can't easily avoid building GSL without debugging information. No doubt I could hack the configure script

Well, we've certainly done that before!

Hey, fbissey, I finally figured out how to get a cedille!

comment:5 in reply to: ↑ 4 Changed 8 years ago by drkirkby

Replying to kcrisman:

Replying to drkirkby:

But the issue does get hit elsewhere. Some packages are not so easy to fix. The GSL is one such package (#10000). Whilst GSL has some other issues on AIX (which I can fix easily with patches which have been accepted upstream), the debugging information one is not so easy to fix. Setting CFLAGS to include -g0 just results in

gcc -g0 -g foo.c

so the GSL configure script adds the "-g" after any attempt to put "-g0". So one can't easily avoid building GSL without debugging information. No doubt I could hack the configure script

Well, we've certainly done that before!

I realised my assumption about GSL was incorrect. We have a bit of code in spkg-install which adds "-g". I'd stuck my code to add "-g0" before that. So it was spkg-install of GSL that was adding the -g, overriding my -g0. As such, #10000, which is the fix for GSL is ready for review.

I could fix these issues - it depends whether I take the trouble to work around a bug which will probably be fixed at some point. (However, the other issues on the GSL ticket will not be solved by other means, as the GCC developers are not accepting this is a bug in gcc, when quite clearly it is. gcc does not use the IBM header file float.h, but creates it own, failing to include things in the system header file. How the $!£" the gcc developers do not consider that a bug I will never know, but they don't. Hence GSL will require a couple of patches on AIX until such time as the next version of GSL is released.

AIX is not high on my priority list, though Alan Weger from IBM is keen to build Sage on AIX.

My biggest obstacle is the lack of fast AIX hardware. As you know, building Sage is not a fast task, but for comparison on my Sun Ultra 27 under OpenSolaris, the time to build MPIR is:

real	1m2.937s
user	1m12.051s
sys	1m1.321s
Successfully installed mpir-1.2.2.p2

and the same bit of Sage on AIX:

real    31m0.395s
user    15m24.370s
sys     7m23.774s
Successfully installed mpir-1.2.2.p2

So my OpenSolaris workstation is 29.5x faster building MPPIR than my AIX server. I've not bothered checking anything other than MPIR, but the basic problem is my AIX server is too old and slow for Sage development.

Dave

comment:6 Changed 8 years ago by jdemeyer

  • Merged in set to sage-4.7.alpha3
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.