#12954 closed defect (fixed)
Make MPIR support SAGE_FAT_BINARY on all systems
Reported by: | vbraun | Owned by: | tbd |
---|---|---|---|
Priority: | blocker | Milestone: | sage-5.0.1 |
Component: | packages: standard | Keywords: | sd40.5 |
Cc: | jdemeyer, leif, jpflori, mhansen, kcrisman, vbraun | Merged in: | sage-5.0.1.rc0 |
Authors: | Jeroen Demeyer | Reviewers: | William Stein, Karl-Dieter Crisman |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
The OSX binary sage-5.0-OSX-64bit-10.6-x86_64-Darwin.dmg dies in MPFR on Core 2 Duo processors. This is because the MPIR spkg only checks SAGE_FAT_BINARY
on Linux systems.
Reported on https://groups.google.com/d/topic/sage-support/IfJCisKo7Ao/discussion
spkg: http://boxen.math.washington.edu/home/jdemeyer/spkg/mpir-2.4.0.p4.spkg
mpir-2.4.0.p4 (Jeroen Demeyer, 26 May 2012)
- Trac #12954: when SAGE_FAT_BINARY=yes on systems which don't support --enable-fat, compile a generic binary (not assuming any particular CPU) by passing the output of configfsf.guess to the --build option of ./configure.
- Disable SAGE_FAT_BINARY when bootstrapping GCC.
- Rename MPIR_EXTRA_OPTS to MPIR_CONFIGURE.
- Add configure options directly to MPIR_CONFIGURE, no longer use SAGE_CONF_OPTS for this.
- When user specifies CFLAGS, append them to MPIR's flags instead of completely replacing MPIR's flags.
- Remove $default_cflags and get_processor_specific_cflags().
Attachments (1)
Change History (37)
comment:1 Changed 10 years ago by
- Cc jdemeyer leif jpflori mhansen added
comment:2 Changed 10 years ago by
- Cc kcrisman added
comment:3 follow-up: ↓ 4 Changed 10 years ago by
comment:4 in reply to: ↑ 3 Changed 10 years ago by
Can you disassemble to see which instruction failed?
If you tell me how to do that (explicit commands) I have the offending binary on the right CPU and could at least get you a dump of the info that came out.
comment:5 Changed 10 years ago by
- Component changed from build to packages
- Owner changed from GeorgSWeber to tbd
- Priority changed from major to blocker
comment:6 Changed 10 years ago by
- Description modified (diff)
- Summary changed from MPFR illegal instruction in OSX binaries to Make MPIR support SAGE_FAT_BINARY on all systems
The problem is with MPIR (as MPFR simply uses MPIR's flags). The current MPIR spkg only checks SAGE_FAT_BINARY
on Linux systems.
comment:7 Changed 10 years ago by
- Description modified (diff)
comment:8 Changed 10 years ago by
- Description modified (diff)
comment:9 Changed 10 years ago by
- Description modified (diff)
comment:10 Changed 10 years ago by
- Description modified (diff)
- Milestone changed from sage-5.1 to sage-5.0.1
- Status changed from new to needs_review
comment:11 follow-ups: ↓ 12 ↓ 26 Changed 10 years ago by
SAGE_FAT_BINARY
should be synonymous with SAGE_GENERIC_BINARY
, though for historical reasons it isn't named well. From http://www.sagemath.org/doc/installation/source.html:
SAGE_FAT_BINARY - to prepare a binary distribution that will run on the widest range of target machines, set this variable to "yes"
I think right now you don't remove advanced CFLAGS
on linux platfroms if SAGE_FAT_BINARY
is set.
comment:12 in reply to: ↑ 11 ; follow-up: ↓ 16 Changed 10 years ago by
Replying to vbraun:
SAGE_FAT_BINARY
should be synonymous withSAGE_GENERIC_BINARY
, though for historical reasons it isn't named well. From http://www.sagemath.org/doc/installation/source.html:SAGE_FAT_BINARY - to prepare a binary distribution that will run on the widest range of target machines, set this variable to "yes"
I think right now you don't remove advanced
CFLAGS
on linux platfroms ifSAGE_FAT_BINARY
is set.
MPIR actually does support "fat binaries" in the true sense on x86 systems: building a library with code for different processors, determining at run-time which one to use.
comment:13 follow-up: ↓ 14 Changed 10 years ago by
I can confirm that installing this spkg into the sage-5.0-OSX-64bit-10.6-x86_64-Darwin.dmg binary release
on Core 2 Duo (OSX 10.6.8) fixes the problem, apparently. I'll run more tests, just in case.
comment:14 in reply to: ↑ 13 ; follow-up: ↓ 15 Changed 10 years ago by
Replying to dimpase:
I can confirm that installing this spkg into the sage-5.0-OSX-64bit-10.6-x86_64-Darwin.dmg binary release
on Core 2 Duo (OSX 10.6.8) fixes the problem, apparently. I'll run more tests, just in case.
I would think that any spkg installed (i.e., built from source on the machine it's running on) would work. Isn't the issue when it's built on another machine (with newer processor) and then the resulting binary used on an older machine?
comment:15 in reply to: ↑ 14 Changed 10 years ago by
Replying to mhansen:
Replying to dimpase:
I can confirm that installing this spkg into the sage-5.0-OSX-64bit-10.6-x86_64-Darwin.dmg binary release
on Core 2 Duo (OSX 10.6.8) fixes the problem, apparently. I'll run more tests, just in case.
I would think that any spkg installed (i.e., built from source on the machine it's running on) would work. Isn't the issue when it's built on another machine (with newer processor) and then the resulting binary used on an older machine?
it's a good point. Well, I don't have newer hardware to test this.
comment:16 in reply to: ↑ 12 ; follow-up: ↓ 17 Changed 10 years ago by
Replying to jdemeyer:
MPIR actually does support "fat binaries" in the true sense on x86 systems: building a library with code for different processors, determining at run-time which one to use.
Why do we have to set CFLAGS at all, then?
Also, is this really true? With all CFLAGS turned on won't the compiler potentially emit advanced instructions in the surrounding library code, outside of the time-critical algorithms?
comment:17 in reply to: ↑ 16 Changed 10 years ago by
Replying to vbraun:
With all CFLAGS turned on won't the compiler potentially emit advanced instructions in the surrounding library code, outside of the time-critical algorithms?
The surrounding library code is compiled with basic CFLAGS, without setting -march
at all. But the critical routines are compiled several times for various processors.
comment:19 follow-up: ↓ 20 Changed 10 years ago by
- Status changed from needs_review to needs_work
I tried the (non-app) binary from 5.0.1.rc0 on on Core 2 Duo (OSX 10.6.8), and got the usual crash:
sage: int(2.75) ------------------------------------------------------------------------ Unhandled SIGILL: An illegal instruction occurred in Sage. This probably occurred because a *compiled* component of Sage has a bug in it and is not properly wrapped with sig_on(), sig_off(). You might want to run Sage under gdb with 'sage -gdb' to debug this. Sage will now terminate. ------------------------------------------------------------------------ /Users/dima/Desktop/sage/spkg/bin/sage: line 312: 63629 Illegal instruction sage-ipython "$@" -i
with ./sage -gdb, I get
sage: int(2.75) Program received signal EXC_BAD_INSTRUCTION, Illegal instruction/operand. 0x00000001016e3ed9 in case1 () (gdb) bt #0 0x00000001016e3ed9 in case1 () #1 0x0000000103e6dba4 in parsed_string_to_mpfr () #2 0x0000000103e6e7fb in mpfr_strtofr () Previous frame inner to this frame (gdb could not unwind past this frame) (gdb)
comment:20 in reply to: ↑ 19 ; follow-up: ↓ 21 Changed 10 years ago by
Can you send the output of
(gdb) disas
when getting a SIGILL
?
comment:21 in reply to: ↑ 20 Changed 10 years ago by
Replying to jdemeyer:
Can you send the output of
(gdb) disaswhen getting a
SIGILL
?
sage: int(2.75) Program received signal EXC_BAD_INSTRUCTION, Illegal instruction/operand. 0x00000001016e3ed9 in case1 () (gdb) disas Dump of assembler code for function case1: 0x00000001016e3ed9 <case1+0>: popcnt 0x20(%rdi),%r8 0x00000001016e3edf <case1+6>: add %r8,%rax End of assembler dump. (gdb)
comment:22 follow-up: ↓ 23 Changed 10 years ago by
popcnt is SSE4.2, so Core 2 Duo don't have it. Btw on Linux I can do
[~]$ objdump -d sage/local/lib/libmpir.so | grep popcnt 45200: f3 4c 0f b8 04 cf popcnt (%rdi,%rcx,8),%r8 45206: f3 4c 0f b8 4c cf 08 popcnt 0x8(%rdi,%rcx,8),%r9 [...]
and I believe the equivalent command on OSX would be otool -tV
. Shouldn't we add some script to sage -bdist
that checks for the absence of, say, SSE >= 4 instructions?
comment:23 in reply to: ↑ 22 ; follow-up: ↓ 24 Changed 10 years ago by
Replying to vbraun:
popcnt is SSE4.2, so Core 2 Duo don't have it. Btw on Linux I can do
[~]$ objdump -d sage/local/lib/libmpir.so | grep popcnt 45200: f3 4c 0f b8 04 cf popcnt (%rdi,%rcx,8),%r8 45206: f3 4c 0f b8 4c cf 08 popcnt 0x8(%rdi,%rcx,8),%r9 [...]and I believe the equivalent command on OSX would be
otool -tV
.
yes, that's right (except it's not .so, but .dylb)
$ otool -tV local/lib/libmpir.dylib | grep popcnt 000000000003be60 popcnt (%rdi,%rcx,8),%r8 000000000003be66 popcnt 0x08(%rdi,%rcx,8),%r9
Shouldn't we add some script to
sage -bdist
that checks for the absence of, say, SSE >= 4 instructions?
how would it help? The problem seems to be in the inability of non-Apple gcc to build fat, a.k.a. universal, dynamic libs and executables. So I think we should either resort to making binary releases for OSX targeted to a particular architecture, or provide some post-install script that basically would do ./sage -f mpir (and anything else that needs similar fix) followed by ./sage -b. It's time-consuming to run, but it needs to be done only once...
comment:24 in reply to: ↑ 23 ; follow-up: ↓ 25 Changed 10 years ago by
Replying to dimpase:
how would it help?
It would help to ensure that we don't accidentally ship binaries that require advanced instruction set extensions. Thats all I meant.
comment:25 in reply to: ↑ 24 ; follow-up: ↓ 29 Changed 10 years ago by
comment:26 in reply to: ↑ 11 ; follow-up: ↓ 31 Changed 10 years ago by
Replying to vbraun:
SAGE_FAT_BINARY
should be synonymous withSAGE_GENERIC_BINARY
, though for historical reasons it isn't named well. From http://www.sagemath.org/doc/installation/source.html:SAGE_FAT_BINARY - to prepare a binary distribution that will run on the widest range of target machines, set this variable to "yes"
That description is certainly insufficient (or, to be more precise, one shouldn't read it from right to left). One should in general not prepare "fat" binaries on relatively new (e.g. Core i7) CPUs -- unless one knows what one's doing... ;-)
IMHO the main purpose of SAGE_FAT_BINARY
is to avoid the use of processor-specific ("hand-written") assembly code (and prevent Sage from using things like -march=native
); it's still up to the distributor to make sure the compiler is targeted to the desired CPU family (whichever that is).
Doing some "sanity check" upon sage -bdist
isn't all bad, but of course heavily depends on what (limited) architecture a specific binary distribution is supposed to be run on.
I never liked MPIR_EXTRA_OPTS
(for additional options to be passed to MPIR's configure
), but it's named like this for consistency with other spkgs, i.e., historical reasons. A couple of spkgs (or every spkg that supports such) use ${PKG_NAME}_EXTRA_OPTS
.
comment:27 Changed 10 years ago by
- Description modified (diff)
- Status changed from needs_work to needs_review
comment:28 Changed 10 years ago by
- Keywords sd40.5 added
comment:29 in reply to: ↑ 25 Changed 10 years ago by
Replying to dimpase:
Replying to vbraun:
Replying to dimpase:
how would it help?
It would help to ensure that we don't accidentally ship binaries that require advanced instruction set extensions. Thats all I meant.
well, we certainly should ship such binaries
I don't think we should ship such binaries, otherwise you're just going to confuse people.
comment:30 Changed 10 years ago by
- Cc vbraun added
New spkg should be ready for review, changes have not yet been committed.
comment:31 in reply to: ↑ 26 Changed 10 years ago by
Replying to leif:
I never liked
MPIR_EXTRA_OPTS
(for additional options to be passed to MPIR'sconfigure
), but it's named like this for consistency with other spkgs, i.e., historical reasons. A couple of spkgs (or every spkg that supports such) use${PKG_NAME}_EXTRA_OPTS
.
So let's change it to {PKG}_CONFIGURE
. Some of these packages (mpfr, ecm, pari) need to be changed anyway.
comment:32 Changed 10 years ago by
- Reviewers set to William Stein
- Status changed from needs_review to positive_review
I read through your patch twice, tried building with SAGE_FAT_BINARY="yes", and this all looks great to me. Of course, the real test is the buildbot and whether or not this really solves people's problems. So for now I give this a positive review.
comment:33 Changed 10 years ago by
- Reviewers changed from William Stein to William Stein, Karl-Dieter Crisman
I can confirm that this seems to work on an Intel Core 2 Duo which previously could not use the buildbot's binary.
comment:34 Changed 10 years ago by
- Merged in set to sage-5.0.1
- Resolution set to fixed
- Status changed from positive_review to closed
comment:35 Changed 10 years ago by
- Merged in changed from sage-5.0.1 to sage-5.0.1.rc0
comment:36 Changed 10 years ago by
Just for reference, this also appears to have solved a similar issue on older PPC Macs using binaries built on (still old, but not as old) less old PPC Macs which was new with 5.0. Good work.
Can you disassemble to see which instruction failed?
This MPFR binary was built with