Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#12970 closed defect (fixed)

MPIR fails to build when CPU's architecture name doesn't match its actual capabilities

Reported by: kini Owned by: tbd
Priority: major Milestone: sage-5.1
Component: packages: standard Keywords: mpir virtualbox yasm cflags sd40.5
Cc: was, jdemeyer, leif Merged in: sage-5.0.1.rc0
Authors: Jeroen Demeyer Reviewers: Volker Braun, William Stein
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: #12954 Stopgaps:

Description (last modified by jdemeyer)

Specifically, on a Virtualbox virtual machine running on a Intel i5-2500K CPU, which is of the architecture GCC calls "corei7-avx", the virtual CPU visible to programs running inside the virtual machine does not actually support all the instructions that a real physical i5-2500K CPU does.

The MPIR SPKG currently builds its own yasm with CFLAGS containing -march=corei7-avx (after detecting the name of the CPU architecture in the VM). The resulting yasm apparently produces instructions that are not understood by the virtual CPU. If one builds yasm with CFLAGS containing -march=native instead, the problem does not occur. So a solution seems to be to force MPIR to build its yasm with -march=native, but I don't know how that will impact platforms other than my Virtualbox VM.

untested spkg: http://boxen.math.washington.edu/home/jdemeyer/spkg/mpir-2.4.0.p5.spkg

Attachments (2)

configure.patch (163.9 KB) - added by jdemeyer 9 years ago.
Patch for configure.in and related files inside the spkg, for review only
mpir-2.4.0.p5.diff (169.5 KB) - added by jdemeyer 9 years ago.
Diff between the 2.4.0.p4 and 2.4.0.p5 spkgs. For reference / review only.

Download all attachments as: .zip

Change History (22)

comment:1 Changed 9 years ago by jdemeyer

Does building with SAGE_FAT_BINARY=yes solve your problem?

comment:2 Changed 9 years ago by kini

Yup. It's not really any less of a workaround than what I've been doing, though:

$ make
# build fails on mpir
$ CFLAGS="-O2 -march=native" ./sage -i mpir
# mpir builds successfully
$ make
# build continues

comment:3 Changed 9 years ago by jdemeyer

Well,

SAGE_FAT_BINARY=yes make

is simpler than what you describe.

The big question now is: how do we detect that we're running inside VirtualBox?

comment:4 follow-up: Changed 9 years ago by kini

Is it really a bug in Virtualbox? Isn't it a bug in MPIR's detection of CPU capabilities? Is it guaranteed that Virtualbox VMs are the only places where MPIR will fail to correctly determine the instruction set of the CPU? (I doubt so...)

comment:5 in reply to: ↑ 4 Changed 9 years ago by jdemeyer

Replying to kini:

Is it really a bug in Virtualbox?

Depends on your definition of "bug" I suppose. It is at least a "missing feature" in VirtualBox.

Is it guaranteed that Virtualbox VMs are the only places where MPIR will fail to correctly determine the instruction set of the CPU?

It's not guaranteed but I don't see another scenario where this could happen, i.e. where the actual CPU features do not correspond with the CPU hardware.

comment:6 Changed 9 years ago by kini

Well, at the least, it seems likely or at least possible on other virtualization platforms. There's also the possibility of weird knockoff processors that pretend to be certain architectures, but I doubt anyone is trying to run Sage on those :P

comment:7 Changed 9 years ago by jdemeyer

Concerning the success reports with -march=native and SAGE_FAT_BINARY=yes: did you actually try to run Sage? Because building is one thing, running another.

comment:8 Changed 9 years ago by kini

With SAGE_FAT_BINARY=yes, no; with -march=native, yes, I was even running a patchbot with it for a while (until I decided to put one on arando instead since it's idle most of the time anyway). I'll run make ptestlong on my VM, stand by for results :)

comment:9 Changed 9 years ago by jdemeyer

  • Authors set to Jeroen Demeyer
  • Description modified (diff)

Please test the spkg http://boxen.math.washington.edu/home/jdemeyer/spkg/mpir-2.4.0.p5.spkg and attach the MPIR log file, whether it succeeds or not.

Last edited 9 years ago by jdemeyer (previous) (diff)

comment:10 Changed 9 years ago by kini

Sorry, I can't test it - I don't have access to the machine on which I encountered this problem until July when I go back to Singapore (ssh tunnel failed for some reason...).

comment:11 Changed 9 years ago by leif

  • Cc leif added

comment:12 Changed 9 years ago by vbraun

In a 32-bit Virtualbox, the spkg from comment:9 it picked

Finally using the following settings:
  CC=gcc
  CFLAGS=-m32 -O2 -fomit-frame-pointer -mtune=corei7-avx -march=corei7-avx  -g 
  CPP=
  CPPFLAGS=
  CXX=g++
  CXXFLAGS=
  LDFLAGS= -Wl,-z,noexecstack 
  ABI=32

and dies in the vmovd AVX instruction:

[sage@sagevm sage]$ ./sage -gdb
----------------------------------------------------------------------
| Sage Version 5.0, Release Date: 2012-05-14                         |
| Type notebook() for the GUI, and license() for information.        |
----------------------------------------------------------------------
/home/sage/sage-5.0/local/bin/sage-ipython
GNU gdb (GDB) Fedora (7.3.50.20110722-13.fc16)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/sage/sage-5.0/local/bin/python...done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
Python 2.7.2 (default, May 21 2012, 00:38:32) 
[GCC 4.6.3 20120306 (Red Hat 4.6.3-2)] on linux3
Type "help", "copyright", "credits" or "license" for more information.

Program received signal SIGILL, Illegal instruction.
0x00470191 in __gmpz_set_str (x=0x1523ab4, str=<optimized out>, base=10) at set_str.c:110
110	set_str.c: No such file or directory.
	in set_str.c
Missing separate debuginfos, use: debuginfo-install atlas-3.8.4-1.fc16.i686 glibc-2.14.90-24.fc16.6.i686 keyutils-libs-1.5.2-1.fc16.i686 krb5-libs-1.9.3-1.fc16.i686 libcom_err-1.41.14-2.fc15.i686 libgcc-4.6.3-2.fc16.i686 libselinux-2.1.6-6.fc16.i686 libstdc++-4.6.3-2.fc16.i686 ncurses-libs-5.9-2.20110716.fc16.i686 openssl-1.0.0i-1.fc16.i686
(gdb) dis
(gdb) disassemble
Dump of assembler code for function __gmpz_set_str:
   0x00470000 <+0>:	push   %ebp
   0x00470001 <+1>:	mov    %esp,%ebp
   0x00470003 <+3>:	push   %edi
   0x00470004 <+4>:	push   %esi
   0x00470005 <+5>:	push   %ebx
   0x00470006 <+6>:	call   0x457b89 <__i686.get_pc_thunk.bx>
   0x0047000b <+11>:	add    $0x4a4b5,%ebx
   0x00470011 <+17>:	sub    $0x5c,%esp
   0x00470014 <+20>:	cmpl   $0x24,0x10(%ebp)
   0x00470018 <+24>:	mov    0xc(%ebp),%edx
   0x0047001b <+27>:	mov    -0x54(%ebx),%ecx
   0x00470021 <+33>:	mov    %ecx,-0x40(%ebp)
   0x00470024 <+36>:	jle    0x470040 <__gmpz_set_str+64>
   0x00470026 <+38>:	cmpl   $0x3e,0x10(%ebp)
   0x0047002a <+42>:	movl   $0xffffffff,-0x38(%ebp)
   0x00470031 <+49>:	jg     0x470212 <__gmpz_set_str+530>
   0x00470037 <+55>:	add    $0xe0,%ecx
   0x0047003d <+61>:	mov    %ecx,-0x40(%ebp)
   0x00470040 <+64>:	mov    %edx,-0x50(%ebp)
   0x00470043 <+67>:	call   0x457a20 <__ctype_b_loc@plt>
   0x00470048 <+72>:	mov    -0x50(%ebp),%edx
   0x0047004b <+75>:	mov    %eax,-0x48(%ebp)
   0x0047004e <+78>:	mov    (%eax),%eax
   0x00470050 <+80>:	jmp    0x47005a <__gmpz_set_str+90>
   0x00470052 <+82>:	lea    0x0(%esi),%esi
   0x00470058 <+88>:	mov    %edi,%edx
   0x0047005a <+90>:	movzbl (%edx),%esi
   0x0047005d <+93>:	lea    0x1(%edx),%edi
   0x00470060 <+96>:	testb  $0x20,0x1(%eax,%esi,2)
   0x00470065 <+101>:	jne    0x470058 <__gmpz_set_str+88>
   0x00470067 <+103>:	cmp    $0x2d,%esi
   0x0047006a <+106>:	movl   $0x0,-0x4c(%ebp)
   0x00470071 <+113>:	je     0x470240 <__gmpz_set_str+576>
   0x00470077 <+119>:	mov    -0x40(%ebp),%edx
   0x0047007a <+122>:	movl   $0xffffffff,-0x38(%ebp)
   0x00470081 <+129>:	cmpl   $0x0,0x10(%ebp)
   0x00470085 <+133>:	movzbl (%edx,%esi,1),%ecx
   0x00470089 <+137>:	mov    $0xa,%edx
   0x0047008e <+142>:	cmovne 0x10(%ebp),%edx
   0x00470092 <+146>:	cmp    %ecx,%edx
   0x00470094 <+148>:	jle    0x470212 <__gmpz_set_str+530>
   0x0047009a <+154>:	mov    0x10(%ebp),%edx
   0x0047009d <+157>:	test   %edx,%edx
   0x0047009f <+159>:	jne    0x4700ee <__gmpz_set_str+238>
   0x004700a1 <+161>:	cmp    $0x30,%esi
   0x004700a4 <+164>:	movl   $0xa,0x10(%ebp)
   0x004700ab <+171>:	jne    0x4700ee <__gmpz_set_str+238>
   0x004700ad <+173>:	movzbl (%edi),%esi
   0x004700b0 <+176>:	cmp    $0x58,%esi
   0x004700b3 <+179>:	je     0x470253 <__gmpz_set_str+595>
   0x004700b9 <+185>:	cmp    $0x78,%esi
   0x004700bc <+188>:	je     0x470253 <__gmpz_set_str+595>
   0x004700c2 <+194>:	cmp    $0x42,%esi
   0x004700c5 <+197>:	je     0x470289 <__gmpz_set_str+649>
   0x004700cb <+203>:	cmp    $0x62,%esi
   0x004700ce <+206>:	xchg   %ax,%ax
   0x004700d0 <+208>:	je     0x470289 <__gmpz_set_str+649>
   0x004700d6 <+214>:	add    $0x1,%edi
   0x004700d9 <+217>:	movl   $0x8,0x10(%ebp)
   0x004700e0 <+224>:	jmp    0x4700ee <__gmpz_set_str+238>
   0x004700e2 <+226>:	lea    0x0(%esi),%esi
   0x004700e8 <+232>:	movzbl (%edi),%esi
   0x004700eb <+235>:	add    $0x1,%edi
   0x004700ee <+238>:	cmp    $0x30,%esi
   0x004700f1 <+241>:	je     0x4700e8 <__gmpz_set_str+232>
   0x004700f3 <+243>:	testb  $0x20,0x1(%eax,%esi,2)
   0x004700f8 <+248>:	jne    0x4700e8 <__gmpz_set_str+232>
   0x004700fa <+250>:	test   %esi,%esi
   0x004700fc <+252>:	je     0x470220 <__gmpz_set_str+544>
   0x00470102 <+258>:	lea    -0x1(%edi),%eax
   0x00470105 <+261>:	movl   $0x0,-0x1c(%ebp)
   0x0047010c <+268>:	mov    %eax,(%esp)
   0x0047010f <+271>:	call   0x4570f0 <strlen@plt>
   0x00470114 <+276>:	mov    %eax,-0x38(%ebp)
   0x00470117 <+279>:	add    $0x1,%eax
   0x0047011a <+282>:	cmp    $0xffff,%eax
   0x0047011f <+287>:	ja     0x4702e4 <__gmpz_set_str+740>
   0x00470125 <+293>:	mov    -0x38(%ebp),%eax
   0x00470128 <+296>:	add    $0x2f,%eax
   0x0047012b <+299>:	and    $0xfffffff0,%eax
   0x0047012e <+302>:	sub    %eax,%esp
   0x00470130 <+304>:	lea    0x2f(%esp),%eax
   0x00470134 <+308>:	and    $0xffffffe0,%eax
   0x00470137 <+311>:	mov    %eax,-0x44(%ebp)
   0x0047013a <+314>:	mov    %eax,%ecx
   0x0047013c <+316>:	mov    -0x38(%ebp),%eax
   0x0047013f <+319>:	test   %eax,%eax
   0x00470141 <+321>:	je     0x47029c <__gmpz_set_str+668>
   0x00470147 <+327>:	mov    %edi,-0x3c(%ebp)
   0x0047014a <+330>:	mov    -0x48(%ebp),%edi
   0x0047014d <+333>:	xor    %eax,%eax
   0x0047014f <+335>:	mov    %ecx,-0x54(%ebp)
   0x00470152 <+338>:	lea    0x0(%esi),%esi
   0x00470158 <+344>:	mov    (%edi),%edx
   0x0047015a <+346>:	testb  $0x20,0x1(%edx,%esi,2)
   0x0047015f <+351>:	jne    0x47017c <__gmpz_set_str+380>
   0x00470161 <+353>:	mov    -0x40(%ebp),%ecx
   0x00470164 <+356>:	movzbl (%ecx,%esi,1),%edx
   0x00470168 <+360>:	cmp    0x10(%ebp),%edx
   0x0047016b <+363>:	jge    0x470270 <__gmpz_set_str+624>
   0x00470171 <+369>:	mov    -0x54(%ebp),%ecx
   0x00470174 <+372>:	mov    %dl,(%ecx)
   0x00470176 <+374>:	add    $0x1,%ecx
   0x00470179 <+377>:	mov    %ecx,-0x54(%ebp)
   0x0047017c <+380>:	mov    -0x3c(%ebp),%edx
   0x0047017f <+383>:	movzbl (%edx,%eax,1),%esi
   0x00470183 <+387>:	add    $0x1,%eax
   0x00470186 <+390>:	cmp    -0x38(%ebp),%eax
   0x00470189 <+393>:	jne    0x470158 <__gmpz_set_str+344>
   0x0047018b <+395>:	mov    -0x54(%ebp),%ecx
   0x0047018e <+398>:	sub    -0x44(%ebp),%ecx
=> 0x00470191 <+401>:	vmovd  %ecx,%xmm0
   0x00470195 <+405>:	vmovq  %xmm0,-0x30(%ebp)
   0x0047019a <+410>:	fildll -0x30(%ebp)
   0x0047019d <+413>:	mov    0x10(%ebp),%edx
   0x004701a0 <+416>:	mov    -0x50(%ebx),%esi
   0x004701a6 <+422>:	lea    (%edx,%edx,4),%eax
   0x004701a9 <+425>:	lea    (%esi,%eax,4),%eax
   0x004701ac <+428>:	fdivl  0x4(%eax)
   0x004701af <+431>:	fisttpl -0x34(%ebp)
   0x004701b2 <+434>:	mov    -0x34(%ebp),%eax
   0x004701b5 <+437>:	lea    0x1f(%eax),%edx
   0x004701b8 <+440>:	test   %eax,%eax
   0x004701ba <+442>:	cmovs  %edx,%eax
   0x004701bd <+445>:	mov    0x8(%ebp),%edx
   0x004701c0 <+448>:	sar    $0x5,%eax
   0x004701c3 <+451>:	add    $0x2,%eax
   0x004701c6 <+454>:	cmp    (%edx),%eax
   0x004701c8 <+456>:	jg     0x4702a5 <__gmpz_set_str+677>
   0x004701ce <+462>:	mov    0x10(%ebp),%edx
   0x004701d1 <+465>:	mov    %ecx,0x8(%esp)
   0x004701d5 <+469>:	mov    -0x44(%ebp),%eax
   0x004701d8 <+472>:	mov    %edx,0xc(%esp)
   0x004701dc <+476>:	mov    0x8(%ebp),%edx
   0x004701df <+479>:	mov    %eax,0x4(%esp)
   0x004701e3 <+483>:	mov    0x8(%edx),%eax
   0x004701e6 <+486>:	mov    %eax,(%esp)
   0x004701e9 <+489>:	call   0x456800 <__gmpn_set_str@plt>
   0x004701ee <+494>:	mov    -0x4c(%ebp),%ecx
   0x004701f1 <+497>:	mov    %eax,%edx
   0x004701f3 <+499>:	neg    %edx
   0x004701f5 <+501>:	test   %ecx,%ecx
   0x004701f7 <+503>:	mov    0x8(%ebp),%ecx
   0x004701fa <+506>:	cmovne %edx,%eax
   0x004701fd <+509>:	mov    %eax,0x4(%ecx)
   0x00470200 <+512>:	mov    -0x1c(%ebp),%eax
   0x00470203 <+515>:	test   %eax,%eax
   0x00470205 <+517>:	jne    0x4702d0 <__gmpz_set_str+720>
   0x0047020b <+523>:	movl   $0x0,-0x38(%ebp)
   0x00470212 <+530>:	mov    -0x38(%ebp),%eax
   0x00470215 <+533>:	lea    -0xc(%ebp),%esp
   0x00470218 <+536>:	pop    %ebx
   0x00470219 <+537>:	pop    %esi
   0x0047021a <+538>:	pop    %edi
   0x0047021b <+539>:	pop    %ebp
   0x0047021c <+540>:	ret    
   0x0047021d <+541>:	lea    0x0(%esi),%esi
   0x00470220 <+544>:	mov    0x8(%ebp),%ecx
   0x00470223 <+547>:	movl   $0x0,-0x38(%ebp)
   0x0047022a <+554>:	mov    -0x38(%ebp),%eax
   0x0047022d <+557>:	movl   $0x0,0x4(%ecx)
   0x00470234 <+564>:	lea    -0xc(%ebp),%esp
   0x00470237 <+567>:	pop    %ebx
   0x00470238 <+568>:	pop    %esi
   0x00470239 <+569>:	pop    %edi
   0x0047023a <+570>:	pop    %ebp
   0x0047023b <+571>:	ret    
Last edited 9 years ago by vbraun (previous) (diff)

comment:13 Changed 9 years ago by vbraun

The new version passes all tests except one in the yasm testsuite. This is a known bug in yasm and recently fixed at http://tortall.lighthouseapp.com/projects/78676/tickets/220-make-check-failure

comment:14 Changed 9 years ago by jdemeyer

  • Dependencies set to #12954
  • Status changed from new to needs_review

Changed 9 years ago by jdemeyer

Patch for configure.in and related files inside the spkg, for review only

Changed 9 years ago by jdemeyer

Diff between the 2.4.0.p4 and 2.4.0.p5 spkgs. For reference / review only.

comment:15 Changed 9 years ago by was

Same fails on my laptop -- newest released VirtualBox? + Ubuntu 12.04 in 64-bit mode:

PASS: modules/preprocs/raw/tests/rawpp_test.sh
Test dwarf2_gen64_test: O +0-1/1 0%
 ** O: dwarf64_pathname did not match object file!
FAIL: modules/dbgfmts/dwarf2/tests/gen64/dwarf2_gen64_test.sh
Test dwarf2_pass32_test: .. +2-0/2 100%

Everything else passes.

comment:16 Changed 9 years ago by was

FWIW, mpfr and ecm both work with SAGE_CHECK="yes" on my 64-bit VM.

comment:17 Changed 9 years ago by vbraun

  • Status changed from needs_review to positive_review

Everything worked on my 32-bit VM too! I've looked at the diffs and everything looks good.

comment:18 Changed 9 years ago by vbraun

  • Keywords sd40.5 added
  • Reviewers set to Volker Braun, William Stein

comment:19 Changed 9 years ago by jdemeyer

  • Merged in set to sage-5.0.1
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:20 Changed 9 years ago by jdemeyer

  • Merged in changed from sage-5.0.1 to sage-5.0.1.rc0
Note: See TracTickets for help on using tickets.