Opened 10 years ago

Closed 9 years ago

#11844 closed defect (duplicate)

Race condition in building MPIR/yasm

Reported by: vbraun Owned by: tbd
Priority: critical Milestone: sage-duplicate/invalid/wontfix
Component: packages: standard Keywords:
Cc: wbhart, leif Merged in:
Authors: Reviewers: Leif Leonhardy
Report Upstream: Fixed upstream, but not in a stable release. Work issues:
Branch: Commit:
Dependencies: #11964, #11616 Stopgaps:

Status badges

Description (last modified by jdemeyer)

The included Yasm includes re2c which has a race condition during parallel build. See the upstream ticket for details:

http://tortall.lighthouseapp.com/projects/78676-yasm/tickets/238-parallel-make-fails

Fixed by #11616.

Change History (16)

comment:1 follow-up: Changed 10 years ago by leif

  • Cc leif added
  • Priority changed from major to minor

Nice.

Which version of MPIR are you referring to?

(I don't know whether the yasm version in our "recent" MPIR packages ever changed at all, or which one we currently use.)

FWIW, IIRC we did a lot of massively parallel builds of MPIR 2.1.x, because of the race condition in MPIR's make install (fixed in MPIR 2.1.3 or .4), but never ran into an error caused by yasm.

comment:2 Changed 10 years ago by vbraun

  • Priority changed from minor to major

This is the mpir-1.2.2.p2.spkg from sage-4.7.2.alpha2. All versions of Yasm have the bug.

You probably didn't trip over the yasm bug because you were using a file system that allows you to open the same file multiple times for writing. Then you don't get the segfault, but you may get corrupted output. If you are extremely lucky gcc might even compile the clobbered output from re2c, and you'd just get mathematically wrong results for integer arithmetic.

comment:3 in reply to: ↑ 1 ; follow-up: Changed 10 years ago by leif

Replying to leif:

FWIW, IIRC we did a lot of massively parallel builds of MPIR 2.1.x, because of the race condition in MPIR's make install (fixed in MPIR 2.1.3 or .4), but never ran into an error caused by yasm.

The latter is apparently fixed in MPIR 2.1.4.

comment:4 Changed 10 years ago by leif

  • Priority changed from major to critical

If it doesn't necessarily lead to a build error... :)

(I've also added a reference to this ticket on #11616.)

comment:5 in reply to: ↑ 3 Changed 10 years ago by leif

Replying to leif:

Replying to leif:

FWIW, IIRC we did a lot of massively parallel builds of MPIR 2.1.x, because of the race condition in MPIR's make install (fixed in MPIR 2.1.3 or .4), but never ran into an error caused by yasm.

The latter is apparently fixed in MPIR 2.1.4.

Not the latter, I meant MPIR's make install race condition. ;-)

comment:6 Changed 10 years ago by vbraun

  • Report Upstream changed from Workaround found; Bug reported upstream. to Fixed upstream, but not in a stable release.
  • Status changed from new to needs_review

comment:7 Changed 10 years ago by leif

Wonder whether they can include the fix in MPIR 2.5.0 (which should already be right before the door).

I'll provide updated MPIR spkgs (at least an MPIR 2.1.3.p5 or 2.1.4.pX spkg, the others for #11616) fixing the need to specify ABI=32 on 32-bit operating systems running on 64-bit CPUs anyway, then I can include such a fix by a patch to upstream.

Note that the yasm guys do not check whether the call to tmpfile() succeeded, nor is the file opened in text mode as it was before. (The latter should be irrelevant on our supported platforms -- including Cygwin AFAIK. Microsoft deprecated tmpfile() anyway... ;-) )

comment:8 follow-up: Changed 10 years ago by vbraun

Note that the yasm guys do not check whether the call to tmpfile() succeeded

In the unlikely case that tmpfile() can't find a unique file name, it'll return NULL so at least the yasm build will fail with a clean segfault. There are probably many other things that would go wrong at build time if tmpfile() fails...

comment:9 in reply to: ↑ 8 ; follow-up: Changed 10 years ago by leif

Replying to vbraun:

Note that the yasm guys do not check whether the call to tmpfile() succeeded

In the unlikely case that tmpfile() can't find a unique file name, it'll return NULL

ERRORS
       EACCES Search permission denied for directory in file's path prefix.

       EEXIST Unable to generate a unique filename.

       EINTR  The call was interrupted by a signal.

       EMFILE Too many file descriptors in use by the process.

       ENFILE Too many files open in the system.

       ENOSPC There was no room in the directory to add the new filename.

       EROFS  Read-only file system.

So there are a couple of reasons why tmpfile()could fail (and these may be temporary, i.e., later calls at least from other processes may well succeed).

Letting re2c segfault is of course the pythonic way... ;-)

comment:10 in reply to: ↑ 9 Changed 10 years ago by vbraun

Replying to leif:

So there are a couple of reasons why tmpfile()could fail

And all of them will break the build, whether or not re2c catches the error.

comment:11 Changed 10 years ago by leif

  • Dependencies set to #11896

Just in case...

comment:12 Changed 10 years ago by jdemeyer

  • Dependencies changed from #11896 to #11964
  • Status changed from needs_review to needs_work

Nothing to review here...

comment:13 Changed 9 years ago by leif

This is now fixed by the MPIR 2.4.0.p2 spkg at #11616 (patch to upstream added).

In case that gets merged, this ticket can be closed as a duplicate (or more precisely, as fixed / superseded by #11616).

comment:14 Changed 9 years ago by leif

  • Milestone changed from sage-5.0 to sage-duplicate/invalid/wontfix
  • Reviewers set to Leif Leonhardy
  • Status changed from needs_work to positive_review

Positive review w.r.t. that #11616 fixes this (by a patch from Sage, not MPIR upstream).

But we IMHO shouldn't close this ticket until #11616 got merged.

comment:15 Changed 9 years ago by jdemeyer

  • Dependencies changed from #11964 to #11964, #11616

comment:16 Changed 9 years ago by jdemeyer

  • Description modified (diff)
  • Resolution set to duplicate
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.