#31565 closed defect (fixed)

Build still non-portable despite SAGE_FAT_BINARY=yes because of numpy

Reported by: Matthias Köppe Owned by:
Priority: blocker Milestone: sage-9.4
Component: build Keywords: sdl
Cc: Erik Bray, gh-kliem, Dima Pasechnik, Volker Braun, François Bissey, Samuel Lelièvre Merged in:
Authors: Jonathan Kliem Reviewers: Thierry Monteil
Report Upstream: N/A Work issues:
Branch: 49e531d (Commits, GitHub, GitLab) Commit: 49e531d3c24ca5bf3f6dff54ce54687256cc429f
Dependencies: #32257 Stopgaps:

Status badges

Description (last modified by Matthias Köppe)

Follow-up from #29537, #31521.

Observed on cygwin-standard but likely also affects the Docker images and the Sage binary distribution.

With the upgrade to numpy 1.20.x (#31008), the non-portability shows as an error message instead of a crash:

  [sagelib-9.4.beta0]       from numpy.core._multiarray_umath import (
  [sagelib-9.4.beta0]   RuntimeError: NumPy was built with baseline optimizations: 
  [sagelib-9.4.beta0]   (SSE SSE2 SSE3 SSSE3 SSE41 POPCNT SSE42 AVX F16C FMA3 AVX2 AVX512F AVX512CD AVX512_SKX) but your machine doesn't support:
  [sagelib-9.4.beta0]   (AVX512F).
  [sagelib-9.4.beta0]   ************************************************************************
  [sagelib-9.4.beta0]   Error building the Sage library
  [sagelib-9.4.beta0]   ************************************************************************
  [sagelib-9.4.beta0] Full log file: /cygdrive/d/a/sage/sage/logs/pkgs/sagelib-9.4.beta0.log

Report withs the 9.3 Linux binary:

Change History (38)

comment:1 Changed 18 months ago by Matthias Köppe

Cc: François Bissey added

comment:2 Changed 18 months ago by François Bissey

What are the symptoms? I don't really do cygwin but I may do docker one day.

comment:3 Changed 18 months ago by Matthias Köppe

The symptom is that something you build on one machine does not run on another machine, aborting with SIGILL.

comment:4 Changed 18 months ago by Matthias Köppe

Description: modified (diff)

comment:5 Changed 17 months ago by Matthias Köppe

Milestone: sage-9.3sage-9.4

Moving to 9.4, as 9.3 has been released.

comment:6 Changed 16 months ago by Matthias Köppe

Dependencies: #31008
Description: modified (diff)
Priority: criticalblocker

comment:7 Changed 16 months ago by gh-kliem

I guess we need to unwind #31521 now??

comment:8 Changed 16 months ago by Matthias Köppe

Ah, that's right!

comment:9 Changed 16 months ago by Matthias Köppe

Description: modified (diff)

comment:10 Changed 16 months ago by Matthias Köppe

Description: modified (diff)

comment:11 Changed 15 months ago by Matthias Köppe

Branch: u/mkoeppe/build_still_non_portable_despite_sage_fat_binary_yes_because_of_numpy

comment:12 Changed 15 months ago by Matthias Köppe

Authors: Jonathan Kliem
Commit: b249c4750f510ba18a36b0306a15914325059398
Status: newneeds_review

New commits:

b249c47Revert "Revert "do not allow numpy intrinsics when building fat binary""

comment:13 Changed 15 months ago by Matthias Köppe

Reviewers: https://github.com/mkoeppe/sage/actions/runs/952966309

comment:14 Changed 15 months ago by gh-kliem

Thanks for taking care of this.

comment:15 Changed 15 months ago by Matthias Köppe

The cygwin-standard build looked rather promising (no "baseline optimizations" message, no crash when just importing numpy) but I am getting crashes again https://github.com/mkoeppe/sage/runs/2867645468 when the doctests do any plotting.

comment:16 Changed 15 months ago by Matthias Köppe

Reviewers: https://github.com/mkoeppe/sage/actions/runs/952966309Matthias Koeppe
Status: needs_reviewpositive_review

The other runs at https://github.com/mkoeppe/sage/runs/2865933225?check_suite_focus=true look clean.

So I consider this ticket already an improvement. We'll have to chase the crash on cygwin when switching CPUs between build stages in ... yet another ... follow-up ticket.

comment:17 Changed 15 months ago by Dima Pasechnik

More numpy-related trouble, looking as e.g. from https://groups.google.com/d/msgid/sage-devel/763f8650-9803-4bba-a0ec-46744204fc22n%40googlegroups.com. (there are more reports like this)

[sagelib-9.4.beta2]   File "/home/chapoton/sage/local/lib/python3.8/site-packages/numpy/core/overrides.py", line 7, in <module>
[sagelib-9.4.beta2]     from numpy.core._multiarray_umath import (
[sagelib-9.4.beta2] RuntimeError: NumPy was built with baseline optimizations: 
[sagelib-9.4.beta2] (SSE SSE2 SSE3 SSSE3 SSE41 POPCNT SSE42) but your machine doesn't support:
[sagelib-9.4.beta2] (POPCNT).

So somehow CPU changes its state, and refuses to say yes to features tested as yes during the Numpy build?! Is it due to some CPU flags manipulations happening during sagelib build?

comment:18 Changed 15 months ago by Dima Pasechnik

I've opened #32021 to deal with that RuntimeError: NumPy was built with baseline optimizations: thing.

comment:19 Changed 15 months ago by gh-kliem

Acutally the definition of cpu-dispatch in NUMPY_FCONFIG isn't clean. However, only the vallue will be passed and not the variable NUMPY_FCONFIG.

comment:20 Changed 15 months ago by gh-kliem

Status: positive_reviewneeds_work

This doesn't work. See https://trac.sagemath.org/ticket/32021#comment:10.

It's a build option not a configure option.

comment:21 Changed 15 months ago by gh-kliem

Actually the place to do it is correct however:

The command arguments are available in build, build_clib, and build_ext. if build_clib or build_ext are not specified by the user, the arguments of build will be used instead, which also holds the default values.

So this does not work for bdist_wheel and this is what we use.

comment:22 Changed 15 months ago by Matthias Köppe

You can do setup.py bdist_wheel build [...build-options...], see for example build/pkgs/jupyter_jsmol/spkg-install.in

comment:23 Changed 15 months ago by gh-kliem

Thanks. This seems to do the trick.

Actually this itself might even work for this ticket here.

comment:24 Changed 15 months ago by gh-kliem

Branch: u/mkoeppe/build_still_non_portable_despite_sage_fat_binary_yes_because_of_numpypublic/31565
Commit: b249c4750f510ba18a36b0306a1591432505939882ae48525b377032296f7f83cdc70e7acf5cd533
Status: needs_workneeds_review

New commits:

82ae485disable baseline in case of SAGE_FAT_BINARY

comment:26 Changed 15 months ago by Matthias Köppe

Status: needs_reviewneeds_info

Unclear whether this is still needed now that #32021 is merged in 9.4.beta5

comment:27 Changed 14 months ago by Matthias Köppe

Dependencies: #31008#32257

comment:28 Changed 14 months ago by Matthias Köppe

With #32021 and the pynac build failure fixed by #32257, we are back to being able to build and run the testsuite on Cygwin. https://github.com/mkoeppe/sage/runs/3126612760?check_suite_focus=true

Numerous SIGSEGVs whenever plotting is involved point to more trouble with numpy.

I'll try out the branch of the present ticket on top of #32257.

comment:29 Changed 14 months ago by git

Commit: 82ae48525b377032296f7f83cdc70e7acf5cd53349e531d3c24ca5bf3f6dff54ce54687256cc429f

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

d4156f7build/pkgs/singular/patches/0001-factory-canonicalform.h-Add-more-FACTORY_PUBLIC.patch: New
49e531ddisable baseline in case of SAGE_FAT_BINARY

comment:30 Changed 14 months ago by Matthias Köppe

Reviewers: Matthias Koeppehttps://github.com/mkoeppe/sage/actions/runs/1054288431

comment:31 in reply to:  28 Changed 14 months ago by Matthias Köppe

Replying to mkoeppe:

Numerous SIGSEGVs whenever plotting is involved point to more trouble with numpy.

I'll try out the branch of the present ticket on top of #32257.

Same issues as before.

comment:32 Changed 14 months ago by Matthias Köppe

Reviewers: https://github.com/mkoeppe/sage/actions/runs/1054288431

comment:33 Changed 14 months ago by Matthias Köppe

Cc: Samuel Lelièvre added
Status: needs_infoneeds_work

Also with #32080 no change. Segfaults on every plot.

Next step would be to try to reproduce this in a local installation in Cygwin.

comment:34 Changed 14 months ago by Matthias Köppe

Priority: blockercritical

comment:36 Changed 14 months ago by Thierry Monteil

Keywords: sdl added
Reviewers: Thierry Monteil

I confirm that this branch fixes SSE2 bug for numpy in a quemulated Pentium 3. Note however that the error appeared at run time not build time.

As this will allow me to rebuild 32bit patchbots and a new SDL for bullseye release (which i did not do for a while), i am +1 for setting this ticket a blocker and get it merged in 9.4.

comment:37 Changed 14 months ago by Thierry Monteil

Priority: criticalblocker
Status: needs_workpositive_review

If nobody complains.

comment:38 Changed 14 months ago by Volker Braun

Branch: public/3156549e531d3c24ca5bf3f6dff54ce54687256cc429f
Resolution: fixed
Status: positive_reviewclosed
Note: See TracTickets for help on using tickets.