Opened 3 years ago
Last modified 3 years ago
#24575 needs_work defect
on Arch make+guile is broken
Reported by: | vdelecroix | Owned by: | |
---|---|---|---|
Priority: | critical | Milestone: | sage-8.2 |
Component: | packages: standard | Keywords: | |
Cc: | embray, vbraun, charpent, defeo, jpflori | Merged in: | |
Authors: | Erik Bray, Vincent Delecroix | Reviewers: | Erik Bray, Vincent Delecroix |
Report Upstream: | Reported upstream. No feedback yet. | Work issues: | |
Branch: | u/vdelecroix/24575 (Commits, GitHub, GitLab) | Commit: | 0d8a4a954cee8958b24ece2067bf5191dcbcd28e |
Dependencies: | #24885 | Stopgaps: |
Description (last modified by )
Guile plugin in make under certain not completely clear conditions, likely to do with a version misconfiguration of system libraries, may fail to build a number of Sage packages. For example, while building Sage 8.2.beta3 on latest archlinux one gets
make: symbol lookup error: /usr/lib/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link
as libguile looks at the LD_LIBRARY_PATH containing a different version of libgc.so
than the one it needs.
See also this report on sage-devel.
After deactivating the gc
package, the compilation went fine.
However, this cannot be reproduced on other Linux systems.
The workaround in the branch consists in declaring the environment variable LD_PRELOAD
so that make uses the system gc. The workaround has to be applied to the 2 standard packages R
and rpy2
.
The three other packges flint
/arb
/deformation
fail to build for the same reason and we apply a small patch to avoid them redefining LD_LIBRARY_PATH
.
Upstream issues
- flint: upstream issue at https://github.com/wbhart/flint2/issues/447
- arb: upstream issue at https://github.com/fredrik-johansson/arb/issues/213
- deformation: https://github.com/jpflori/pydeformation/issues/5 (for the pie trouble from #24902)
Attachments (7)
Change History (142)
comment:1 Changed 3 years ago by
- Description modified (diff)
comment:2 Changed 3 years ago by
- Description modified (diff)
comment:3 Changed 3 years ago by
- Cc charpent added
comment:4 Changed 3 years ago by
At what point do you get this error? While building R? (guessing the latter from the subject of sage-devel post)
comment:5 follow-up: ↓ 23 Changed 3 years ago by
how come make depends on guile for you? I don't see it.
$ ldd `which gmake` linux-vdso.so.1 (0x00007ffe861ce000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f0e5b6d3000) libc.so.6 => /lib64/libc.so.6 (0x00007f0e5b30f000) /lib64/ld-linux-x86-64.so.2 (0x00007f0e5b8d7000)
I've just installed gc-7.6.2 systemwide, and things work for me with Sage, so far.
comment:6 Changed 3 years ago by
By the way, there is #23700, which would give you the same major gc version as you apparently need (although I fail to see why).
comment:7 follow-up: ↓ 12 Changed 3 years ago by
IMHO, if you can nuke a system utility by installing a library with normal user privileges, then you have a huge security hole. Thus I don't think it's something to fix in Sage.
comment:8 Changed 3 years ago by
I think the "symbol lookup error" is just being echoed by make, its not from make not finding a symbol; This happens while R is compiling the MASS package and output is clearly being filtered.
Apparently R sets LD_LIBRARY_PATH while compiling packages so Sage's libraries take precedence over system ones. Which inevitably leads to problems, which is why we removed that from the Sage build system a while ago.
comment:9 Changed 3 years ago by
No, R is not setting LD_LIBRARY_PATH
, it is merely respecting it. I think we have a case of Sage being built with LD_LIBRARY_PATH set to something, and also guile
(indeed, it has nothing to do with Sage AFAIK) involved in the environment somehow; and guile
(perhaps invoked from .bashrc
?) made to use wrong gc
version from Sage.
To reproduce this one needs to have libgc.so.X have the same X in $SAGE_LOCAL/lib and in /usr/lib. On my system X=1 in the former and X=2 in the latter, and libguile is linked to libgc.so.2. So I went and made
$ ln -sf libgc.so.1 libgc.so.2
in $SAGE_LOCAL/lib. After this I duly get
$ LD_LIBRARY_PATH=./local/lib guile guile: symbol lookup error: /usr/lib64/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link
Needless to say, R still builds just fine for me after this hack.
comment:10 Changed 3 years ago by
For precision, my $LD_LIBRARY_PATH
is empty. It should not have anything to do with it. What about the proposition of charpent comment:3?
comment:11 Changed 3 years ago by
You have a strange setup on your system, which involves guile into building Sage. Perhaps something in shell configurations, I do not know. Or something wrong with your linker settings or its cached. Guile is a system library which you can only make to fail this way by setting LD_LIBRARY_PATH. But Sage does not do it, something else does.
comment:12 in reply to: ↑ 7 ; follow-up: ↓ 13 Changed 3 years ago by
Replying to dimpase:
IMHO, if you can nuke a system utility by installing a library with normal user privileges, then you have a huge security hole. Thus I don't think it's something to fix in Sage.
That's not what's going on here so please don't mischaracterize it as a "huge security hole". It's quite normal to have a broken setup where one executable is linking at runtime with the wrong version of some shared library. This is the Linux version of "DLL hell" (albeit less severe).
comment:13 in reply to: ↑ 12 ; follow-up: ↓ 14 Changed 3 years ago by
Replying to embray:
Replying to dimpase:
IMHO, if you can nuke a system utility by installing a library with normal user privileges, then you have a huge security hole. Thus I don't think it's something to fix in Sage.
That's not what's going on here so please don't mischaracterize it as a "huge security hole". It's quite normal to have a broken setup where one executable is linking at runtime with the wrong version of some shared library. This is the Linux version of "DLL hell" (albeit less severe).
One needs to set LD_LIBRARY_PATH for this to happen. If on the other hand you succeed in replacing the system library with one at your account *for all the users*, regardless of the environment, then yes, you have hacked the system via a security hole.
Anyhow, there is no Sage bug to fix here, that's what I am trying to say all along.
Unless I see an meaningful explanation how libguile is relevant to building Sage, I'd tend to set this to wontfix
.
comment:14 in reply to: ↑ 13 ; follow-up: ↓ 15 Changed 3 years ago by
Replying to dimpase:
Replying to embray:
Replying to dimpase:
IMHO, if you can nuke a system utility by installing a library with normal user privileges, then you have a huge security hole. Thus I don't think it's something to fix in Sage.
That's not what's going on here so please don't mischaracterize it as a "huge security hole". It's quite normal to have a broken setup where one executable is linking at runtime with the wrong version of some shared library. This is the Linux version of "DLL hell" (albeit less severe).
One needs to set LD_LIBRARY_PATH for this to happen. If on the other hand you succeed in replacing the system library with one at your account *for all the users*, regardless of the environment, then yes, you have hacked the system via a security hole.
I...don't see any evidence that that's happening here.
comment:15 in reply to: ↑ 14 Changed 3 years ago by
Replying to embray:
Replying to dimpase:
Replying to embray:
Replying to dimpase:
IMHO, if you can nuke a system utility by installing a library with normal user privileges, then you have a huge security hole. Thus I don't think it's something to fix in Sage.
That's not what's going on here so please don't mischaracterize it as a "huge security hole". It's quite normal to have a broken setup where one executable is linking at runtime with the wrong version of some shared library. This is the Linux version of "DLL hell" (albeit less severe).
One needs to set LD_LIBRARY_PATH for this to happen. If on the other hand you succeed in replacing the system library with one at your account *for all the users*, regardless of the environment, then yes, you have hacked the system via a security hole.
I...don't see any evidence that that's happening here.
I have not said I see this, either. What I see is an unexplained attempt to invoke (lib)guile during the Sage build.
comment:16 follow-up: ↓ 18 Changed 3 years ago by
Somebody who can reproduce the original problem should make the R build more verbose and try again...
comment:17 Changed 3 years ago by
according to Vincent this can happen on his system while building Flint:
I restart a build from scratch and I don't believe that R is responsible in any way. This new build stopped on flint pointing at the same library issue make: symbol lookup error: /usr/lib/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link
comment:18 in reply to: ↑ 16 Changed 3 years ago by
Replying to vbraun:
Somebody who can reproduce the original problem should make the R build more verbose and try again...
on it
comment:19 Changed 3 years ago by
Failed on flint. But debug mode not very helpful (made in flint source dir)
(sage-sh)$ make --debug GNU Make 4.2.1 Built for x86_64-unknown-linux-gnu Copyright (C) 1988-2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Reading makefiles... Updating makefiles.... Updating goal targets.... File 'all' does not exist. File 'library' does not exist. Must remake target 'library'. make: symbol lookup error: /usr/lib/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link make: *** [Makefile:173: library] Error 127
comment:20 follow-up: ↓ 21 Changed 3 years ago by
Can you try starting guile at (sage-sh)$
prompt?
comment:21 in reply to: ↑ 20 Changed 3 years ago by
Replying to dimpase:
Can you try starting guile at
(sage-sh)$
prompt?
It works fine
(sage-sh) $ guile GNU Guile 2.2.3 Copyright (C) 1995-2017 Free Software Foundation, Inc. Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'. This program is free software, and you are welcome to redistribute it under certain conditions; type `,show c' for details. Enter `,help' for help. scheme@(guile-user)> quit() $1 = #<procedure quit args> While compiling expression: Syntax error: unknown location: unexpected syntax in form () scheme@(guile-user)> ()
comment:22 Changed 3 years ago by
Still with flint: without any option to ./configure
it succeeds
(sage-sh)$ ./configure --disable-static --prefix="$SAGE_LOCAL" Configuring...x86_64-Linux Testing __builtin_popcountl...yes Testing native popcount...yes Testing __thread...yes Testing fenv...yes FLINT was successfully configured. (sage-sh) $ make mkdir -p build make[1]: Entering directory '/opt/sage-bis/local/var/tmp/sage/build/flint-2.5.2.p1/src' CC build/printf.lo CC build/fprintf.lo CC build/sprintf.lo CC build/scanf.lo CC build/fscanf.lo CC build/sscanf.lo CC build/clz_tab.lo CC build/memory_manager.lo CC build/version.lo CC build/profiler.lo CC build/thread_support.lo ...
But setting --with-gmp
it fails
(sage-sh) $ ./configure --disable-static --prefix="$SAGE_LOCAL" --with-gmp="$SAGE_LOCAL" Configuring...x86_64-Linux Testing __builtin_popcountl...yes Testing native popcount...yes Testing __thread...yes Testing fenv...yes FLINT was successfully configured. (sage-sh) $ make make: symbol lookup error: /usr/lib/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link make: *** [Makefile:173: library] Error 127
comment:23 in reply to: ↑ 5 ; follow-up: ↓ 24 Changed 3 years ago by
Replying to dimpase:
how come make depends on guile for you? I don't see it.
$ ldd `which gmake` linux-vdso.so.1 (0x00007ffe861ce000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f0e5b6d3000) libc.so.6 => /lib64/libc.so.6 (0x00007f0e5b30f000) /lib64/ld-linux-x86-64.so.2 (0x00007f0e5b8d7000)I've just installed gc-7.6.2 systemwide, and things work for me with Sage, so far.
What is gmake
? I got
(sage-sh) $ ldd `which make` linux-vdso.so.1 (0x00007fff3ccf6000) libguile-2.2.so.1 => /usr/lib/libguile-2.2.so.1 (0x00007f88df74d000) libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f88df549000) libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f88df32b000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f88def74000) libgc.so.1 => /usr/lib/libgc.so.1 (0x00007f88ded0a000) libffi.so.6 => /usr/lib/libffi.so.6 (0x00007f88deb01000) libunistring.so.2 => /usr/lib/libunistring.so.2 (0x00007f88de77f000) libgmp.so.10 => /usr/lib/libgmp.so.10 (0x00007f88de4ec000) libltdl.so.7 => /usr/lib/libltdl.so.7 (0x00007f88de2e2000) libcrypt.so.1 => /usr/lib/libcrypt.so.1 (0x00007f88de0aa000) libm.so.6 => /usr/lib/libm.so.6 (0x00007f88ddd5e000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f88dfa7a000) libatomic_ops.so.1 => /usr/lib/libatomic_ops.so.1 (0x00007f88ddb5c000)
comment:24 in reply to: ↑ 23 ; follow-ups: ↓ 25 ↓ 26 Changed 3 years ago by
Replying to vdelecroix:
Replying to dimpase:
how come make depends on guile for you? I don't see it.
$ ldd `which gmake` linux-vdso.so.1 (0x00007ffe861ce000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f0e5b6d3000) libc.so.6 => /lib64/libc.so.6 (0x00007f0e5b30f000) /lib64/ld-linux-x86-64.so.2 (0x00007f0e5b8d7000)I've just installed gc-7.6.2 systemwide, and things work for me with Sage, so far.
What is
gmake
?
for me make
is a link to gmake
, but it's not important.
What's important is that your make
is linked with libguile
(and a slew of its dependencies, including libgc
), and this is not usual (I never heard of it---
although it is not crazy, see https://www.gnu.org/software/make/manual/html_node/Guile-Integration.html)
I got
(sage-sh) $ ldd `which make` linux-vdso.so.1 (0x00007fff3ccf6000) libguile-2.2.so.1 => /usr/lib/libguile-2.2.so.1 (0x00007f88df74d000) libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f88df549000) libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f88df32b000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f88def74000) libgc.so.1 => /usr/lib/libgc.so.1 (0x00007f88ded0a000) libffi.so.6 => /usr/lib/libffi.so.6 (0x00007f88deb01000) libunistring.so.2 => /usr/lib/libunistring.so.2 (0x00007f88de77f000) libgmp.so.10 => /usr/lib/libgmp.so.10 (0x00007f88de4ec000) libltdl.so.7 => /usr/lib/libltdl.so.7 (0x00007f88de2e2000) libcrypt.so.1 => /usr/lib/libcrypt.so.1 (0x00007f88de0aa000) libm.so.6 => /usr/lib/libm.so.6 (0x00007f88ddd5e000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f88dfa7a000) libatomic_ops.so.1 => /usr/lib/libatomic_ops.so.1 (0x00007f88ddb5c000)
So we see that at this point make
appears to be correctly linked.
What do you see if in that directory (at sage-sh prompt) you run make -v
rather than make
? (More precisely, I'd like to understand whether it's the generated Flint's Makefile that breaks it, or it's just make
itself)
And what does ldd /usr/lib/libguile-2.2.so.1
show?
comment:25 in reply to: ↑ 24 Changed 3 years ago by
Replying to dimpase:
Replying to vdelecroix:
Replying to dimpase:
What do you see if in that directory (at sage-sh prompt) you run
make -v
rather thanmake
? (More precisely, I'd like to understand whether it's the generated Flint's Makefile that breaks it, or it's justmake
itself)
(sage-sh) $ make -v GNU Make 4.2.1 Construit pour x86_64-unknown-linux-gnu Copyright (C) 1988-2016 Free Software Foundation, Inc. Licence GPLv3+ : GNU GPL version 3 ou ultérieure <http://gnu.org/licenses/gpl.html> Ceci est un logiciel libre : vous êtes autorisé à le modifier et à la redistribuer. Il ne comporte AUCUNE GARANTIE, dans la mesure de ce que permet la loi.
Please read also comment:22: make does not look broken when I do not configure gmp
.
And what does
ldd /usr/lib/libguile-2.2.so.1
show?
(sage-sh) $ ldd /usr/lib/libguile-2.2.so.1 linux-vdso.so.1 (0x00007fff17387000) libgc.so.1 => /usr/lib/libgc.so.1 (0x00007fed43ae3000) libffi.so.6 => /usr/lib/libffi.so.6 (0x00007fed438da000) libunistring.so.2 => /usr/lib/libunistring.so.2 (0x00007fed43558000) libgmp.so.10 => /usr/lib/libgmp.so.10 (0x00007fed432c5000) libltdl.so.7 => /usr/lib/libltdl.so.7 (0x00007fed430bb000) libcrypt.so.1 => /usr/lib/libcrypt.so.1 (0x00007fed42e83000) libm.so.6 => /usr/lib/libm.so.6 (0x00007fed42b37000) libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fed42919000) libc.so.6 => /usr/lib/libc.so.6 (0x00007fed42562000) /usr/lib64/ld-linux-x86-64.so.2 (0x00007fed4407a000) libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fed4235e000) libatomic_ops.so.1 => /usr/lib/libatomic_ops.so.1 (0x00007fed4215c000)
comment:26 in reply to: ↑ 24 Changed 3 years ago by
Replying to dimpase:
for me
make
is a link togmake
, but it's not important. What's important is that yourmake
is linked withlibguile
(and a slew of its dependencies, includinglibgc
), and this is not usual (I never heard of it---
Since GNU make version 4 you can extend make
with guile bindings. Building make with such extension is a configuration option. Building those or not is a choice usually made by distro. Usually binary distro include all possible options unless they have "reservations". On Gentoo it is an option that is off by default.
comment:27 Changed 3 years ago by
I've built make from source with --with-guile
, guile version 2.2.
(which required changing one character in line 171 configure.ac,
[ PKG_CHECK_MODULES([GUILE], [guile-2.2], [have_guile=yes],
(2.2
instead of 2.0
- this probably explains why I was unable to build it the gentoo way?), getting
$ ldd `which make` linux-vdso.so.1 (0x00007ffc6d3c3000) libguile-2.2.so.1 => /usr/lib64/libguile-2.2.so.1 (0x00007f932105f000) libgc.so.2 => /usr/lib64/libgc.so.2 (0x00007f9320de6000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f9320be2000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f93209c2000) libc.so.6 => /lib64/libc.so.6 (0x00007f93205fe000) libffi.so.6 => /usr/lib64/libffi.so.6 (0x00007f93203f5000) libunistring.so.2 => /usr/lib64/libunistring.so.2 (0x00007f932007c000) libgmp.so.10 => /usr/lib64/libgmp.so.10 (0x00007f931fdf3000) libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x00007f931fbe9000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f931f9b1000) libm.so.6 => /lib64/libm.so.6 (0x00007f931f66f000) /lib64/ld-linux-x86-64.so.2 (0x00007f93213b0000)
but I cannot reproduce this.
It might be the version difference, but the produced make
happily builds Flint
even if I do $export LD_LIBRARY_PATH=$SAGE_LOCAL/lib; make
.
Needless to say, this export breaks guile:
$ guile guile: symbol lookup error: /usr/lib64/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link
Or it might be that the generated by Flint Makefile does not trigger guile extension in my case, and does trigger it in Vincent's case?
comment:28 follow-up: ↓ 29 Changed 3 years ago by
The trigger is just execution. If the symbol is not resolved you get this. There are a couple of things to remember:
- for the problem to happen the soname of the libgc in sage and on the system need to be the same
- while the soname are the same libgc in sage doesn't have the same symbols than on the system
So either libgc shouldn't have the same soname (upstream not bumping the number properly) or libgc is not configured with the same features in sage and on the system.
comment:29 in reply to: ↑ 28 Changed 3 years ago by
Replying to fbissey:
The trigger is just execution. If the symbol is not resolved you get this. There are a couple of things to remember:
- for the problem to happen the soname of the libgc in sage and on the system need to be the same
- while the soname are the same libgc in sage doesn't have the same symbols than on the system
So either libgc shouldn't have the same soname (upstream not bumping the number properly) or libgc is not configured with the same features in sage and on the system.
This does crash guile:
$ ldd `which guile` linux-vdso.so.1 (0x00007ffd5e36a000) libguile-2.2.so.1 => /usr/lib64/libguile-2.2.so.1 (0x00007f4c66c70000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f4c66a50000) libc.so.6 => /lib64/libc.so.6 (0x00007f4c6668c000) libgc.so.2 => /usr/lib64/libgc.so.2 (0x00007f4c66413000) libffi.so.6 => /usr/lib64/libffi.so.6 (0x00007f4c6620a000) libunistring.so.2 => /usr/lib64/libunistring.so.2 (0x00007f4c65e91000) libgmp.so.10 => /usr/lib64/libgmp.so.10 (0x00007f4c65c08000) libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x00007f4c659fe000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f4c657fa000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f4c655c2000) libm.so.6 => /lib64/libm.so.6 (0x00007f4c65280000) /lib64/ld-linux-x86-64.so.2 (0x00007f4c66fc1000) (sage-sh) dima@hilbert:sage-dev$ LD_LIBRARY_PATH=$SAGE_LOCAL/lib guile guile: symbol lookup error: /usr/lib64/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link
as I created a link to the wrong libgc (see comment 9):
$ ls -l $SAGE_LOCAL/lib/libgc* -rw-r--r-- 1 dima dima 946784 Dec 30 09:59 /home/dima/Sage/sage-dev/local/lib/libgc.a lrwxrwxrwx 1 dima dima 14 Dec 30 09:59 /home/dima/Sage/sage-dev/local/lib/libgc.so -> libgc.so.1.0.3 lrwxrwxrwx 1 dima dima 14 Dec 30 09:59 /home/dima/Sage/sage-dev/local/lib/libgc.so.1 -> libgc.so.1.0.3 -rwxr-xr-x 1 dima dima 702568 Dec 30 09:59 /home/dima/Sage/sage-dev/local/lib/libgc.so.1.0.3 lrwxrwxrwx 1 dima dima 10 Jan 20 22:54 /home/dima/Sage/sage-dev/local/lib/libgc.so.2 -> libgc.so.1
(libgc.so.1 is wrong (Sage's gc 7.2)) But make with guile works just fine:
$ LD_LIBRARY_PATH=$SAGE_LOCAL/lib /home/dima/bin/make -v GNU Make 4.2.1 Built for x86_64-pc-linux-gnu Copyright (C) 1988-2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
even though it is linked to the same libguile:
$ ldd /home/dima/bin/make linux-vdso.so.1 (0x00007ffd2959e000) libguile-2.2.so.1 => /usr/lib64/libguile-2.2.so.1 (0x00007fb4e1418000) libgc.so.2 => /usr/lib64/libgc.so.2 (0x00007fb4e119f000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fb4e0f9b000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fb4e0d7b000) libc.so.6 => /lib64/libc.so.6 (0x00007fb4e09b7000) libffi.so.6 => /usr/lib64/libffi.so.6 (0x00007fb4e07ae000) libunistring.so.2 => /usr/lib64/libunistring.so.2 (0x00007fb4e0435000) libgmp.so.10 => /usr/lib64/libgmp.so.10 (0x00007fb4e01ac000) libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x00007fb4dffa2000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007fb4dfd6a000) libm.so.6 => /lib64/libm.so.6 (0x00007fb4dfa28000) /lib64/ld-linux-x86-64.so.2 (0x00007fb4e1769000)
just as a sanity check:
$ LD_LIBRARY_PATH=$SAGE_LOCAL/lib ldd /home/dima/bin/make linux-vdso.so.1 (0x00007ffccbbfe000) libguile-2.2.so.1 => /usr/lib64/libguile-2.2.so.1 (0x00007ff033858000) libgc.so.2 => /home/dima/Sage/sage-dev/local/lib/libgc.so.2 (0x00007ff0334f9000) libdl.so.2 => /lib64/libdl.so.2 (0x00007ff0332f5000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ff0330d5000) libc.so.6 => /lib64/libc.so.6 (0x00007ff032d11000) libffi.so.6 => /usr/lib64/libffi.so.6 (0x00007ff032b08000) libunistring.so.2 => /usr/lib64/libunistring.so.2 (0x00007ff03278f000) libgmp.so.10 => /usr/lib64/libgmp.so.10 (0x00007ff032506000) libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x00007ff0322fc000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007ff0320c4000) libm.so.6 => /lib64/libm.so.6 (0x00007ff031d82000) /lib64/ld-linux-x86-64.so.2 (0x00007ff033ba9000)
So even though the link ought to be resolved by the linker, it is not done (I can also run actual building, not just -v
with this setup).
comment:30 follow-up: ↓ 32 Changed 3 years ago by
- Cc defeo added
Just wanted to confirm I'm experiencing the same problem on Arch. I have no more insights than you guys.
Has Antonio Rojas popped up in the discussion yet? He might have already seen this error while packaging for Arch.
comment:31 Changed 3 years ago by
One trivial way out is to upgrade our gc, see #23700
comment:32 in reply to: ↑ 30 Changed 3 years ago by
Replying to defeo:
Just wanted to confirm I'm experiencing the same problem on Arch. I have no more insights than you guys.
Has Antonio Rojas popped up in the discussion yet? He might have already seen this error while packaging for Arch.
I think that Arch guys forgot to bump up the version of libgc, for it is still libgc.so.1 (while on gentoo the same libgc is named libgc.so.2) cf comments 25 and 27 above.
Note that Arch most probably uses system libgc in its build of Sage, as it's not listed here https://www.archlinux.org/packages/community/x86_64/sagemath/
comment:33 Changed 3 years ago by
Could anyone who can reproduce this check whether #23000 fixes the problem?
comment:34 follow-up: ↓ 40 Changed 3 years ago by
oops, typo, it should be "Could anyone who can reproduce this check whether #23700 fixes the problem?"
comment:35 Changed 3 years ago by
- Report Upstream changed from N/A to Reported upstream. No feedback yet.
I've asked on bug-make@gnu.org
whether is this a GNU make bug.
comment:36 follow-ups: ↓ 37 ↓ 39 Changed 3 years ago by
- Report Upstream changed from Reported upstream. No feedback yet. to Reported upstream. Developers deny it's a bug.
Well, I am not convinced, and am still waiting for an answer to this.
comment:37 in reply to: ↑ 36 ; follow-up: ↓ 38 Changed 3 years ago by
comment:38 in reply to: ↑ 37 Changed 3 years ago by
Replying to embray:
Replying to dimpase:
Well, I am not convinced, and am still waiting for an answer to this.
Your comment about maybe statically linking libguile makes some sense, but then you'd have to also statically link any of its dependencies as well, including libgc or else it wouldn't solve the problem.
They could also load the Guile extension only if they need it. (And/or have a configuration option of turning it off).
comment:39 in reply to: ↑ 36 Changed 3 years ago by
That would make sense too.
Anyways, I'm increasingly convinced that the problem here is in the affected distros. I'm gonna try an Arch VM and see if I can reproduce...
comment:40 in reply to: ↑ 34 ; follow-ups: ↓ 41 ↓ 42 Changed 3 years ago by
Replying to dimpase:
oops, typo, it should be "Could anyone who can reproduce this check whether #23700 fixes the problem?"
Not for me.
- I checked out the ticket and ran
make
. Same failure.
- I ran
make distclean
, themake
again. I got this failure:
[patch-2.7.5] Using cached file /home/defeo/sage/upstream/patch-2.7.5.tar.gz [patch-2.7.5] patch-2.7.5 [patch-2.7.5] ==================================================== [patch-2.7.5] Setting up build directory for patch-2.7.5 [patch-2.7.5] Traceback (most recent call last): [patch-2.7.5] File "/home/defeo/sage/build/bin/sage-uncompress-spkg", line 23, in <module> [patch-2.7.5] run() [patch-2.7.5] File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/cmdline.py", line 72, in run [patch-2.7.5] unpack_archive(archive, dirname) [patch-2.7.5] File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/action.py", line 68, in unpack_archive [patch-2.7.5] archive.extractall(members=archive.names) [patch-2.7.5] File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/tar_file.py", line 90, in extractall [patch-2.7.5] members=members) [patch-2.7.5] File "/usr/lib/python3.6/tarfile.py", line 2007, in extractall [patch-2.7.5] numeric_owner=numeric_owner) [patch-2.7.5] File "/usr/lib/python3.6/tarfile.py", line 2049, in extract [patch-2.7.5] numeric_owner=numeric_owner) [patch-2.7.5] TypeError: _extract_member() got an unexpected keyword argument 'set_attrs' [patch-2.7.5] ************************************************************************ [patch-2.7.5] Error: failed to extract /home/defeo/sage/upstream/patch-2.7.5.tar.gz [patch-2.7.5] ************************************************************************
comment:41 in reply to: ↑ 40 Changed 3 years ago by
a duplicate comment, sorry.
comment:42 in reply to: ↑ 40 Changed 3 years ago by
Replying to defeo:
Replying to dimpase:
oops, typo, it should be "Could anyone who can reproduce this check whether #23700 fixes the problem?"
Not for me.
- I checked out the ticket and ran
make
. Same failure.
- I ran
make distclean
, themake
again. I got this failure:[patch-2.7.5] Using cached file /home/defeo/sage/upstream/patch-2.7.5.tar.gz [patch-2.7.5] patch-2.7.5 [patch-2.7.5] ==================================================== [patch-2.7.5] Setting up build directory for patch-2.7.5 [patch-2.7.5] Traceback (most recent call last): [patch-2.7.5] File "/home/defeo/sage/build/bin/sage-uncompress-spkg", line 23, in <module> [patch-2.7.5] run() [patch-2.7.5] File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/cmdline.py", line 72, in run [patch-2.7.5] unpack_archive(archive, dirname) [patch-2.7.5] File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/action.py", line 68, in unpack_archive [patch-2.7.5] archive.extractall(members=archive.names) [patch-2.7.5] File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/tar_file.py", line 90, in extractall [patch-2.7.5] members=members) [patch-2.7.5] File "/usr/lib/python3.6/tarfile.py", line 2007, in extractall [patch-2.7.5] numeric_owner=numeric_owner) [patch-2.7.5] File "/usr/lib/python3.6/tarfile.py", line 2049, in extract [patch-2.7.5] numeric_owner=numeric_owner) [patch-2.7.5] TypeError: _extract_member() got an unexpected keyword argument 'set_attrs' [patch-2.7.5] ************************************************************************ [patch-2.7.5] Error: failed to extract /home/defeo/sage/upstream/patch-2.7.5.tar.gz [patch-2.7.5] ************************************************************************
this looks like system's Python is nuked too. Do you have funky stuff in your LD_LIBRARY_PATH or in PATH? Nothing to do with gc, that's certain.
comment:43 Changed 3 years ago by
Or perhaps it's simply due to your python
being python3
(or a very new python3
, which has not been tested...)
comment:44 follow-up: ↓ 45 Changed 3 years ago by
yep, I have this error if I set my system Python to python3.5, too.
Thus, set python to python2, and repeat please.
comment:45 in reply to: ↑ 44 Changed 3 years ago by
comment:46 Changed 3 years ago by
Ok, it compiled with Python2. Now, it might be thanks to #23700, or thanks to Python2... who knows? :)
comment:47 Changed 3 years ago by
- Dependencies set to #23700
- Status changed from new to needs_review
#23700 is reported to fix this issue.
(As well as using a guile-less make, I presume.)
comment:48 Changed 3 years ago by
Neat, I was able to reproduce this in an Arch Linux Docker image. So at least there's that.
comment:49 Changed 3 years ago by
Does #23700 cure it?
comment:50 Changed 3 years ago by
I haven't tried. But a workaround that did work was to add LD_PRELOAD=/usr/bin/libgc.so
. So a full workaround might look something like:
if [ "$UNAME" = "Linux" ]; then LIBGC="$(ldd $(which make) | sed -n 's/\s*libgc\.so.* => \(.\+\) .*/\1/p')" if [ -n "$LIBGC" ]; then export LD_PRELOAD="$LIBGC" fi fi
This finds the libgc
that is needed by libguile
(and by extension make
) and ensures it's the one that's used, not the one from Sage. Sucks, but it works, and is kind of necessary.
A similar LD_PRELOAD
trick might be able to solve #24605 as well, but I haven't tested that yet.
comment:51 follow-up: ↓ 52 Changed 3 years ago by
- Branch set to u/embray/build/ticket-24575
- Commit set to 71c63fd0d9043a134568ad2a018f65c69888c52a
- Reviewers set to Erik Bray
I've gone ahead and added my workaround. I would recommend using this even with #23700, just because really we should always be using the libgc
from the system when invoking make
(where applicable), even if the libgc
in Sage happens, by some luck, to be compatible with the system's version.
In principle this workaround is needed for any build process that adds $SAGE_LOCAL/lib
to $LD_LIBRARY_PATH
. In general this should not be done at all, but there is at least one other case I know of in Sage: python. So this might also be worth extracting into a helper function for pre-loading certain libraries when needed...
New commits:
71c63fd | Add the workaround to https://trac.sagemath.org/ticket/24575
|
comment:52 in reply to: ↑ 51 Changed 3 years ago by
- Dependencies #23700 deleted
Replying to embray:
I've gone ahead and added my workaround. I would recommend using this even with #23700, just because really we should always be using the
libgc
from the system when invokingmake
(where applicable), even if thelibgc
in Sage happens, by some luck, to be compatible with the system's version.In principle this workaround is needed for any build process that adds
$SAGE_LOCAL/lib
to$LD_LIBRARY_PATH
. In general this should not be done at all, but there is at least one other case I know of in Sage: python. So this might also be worth extracting into a helper function for pre-loading certain libraries when needed...
Thanks Erik for analyzing the problem and providing the workaround! I definitely did not want to consider #23700 as a solution. (I am now compiling from scratch for checking)
Note that your fix is focused towards libgc so that the same kind of trouble might appear with another library in the future. But I consider this as fine for now. Wouldn't it be possible to exclude libgc from the list of packages to install when already present (and up to date) on the system?
Changed 3 years ago by
comment:53 Changed 3 years ago by
flint build is failing (for the same reason as R did), see flint-2.5.2.p2.log. Should we apply the same strategy here?
Changed 3 years ago by
comment:54 Changed 3 years ago by
Replying to vdelecroix:
flint build is failing (for the same reason as R did), see flint-2.5.2.p2.log. Should we apply the same strategy here?
Same also with arb and the Python package ryp2 (rpy2-2.8.2.p0.log). After adding the workaround to the three spkg-install the build completes.
Though I did not check the optional packages.
comment:55 Changed 3 years ago by
- Branch changed from u/embray/build/ticket-24575 to u/vdelecroix/24575
- Commit changed from 71c63fd0d9043a134568ad2a018f65c69888c52a to f33c5e60c6655d2af7d97047dcba996600f37193
New commits:
f33c5e6 | Same workaround for arb, flint and rpy2
|
comment:56 Changed 3 years ago by
- Description modified (diff)
comment:57 follow-up: ↓ 58 Changed 3 years ago by
- Status changed from needs_review to needs_work
Shouldn't we do the LD_PRELOAD
in the script that calls spkg-install
, rather than
repeat this boilerplate? (And the same for spkg-check
, by the way).
This would also take care of all the non-standard packages.
comment:58 in reply to: ↑ 57 ; follow-up: ↓ 59 Changed 3 years ago by
Replying to dimpase:
Shouldn't we do the
LD_PRELOAD
in the script that callsspkg-install
, rather than repeat this boilerplate? (And the same forspkg-check
, by the way).This would also take care of all the non-standard packages.
I don't think so. This workaround takes care of fragile makefiles until a better solution is found. Having it globally applied would be a nightmare for debugging as well as upstream communication. It is also likely that the workarounds will be removed one by one.
comment:59 in reply to: ↑ 58 Changed 3 years ago by
Replying to vdelecroix:
Replying to dimpase:
Shouldn't we do the
LD_PRELOAD
in the script that callsspkg-install
, rather than repeat this boilerplate? (And the same forspkg-check
, by the way).This would also take care of all the non-standard packages.
I don't think so. This workaround takes care of fragile makefiles until a better solution is found.
A better solution is not to use Guile-enabled make, at least not until it is built in a way ensuring one can use it for hacking on Guile dependencies.
Having it globally applied would be a nightmare for debugging as well as upstream communication. It is also likely that the workarounds will be removed one by one.
There is nothing to communicate to package upstream here, I think. You cannot ban their use of LD_LIBRARY_FLAGS. Then, the LD_PRELOAD is a pretty standard way to deal with these issues. It has cured so far all these issues, why do you want to keep getting reports on such and such package mysteriously breaking while Guile-enabled make is used.
comment:60 follow-up: ↓ 62 Changed 3 years ago by
I agree with Vincent here. This workaround should only be used for the known packages with an LD_LIBRARY_PATH problem, and this should be reported as a bug upstream.
FLINT seems to be getting a CMake build system to replace its handwritten one, which will likely eliminate this problem.
comment:61 Changed 3 years ago by
Over in #24885 I already implemented a more generic solution for this, but I didn't push the branch yet. That should be used instead.
comment:62 in reply to: ↑ 60 Changed 3 years ago by
Replying to mkoeppe:
I agree with Vincent here. This workaround should only be used for the known packages with an LD_LIBRARY_PATH problem, and this should be reported as a bug upstream.
The fact that they use LD_LIBRARY_PATH
is not a bug IMO, though it would be better, at least in some cases, if they used LD_PRELOAD
instead for specific libraries.
comment:63 Changed 3 years ago by
- Branch changed from u/vdelecroix/24575 to u/embray/build/ticket-24575
- Commit changed from f33c5e60c6655d2af7d97047dcba996600f37193 to 454221ac40ec282c469ff2043465811b7364313b
- Dependencies set to #24885
Reworked on top of #24885
New commits:
ba1b5ee | Add helper function that implements the workaround from https://trac.sagemath.org/ticket/24575 more generically.
|
6103df0 | Add the workaround to https://trac.sagemath.org/ticket/24575
|
2ecaa74 | Replace this with sdh_preload_lib
|
454221a | Same issue applies to arb, flint, and rpy2
|
comment:64 Changed 3 years ago by
- Status changed from needs_work to needs_review
comment:65 Changed 3 years ago by
- Reviewers changed from Erik Bray to Erik Bray, Vincent Delecroix
- Status changed from needs_review to needs_work
I am currently testing optional tickets, at least deformation
has the same symptoms (I will provide a proper commit with all of them once finished).
comment:66 Changed 3 years ago by
All right, for deformation
, after setting the sdh_preload_lib
I got a different error that is unrelated
[deformation-d05941b] CC ../build/perm/../perm.lo [deformation-d05941b] /usr/bin/ld: -r and -pie may not be used together [deformation-d05941b] collect2: error: ld returned 1 exit status [deformation-d05941b] make[4]: *** [../Makefile.subdirs:55: ../build/perm/../perm.lo] Error 1
comment:67 Changed 3 years ago by
- Branch changed from u/embray/build/ticket-24575 to u/vdelecroix/24575
- Commit changed from 454221ac40ec282c469ff2043465811b7364313b to 1ac3afaefe81fb21c43be7964c07f7cd7d529cb9
Concerning optional packages, only deformation needs the workaround (It appears that I also have some unrelated build failures #23533, #24901, #24902 and #24903).
Erik, Dima, Matthias: I am considering the branch as ready to be positively reviewed. As I added a commit on top of the branch I let somebody else finishing the review.
New commits:
1ac3afa | Same issue applies to optional package deformation
|
comment:68 Changed 3 years ago by
- Status changed from needs_work to needs_review
comment:69 Changed 3 years ago by
Our package perl_term_readline_gnu
also has some LD_LIBRARY_PATH stuff...
comment:70 follow-up: ↓ 72 Changed 3 years ago by
I think it's better to patch out this LD_LIBRARY_PATH stuff from the packages' Makefiles. Like this: https://github.com/mkoeppe/deformation/commit/0d732b13e901b777aca000ff502a5d5aa8d690bf
comment:71 Changed 3 years ago by
- Cc jpflori added
comment:72 in reply to: ↑ 70 ; follow-ups: ↓ 73 ↓ 75 Changed 3 years ago by
Replying to mkoeppe:
I think it's better to patch out this LD_LIBRARY_PATH stuff from the packages' Makefiles. Like this: https://github.com/mkoeppe/deformation/commit/0d732b13e901b777aca000ff502a5d5aa8d690bf
Note that flint Makefile contains the very same lines... would you suggest that the same operation should be applied there?
comment:73 in reply to: ↑ 72 Changed 3 years ago by
Replying to vdelecroix:
Replying to mkoeppe:
I think it's better to patch out this LD_LIBRARY_PATH stuff from the packages' Makefiles. Like this: https://github.com/mkoeppe/deformation/commit/0d732b13e901b777aca000ff502a5d5aa8d690bf
Note that flint Makefile contains the very same lines... would you suggest that the same operation should be applied there?
As well as arb.
comment:74 Changed 3 years ago by
flint
, arb
and deformation
share almost the same build system indeed.
Except I did not push the -r
/pie
fix to deformation
.
comment:75 in reply to: ↑ 72 Changed 3 years ago by
Replying to vdelecroix:
would you suggest that the same operation should be applied there?
Yes, probably.
comment:76 Changed 3 years ago by
- Description modified (diff)
- Report Upstream changed from Reported upstream. Developers deny it's a bug. to Reported upstream. No feedback yet.
comment:77 follow-up: ↓ 82 Changed 3 years ago by
And for R, it may be enough to remove the bottom lines of etc/ldpaths.in
.
comment:78 Changed 3 years ago by
- Description modified (diff)
comment:79 follow-up: ↓ 80 Changed 3 years ago by
Note for all these packages, it is to be seen whether upstream would accept these changes: Some of these libraries may in fact have valid reasons for adjusting LD_LIBRARY_PATH in the context of their build system quirks. But in our setup, since we make sure that all libraries are installed with a full rpath, none of these LD_LIBRARY_PATH things are necessary.
comment:80 in reply to: ↑ 79 Changed 3 years ago by
Replying to mkoeppe:
Note for all these packages, it is to be seen whether upstream would accept these changes: Some of these libraries may in fact have valid reasons for adjusting LD_LIBRARY_PATH in the context of their build system quirks. But in our setup, since we make sure that all libraries are installed with a full rpath, none of these LD_LIBRARY_PATH things are necessary.
At least I asked for flint/arb (see ticket description). Even if upstream remains unchanged we would have two options
- adding a patch removing the 5 initial lines of
Makefile.in
(compiles fine on my computer) - use the Erik's workaround with
LD_PRELOAD
Does anybody have a preference?
comment:81 Changed 3 years ago by
I (clearly) have a strong preference for patching.
comment:82 in reply to: ↑ 77 ; follow-up: ↓ 83 Changed 3 years ago by
Replying to mkoeppe:
And for R, it may be enough to remove the bottom lines of
etc/ldpaths.in
.
@charpent: I see that the sage R package contains various patches from you, in particular regarding directories. Would this change, removing the set up of LD_LIBRARY_PATH (and DYLD_FALLBACK_LIBRARY_PATH on macOS), make sense to you?
comment:83 in reply to: ↑ 82 Changed 3 years ago by
Replying to mkoeppe:
Replying to mkoeppe:
And for R, it may be enough to remove the bottom lines of
etc/ldpaths.in
.@charpent: I see that the sage R package contains various patches from you, in particular regarding directories. Would this change, removing the set up of LD_LIBRARY_PATH (and DYLD_FALLBACK_LIBRARY_PATH on macOS), make sense to you?
IMHO this would break the Java support in R (if Java is installed in a non-standard place, which is not very unusual), and probably other R packages that might install or use shared libs.
comment:84 Changed 3 years ago by
- Commit changed from 1ac3afaefe81fb21c43be7964c07f7cd7d529cb9 to f4697df10175ba42a821f848942d3c174c550d76
comment:85 Changed 3 years ago by
- Description modified (diff)
comment:86 Changed 3 years ago by
comment:87 Changed 3 years ago by
- Description modified (diff)
comment:88 Changed 3 years ago by
- Commit changed from f4697df10175ba42a821f848942d3c174c550d76 to f5f25dc71fa05c4ac81d18392880a3358a9c9957
Branch pushed to git repo; I updated commit sha1. New commits:
f5f25dc | specify upstream issues in patches
|
comment:89 follow-ups: ↓ 91 ↓ 95 Changed 3 years ago by
I'm -1 on patching things out if it isn't necessary to, or if the reason to do so hasn't been fully understood. Are any of these patches even necessary? And if so, why are these packages manipulating LD_LIBRARY_PATH
, and how are you sure that it isn't a necessary and valid thing to do in this case?
comment:90 Changed 3 years ago by
In particular, since I already provided a workaround, why not just use that same workaround for those packages as well?
comment:91 in reply to: ↑ 89 Changed 3 years ago by
Replying to embray:
I'm -1 on patching things out if it isn't necessary to, or if the reason to do so hasn't been fully understood. Are any of these patches even necessary? And if so, why are these packages manipulating
LD_LIBRARY_PATH
, and how are you sure that it isn't a necessary and valid thing to do in this case?
Matthias might be better placed to answer (see 81). We might also wait for upstream answers (see ticket description for the links). To my mind, keeping spkg-install
as simple as possible is better (assuming the patch is correct).
I would also like to find a solution so that all these build troubles are merged soon (even if the fix has to change later on). The problem at the origin of this ticket affects all archlinux users.
comment:92 follow-ups: ↓ 93 ↓ 99 Changed 3 years ago by
Hell, we can provide a make spkg. To me, make is a tool that should work, no matter what. Otherwise, patching Sage packages is akin to trying to use a slot screwdriver on French recess screws, and then proceeding to saw slots in screws instead of picking up a correct tool...(sorry, my undergrad was in engineering :-))
comment:93 in reply to: ↑ 92 Changed 3 years ago by
Replying to dimpase:
Hell, we can provide a make spkg. To me, make is a tool that should work, no matter what. Otherwise, patching Sage packages is akin to trying to use a slot screwdriver on French recess screws, and then proceeding to saw slots in screws instead of picking up a correct tool...(sorry, my undergrad was in engineering :-))
No problem. After all, my undergrad was even more remote...
comment:94 follow-up: ↓ 96 Changed 3 years ago by
For flint and friends another workaround I found was to simply pass DLPATH_ADD=
when calling make
in its spkg-install
.
comment:95 in reply to: ↑ 89 Changed 3 years ago by
Replying to embray:
why are these packages manipulating
LD_LIBRARY_PATH
, and how are you sure that it isn't a necessary and valid thing to do in this case?
We can be sure simply by testing that the build works, just like we do with any other package.
comment:96 in reply to: ↑ 94 Changed 3 years ago by
Replying to embray:
For flint and friends another workaround I found was to simply pass
DLPATH_ADD=
when callingmake
in itsspkg-install
.
This is a great solution, which I prefer over patching.
comment:97 Changed 3 years ago by
- Commit changed from f5f25dc71fa05c4ac81d18392880a3358a9c9957 to 21442be9d659d74e29ce2a3a5dd85f55b7c90475
comment:98 Changed 3 years ago by
Thank you!
Next I'd suggest that we also see how exactly $SAGE_LOCAL/lib
ends up in R's LD_LIBRARY_PATH
. The relevant file seems to be $SAGE_LOCAL/lib/R/etc/ldpaths
. But I can't investigate this here on macOS.
comment:99 in reply to: ↑ 92 Changed 3 years ago by
comment:100 follow-up: ↓ 112 Changed 3 years ago by
[EDITED]
At commit 21442be
flint and arb do not pass their testsuite on my computer (when doing $ sage -f -c flint
or same with arb
/deformation
)
- without anything new in
spkg-check
I end up with the same libguile buisness - with
DLPATH_ADD=
it ends with the tests failing to loadlibflint
/libarb
/libdeformation
(as I guess they are not installed at the time the tests are run). - with
sdh_preload_lib
it works fine
I will add a commit in a minute.
comment:101 Changed 3 years ago by
- Commit changed from 21442be9d659d74e29ce2a3a5dd85f55b7c90475 to 0d8a4a954cee8958b24ece2067bf5191dcbcd28e
Branch pushed to git repo; I updated commit sha1. New commits:
0d8a4a9 | use sdh_preload_lib for flint/arb/deformation test-suites
|
comment:102 Changed 3 years ago by
Interesting; normally make check
is supposed to test against the non-installed library, rather than the installed library. (The distinction does not matter to us because we run spkg-check
after spkg-install
.
comment:103 Changed 3 years ago by
Vincent, could you try whether passing
$MAKE check DLPATH_ADD=`pwd`
works (without using sdh_preload_lib
)?
Changed 3 years ago by
Changed 3 years ago by
Changed 3 years ago by
Changed 3 years ago by
comment:104 Changed 3 years ago by
See in attachment the full log of the sage -f -c flint
with the various configurations
- spkg-check as in develop flint-2.5.2-check_on_base.log.gz
- with
DLPATH_ADD=
flint-2.5.2-check_with_DLPATH_ADD_empty.log.gz - with
DLPATH_add=`pwd`
as suggested in 103 flint-2.5.2-check_with_DLPATH_ADD_pwd.log.gz - with
sdh_preload_lib
as in0d8a4a9
flint-2.5.2-check_with_sdh_preload.log.gz
comment:105 Changed 3 years ago by
Thank you! Seems like FLINT forgets to pass the rpath linker option when it builds its test programs, relying on LD_LIBRARY_PATH instead.
comment:106 Changed 3 years ago by
Our patch use_ldflags_in_tests.patch
(for FLINT) does not go far enough; it forgets to patch Makefile.subdirs
.
comment:107 follow-up: ↓ 111 Changed 3 years ago by
Fix for FLINT is here: https://github.com/mkoeppe/flint2/commit/bd2684891b6da6791ae2f52482a02f5b4cc56bd1
comment:108 follow-up: ↓ 113 Changed 3 years ago by
- Description modified (diff)
- Priority changed from blocker to critical
- Summary changed from conflicts with gc to on Arch make+guile is broken
The ticket description and the title were misleading. It's a bug in Arch that we are dealing with here, and it should not be a blocker. A broken make is nothing new, e.g. while in theory BSD Make should be able to build Sage, in practice is does not work, and one has to use GNU Make.
If you must work on Arch, install make without guile support. Certainly, improving various upstream build systems is a noble goal, but getting totally carried away with this is not a good idea.
comment:109 follow-up: ↓ 110 Changed 3 years ago by
I agree that it's probably not a blocker because it only happens on a relatively obscure configuration. But I don't think your edit to the description was an improvement.
comment:110 in reply to: ↑ 109 Changed 3 years ago by
- Description modified (diff)
Replying to mkoeppe:
I agree that it's probably not a blocker because it only happens on a relatively obscure configuration. But I don't think your edit to the description was an improvement.
I've added a bit more detail pointing at the root cause of trouble. As well, #23700 will provide libgc
compatible with the one needed by libguile
, and the main problem will go away after it is merged.
comment:111 in reply to: ↑ 107 ; follow-up: ↓ 119 Changed 3 years ago by
Replying to mkoeppe:
Fix for FLINT is here: https://github.com/mkoeppe/flint2/commit/bd2684891b6da6791ae2f52482a02f5b4cc56bd1
With the patch applied, tests run with DLPATH_ADD=
in make check
. Could you make it a proper upstream pull request?
comment:112 in reply to: ↑ 100 Changed 3 years ago by
Replying to vdelecroix:
[EDITED]
At commit
21442be
flint and arb do not pass their testsuite on my computer (when doing$ sage -f -c flint
or same witharb
/deformation
)
- without anything new in
spkg-check
I end up with the same libguile buisness- with
DLPATH_ADD=
it ends with the tests failing to loadlibflint
/libarb
/libdeformation
(as I guess they are not installed at the time the tests are run).
I didn't get an opportunity to comment on this last night, but FYI you don't need to add DLPATH_ADD=
to the make install
call. Just the first one that says make verbose
. That works for me, and the make check
tests pass.
comment:113 in reply to: ↑ 108 ; follow-up: ↓ 114 Changed 3 years ago by
Replying to gh-dimpase:
The ticket description and the title were misleading. It's a bug in Arch that we are dealing with here, and it should not be a blocker. A broken make is nothing new, e.g. while in theory BSD Make should be able to build Sage, in practice is does not work, and one has to use GNU Make.
It's not a bug in Arch. If anything it's a bug in Sage...
comment:114 in reply to: ↑ 113 ; follow-up: ↓ 115 Changed 3 years ago by
Replying to embray:
Replying to gh-dimpase:
The ticket description and the title were misleading. It's a bug in Arch that we are dealing with here, and it should not be a blocker. A broken make is nothing new, e.g. while in theory BSD Make should be able to build Sage, in practice is does not work, and one has to use GNU Make.
It's not a bug in Arch. If anything it's a bug in Sage...
As you cannot reproduce it on anything but the latest Arch, I don't really see how you can say this. The make they ship is not backwards-compatible, and they have not made any announcements to that extent, have not bumped the version up, have they? At least if they insist on this make, they should also provide a guile-less make in another package.
comment:115 in reply to: ↑ 114 ; follow-up: ↓ 118 Changed 3 years ago by
Replying to dimpase:
Replying to embray:
Replying to gh-dimpase:
The ticket description and the title were misleading. It's a bug in Arch that we are dealing with here, and it should not be a blocker. A broken make is nothing new, e.g. while in theory BSD Make should be able to build Sage, in practice is does not work, and one has to use GNU Make.
It's not a bug in Arch. If anything it's a bug in Sage...
As you cannot reproduce it on anything but the latest Arch, I don't really see how you can say this. The make they ship is not backwards-compatible, and they have not made any announcements to that extent, have not bumped the version up, have they? At least if they insist on this make, they should also provide a guile-less make in another package.
I think you're being a tad Sage-centric. The kind of problem we're encountering here is a normal problem when you have two different versions of a library and you load the wrong version in an executable that expects the different version. There's certainly nothing wrong with Arch shipping a feature-complete version of GNU make (while it's a rare feature, it's probably used by at least some packages), and I can't blame them for not expecting that somebody might end up with their own copy of libgc on their shared library path, which is a highly unusual thing to be doing.
You do have a point that being a fundamental build tool, a make
with additional dependencies is more likely to encounter a problem like this than many other tools, but this same sort of problem can happen with any other part of the build toolchain. The real problem is that Sage is insisting on using too many of its own packages for low-level dependencies :)
comment:116 follow-ups: ↓ 117 ↓ 120 Changed 3 years ago by
Another example where this kind of problem can occur (but by luck doesn't seem to), which has nothing to do with make or guile or gc: Sage ships its own libz for some reason. Well, libpython has libz as a dependency, and when building Python it also manipulates LD_LIBRARY_PATH
. In this case it isn't really a problem, but if you had $SAGE_LOCAL/lib
on LD_LIBRARY_PATH
, and Sage's libz were incompatible with the system's libz, you would also have a problem.
comment:117 in reply to: ↑ 116 ; follow-up: ↓ 123 Changed 3 years ago by
Replying to embray:
Another example where this kind of problem can occur (but by luck doesn't seem to), which has nothing to do with make or guile or gc: Sage ships its own libz for some reason. Well, libpython has libz as a dependency, and when building Python it also manipulates
LD_LIBRARY_PATH
. In this case it isn't really a problem, but if you had$SAGE_LOCAL/lib
onLD_LIBRARY_PATH
, and Sage's libz were incompatible with the system's libz, you would also have a problem.
please have a look at comments 27-29 above. The only conclusion I can draw from it that Arch has done something dodgy and very hard to reproduce, the probably screwed up versioning of libgc, by not bumping it up while upgrading, or something similar. (or perhaps gentoo has a different build setup for make+libguile, so that the result is not fragile...)
TLDR; manipulating LD_LIBRARY_PATH while building with make+guile on gentoo does not lead to a problem, while on arch it does.
comment:118 in reply to: ↑ 115 ; follow-ups: ↓ 121 ↓ 125 Changed 3 years ago by
Replying to embray:
[ Snip... ]
The real problem is that Sage is insisting on using too many of its own packages for low-level dependencies :)
Hear, hear !
This, IMNSHO, is a capital point, that is involved in a *lot* of other parts of Sage. It stems from our insistence to have "known good" versions of almost everything we use, hence "our" version of Maxima, "our" version of R, "our" version of Sympy ,etc... and even "our" version(s) of Python (!).
This modus operandi greatly simplifies the maintenance of the consistency of Sage with (sometimes wildly) varying versions of other people's software. But the drawback is that Sage ends up being a distribution of mathematics-related software and underlying utilities, which converges to a (not so small) Unix-like distribution.
An alternative would be to push the version-related variability of interface in specialized interface packages, presenting Sage with a uniform interface and adapting to the "other side" variability.
As far as I can tell, this alternative isn't used because it requires maintenance of the interface for each and every version of the interfaced software that can be met "in the wild". Which would be more work than maintaining one "Sage's own" version, it seems...
But at the point where we need "our" make, "our" gcc, "our" python(s), etc..., I wonder if this point of view shouldn't be re-assessed.
This ticket is probably not the right place to discuss it ; however, I think it should be opened on sage-devel.
Opinions ? Advice ?
comment:119 in reply to: ↑ 111 ; follow-up: ↓ 124 Changed 3 years ago by
Replying to vdelecroix:
Replying to mkoeppe:
Fix for FLINT is here: https://github.com/mkoeppe/flint2/commit/bd2684891b6da6791ae2f52482a02f5b4cc56bd1
With the patch applied, tests run with
DLPATH_ADD=
inmake check
. Could you make it a proper upstream pull request?
comment:120 in reply to: ↑ 116 Changed 3 years ago by
Replying to embray:
Another example where this kind of problem can occur (but by luck doesn't seem to) [...] libz [...]
That's why I would recommend to use DLPATH_ADD=
in all places where make is used in the spkg scripts for the FLINT-like packages, not just the minimal list of those where the current symptom (make-guile-gc-on-arch) is observed.
comment:121 in reply to: ↑ 118 Changed 3 years ago by
Replying to charpent:
Opinions ? Advice ?
The convenience for users of having a self contained sage distribution (in particular those who are stuck on a system without root access, or distributions without a comprehensive and well-maintained set of mathematical software packages) is important.
Using the distribution's installed packages instead of our own whenever we can is of course desirable, and in fact there is an ongoing effort to do so. See for example Erik's #24919.
Whether we can use some specific version is often times not a question of "interfacing", but to avoid critical bugs; and it seems problematic to build an interface with the purpose of working around bugs. When it is about features rather than bugs, again there is an ongoing effort already, #20382.
These tickets (and related ones) need technical discussion.
Changed 3 years ago by
comment:122 follow-up: ↓ 126 Changed 3 years ago by
Replying to embray:
Replying to vdelecroix:
[EDITED]
At commit
21442be
flint and arb do not pass their testsuite on my computer (when doing$ sage -f -c flint
or same witharb
/deformation
)
- without anything new in
spkg-check
I end up with the same libguile buisness- with
DLPATH_ADD=
it ends with the tests failing to loadlibflint
/libarb
/libdeformation
(as I guess they are not installed at the time the tests are run).I didn't get an opportunity to comment on this last night, but FYI you don't need to add
DLPATH_ADD=
to themake install
call. Just the first one that saysmake verbose
. That works for me, and themake check
tests pass.
Perhaps I misunderstood your suggestion but the following does not work on my computer
-
build/pkgs/flint/spkg-install
diff --git a/build/pkgs/flint/spkg-install b/build/pkgs/flint/spkg-install index f6c185433f..29cdbee8f0 100644
a b if [ $? -ne 0 ]; then 38 38 fi 39 39 40 40 echo "Building FLINT shared library." 41 $MAKE verbose 41 $MAKE verbose DLPATH_ADD= 42 42 if [ $? -ne 0 ]; then 43 43 echo >&2 "Error: Failed to build FLINT shared library." 44 44 exit 1
The log of the command sage -f flint
is flint-2.5.2-check_with_DLPATH_ADD_empty_but_not_on_make_install.log.gz.
comment:123 in reply to: ↑ 117 Changed 3 years ago by
Replying to dimpase:
The only conclusion I can draw from it that Arch has done something dodgy and very hard to reproduce, the probably screwed up versioning of libgc, by not bumping it up while upgrading, or something similar. (or perhaps gentoo has a different build setup for make+libguile, so that the result is not fragile...)
Shared library versioning is certainly a powerful mechanism for a distribution to keep consistency, but it cannot protect against shadowing system libraries by user-installed libraries via LD_LIBRARY_PATH.
comment:124 in reply to: ↑ 119 Changed 3 years ago by
Replying to mkoeppe:
Replying to vdelecroix:
Replying to mkoeppe:
Fix for FLINT is here: https://github.com/mkoeppe/flint2/commit/bd2684891b6da6791ae2f52482a02f5b4cc56bd1
With the patch applied, tests run with
DLPATH_ADD=
inmake check
. Could you make it a proper upstream pull request?
Superseeded by https://github.com/wbhart/flint2/pull/450 (sorry).
comment:125 in reply to: ↑ 118 Changed 3 years ago by
Replying to charpent:
This ticket is probably not the right place to discuss it ; however, I think it should be opened on sage-devel.
Opinions ? Advice ?
While I agree with you 100% this is not a new discussion, nor does a discussion really need to be opened on sage-devel (been there done that). There is already lots of work being done on that from different directions, much of it funded by OpenDreamKit.
comment:126 in reply to: ↑ 122 ; follow-up: ↓ 128 Changed 3 years ago by
Replying to vdelecroix:
Replying to embray:
Replying to vdelecroix:
[EDITED]
At commit
21442be
flint and arb do not pass their testsuite on my computer (when doing$ sage -f -c flint
or same witharb
/deformation
)
- without anything new in
spkg-check
I end up with the same libguile buisness- with
DLPATH_ADD=
it ends with the tests failing to loadlibflint
/libarb
/libdeformation
(as I guess they are not installed at the time the tests are run).I didn't get an opportunity to comment on this last night, but FYI you don't need to add
DLPATH_ADD=
to themake install
call. Just the first one that saysmake verbose
. That works for me, and themake check
tests pass.Perhaps I misunderstood your suggestion but the following does not work on my computer
build/pkgs/flint/spkg-install
diff --git a/build/pkgs/flint/spkg-install b/build/pkgs/flint/spkg-install index f6c185433f..29cdbee8f0 100644
a b if [ $? -ne 0 ]; then 38 38 fi 39 39 40 40 echo "Building FLINT shared library." 41 $MAKE verbose 41 $MAKE verbose DLPATH_ADD= 42 42 if [ $? -ne 0 ]; then 43 43 echo >&2 "Error: Failed to build FLINT shared library." 44 44 exit 1 The log of the command
sage -f flint
is flint-2.5.2-check_with_DLPATH_ADD_empty_but_not_on_make_install.log.gz.
Exactly this is all I needed for the build to work on arch.
comment:127 Changed 3 years ago by
Do you actually need any of this, with the new beta merging #23700 ? I'm not saying that this branch is not needed, it's merely for understanding the root causes...
comment:128 in reply to: ↑ 126 Changed 3 years ago by
Replying to embray:
Replying to vdelecroix:
Perhaps I misunderstood your suggestion but the following does not work on my computer
Exactly this is all I needed for the build to work on arch.
Scratch that--it's not working for me either. I must have made a mistake when I last tested it (perhaps I forgot to ensure that the gc
package was installed in sage first). Now I am getting the same result.
comment:129 Changed 3 years ago by
This is going to conflict with #25035: Erik, Vincent, in which order do you think we should handle the two tickets?
comment:130 Changed 3 years ago by
I don't have a strong preference. If there is a conflict I can resolve it later.
comment:131 Changed 3 years ago by
Let's finish this ticket by removing the unnecessary sdh_preload_lib
from the packages with FLINT-like build systems.
comment:132 Changed 3 years ago by
I have created a follow-up ticket for a solution without the sdh_preload_lib
with R (and rpy2) at #25170.
comment:133 follow-up: ↓ 134 Changed 3 years ago by
By the way, #24919 provides a prototype for an easy to use (I think) generic mechanism for adding configure
-time checks for system packages to use in favor of building copies of those packages for sage (just as we currently do for packages like gcc, curl, etc...).
In the long-term the best solution to this particular issue would be for Sage to not be installing its own libgc unless it absolute has to (which for a modern Arch Linux where this problem is occurring, it probably wouldn't have to, though that also might depend on whether or not one has the develop package installed). We could also maybe enable header-only installs of some packages.
I'll experiment with adding a configure-time check for libgc on top of #24919.
comment:134 in reply to: ↑ 133 Changed 3 years ago by
comment:135 Changed 3 years ago by
- Status changed from needs_review to needs_work
A (possibly naive) suggestion is to do (almost) what we already do for gcc and related, but simpler :
Drawback : a newer version might break compatibility. A code review of its use is necessary.
BTW : can one use
pkg-config
in the main configure file ? I think not (alas...).