Opened 3 years ago
Last modified 3 years ago
#24575 needs_work defect
conflicts with gc — at Version 78
Reported by: | vdelecroix | Owned by: | |
---|---|---|---|
Priority: | critical | Milestone: | sage-8.2 |
Component: | packages: standard | Keywords: | |
Cc: | embray, vbraun, charpent, defeo, jpflori | Merged in: | |
Authors: | Erik Bray | Reviewers: | Erik Bray, Vincent Delecroix |
Report Upstream: | Reported upstream. No feedback yet. | Work issues: | |
Branch: | u/vdelecroix/24575 (Commits, GitHub, GitLab) | Commit: | 1ac3afaefe81fb21c43be7964c07f7cd7d529cb9 |
Dependencies: | #24885 | Stopgaps: |
Description (last modified by )
Sage has a standard package gc
that creates conflicts with programs that are prerequisite to build Sage such as make. For example, building Sage 8.2.beta3 on archlinux one gets
make: symbol lookup error: /usr/lib/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link
See also this report on sage-devel.
After deactivating the gc
package, the compilation went fine.
The workaround in the branch consists in declaring the environment variable LD_PRELOAD
so that make uses the system gc. The workaround has to be applied to 4 standard packages:
- R
- rpy2
- flint: upstream issue at https://github.com/wbhart/flint2/issues/447
- arb: upstream issue at https://github.com/fredrik-johansson/arb/issues/213
And also to some optional packages
- deformation
Change History (80)
comment:1 Changed 3 years ago by
- Description modified (diff)
comment:2 Changed 3 years ago by
- Description modified (diff)
comment:3 Changed 3 years ago by
- Cc charpent added
comment:4 Changed 3 years ago by
At what point do you get this error? While building R? (guessing the latter from the subject of sage-devel post)
comment:5 follow-up: ↓ 23 Changed 3 years ago by
how come make depends on guile for you? I don't see it.
$ ldd `which gmake` linux-vdso.so.1 (0x00007ffe861ce000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f0e5b6d3000) libc.so.6 => /lib64/libc.so.6 (0x00007f0e5b30f000) /lib64/ld-linux-x86-64.so.2 (0x00007f0e5b8d7000)
I've just installed gc-7.6.2 systemwide, and things work for me with Sage, so far.
comment:6 Changed 3 years ago by
By the way, there is #23700, which would give you the same major gc version as you apparently need (although I fail to see why).
comment:7 follow-up: ↓ 12 Changed 3 years ago by
IMHO, if you can nuke a system utility by installing a library with normal user privileges, then you have a huge security hole. Thus I don't think it's something to fix in Sage.
comment:8 Changed 3 years ago by
I think the "symbol lookup error" is just being echoed by make, its not from make not finding a symbol; This happens while R is compiling the MASS package and output is clearly being filtered.
Apparently R sets LD_LIBRARY_PATH while compiling packages so Sage's libraries take precedence over system ones. Which inevitably leads to problems, which is why we removed that from the Sage build system a while ago.
comment:9 Changed 3 years ago by
No, R is not setting LD_LIBRARY_PATH
, it is merely respecting it. I think we have a case of Sage being built with LD_LIBRARY_PATH set to something, and also guile
(indeed, it has nothing to do with Sage AFAIK) involved in the environment somehow; and guile
(perhaps invoked from .bashrc
?) made to use wrong gc
version from Sage.
To reproduce this one needs to have libgc.so.X have the same X in $SAGE_LOCAL/lib and in /usr/lib. On my system X=1 in the former and X=2 in the latter, and libguile is linked to libgc.so.2. So I went and made
$ ln -sf libgc.so.1 libgc.so.2
in $SAGE_LOCAL/lib. After this I duly get
$ LD_LIBRARY_PATH=./local/lib guile guile: symbol lookup error: /usr/lib64/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link
Needless to say, R still builds just fine for me after this hack.
comment:10 Changed 3 years ago by
For precision, my $LD_LIBRARY_PATH
is empty. It should not have anything to do with it. What about the proposition of charpent comment:3?
comment:11 Changed 3 years ago by
You have a strange setup on your system, which involves guile into building Sage. Perhaps something in shell configurations, I do not know. Or something wrong with your linker settings or its cache. Guile is a system library which you can only make to fail this way by setting LD_LIBRARY_PATH. But Sage does not do it, something else does.
comment:12 in reply to: ↑ 7 ; follow-up: ↓ 13 Changed 3 years ago by
Replying to dimpase:
IMHO, if you can nuke a system utility by installing a library with normal user privileges, then you have a huge security hole. Thus I don't think it's something to fix in Sage.
That's not what's going on here so please don't mischaracterize it as a "huge security hole". It's quite normal to have a broken setup where one executable is linking at runtime with the wrong version of some shared library. This is the Linux version of "DLL hell" (albeit less severe).
comment:13 in reply to: ↑ 12 ; follow-up: ↓ 14 Changed 3 years ago by
Replying to embray:
Replying to dimpase:
IMHO, if you can nuke a system utility by installing a library with normal user privileges, then you have a huge security hole. Thus I don't think it's something to fix in Sage.
That's not what's going on here so please don't mischaracterize it as a "huge security hole". It's quite normal to have a broken setup where one executable is linking at runtime with the wrong version of some shared library. This is the Linux version of "DLL hell" (albeit less severe).
One needs to set LD_LIBRARY_PATH for this to happen. If on the other hand you succeed in replacing the system library with one at your account *for all the users*, regardless of the environment, then yes, you have hacked the system via a security hole.
Anyhow, there is no Sage bug to fix here, that's what I am trying to say all along.
Unless I see an meaningful explanation how libguile is relevant to building Sage, I'd tend to set this to wontfix
.
comment:14 in reply to: ↑ 13 ; follow-up: ↓ 15 Changed 3 years ago by
Replying to dimpase:
Replying to embray:
Replying to dimpase:
IMHO, if you can nuke a system utility by installing a library with normal user privileges, then you have a huge security hole. Thus I don't think it's something to fix in Sage.
That's not what's going on here so please don't mischaracterize it as a "huge security hole". It's quite normal to have a broken setup where one executable is linking at runtime with the wrong version of some shared library. This is the Linux version of "DLL hell" (albeit less severe).
One needs to set LD_LIBRARY_PATH for this to happen. If on the other hand you succeed in replacing the system library with one at your account *for all the users*, regardless of the environment, then yes, you have hacked the system via a security hole.
I...don't see any evidence that that's happening here.
comment:15 in reply to: ↑ 14 Changed 3 years ago by
Replying to embray:
Replying to dimpase:
Replying to embray:
Replying to dimpase:
IMHO, if you can nuke a system utility by installing a library with normal user privileges, then you have a huge security hole. Thus I don't think it's something to fix in Sage.
That's not what's going on here so please don't mischaracterize it as a "huge security hole". It's quite normal to have a broken setup where one executable is linking at runtime with the wrong version of some shared library. This is the Linux version of "DLL hell" (albeit less severe).
One needs to set LD_LIBRARY_PATH for this to happen. If on the other hand you succeed in replacing the system library with one at your account *for all the users*, regardless of the environment, then yes, you have hacked the system via a security hole.
I...don't see any evidence that that's happening here.
I have not said I see this, either. What I see is an unexplained attempt to invoke (lib)guile during the Sage build.
comment:16 follow-up: ↓ 18 Changed 3 years ago by
Somebody who can reproduce the original problem should make the R build more verbose and try again...
comment:17 Changed 3 years ago by
according to Vincent this can happen on his system while building Flint:
I restart a build from scratch and I don't believe that R is responsible in any way. This new build stopped on flint pointing at the same library issue make: symbol lookup error: /usr/lib/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link
comment:18 in reply to: ↑ 16 Changed 3 years ago by
Replying to vbraun:
Somebody who can reproduce the original problem should make the R build more verbose and try again...
on it
comment:19 Changed 3 years ago by
Failed on flint. But debug mode not very helpful (made in flint source dir)
(sage-sh)$ make --debug GNU Make 4.2.1 Built for x86_64-unknown-linux-gnu Copyright (C) 1988-2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Reading makefiles... Updating makefiles.... Updating goal targets.... File 'all' does not exist. File 'library' does not exist. Must remake target 'library'. make: symbol lookup error: /usr/lib/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link make: *** [Makefile:173: library] Error 127
comment:20 follow-up: ↓ 21 Changed 3 years ago by
Can you try starting guile at (sage-sh)$
prompt?
comment:21 in reply to: ↑ 20 Changed 3 years ago by
Replying to dimpase:
Can you try starting guile at
(sage-sh)$
prompt?
It works fine
(sage-sh) $ guile GNU Guile 2.2.3 Copyright (C) 1995-2017 Free Software Foundation, Inc. Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'. This program is free software, and you are welcome to redistribute it under certain conditions; type `,show c' for details. Enter `,help' for help. scheme@(guile-user)> quit() $1 = #<procedure quit args> While compiling expression: Syntax error: unknown location: unexpected syntax in form () scheme@(guile-user)> ()
comment:22 Changed 3 years ago by
Still with flint: without any option to ./configure
it succeeds
(sage-sh)$ ./configure --disable-static --prefix="$SAGE_LOCAL" Configuring...x86_64-Linux Testing __builtin_popcountl...yes Testing native popcount...yes Testing __thread...yes Testing fenv...yes FLINT was successfully configured. (sage-sh) $ make mkdir -p build make[1]: Entering directory '/opt/sage-bis/local/var/tmp/sage/build/flint-2.5.2.p1/src' CC build/printf.lo CC build/fprintf.lo CC build/sprintf.lo CC build/scanf.lo CC build/fscanf.lo CC build/sscanf.lo CC build/clz_tab.lo CC build/memory_manager.lo CC build/version.lo CC build/profiler.lo CC build/thread_support.lo ...
But setting --with-gmp
it fails
(sage-sh) $ ./configure --disable-static --prefix="$SAGE_LOCAL" --with-gmp="$SAGE_LOCAL" Configuring...x86_64-Linux Testing __builtin_popcountl...yes Testing native popcount...yes Testing __thread...yes Testing fenv...yes FLINT was successfully configured. (sage-sh) $ make make: symbol lookup error: /usr/lib/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link make: *** [Makefile:173: library] Error 127
comment:23 in reply to: ↑ 5 ; follow-up: ↓ 24 Changed 3 years ago by
Replying to dimpase:
how come make depends on guile for you? I don't see it.
$ ldd `which gmake` linux-vdso.so.1 (0x00007ffe861ce000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f0e5b6d3000) libc.so.6 => /lib64/libc.so.6 (0x00007f0e5b30f000) /lib64/ld-linux-x86-64.so.2 (0x00007f0e5b8d7000)I've just installed gc-7.6.2 systemwide, and things work for me with Sage, so far.
What is gmake
? I got
(sage-sh) $ ldd `which make` linux-vdso.so.1 (0x00007fff3ccf6000) libguile-2.2.so.1 => /usr/lib/libguile-2.2.so.1 (0x00007f88df74d000) libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f88df549000) libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f88df32b000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f88def74000) libgc.so.1 => /usr/lib/libgc.so.1 (0x00007f88ded0a000) libffi.so.6 => /usr/lib/libffi.so.6 (0x00007f88deb01000) libunistring.so.2 => /usr/lib/libunistring.so.2 (0x00007f88de77f000) libgmp.so.10 => /usr/lib/libgmp.so.10 (0x00007f88de4ec000) libltdl.so.7 => /usr/lib/libltdl.so.7 (0x00007f88de2e2000) libcrypt.so.1 => /usr/lib/libcrypt.so.1 (0x00007f88de0aa000) libm.so.6 => /usr/lib/libm.so.6 (0x00007f88ddd5e000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f88dfa7a000) libatomic_ops.so.1 => /usr/lib/libatomic_ops.so.1 (0x00007f88ddb5c000)
comment:24 in reply to: ↑ 23 ; follow-ups: ↓ 25 ↓ 26 Changed 3 years ago by
Replying to vdelecroix:
Replying to dimpase:
how come make depends on guile for you? I don't see it.
$ ldd `which gmake` linux-vdso.so.1 (0x00007ffe861ce000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f0e5b6d3000) libc.so.6 => /lib64/libc.so.6 (0x00007f0e5b30f000) /lib64/ld-linux-x86-64.so.2 (0x00007f0e5b8d7000)I've just installed gc-7.6.2 systemwide, and things work for me with Sage, so far.
What is
gmake
?
for me make
is a link to gmake
, but it's not important.
What's important is that your make
is linked with libguile
(and a slew of its dependencies, including libgc
), and this is not usual (I never heard of it---
although it is not crazy, see https://www.gnu.org/software/make/manual/html_node/Guile-Integration.html)
I got
(sage-sh) $ ldd `which make` linux-vdso.so.1 (0x00007fff3ccf6000) libguile-2.2.so.1 => /usr/lib/libguile-2.2.so.1 (0x00007f88df74d000) libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f88df549000) libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f88df32b000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f88def74000) libgc.so.1 => /usr/lib/libgc.so.1 (0x00007f88ded0a000) libffi.so.6 => /usr/lib/libffi.so.6 (0x00007f88deb01000) libunistring.so.2 => /usr/lib/libunistring.so.2 (0x00007f88de77f000) libgmp.so.10 => /usr/lib/libgmp.so.10 (0x00007f88de4ec000) libltdl.so.7 => /usr/lib/libltdl.so.7 (0x00007f88de2e2000) libcrypt.so.1 => /usr/lib/libcrypt.so.1 (0x00007f88de0aa000) libm.so.6 => /usr/lib/libm.so.6 (0x00007f88ddd5e000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f88dfa7a000) libatomic_ops.so.1 => /usr/lib/libatomic_ops.so.1 (0x00007f88ddb5c000)
So we see that at this point make
appears to be correctly linked.
What do you see if in that directory (at sage-sh prompt) you run make -v
rather than make
? (More precisely, I'd like to understand whether it's the generated Flint's Makefile that breaks it, or it's just make
itself)
And what does ldd /usr/lib/libguile-2.2.so.1
show?
comment:25 in reply to: ↑ 24 Changed 3 years ago by
Replying to dimpase:
Replying to vdelecroix:
Replying to dimpase:
What do you see if in that directory (at sage-sh prompt) you run
make -v
rather thanmake
? (More precisely, I'd like to understand whether it's the generated Flint's Makefile that breaks it, or it's justmake
itself)
(sage-sh) $ make -v GNU Make 4.2.1 Construit pour x86_64-unknown-linux-gnu Copyright (C) 1988-2016 Free Software Foundation, Inc. Licence GPLv3+ : GNU GPL version 3 ou ultérieure <http://gnu.org/licenses/gpl.html> Ceci est un logiciel libre : vous êtes autorisé à le modifier et à la redistribuer. Il ne comporte AUCUNE GARANTIE, dans la mesure de ce que permet la loi.
Please read also comment:22: make does not look broken when I do not configure gmp
.
And what does
ldd /usr/lib/libguile-2.2.so.1
show?
(sage-sh) $ ldd /usr/lib/libguile-2.2.so.1 linux-vdso.so.1 (0x00007fff17387000) libgc.so.1 => /usr/lib/libgc.so.1 (0x00007fed43ae3000) libffi.so.6 => /usr/lib/libffi.so.6 (0x00007fed438da000) libunistring.so.2 => /usr/lib/libunistring.so.2 (0x00007fed43558000) libgmp.so.10 => /usr/lib/libgmp.so.10 (0x00007fed432c5000) libltdl.so.7 => /usr/lib/libltdl.so.7 (0x00007fed430bb000) libcrypt.so.1 => /usr/lib/libcrypt.so.1 (0x00007fed42e83000) libm.so.6 => /usr/lib/libm.so.6 (0x00007fed42b37000) libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fed42919000) libc.so.6 => /usr/lib/libc.so.6 (0x00007fed42562000) /usr/lib64/ld-linux-x86-64.so.2 (0x00007fed4407a000) libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fed4235e000) libatomic_ops.so.1 => /usr/lib/libatomic_ops.so.1 (0x00007fed4215c000)
comment:26 in reply to: ↑ 24 Changed 3 years ago by
Replying to dimpase:
for me
make
is a link togmake
, but it's not important. What's important is that yourmake
is linked withlibguile
(and a slew of its dependencies, includinglibgc
), and this is not usual (I never heard of it---
Since GNU make version 4 you can extend make
with guile bindings. Building make with such extension is a configuration option. Building those or not is a choice usually made by distro. Usually binary distro include all possible options unless they have "reservations". On Gentoo it is an option that is off by default.
comment:27 Changed 3 years ago by
I've built make from source with --with-guile
, guile version 2.2.
(which required changing one character in line 171 configure.ac,
[ PKG_CHECK_MODULES([GUILE], [guile-2.2], [have_guile=yes],
(2.2
instead of 2.0
- this probably explains why I was unable to build it the gentoo way?), getting
$ ldd `which make` linux-vdso.so.1 (0x00007ffc6d3c3000) libguile-2.2.so.1 => /usr/lib64/libguile-2.2.so.1 (0x00007f932105f000) libgc.so.2 => /usr/lib64/libgc.so.2 (0x00007f9320de6000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f9320be2000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f93209c2000) libc.so.6 => /lib64/libc.so.6 (0x00007f93205fe000) libffi.so.6 => /usr/lib64/libffi.so.6 (0x00007f93203f5000) libunistring.so.2 => /usr/lib64/libunistring.so.2 (0x00007f932007c000) libgmp.so.10 => /usr/lib64/libgmp.so.10 (0x00007f931fdf3000) libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x00007f931fbe9000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f931f9b1000) libm.so.6 => /lib64/libm.so.6 (0x00007f931f66f000) /lib64/ld-linux-x86-64.so.2 (0x00007f93213b0000)
but I cannot reproduce this.
It might be the version difference, but the produced make
happily builds Flint
even if I do $export LD_LIBRARY_PATH=$SAGE_LOCAL/lib; make
.
Needless to say, this export breaks guile:
$ guile guile: symbol lookup error: /usr/lib64/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link
Or it might be that the generated by Flint Makefile does not trigger guile extension in my case, and does trigger it in Vincent's case?
comment:28 follow-up: ↓ 29 Changed 3 years ago by
The trigger is just execution. If the symbol is not resolved you get this. There are a couple of things to remember:
- for the problem to happen the soname of the libgc in sage and on the system need to be the same
- while the soname are the same libgc in sage doesn't have the same symbols than on the system
So either libgc shouldn't have the same soname (upstream not bumping the number properly) or libgc is not configured with the same features in sage and on the system.
comment:29 in reply to: ↑ 28 Changed 3 years ago by
Replying to fbissey:
The trigger is just execution. If the symbol is not resolved you get this. There are a couple of things to remember:
- for the problem to happen the soname of the libgc in sage and on the system need to be the same
- while the soname are the same libgc in sage doesn't have the same symbols than on the system
So either libgc shouldn't have the same soname (upstream not bumping the number properly) or libgc is not configured with the same features in sage and on the system.
This does crash guile:
$ ldd `which guile` linux-vdso.so.1 (0x00007ffd5e36a000) libguile-2.2.so.1 => /usr/lib64/libguile-2.2.so.1 (0x00007f4c66c70000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f4c66a50000) libc.so.6 => /lib64/libc.so.6 (0x00007f4c6668c000) libgc.so.2 => /usr/lib64/libgc.so.2 (0x00007f4c66413000) libffi.so.6 => /usr/lib64/libffi.so.6 (0x00007f4c6620a000) libunistring.so.2 => /usr/lib64/libunistring.so.2 (0x00007f4c65e91000) libgmp.so.10 => /usr/lib64/libgmp.so.10 (0x00007f4c65c08000) libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x00007f4c659fe000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f4c657fa000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f4c655c2000) libm.so.6 => /lib64/libm.so.6 (0x00007f4c65280000) /lib64/ld-linux-x86-64.so.2 (0x00007f4c66fc1000) (sage-sh) dima@hilbert:sage-dev$ LD_LIBRARY_PATH=$SAGE_LOCAL/lib guile guile: symbol lookup error: /usr/lib64/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link
as I created a link to the wrong libgc (see comment 9):
$ ls -l $SAGE_LOCAL/lib/libgc* -rw-r--r-- 1 dima dima 946784 Dec 30 09:59 /home/dima/Sage/sage-dev/local/lib/libgc.a lrwxrwxrwx 1 dima dima 14 Dec 30 09:59 /home/dima/Sage/sage-dev/local/lib/libgc.so -> libgc.so.1.0.3 lrwxrwxrwx 1 dima dima 14 Dec 30 09:59 /home/dima/Sage/sage-dev/local/lib/libgc.so.1 -> libgc.so.1.0.3 -rwxr-xr-x 1 dima dima 702568 Dec 30 09:59 /home/dima/Sage/sage-dev/local/lib/libgc.so.1.0.3 lrwxrwxrwx 1 dima dima 10 Jan 20 22:54 /home/dima/Sage/sage-dev/local/lib/libgc.so.2 -> libgc.so.1
(libgc.so.1 is wrong (Sage's gc 7.2)) But make with guile works just fine:
$ LD_LIBRARY_PATH=$SAGE_LOCAL/lib /home/dima/bin/make -v GNU Make 4.2.1 Built for x86_64-pc-linux-gnu Copyright (C) 1988-2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
even though it is linked to the same libguile:
$ ldd /home/dima/bin/make linux-vdso.so.1 (0x00007ffd2959e000) libguile-2.2.so.1 => /usr/lib64/libguile-2.2.so.1 (0x00007fb4e1418000) libgc.so.2 => /usr/lib64/libgc.so.2 (0x00007fb4e119f000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fb4e0f9b000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fb4e0d7b000) libc.so.6 => /lib64/libc.so.6 (0x00007fb4e09b7000) libffi.so.6 => /usr/lib64/libffi.so.6 (0x00007fb4e07ae000) libunistring.so.2 => /usr/lib64/libunistring.so.2 (0x00007fb4e0435000) libgmp.so.10 => /usr/lib64/libgmp.so.10 (0x00007fb4e01ac000) libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x00007fb4dffa2000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007fb4dfd6a000) libm.so.6 => /lib64/libm.so.6 (0x00007fb4dfa28000) /lib64/ld-linux-x86-64.so.2 (0x00007fb4e1769000)
just as a sanity check:
$ LD_LIBRARY_PATH=$SAGE_LOCAL/lib ldd /home/dima/bin/make linux-vdso.so.1 (0x00007ffccbbfe000) libguile-2.2.so.1 => /usr/lib64/libguile-2.2.so.1 (0x00007ff033858000) libgc.so.2 => /home/dima/Sage/sage-dev/local/lib/libgc.so.2 (0x00007ff0334f9000) libdl.so.2 => /lib64/libdl.so.2 (0x00007ff0332f5000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ff0330d5000) libc.so.6 => /lib64/libc.so.6 (0x00007ff032d11000) libffi.so.6 => /usr/lib64/libffi.so.6 (0x00007ff032b08000) libunistring.so.2 => /usr/lib64/libunistring.so.2 (0x00007ff03278f000) libgmp.so.10 => /usr/lib64/libgmp.so.10 (0x00007ff032506000) libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x00007ff0322fc000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007ff0320c4000) libm.so.6 => /lib64/libm.so.6 (0x00007ff031d82000) /lib64/ld-linux-x86-64.so.2 (0x00007ff033ba9000)
So even though the link ought to be resolved by the linker, it is not done (I can also run actual building, not just -v
with this setup).
comment:30 follow-up: ↓ 32 Changed 3 years ago by
- Cc defeo added
Just wanted to confirm I'm experiencing the same problem on Arch. I have no more insights than you guys.
Has Antonio Rojas popped up in the discussion yet? He might have already seen this error while packaging for Arch.
comment:31 Changed 3 years ago by
One trivial way out is to upgrade our gc, see #23700
comment:32 in reply to: ↑ 30 Changed 3 years ago by
Replying to defeo:
Just wanted to confirm I'm experiencing the same problem on Arch. I have no more insights than you guys.
Has Antonio Rojas popped up in the discussion yet? He might have already seen this error while packaging for Arch.
I think that Arch guys forgot to bump up the version of libgc, for it is still libgc.so.1 (while on gentoo the same libgc is named libgc.so.2) cf comments 25 and 27 above.
Note that Arch most probably uses system libgc in its build of Sage, as it's not listed here https://www.archlinux.org/packages/community/x86_64/sagemath/
comment:33 Changed 3 years ago by
Could anyone who can reproduce this check whether #23000 fixes the problem?
comment:34 follow-up: ↓ 40 Changed 3 years ago by
oops, typo, it should be "Could anyone who can reproduce this check whether #23700 fixes the problem?"
comment:35 Changed 3 years ago by
- Report Upstream changed from N/A to Reported upstream. No feedback yet.
I've asked on bug-make@gnu.org
whether is this a GNU make bug.
comment:36 follow-ups: ↓ 37 ↓ 39 Changed 3 years ago by
- Report Upstream changed from Reported upstream. No feedback yet. to Reported upstream. Developers deny it's a bug.
Well, I am not convinced, and am still waiting for an answer to this.
comment:37 in reply to: ↑ 36 ; follow-up: ↓ 38 Changed 3 years ago by
comment:38 in reply to: ↑ 37 Changed 3 years ago by
Replying to embray:
Replying to dimpase:
Well, I am not convinced, and am still waiting for an answer to this.
Your comment about maybe statically linking libguile makes some sense, but then you'd have to also statically link any of its dependencies as well, including libgc or else it wouldn't solve the problem.
They could also load the Guile extension only if they need it. (And/or have a configuration option of turning it off).
comment:39 in reply to: ↑ 36 Changed 3 years ago by
That would make sense too.
Anyways, I'm increasingly convinced that the problem here is in the affected distros. I'm gonna try an Arch VM and see if I can reproduce...
comment:40 in reply to: ↑ 34 ; follow-ups: ↓ 41 ↓ 42 Changed 3 years ago by
Replying to dimpase:
oops, typo, it should be "Could anyone who can reproduce this check whether #23700 fixes the problem?"
Not for me.
- I checked out the ticket and ran
make
. Same failure.
- I ran
make distclean
, themake
again. I got this failure:
[patch-2.7.5] Using cached file /home/defeo/sage/upstream/patch-2.7.5.tar.gz [patch-2.7.5] patch-2.7.5 [patch-2.7.5] ==================================================== [patch-2.7.5] Setting up build directory for patch-2.7.5 [patch-2.7.5] Traceback (most recent call last): [patch-2.7.5] File "/home/defeo/sage/build/bin/sage-uncompress-spkg", line 23, in <module> [patch-2.7.5] run() [patch-2.7.5] File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/cmdline.py", line 72, in run [patch-2.7.5] unpack_archive(archive, dirname) [patch-2.7.5] File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/action.py", line 68, in unpack_archive [patch-2.7.5] archive.extractall(members=archive.names) [patch-2.7.5] File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/tar_file.py", line 90, in extractall [patch-2.7.5] members=members) [patch-2.7.5] File "/usr/lib/python3.6/tarfile.py", line 2007, in extractall [patch-2.7.5] numeric_owner=numeric_owner) [patch-2.7.5] File "/usr/lib/python3.6/tarfile.py", line 2049, in extract [patch-2.7.5] numeric_owner=numeric_owner) [patch-2.7.5] TypeError: _extract_member() got an unexpected keyword argument 'set_attrs' [patch-2.7.5] ************************************************************************ [patch-2.7.5] Error: failed to extract /home/defeo/sage/upstream/patch-2.7.5.tar.gz [patch-2.7.5] ************************************************************************
comment:41 in reply to: ↑ 40 Changed 3 years ago by
a duplicate comment, sorry.
comment:42 in reply to: ↑ 40 Changed 3 years ago by
Replying to defeo:
Replying to dimpase:
oops, typo, it should be "Could anyone who can reproduce this check whether #23700 fixes the problem?"
Not for me.
- I checked out the ticket and ran
make
. Same failure.
- I ran
make distclean
, themake
again. I got this failure:[patch-2.7.5] Using cached file /home/defeo/sage/upstream/patch-2.7.5.tar.gz [patch-2.7.5] patch-2.7.5 [patch-2.7.5] ==================================================== [patch-2.7.5] Setting up build directory for patch-2.7.5 [patch-2.7.5] Traceback (most recent call last): [patch-2.7.5] File "/home/defeo/sage/build/bin/sage-uncompress-spkg", line 23, in <module> [patch-2.7.5] run() [patch-2.7.5] File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/cmdline.py", line 72, in run [patch-2.7.5] unpack_archive(archive, dirname) [patch-2.7.5] File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/action.py", line 68, in unpack_archive [patch-2.7.5] archive.extractall(members=archive.names) [patch-2.7.5] File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/tar_file.py", line 90, in extractall [patch-2.7.5] members=members) [patch-2.7.5] File "/usr/lib/python3.6/tarfile.py", line 2007, in extractall [patch-2.7.5] numeric_owner=numeric_owner) [patch-2.7.5] File "/usr/lib/python3.6/tarfile.py", line 2049, in extract [patch-2.7.5] numeric_owner=numeric_owner) [patch-2.7.5] TypeError: _extract_member() got an unexpected keyword argument 'set_attrs' [patch-2.7.5] ************************************************************************ [patch-2.7.5] Error: failed to extract /home/defeo/sage/upstream/patch-2.7.5.tar.gz [patch-2.7.5] ************************************************************************
this looks like system's Python is nuked too. Do you have funky stuff in your LD_LIBRARY_PATH or in PATH? Nothing to do with gc, that's certain.
comment:43 Changed 3 years ago by
Or perhaps it's simply due to your python
being python3
(or a very new python3
, which has not been tested...)
comment:44 follow-up: ↓ 45 Changed 3 years ago by
yep, I have this error if I set my system Python to python3.5, too.
Thus, set python to python2, and repeat please.
comment:45 in reply to: ↑ 44 Changed 3 years ago by
comment:46 Changed 3 years ago by
Ok, it compiled with Python2. Now, it might be thanks to #23700, or thanks to Python2... who knows? :)
comment:47 Changed 3 years ago by
- Dependencies set to #23700
- Status changed from new to needs_review
#23700 is reported to fix this issue.
(As well as using a guile-less make, I presume.)
comment:48 Changed 3 years ago by
Neat, I was able to reproduce this in an Arch Linux Docker image. So at least there's that.
comment:49 Changed 3 years ago by
Does #23700 cure it?
comment:50 Changed 3 years ago by
I haven't tried. But a workaround that did work was to add LD_PRELOAD=/usr/bin/libgc.so
. So a full workaround might look something like:
if [ "$UNAME" = "Linux" ]; then LIBGC="$(ldd $(which make) | sed -n 's/\s*libgc\.so.* => \(.\+\) .*/\1/p')" if [ -n "$LIBGC" ]; then export LD_PRELOAD="$LIBGC" fi fi
This finds the libgc
that is needed by libguile
(and by extension make
) and ensures it's the one that's used, not the one from Sage. Sucks, but it works, and is kind of necessary.
A similar LD_PRELOAD
trick might be able to solve #24605 as well, but I haven't tested that yet.
comment:51 follow-up: ↓ 52 Changed 3 years ago by
- Branch set to u/embray/build/ticket-24575
- Commit set to 71c63fd0d9043a134568ad2a018f65c69888c52a
- Reviewers set to Erik Bray
I've gone ahead and added my workaround. I would recommend using this even with #23700, just because really we should always be using the libgc
from the system when invoking make
(where applicable), even if the libgc
in Sage happens, by some luck, to be compatible with the system's version.
In principle this workaround is needed for any build process that adds $SAGE_LOCAL/lib
to $LD_LIBRARY_PATH
. In general this should not be done at all, but there is at least one other case I know of in Sage: python. So this might also be worth extracting into a helper function for pre-loading certain libraries when needed...
New commits:
71c63fd | Add the workaround to https://trac.sagemath.org/ticket/24575
|
comment:52 in reply to: ↑ 51 Changed 3 years ago by
- Dependencies #23700 deleted
Replying to embray:
I've gone ahead and added my workaround. I would recommend using this even with #23700, just because really we should always be using the
libgc
from the system when invokingmake
(where applicable), even if thelibgc
in Sage happens, by some luck, to be compatible with the system's version.In principle this workaround is needed for any build process that adds
$SAGE_LOCAL/lib
to$LD_LIBRARY_PATH
. In general this should not be done at all, but there is at least one other case I know of in Sage: python. So this might also be worth extracting into a helper function for pre-loading certain libraries when needed...
Thanks Erik for analyzing the problem and providing the workaround! I definitely did not want to consider #23700 as a solution. (I am now compiling from scratch for checking)
Note that your fix is focused towards libgc so that the same kind of trouble might appear with another library in the future. But I consider this as fine for now. Wouldn't it be possible to exclude libgc from the list of packages to install when already present (and up to date) on the system?
Changed 3 years ago by
comment:53 Changed 3 years ago by
flint build is failing (for the same reason as R did), see flint-2.5.2.p2.log. Should we apply the same strategy here?
Changed 3 years ago by
comment:54 Changed 3 years ago by
Replying to vdelecroix:
flint build is failing (for the same reason as R did), see flint-2.5.2.p2.log. Should we apply the same strategy here?
Same also with arb and the Python package ryp2 (rpy2-2.8.2.p0.log). After adding the workaround to the three spkg-install the build completes.
Though I did not check the optional packages.
comment:55 Changed 3 years ago by
- Branch changed from u/embray/build/ticket-24575 to u/vdelecroix/24575
- Commit changed from 71c63fd0d9043a134568ad2a018f65c69888c52a to f33c5e60c6655d2af7d97047dcba996600f37193
New commits:
f33c5e6 | Same workaround for arb, flint and rpy2
|
comment:56 Changed 3 years ago by
- Description modified (diff)
comment:57 follow-up: ↓ 58 Changed 3 years ago by
- Status changed from needs_review to needs_work
Shouldn't we do the LD_PRELOAD
in the script that calls spkg-install
, rather than
repeat this boilerplate? (And the same for spkg-check
, by the way).
This would also take care of all the non-standard packages.
comment:58 in reply to: ↑ 57 ; follow-up: ↓ 59 Changed 3 years ago by
Replying to dimpase:
Shouldn't we do the
LD_PRELOAD
in the script that callsspkg-install
, rather than repeat this boilerplate? (And the same forspkg-check
, by the way).This would also take care of all the non-standard packages.
I don't think so. This workaround takes care of fragile makefiles until a better solution is found. Having it globally applied would be a nightmare for debugging as well as upstream communication. It is also likely that the workarounds will be removed one by one.
comment:59 in reply to: ↑ 58 Changed 3 years ago by
Replying to vdelecroix:
Replying to dimpase:
Shouldn't we do the
LD_PRELOAD
in the script that callsspkg-install
, rather than repeat this boilerplate? (And the same forspkg-check
, by the way).This would also take care of all the non-standard packages.
I don't think so. This workaround takes care of fragile makefiles until a better solution is found.
A better solution is not to use Guile-enabled make, at least not until it is built in a way ensuring one can use it for hacking on Guile dependencies.
Having it globally applied would be a nightmare for debugging as well as upstream communication. It is also likely that the workarounds will be removed one by one.
There is nothing to communicate to package upstream here, I think. You cannot ban their use of LD_LIBRARY_FLAGS. Then, the LD_PRELOAD is a pretty standard way to deal with these issues. It has cured so far all these issues, why do you want to keep getting reports on such and such package mysteriously breaking while Guile-enabled make is used.
comment:60 follow-up: ↓ 62 Changed 3 years ago by
I agree with Vincent here. This workaround should only be used for the known packages with an LD_LIBRARY_PATH problem, and this should be reported as a bug upstream.
FLINT seems to be getting a CMake build system to replace its handwritten one, which will likely eliminate this problem.
comment:61 Changed 3 years ago by
Over in #24885 I already implemented a more generic solution for this, but I didn't push the branch yet. That should be used instead.
comment:62 in reply to: ↑ 60 Changed 3 years ago by
Replying to mkoeppe:
I agree with Vincent here. This workaround should only be used for the known packages with an LD_LIBRARY_PATH problem, and this should be reported as a bug upstream.
The fact that they use LD_LIBRARY_PATH
is not a bug IMO, though it would be better, at least in some cases, if they used LD_PRELOAD
instead for specific libraries.
comment:63 Changed 3 years ago by
- Branch changed from u/vdelecroix/24575 to u/embray/build/ticket-24575
- Commit changed from f33c5e60c6655d2af7d97047dcba996600f37193 to 454221ac40ec282c469ff2043465811b7364313b
- Dependencies set to #24885
Reworked on top of #24885
New commits:
ba1b5ee | Add helper function that implements the workaround from https://trac.sagemath.org/ticket/24575 more generically.
|
6103df0 | Add the workaround to https://trac.sagemath.org/ticket/24575
|
2ecaa74 | Replace this with sdh_preload_lib
|
454221a | Same issue applies to arb, flint, and rpy2
|
comment:64 Changed 3 years ago by
- Status changed from needs_work to needs_review
comment:65 Changed 3 years ago by
- Reviewers changed from Erik Bray to Erik Bray, Vincent Delecroix
- Status changed from needs_review to needs_work
I am currently testing optional tickets, at least deformation
has the same symptoms (I will provide a proper commit with all of them once finished).
comment:66 Changed 3 years ago by
All right, for deformation
, after setting the sdh_preload_lib
I got a different error that is unrelated
[deformation-d05941b] CC ../build/perm/../perm.lo [deformation-d05941b] /usr/bin/ld: -r and -pie may not be used together [deformation-d05941b] collect2: error: ld returned 1 exit status [deformation-d05941b] make[4]: *** [../Makefile.subdirs:55: ../build/perm/../perm.lo] Error 1
comment:67 Changed 3 years ago by
- Branch changed from u/embray/build/ticket-24575 to u/vdelecroix/24575
- Commit changed from 454221ac40ec282c469ff2043465811b7364313b to 1ac3afaefe81fb21c43be7964c07f7cd7d529cb9
Concerning optional packages, only deformation needs the workaround (It appears that I also have some unrelated build failures #23533, #24901, #24902 and #24903).
Erik, Dima, Matthias: I am considering the branch as ready to be positively reviewed. As I added a commit on top of the branch I let somebody else finishing the review.
New commits:
1ac3afa | Same issue applies to optional package deformation
|
comment:68 Changed 3 years ago by
- Status changed from needs_work to needs_review
comment:69 Changed 3 years ago by
Our package perl_term_readline_gnu
also has some LD_LIBRARY_PATH stuff...
comment:70 follow-up: ↓ 72 Changed 3 years ago by
I think it's better to patch out this LD_LIBRARY_PATH stuff from the packages' Makefiles. Like this: https://github.com/mkoeppe/deformation/commit/0d732b13e901b777aca000ff502a5d5aa8d690bf
comment:71 Changed 3 years ago by
- Cc jpflori added
comment:72 in reply to: ↑ 70 ; follow-ups: ↓ 73 ↓ 75 Changed 3 years ago by
Replying to mkoeppe:
I think it's better to patch out this LD_LIBRARY_PATH stuff from the packages' Makefiles. Like this: https://github.com/mkoeppe/deformation/commit/0d732b13e901b777aca000ff502a5d5aa8d690bf
Note that flint Makefile contains the very same lines... would you suggest that the same operation should be applied there?
comment:73 in reply to: ↑ 72 Changed 3 years ago by
Replying to vdelecroix:
Replying to mkoeppe:
I think it's better to patch out this LD_LIBRARY_PATH stuff from the packages' Makefiles. Like this: https://github.com/mkoeppe/deformation/commit/0d732b13e901b777aca000ff502a5d5aa8d690bf
Note that flint Makefile contains the very same lines... would you suggest that the same operation should be applied there?
As well as arb.
comment:74 Changed 3 years ago by
flint
, arb
and deformation
share almost the same build system indeed.
Except I did not push the -r
/pie
fix to deformation
.
comment:75 in reply to: ↑ 72 Changed 3 years ago by
Replying to vdelecroix:
would you suggest that the same operation should be applied there?
Yes, probably.
comment:76 Changed 3 years ago by
- Description modified (diff)
- Report Upstream changed from Reported upstream. Developers deny it's a bug. to Reported upstream. No feedback yet.
comment:77 Changed 3 years ago by
And for R, it may be enough to remove the bottom lines of etc/ldpaths.in
.
comment:78 Changed 3 years ago by
- Description modified (diff)
A (possibly naive) suggestion is to do (almost) what we already do for gcc and related, but simpler :
Drawback : a newer version might break compatibility. A code review of its use is necessary.
BTW : can one use
pkg-config
in the main configure file ? I think not (alas...).