Opened 4 years ago

Last modified 3 years ago

#24575 needs_work defect

conflicts with gc — at Version 76

Reported by: vdelecroix Owned by:
Priority: critical Milestone: sage-8.2
Component: packages: standard Keywords:
Cc: embray, vbraun, charpent, defeo, jpflori Merged in:
Authors: Erik Bray Reviewers: Erik Bray, Vincent Delecroix
Report Upstream: Reported upstream. No feedback yet. Work issues:
Branch: u/vdelecroix/24575 (Commits, GitHub, GitLab) Commit: 1ac3afaefe81fb21c43be7964c07f7cd7d529cb9
Dependencies: #24885 Stopgaps:

Status badges

Description (last modified by vdelecroix)

Sage has a standard package gc that creates conflicts with programs that are prerequisite to build Sage such as make. For example, building Sage 8.2.beta3 on archlinux one gets

make: symbol lookup error: /usr/lib/libguile-2.2.so.1: undefined symbol: 
GC_move_disappearing_link

See also this report on sage-devel.

After deactivating the gc package, the compilation went fine.

The workaround in the branch consists in declaring the environment variable LD_PRELOAD so that make uses the system gc. The workaround has to be applied to 4 standard packages:

And also to some optional packages

  • deformation

Change History (78)

comment:1 Changed 4 years ago by vdelecroix

  • Description modified (diff)

comment:2 Changed 4 years ago by vdelecroix

  • Description modified (diff)

comment:3 Changed 4 years ago by charpent

  • Cc charpent added

A (possibly naive) suggestion is to do (almost) what we already do for gcc and related, but simpler :

  • Depend on gc
  • Test for it (and its version) in the main configure file
    • If found (and sufficient) : use that (possibly symlinking the relevant header/library files)
    • Else : install "our" version.

Drawback : a newer version might break compatibility. A code review of its use is necessary.

BTW : can one use pkg-config in the main configure file ? I think not (alas...).

comment:4 Changed 4 years ago by dimpase

At what point do you get this error? While building R? (guessing the latter from the subject of sage-devel post)

comment:5 follow-up: Changed 4 years ago by dimpase

how come make depends on guile for you? I don't see it.

$ ldd `which gmake`
	linux-vdso.so.1 (0x00007ffe861ce000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f0e5b6d3000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f0e5b30f000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f0e5b8d7000)

I've just installed gc-7.6.2 systemwide, and things work for me with Sage, so far.

comment:6 Changed 4 years ago by dimpase

By the way, there is #23700, which would give you the same major gc version as you apparently need (although I fail to see why).

comment:7 follow-up: Changed 4 years ago by dimpase

IMHO, if you can nuke a system utility by installing a library with normal user privileges, then you have a huge security hole. Thus I don't think it's something to fix in Sage.

comment:8 Changed 4 years ago by vbraun

I think the "symbol lookup error" is just being echoed by make, its not from make not finding a symbol; This happens while R is compiling the MASS package and output is clearly being filtered.

Apparently R sets LD_LIBRARY_PATH while compiling packages so Sage's libraries take precedence over system ones. Which inevitably leads to problems, which is why we removed that from the Sage build system a while ago.

comment:9 Changed 4 years ago by dimpase

No, R is not setting LD_LIBRARY_PATH, it is merely respecting it. I think we have a case of Sage being built with LD_LIBRARY_PATH set to something, and also guile (indeed, it has nothing to do with Sage AFAIK) involved in the environment somehow; and guile (perhaps invoked from .bashrc?) made to use wrong gc version from Sage.

To reproduce this one needs to have libgc.so.X have the same X in $SAGE_LOCAL/lib and in /usr/lib. On my system X=1 in the former and X=2 in the latter, and libguile is linked to libgc.so.2. So I went and made

$ ln -sf libgc.so.1 libgc.so.2

in $SAGE_LOCAL/lib. After this I duly get

$ LD_LIBRARY_PATH=./local/lib guile
guile: symbol lookup error: /usr/lib64/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link

Needless to say, R still builds just fine for me after this hack.

comment:10 Changed 4 years ago by vdelecroix

For precision, my $LD_LIBRARY_PATH is empty. It should not have anything to do with it. What about the proposition of charpent comment:3?

comment:11 Changed 4 years ago by dimpase

You have a strange setup on your system, which involves guile into building Sage. Perhaps something in shell configurations, I do not know. Or something wrong with your linker settings or its cache. Guile is a system library which you can only make to fail this way by setting LD_LIBRARY_PATH. But Sage does not do it, something else does.

Last edited 4 years ago by dimpase (previous) (diff)

comment:12 in reply to: ↑ 7 ; follow-up: Changed 4 years ago by embray

Replying to dimpase:

IMHO, if you can nuke a system utility by installing a library with normal user privileges, then you have a huge security hole. Thus I don't think it's something to fix in Sage.

That's not what's going on here so please don't mischaracterize it as a "huge security hole". It's quite normal to have a broken setup where one executable is linking at runtime with the wrong version of some shared library. This is the Linux version of "DLL hell" (albeit less severe).

comment:13 in reply to: ↑ 12 ; follow-up: Changed 4 years ago by dimpase

Replying to embray:

Replying to dimpase:

IMHO, if you can nuke a system utility by installing a library with normal user privileges, then you have a huge security hole. Thus I don't think it's something to fix in Sage.

That's not what's going on here so please don't mischaracterize it as a "huge security hole". It's quite normal to have a broken setup where one executable is linking at runtime with the wrong version of some shared library. This is the Linux version of "DLL hell" (albeit less severe).

One needs to set LD_LIBRARY_PATH for this to happen. If on the other hand you succeed in replacing the system library with one at your account *for all the users*, regardless of the environment, then yes, you have hacked the system via a security hole.

Anyhow, there is no Sage bug to fix here, that's what I am trying to say all along. Unless I see an meaningful explanation how libguile is relevant to building Sage, I'd tend to set this to wontfix.

comment:14 in reply to: ↑ 13 ; follow-up: Changed 4 years ago by embray

Replying to dimpase:

Replying to embray:

Replying to dimpase:

IMHO, if you can nuke a system utility by installing a library with normal user privileges, then you have a huge security hole. Thus I don't think it's something to fix in Sage.

That's not what's going on here so please don't mischaracterize it as a "huge security hole". It's quite normal to have a broken setup where one executable is linking at runtime with the wrong version of some shared library. This is the Linux version of "DLL hell" (albeit less severe).

One needs to set LD_LIBRARY_PATH for this to happen. If on the other hand you succeed in replacing the system library with one at your account *for all the users*, regardless of the environment, then yes, you have hacked the system via a security hole.

I...don't see any evidence that that's happening here.

comment:15 in reply to: ↑ 14 Changed 4 years ago by dimpase

Replying to embray:

Replying to dimpase:

Replying to embray:

Replying to dimpase:

IMHO, if you can nuke a system utility by installing a library with normal user privileges, then you have a huge security hole. Thus I don't think it's something to fix in Sage.

That's not what's going on here so please don't mischaracterize it as a "huge security hole". It's quite normal to have a broken setup where one executable is linking at runtime with the wrong version of some shared library. This is the Linux version of "DLL hell" (albeit less severe).

One needs to set LD_LIBRARY_PATH for this to happen. If on the other hand you succeed in replacing the system library with one at your account *for all the users*, regardless of the environment, then yes, you have hacked the system via a security hole.

I...don't see any evidence that that's happening here.

I have not said I see this, either. What I see is an unexplained attempt to invoke (lib)guile during the Sage build.

comment:16 follow-up: Changed 4 years ago by vbraun

Somebody who can reproduce the original problem should make the R build more verbose and try again...

comment:17 Changed 4 years ago by dimpase

according to Vincent this can happen on his system while building Flint:

I restart a build from scratch and I don't believe that R is responsible 
in any way. This new build stopped on flint pointing at the same library 
issue 

make: symbol lookup error: /usr/lib/libguile-2.2.so.1: undefined symbol: 
GC_move_disappearing_link 

comment:18 in reply to: ↑ 16 Changed 4 years ago by vdelecroix

Replying to vbraun:

Somebody who can reproduce the original problem should make the R build more verbose and try again...

on it

comment:19 Changed 4 years ago by vdelecroix

Failed on flint. But debug mode not very helpful (made in flint source dir)

(sage-sh)$ make --debug
GNU Make 4.2.1
Built for x86_64-unknown-linux-gnu
Copyright (C) 1988-2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Reading makefiles...
Updating makefiles....
Updating goal targets....
 File 'all' does not exist.
   File 'library' does not exist.
  Must remake target 'library'.
make: symbol lookup error: /usr/lib/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link
make: *** [Makefile:173: library] Error 127

comment:20 follow-up: Changed 4 years ago by dimpase

Can you try starting guile at (sage-sh)$ prompt?

comment:21 in reply to: ↑ 20 Changed 4 years ago by vdelecroix

Replying to dimpase:

Can you try starting guile at (sage-sh)$ prompt?

It works fine

(sage-sh) $ guile
GNU Guile 2.2.3
Copyright (C) 1995-2017 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> quit()
$1 = #<procedure quit args>
While compiling expression:
Syntax error:
unknown location: unexpected syntax in form ()
scheme@(guile-user)> ()
Last edited 4 years ago by vdelecroix (previous) (diff)

comment:22 Changed 4 years ago by vdelecroix

Still with flint: without any option to ./configure it succeeds

(sage-sh)$ ./configure --disable-static --prefix="$SAGE_LOCAL"
Configuring...x86_64-Linux
Testing __builtin_popcountl...yes
Testing native popcount...yes
Testing __thread...yes
Testing fenv...yes
FLINT was successfully configured.
(sage-sh) $ make
mkdir -p build
make[1]: Entering directory '/opt/sage-bis/local/var/tmp/sage/build/flint-2.5.2.p1/src'
    CC   build/printf.lo
    CC   build/fprintf.lo
    CC   build/sprintf.lo
    CC   build/scanf.lo
    CC   build/fscanf.lo
    CC   build/sscanf.lo
    CC   build/clz_tab.lo
    CC   build/memory_manager.lo
    CC   build/version.lo
    CC   build/profiler.lo
    CC   build/thread_support.lo
...

But setting --with-gmp it fails

(sage-sh) $ ./configure --disable-static --prefix="$SAGE_LOCAL" --with-gmp="$SAGE_LOCAL"
Configuring...x86_64-Linux
Testing __builtin_popcountl...yes
Testing native popcount...yes
Testing __thread...yes
Testing fenv...yes
FLINT was successfully configured.
(sage-sh) $ make
make: symbol lookup error: /usr/lib/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link
make: *** [Makefile:173: library] Error 127

comment:23 in reply to: ↑ 5 ; follow-up: Changed 4 years ago by vdelecroix

Replying to dimpase:

how come make depends on guile for you? I don't see it.

$ ldd `which gmake`
	linux-vdso.so.1 (0x00007ffe861ce000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f0e5b6d3000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f0e5b30f000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f0e5b8d7000)

I've just installed gc-7.6.2 systemwide, and things work for me with Sage, so far.

What is gmake? I got

(sage-sh) $ ldd `which make`
        linux-vdso.so.1 (0x00007fff3ccf6000)
        libguile-2.2.so.1 => /usr/lib/libguile-2.2.so.1 (0x00007f88df74d000)
        libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f88df549000)
        libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f88df32b000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007f88def74000)
        libgc.so.1 => /usr/lib/libgc.so.1 (0x00007f88ded0a000)
        libffi.so.6 => /usr/lib/libffi.so.6 (0x00007f88deb01000)
        libunistring.so.2 => /usr/lib/libunistring.so.2 (0x00007f88de77f000)
        libgmp.so.10 => /usr/lib/libgmp.so.10 (0x00007f88de4ec000)
        libltdl.so.7 => /usr/lib/libltdl.so.7 (0x00007f88de2e2000)
        libcrypt.so.1 => /usr/lib/libcrypt.so.1 (0x00007f88de0aa000)
        libm.so.6 => /usr/lib/libm.so.6 (0x00007f88ddd5e000)
        /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f88dfa7a000)
        libatomic_ops.so.1 => /usr/lib/libatomic_ops.so.1 (0x00007f88ddb5c000)

comment:24 in reply to: ↑ 23 ; follow-ups: Changed 4 years ago by dimpase

Replying to vdelecroix:

Replying to dimpase:

how come make depends on guile for you? I don't see it.

$ ldd `which gmake`
	linux-vdso.so.1 (0x00007ffe861ce000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f0e5b6d3000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f0e5b30f000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f0e5b8d7000)

I've just installed gc-7.6.2 systemwide, and things work for me with Sage, so far.

What is gmake?

for me make is a link to gmake, but it's not important. What's important is that your make is linked with libguile (and a slew of its dependencies, including libgc), and this is not usual (I never heard of it--- although it is not crazy, see https://www.gnu.org/software/make/manual/html_node/Guile-Integration.html)

I got

(sage-sh) $ ldd `which make`
        linux-vdso.so.1 (0x00007fff3ccf6000)
        libguile-2.2.so.1 => /usr/lib/libguile-2.2.so.1 (0x00007f88df74d000)
        libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f88df549000)
        libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f88df32b000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007f88def74000)
        libgc.so.1 => /usr/lib/libgc.so.1 (0x00007f88ded0a000)
        libffi.so.6 => /usr/lib/libffi.so.6 (0x00007f88deb01000)
        libunistring.so.2 => /usr/lib/libunistring.so.2 (0x00007f88de77f000)
        libgmp.so.10 => /usr/lib/libgmp.so.10 (0x00007f88de4ec000)
        libltdl.so.7 => /usr/lib/libltdl.so.7 (0x00007f88de2e2000)
        libcrypt.so.1 => /usr/lib/libcrypt.so.1 (0x00007f88de0aa000)
        libm.so.6 => /usr/lib/libm.so.6 (0x00007f88ddd5e000)
        /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f88dfa7a000)
        libatomic_ops.so.1 => /usr/lib/libatomic_ops.so.1 (0x00007f88ddb5c000)

So we see that at this point make appears to be correctly linked.

What do you see if in that directory (at sage-sh prompt) you run make -v rather than make? (More precisely, I'd like to understand whether it's the generated Flint's Makefile that breaks it, or it's just make itself)

And what does ldd /usr/lib/libguile-2.2.so.1 show?

comment:25 in reply to: ↑ 24 Changed 4 years ago by vdelecroix

Replying to dimpase:

Replying to vdelecroix:

Replying to dimpase:

What do you see if in that directory (at sage-sh prompt) you run make -v rather than make? (More precisely, I'd like to understand whether it's the generated Flint's Makefile that breaks it, or it's just make itself)

(sage-sh) $ make -v
GNU Make 4.2.1
Construit pour x86_64-unknown-linux-gnu
Copyright (C) 1988-2016 Free Software Foundation, Inc.
Licence GPLv3+ : GNU GPL version 3 ou ultérieure <http://gnu.org/licenses/gpl.html>
Ceci est un logiciel libre : vous êtes autorisé à le modifier et à la redistribuer.
Il ne comporte AUCUNE GARANTIE, dans la mesure de ce que permet la loi.

Please read also comment:22: make does not look broken when I do not configure gmp.

And what does ldd /usr/lib/libguile-2.2.so.1 show?

(sage-sh) $ ldd /usr/lib/libguile-2.2.so.1
        linux-vdso.so.1 (0x00007fff17387000)
        libgc.so.1 => /usr/lib/libgc.so.1 (0x00007fed43ae3000)
        libffi.so.6 => /usr/lib/libffi.so.6 (0x00007fed438da000)
        libunistring.so.2 => /usr/lib/libunistring.so.2 (0x00007fed43558000)
        libgmp.so.10 => /usr/lib/libgmp.so.10 (0x00007fed432c5000)
        libltdl.so.7 => /usr/lib/libltdl.so.7 (0x00007fed430bb000)
        libcrypt.so.1 => /usr/lib/libcrypt.so.1 (0x00007fed42e83000)
        libm.so.6 => /usr/lib/libm.so.6 (0x00007fed42b37000)
        libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fed42919000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007fed42562000)
        /usr/lib64/ld-linux-x86-64.so.2 (0x00007fed4407a000)
        libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fed4235e000)
        libatomic_ops.so.1 => /usr/lib/libatomic_ops.so.1 (0x00007fed4215c000)

comment:26 in reply to: ↑ 24 Changed 4 years ago by fbissey

Replying to dimpase:

for me make is a link to gmake, but it's not important. What's important is that your make is linked with libguile (and a slew of its dependencies, including libgc), and this is not usual (I never heard of it---

Since GNU make version 4 you can extend make with guile bindings. Building make with such extension is a configuration option. Building those or not is a choice usually made by distro. Usually binary distro include all possible options unless they have "reservations". On Gentoo it is an option that is off by default.

comment:27 Changed 4 years ago by dimpase

I've built make from source with --with-guile, guile version 2.2. (which required changing one character in line 171 configure.ac,

[ PKG_CHECK_MODULES([GUILE], [guile-2.2], [have_guile=yes],

(2.2 instead of 2.0 - this probably explains why I was unable to build it the gentoo way?), getting

$ ldd `which make`
	linux-vdso.so.1 (0x00007ffc6d3c3000)
	libguile-2.2.so.1 => /usr/lib64/libguile-2.2.so.1 (0x00007f932105f000)
	libgc.so.2 => /usr/lib64/libgc.so.2 (0x00007f9320de6000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f9320be2000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f93209c2000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f93205fe000)
	libffi.so.6 => /usr/lib64/libffi.so.6 (0x00007f93203f5000)
	libunistring.so.2 => /usr/lib64/libunistring.so.2 (0x00007f932007c000)
	libgmp.so.10 => /usr/lib64/libgmp.so.10 (0x00007f931fdf3000)
	libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x00007f931fbe9000)
	libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f931f9b1000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f931f66f000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f93213b0000)

but I cannot reproduce this. It might be the version difference, but the produced make happily builds Flint even if I do $export LD_LIBRARY_PATH=$SAGE_LOCAL/lib; make.

Needless to say, this export breaks guile:

$ guile
guile: symbol lookup error: /usr/lib64/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link

Or it might be that the generated by Flint Makefile does not trigger guile extension in my case, and does trigger it in Vincent's case?

comment:28 follow-up: Changed 4 years ago by fbissey

The trigger is just execution. If the symbol is not resolved you get this. There are a couple of things to remember:

  • for the problem to happen the soname of the libgc in sage and on the system need to be the same
  • while the soname are the same libgc in sage doesn't have the same symbols than on the system

So either libgc shouldn't have the same soname (upstream not bumping the number properly) or libgc is not configured with the same features in sage and on the system.

comment:29 in reply to: ↑ 28 Changed 4 years ago by dimpase

Replying to fbissey:

The trigger is just execution. If the symbol is not resolved you get this. There are a couple of things to remember:

  • for the problem to happen the soname of the libgc in sage and on the system need to be the same
  • while the soname are the same libgc in sage doesn't have the same symbols than on the system

So either libgc shouldn't have the same soname (upstream not bumping the number properly) or libgc is not configured with the same features in sage and on the system.

This does crash guile:

$ ldd `which guile`
	linux-vdso.so.1 (0x00007ffd5e36a000)
	libguile-2.2.so.1 => /usr/lib64/libguile-2.2.so.1 (0x00007f4c66c70000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f4c66a50000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f4c6668c000)
	libgc.so.2 => /usr/lib64/libgc.so.2 (0x00007f4c66413000)
	libffi.so.6 => /usr/lib64/libffi.so.6 (0x00007f4c6620a000)
	libunistring.so.2 => /usr/lib64/libunistring.so.2 (0x00007f4c65e91000)
	libgmp.so.10 => /usr/lib64/libgmp.so.10 (0x00007f4c65c08000)
	libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x00007f4c659fe000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f4c657fa000)
	libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f4c655c2000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f4c65280000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f4c66fc1000)
(sage-sh) dima@hilbert:sage-dev$ LD_LIBRARY_PATH=$SAGE_LOCAL/lib guile
guile: symbol lookup error: /usr/lib64/libguile-2.2.so.1: undefined symbol: GC_move_disappearing_link

as I created a link to the wrong libgc (see comment 9):

$ ls -l $SAGE_LOCAL/lib/libgc*
-rw-r--r-- 1 dima dima 946784 Dec 30 09:59 /home/dima/Sage/sage-dev/local/lib/libgc.a
lrwxrwxrwx 1 dima dima     14 Dec 30 09:59 /home/dima/Sage/sage-dev/local/lib/libgc.so -> libgc.so.1.0.3
lrwxrwxrwx 1 dima dima     14 Dec 30 09:59 /home/dima/Sage/sage-dev/local/lib/libgc.so.1 -> libgc.so.1.0.3
-rwxr-xr-x 1 dima dima 702568 Dec 30 09:59 /home/dima/Sage/sage-dev/local/lib/libgc.so.1.0.3
lrwxrwxrwx 1 dima dima     10 Jan 20 22:54 /home/dima/Sage/sage-dev/local/lib/libgc.so.2 -> libgc.so.1

(libgc.so.1 is wrong (Sage's gc 7.2)) But make with guile works just fine:

$ LD_LIBRARY_PATH=$SAGE_LOCAL/lib /home/dima/bin/make -v
GNU Make 4.2.1
Built for x86_64-pc-linux-gnu
Copyright (C) 1988-2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

even though it is linked to the same libguile:

$ ldd /home/dima/bin/make 
	linux-vdso.so.1 (0x00007ffd2959e000)
	libguile-2.2.so.1 => /usr/lib64/libguile-2.2.so.1 (0x00007fb4e1418000)
	libgc.so.2 => /usr/lib64/libgc.so.2 (0x00007fb4e119f000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007fb4e0f9b000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fb4e0d7b000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fb4e09b7000)
	libffi.so.6 => /usr/lib64/libffi.so.6 (0x00007fb4e07ae000)
	libunistring.so.2 => /usr/lib64/libunistring.so.2 (0x00007fb4e0435000)
	libgmp.so.10 => /usr/lib64/libgmp.so.10 (0x00007fb4e01ac000)
	libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x00007fb4dffa2000)
	libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007fb4dfd6a000)
	libm.so.6 => /lib64/libm.so.6 (0x00007fb4dfa28000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fb4e1769000)

just as a sanity check:

$ LD_LIBRARY_PATH=$SAGE_LOCAL/lib ldd /home/dima/bin/make 
	linux-vdso.so.1 (0x00007ffccbbfe000)
	libguile-2.2.so.1 => /usr/lib64/libguile-2.2.so.1 (0x00007ff033858000)
	libgc.so.2 => /home/dima/Sage/sage-dev/local/lib/libgc.so.2 (0x00007ff0334f9000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007ff0332f5000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ff0330d5000)
	libc.so.6 => /lib64/libc.so.6 (0x00007ff032d11000)
	libffi.so.6 => /usr/lib64/libffi.so.6 (0x00007ff032b08000)
	libunistring.so.2 => /usr/lib64/libunistring.so.2 (0x00007ff03278f000)
	libgmp.so.10 => /usr/lib64/libgmp.so.10 (0x00007ff032506000)
	libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x00007ff0322fc000)
	libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007ff0320c4000)
	libm.so.6 => /lib64/libm.so.6 (0x00007ff031d82000)
	/lib64/ld-linux-x86-64.so.2 (0x00007ff033ba9000)

So even though the link ought to be resolved by the linker, it is not done (I can also run actual building, not just -v with this setup).

comment:30 follow-up: Changed 4 years ago by defeo

  • Cc defeo added

Just wanted to confirm I'm experiencing the same problem on Arch. I have no more insights than you guys.

Has Antonio Rojas popped up in the discussion yet? He might have already seen this error while packaging for Arch.

comment:31 Changed 4 years ago by dimpase

One trivial way out is to upgrade our gc, see #23700

comment:32 in reply to: ↑ 30 Changed 4 years ago by dimpase

Replying to defeo:

Just wanted to confirm I'm experiencing the same problem on Arch. I have no more insights than you guys.

Has Antonio Rojas popped up in the discussion yet? He might have already seen this error while packaging for Arch.

I think that Arch guys forgot to bump up the version of libgc, for it is still libgc.so.1 (while on gentoo the same libgc is named libgc.so.2) cf comments 25 and 27 above.

Note that Arch most probably uses system libgc in its build of Sage, as it's not listed here https://www.archlinux.org/packages/community/x86_64/sagemath/

Last edited 4 years ago by dimpase (previous) (diff)

comment:33 Changed 4 years ago by dimpase

Could anyone who can reproduce this check whether #23000 fixes the problem?

comment:34 follow-up: Changed 4 years ago by dimpase

oops, typo, it should be "Could anyone who can reproduce this check whether #23700 fixes the problem?"

comment:35 Changed 4 years ago by dimpase

  • Report Upstream changed from N/A to Reported upstream. No feedback yet.

I've asked on bug-make@gnu.org whether is this a GNU make bug.

Last edited 4 years ago by dimpase (previous) (diff)

comment:36 follow-ups: Changed 4 years ago by dimpase

  • Report Upstream changed from Reported upstream. No feedback yet. to Reported upstream. Developers deny it's a bug.

Well, I am not convinced, and am still waiting for an answer to this.

comment:37 in reply to: ↑ 36 ; follow-up: Changed 4 years ago by embray

Replying to dimpase:

Well, I am not convinced, and am still waiting for an answer to this.

Your comment about maybe statically linking libguile makes some sense, but then you'd have to also statically link any of its dependencies as well, including libgc or else it wouldn't solve the problem.

comment:38 in reply to: ↑ 37 Changed 4 years ago by dimpase

Replying to embray:

Replying to dimpase:

Well, I am not convinced, and am still waiting for an answer to this.

Your comment about maybe statically linking libguile makes some sense, but then you'd have to also statically link any of its dependencies as well, including libgc or else it wouldn't solve the problem.

They could also load the Guile extension only if they need it. (And/or have a configuration option of turning it off).

comment:39 in reply to: ↑ 36 Changed 4 years ago by embray

That would make sense too.

Anyways, I'm increasingly convinced that the problem here is in the affected distros. I'm gonna try an Arch VM and see if I can reproduce...

comment:40 in reply to: ↑ 34 ; follow-ups: Changed 4 years ago by defeo

Replying to dimpase:

oops, typo, it should be "Could anyone who can reproduce this check whether #23700 fixes the problem?"

Not for me.

  1. I checked out the ticket and ran make. Same failure.
  1. I ran make distclean, the make again. I got this failure:
[patch-2.7.5] Using cached file /home/defeo/sage/upstream/patch-2.7.5.tar.gz
[patch-2.7.5] patch-2.7.5
[patch-2.7.5] ====================================================
[patch-2.7.5] Setting up build directory for patch-2.7.5
[patch-2.7.5] Traceback (most recent call last):
[patch-2.7.5]   File "/home/defeo/sage/build/bin/sage-uncompress-spkg", line 23, in <module>
[patch-2.7.5]     run()
[patch-2.7.5]   File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/cmdline.py", line 72, in run
[patch-2.7.5]     unpack_archive(archive, dirname)
[patch-2.7.5]   File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/action.py", line 68, in unpack_archive
[patch-2.7.5]     archive.extractall(members=archive.names)
[patch-2.7.5]   File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/tar_file.py", line 90, in extractall
[patch-2.7.5]     members=members)
[patch-2.7.5]   File "/usr/lib/python3.6/tarfile.py", line 2007, in extractall
[patch-2.7.5]     numeric_owner=numeric_owner)
[patch-2.7.5]   File "/usr/lib/python3.6/tarfile.py", line 2049, in extract
[patch-2.7.5]     numeric_owner=numeric_owner)
[patch-2.7.5] TypeError: _extract_member() got an unexpected keyword argument 'set_attrs'
[patch-2.7.5] ************************************************************************
[patch-2.7.5] Error: failed to extract /home/defeo/sage/upstream/patch-2.7.5.tar.gz
[patch-2.7.5] ************************************************************************

comment:41 in reply to: ↑ 40 Changed 4 years ago by dimpase

a duplicate comment, sorry.

Last edited 4 years ago by dimpase (previous) (diff)

comment:42 in reply to: ↑ 40 Changed 4 years ago by dimpase

Replying to defeo:

Replying to dimpase:

oops, typo, it should be "Could anyone who can reproduce this check whether #23700 fixes the problem?"

Not for me.

  1. I checked out the ticket and ran make. Same failure.
  1. I ran make distclean, the make again. I got this failure:
[patch-2.7.5] Using cached file /home/defeo/sage/upstream/patch-2.7.5.tar.gz
[patch-2.7.5] patch-2.7.5
[patch-2.7.5] ====================================================
[patch-2.7.5] Setting up build directory for patch-2.7.5
[patch-2.7.5] Traceback (most recent call last):
[patch-2.7.5]   File "/home/defeo/sage/build/bin/sage-uncompress-spkg", line 23, in <module>
[patch-2.7.5]     run()
[patch-2.7.5]   File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/cmdline.py", line 72, in run
[patch-2.7.5]     unpack_archive(archive, dirname)
[patch-2.7.5]   File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/action.py", line 68, in unpack_archive
[patch-2.7.5]     archive.extractall(members=archive.names)
[patch-2.7.5]   File "/home/defeo/sage/build/bin/../sage_bootstrap/uncompress/tar_file.py", line 90, in extractall
[patch-2.7.5]     members=members)
[patch-2.7.5]   File "/usr/lib/python3.6/tarfile.py", line 2007, in extractall
[patch-2.7.5]     numeric_owner=numeric_owner)
[patch-2.7.5]   File "/usr/lib/python3.6/tarfile.py", line 2049, in extract
[patch-2.7.5]     numeric_owner=numeric_owner)
[patch-2.7.5] TypeError: _extract_member() got an unexpected keyword argument 'set_attrs'
[patch-2.7.5] ************************************************************************
[patch-2.7.5] Error: failed to extract /home/defeo/sage/upstream/patch-2.7.5.tar.gz
[patch-2.7.5] ************************************************************************

this looks like system's Python is nuked too. Do you have funky stuff in your LD_LIBRARY_PATH or in PATH? Nothing to do with gc, that's certain.

comment:43 Changed 4 years ago by dimpase

Or perhaps it's simply due to your python being python3 (or a very new python3, which has not been tested...)

comment:44 follow-up: Changed 4 years ago by dimpase

yep, I have this error if I set my system Python to python3.5, too.

Thus, set python to python2, and repeat please.

comment:45 in reply to: ↑ 44 Changed 4 years ago by dimpase

Replying to dimpase:

yep, I have this error if I set my system Python to python3.5, too.

Thus, set python to python2, and repeat please.

This tar_file py3 problem is now #24830 (which has nothing to do with the current ticket)

comment:46 Changed 4 years ago by defeo

Ok, it compiled with Python2. Now, it might be thanks to #23700, or thanks to Python2... who knows? :)

comment:47 Changed 4 years ago by dimpase

  • Dependencies set to #23700
  • Status changed from new to needs_review

#23700 is reported to fix this issue.

(As well as using a guile-less make, I presume.)

comment:48 Changed 4 years ago by embray

Neat, I was able to reproduce this in an Arch Linux Docker image. So at least there's that.

comment:49 Changed 4 years ago by dimpase

Does #23700 cure it?

comment:50 Changed 4 years ago by embray

I haven't tried. But a workaround that did work was to add LD_PRELOAD=/usr/bin/libgc.so. So a full workaround might look something like:

if [ "$UNAME" = "Linux" ]; then
    LIBGC="$(ldd $(which make) | sed -n 's/\s*libgc\.so.* => \(.\+\) .*/\1/p')"
    if [ -n "$LIBGC" ]; then
        export LD_PRELOAD="$LIBGC"
    fi
fi

This finds the libgc that is needed by libguile (and by extension make) and ensures it's the one that's used, not the one from Sage. Sucks, but it works, and is kind of necessary.

A similar LD_PRELOAD trick might be able to solve #24605 as well, but I haven't tested that yet.

Last edited 4 years ago by embray (previous) (diff)

comment:51 follow-up: Changed 4 years ago by embray

  • Authors set to Erik Bray
  • Branch set to u/embray/build/ticket-24575
  • Commit set to 71c63fd0d9043a134568ad2a018f65c69888c52a
  • Reviewers set to Erik Bray

I've gone ahead and added my workaround. I would recommend using this even with #23700, just because really we should always be using the libgc from the system when invoking make (where applicable), even if the libgc in Sage happens, by some luck, to be compatible with the system's version.

In principle this workaround is needed for any build process that adds $SAGE_LOCAL/lib to $LD_LIBRARY_PATH. In general this should not be done at all, but there is at least one other case I know of in Sage: python. So this might also be worth extracting into a helper function for pre-loading certain libraries when needed...


New commits:

71c63fdAdd the workaround to https://trac.sagemath.org/ticket/24575

comment:52 in reply to: ↑ 51 Changed 4 years ago by vdelecroix

  • Dependencies #23700 deleted

Replying to embray:

I've gone ahead and added my workaround. I would recommend using this even with #23700, just because really we should always be using the libgc from the system when invoking make (where applicable), even if the libgc in Sage happens, by some luck, to be compatible with the system's version.

In principle this workaround is needed for any build process that adds $SAGE_LOCAL/lib to $LD_LIBRARY_PATH. In general this should not be done at all, but there is at least one other case I know of in Sage: python. So this might also be worth extracting into a helper function for pre-loading certain libraries when needed...

Thanks Erik for analyzing the problem and providing the workaround! I definitely did not want to consider #23700 as a solution. (I am now compiling from scratch for checking)

Note that your fix is focused towards libgc so that the same kind of trouble might appear with another library in the future. But I consider this as fine for now. Wouldn't it be possible to exclude libgc from the list of packages to install when already present (and up to date) on the system?

Changed 4 years ago by vdelecroix

comment:53 Changed 4 years ago by vdelecroix

flint build is failing (for the same reason as R did), see flint-2.5.2.p2.log. Should we apply the same strategy here?

Changed 4 years ago by vdelecroix

comment:54 Changed 4 years ago by vdelecroix

Replying to vdelecroix:

flint build is failing (for the same reason as R did), see flint-2.5.2.p2.log. Should we apply the same strategy here?

Same also with arb and the Python package ryp2 (rpy2-2.8.2.p0.log). After adding the workaround to the three spkg-install the build completes.

Though I did not check the optional packages.

comment:55 Changed 4 years ago by vdelecroix

  • Branch changed from u/embray/build/ticket-24575 to u/vdelecroix/24575
  • Commit changed from 71c63fd0d9043a134568ad2a018f65c69888c52a to f33c5e60c6655d2af7d97047dcba996600f37193

New commits:

f33c5e6Same workaround for arb, flint and rpy2

comment:56 Changed 4 years ago by vdelecroix

  • Description modified (diff)

comment:57 follow-up: Changed 4 years ago by dimpase

  • Status changed from needs_review to needs_work

Shouldn't we do the LD_PRELOAD in the script that calls spkg-install, rather than repeat this boilerplate? (And the same for spkg-check, by the way).

This would also take care of all the non-standard packages.

comment:58 in reply to: ↑ 57 ; follow-up: Changed 4 years ago by vdelecroix

Replying to dimpase:

Shouldn't we do the LD_PRELOAD in the script that calls spkg-install, rather than repeat this boilerplate? (And the same for spkg-check, by the way).

This would also take care of all the non-standard packages.

I don't think so. This workaround takes care of fragile makefiles until a better solution is found. Having it globally applied would be a nightmare for debugging as well as upstream communication. It is also likely that the workarounds will be removed one by one.

comment:59 in reply to: ↑ 58 Changed 4 years ago by dimpase

Replying to vdelecroix:

Replying to dimpase:

Shouldn't we do the LD_PRELOAD in the script that calls spkg-install, rather than repeat this boilerplate? (And the same for spkg-check, by the way).

This would also take care of all the non-standard packages.

I don't think so. This workaround takes care of fragile makefiles until a better solution is found.

A better solution is not to use Guile-enabled make, at least not until it is built in a way ensuring one can use it for hacking on Guile dependencies.

Having it globally applied would be a nightmare for debugging as well as upstream communication. It is also likely that the workarounds will be removed one by one.

There is nothing to communicate to package upstream here, I think. You cannot ban their use of LD_LIBRARY_FLAGS. Then, the LD_PRELOAD is a pretty standard way to deal with these issues. It has cured so far all these issues, why do you want to keep getting reports on such and such package mysteriously breaking while Guile-enabled make is used.

comment:60 follow-up: Changed 4 years ago by mkoeppe

I agree with Vincent here. This workaround should only be used for the known packages with an LD_LIBRARY_PATH problem, and this should be reported as a bug upstream.

FLINT seems to be getting a CMake build system to replace its handwritten one, which will likely eliminate this problem.

comment:61 Changed 4 years ago by embray

Over in #24885 I already implemented a more generic solution for this, but I didn't push the branch yet. That should be used instead.

comment:62 in reply to: ↑ 60 Changed 4 years ago by embray

Replying to mkoeppe:

I agree with Vincent here. This workaround should only be used for the known packages with an LD_LIBRARY_PATH problem, and this should be reported as a bug upstream.

The fact that they use LD_LIBRARY_PATH is not a bug IMO, though it would be better, at least in some cases, if they used LD_PRELOAD instead for specific libraries.

comment:63 Changed 4 years ago by embray

  • Branch changed from u/vdelecroix/24575 to u/embray/build/ticket-24575
  • Commit changed from f33c5e60c6655d2af7d97047dcba996600f37193 to 454221ac40ec282c469ff2043465811b7364313b
  • Dependencies set to #24885

Reworked on top of #24885


New commits:

ba1b5eeAdd helper function that implements the workaround from https://trac.sagemath.org/ticket/24575 more generically.
6103df0Add the workaround to https://trac.sagemath.org/ticket/24575
2ecaa74Replace this with sdh_preload_lib
454221aSame issue applies to arb, flint, and rpy2

comment:64 Changed 4 years ago by embray

  • Status changed from needs_work to needs_review

comment:65 Changed 4 years ago by vdelecroix

  • Reviewers changed from Erik Bray to Erik Bray, Vincent Delecroix
  • Status changed from needs_review to needs_work

I am currently testing optional tickets, at least deformation has the same symptoms (I will provide a proper commit with all of them once finished).

comment:66 Changed 4 years ago by vdelecroix

All right, for deformation, after setting the sdh_preload_lib I got a different error that is unrelated

[deformation-d05941b]     CC   ../build/perm/../perm.lo
[deformation-d05941b] /usr/bin/ld: -r and -pie may not be used together
[deformation-d05941b] collect2: error: ld returned 1 exit status
[deformation-d05941b] make[4]: *** [../Makefile.subdirs:55: ../build/perm/../perm.lo] Error 1

comment:67 Changed 4 years ago by vdelecroix

  • Branch changed from u/embray/build/ticket-24575 to u/vdelecroix/24575
  • Commit changed from 454221ac40ec282c469ff2043465811b7364313b to 1ac3afaefe81fb21c43be7964c07f7cd7d529cb9

Concerning optional packages, only deformation needs the workaround (It appears that I also have some unrelated build failures #23533, #24901, #24902 and #24903).

Erik, Dima, Matthias: I am considering the branch as ready to be positively reviewed. As I added a commit on top of the branch I let somebody else finishing the review.


New commits:

1ac3afaSame issue applies to optional package deformation

comment:68 Changed 4 years ago by vdelecroix

  • Status changed from needs_work to needs_review

comment:69 Changed 4 years ago by mkoeppe

Our package perl_term_readline_gnu also has some LD_LIBRARY_PATH stuff...

comment:70 follow-up: Changed 4 years ago by mkoeppe

I think it's better to patch out this LD_LIBRARY_PATH stuff from the packages' Makefiles. Like this: https://github.com/mkoeppe/deformation/commit/0d732b13e901b777aca000ff502a5d5aa8d690bf

comment:71 Changed 4 years ago by mkoeppe

  • Cc jpflori added

comment:72 in reply to: ↑ 70 ; follow-ups: Changed 4 years ago by vdelecroix

Replying to mkoeppe:

I think it's better to patch out this LD_LIBRARY_PATH stuff from the packages' Makefiles. Like this: https://github.com/mkoeppe/deformation/commit/0d732b13e901b777aca000ff502a5d5aa8d690bf

Note that flint Makefile contains the very same lines... would you suggest that the same operation should be applied there?

comment:73 in reply to: ↑ 72 Changed 4 years ago by vdelecroix

Replying to vdelecroix:

Replying to mkoeppe:

I think it's better to patch out this LD_LIBRARY_PATH stuff from the packages' Makefiles. Like this: https://github.com/mkoeppe/deformation/commit/0d732b13e901b777aca000ff502a5d5aa8d690bf

Note that flint Makefile contains the very same lines... would you suggest that the same operation should be applied there?

As well as arb.

comment:74 Changed 4 years ago by jpflori

flint, arb and deformation share almost the same build system indeed. Except I did not push the -r/pie fix to deformation.

comment:75 in reply to: ↑ 72 Changed 4 years ago by mkoeppe

Replying to vdelecroix:

would you suggest that the same operation should be applied there?

Yes, probably.

comment:76 Changed 4 years ago by vdelecroix

  • Description modified (diff)
  • Report Upstream changed from Reported upstream. Developers deny it's a bug. to Reported upstream. No feedback yet.
Note: See TracTickets for help on using tickets.