Opened 10 years ago

Closed 9 years ago

#11708 closed defect (worksforme)

maxima doesn't build on Linux ppc64 (silius on skynet)

Reported by: was Owned by: drkirkby
Priority: major Milestone: sage-duplicate/invalid/wontfix
Component: porting Keywords:
Cc: mhansen Merged in:
Authors: Reviewers: Jeroen Demeyer
Report Upstream: Not yet reported upstream; Will do shortly. Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description

The maxima spkg in sage-4.7.1 fails to build almost instantly with:

...
;;; Emitting code for UNARY.

Internal or unrecoverable error in:
not a lisp data object
  [2: No such file or directory]

Change History (66)

comment:1 Changed 10 years ago by was

NOTE: A naive attempt with maxima-5.25 fails in exactly the same way.

comment:2 follow-up: Changed 10 years ago by leif

Nice. I didn't know we have any Linux PPC in our build farm. :)

comment:3 in reply to: ↑ 2 Changed 10 years ago by kcrisman

Replying to leif:

Nice. I didn't know we have any Linux PPC in our build farm. :)

I'll ask about this on the Maxima devel list. Maybe there is some little configuration thing that needs to be set.

comment:4 Changed 10 years ago by fbissey

Could we see more of that log? And I'd like to see the ecl build log as well if it was possible.

comment:5 Changed 10 years ago by was

Here's a huge log file

http://sage.math.washington.edu/home/wstein/days/32/silius/install.log-20110818.txt

It is huge due to building ATLAS, etc. So, just search in it.

It looks like maybe *everything* ends up building except Maxima (and Tachyon, which is fixed by another ticket).

wstein@silius:~/silius/sage-4.7.1> ./sage
----------------------------------------------------------------------
| Sage Version 4.7.1, Release Date: 2011-08-11                       |
| Type notebook() for the GUI, and license() for information.        |
----------------------------------------------------------------------
sage: time n = factorial(10^6)
Time: CPU 0.64 s, Wall: 0.65 s
sage: time n = factorial(10^7)
Time: CPU 11.81 s, Wall: 11.81 s
sage: !uname -a
Linux silius 2.6.32.43-0.4-ppc64 #1 SMP 2011-07-14 14:47:44 +0200 ppc64 ppc64 ppc64 GNU/Linux

comment:7 Changed 10 years ago by fbissey

Hum... nothing jumps out at me, especially if sage compiled without problems including sage/libs/ecl.pyx. Is there a maxima rpm for sles 11? If so it may be worth looking in the spec file.

comment:8 Changed 10 years ago by was

Yes, sage/libs/ecl.pyx compiles and passes all tests:

wstein@silius:~/silius/sage-4.7.1> ./sage -t devel/sage/sage/libs/ecl.pyx
sage -t  "devel/sage/sage/libs/ecl.pyx"                     
         [5.1 s]
 
----------------------------------------------------------------------
All tests passed!

comment:9 follow-ups: Changed 10 years ago by leif

Hmmm, there are a couple of warnings already, including

make[3]: warning:  Clock skew detected.  Your build may be incomplete.

but comparing the log to what I have doesn't look very different.

How about installing the Maxima spkg with strace -f [...] ./sage -i ..., to see which file ecl apparently doesn't find?

It would also perhaps be better to first try to build on a local filesystem (if that wasn't the case, which I assume).

comment:10 Changed 10 years ago by kcrisman

Here is a response from one Maxima developer.

Maxima probably isn't supported on this platform because probably no
developer normally works on PPC Linux.  I might be wrong about that.

On the other hand, I have built maxima just fine using a 64-bit version
of ecl on Linux (x86) and Solaris (sparc).  So I would guess that the
problem you are seeing is due to ecl.

If there are any other responses, I'll update here.

comment:11 Changed 10 years ago by leif

Did Dave ever manage to build Sage in 64-bit mode on SPARC?

As far as I know the only big-endians we build / test on are (or just use) 32 bits.

I also noticed that silius doesn't have libffi, but I don't think that's relevant here.

comment:12 in reply to: ↑ 9 ; follow-up: Changed 10 years ago by was

Replying to leif:

How about installing the Maxima spkg with strace -f [...] ./sage -i ..., to see which file ecl apparently doesn't find?

It would also perhaps be better to first try to build on a local filesystem (if that wasn't the case, which I assume).

Do you want an account on the machine? Then you can try all this and maybe fix the problem!

Write me offlist at wstein@… for an account.

comment:13 Changed 10 years ago by kcrisman

By the way, this may be on skynet, but it's not listed at the wiki for Sage on skynet.

comment:14 Changed 10 years ago by was

The machine was added to skynet 2 or 3 days ago.

The page http://wiki.sagemath.org/skynet is interesting. Note that it is surely out of date, since the compilers are always being updated, operating systems upgraded, etc.

comment:15 Changed 10 years ago by was

I've added some info about silius stamped "today" to that wiki.

comment:16 Changed 10 years ago by kcrisman

The machine was added to skynet 2 or 3 days ago.

Cool.

The page http://wiki.sagemath.org/skynet is interesting. Note that it is surely out of date, since the compilers are always being updated, operating systems upgraded, etc.

John P. reminded me of it on another ticket. Even out-of-date, it seems useful.

comment:17 in reply to: ↑ 9 Changed 10 years ago by benjaminfjones

Replying to leif:

I've been banging my head on this for a while now at Sage Days 32. I've tried the following so far:

  • building ECL 11.1.1 independent of Sage and then building Maxima 5.25.0 (latest) on top of that (failed: same "redefining NIL" error that mhansen reported)
  • building latest CVS version of ECL, then Maxima 5.25.0 on top of that (failed: same error as above)
  • making a new .spkg for the latest ECL, then using that to try building the maxima .spkg (failed with "not a lisp data object" error)

How about installing the Maxima spkg with strace -f [...] ./sage -i ..., to see which file ecl apparently doesn't find?

Just tried this on silius, here is the part of the log just before the error message is printed:

[pid 13982] write(3, "\tif(!((V1)==(VV[26]))){\n", 24) = 24
[pid 13982] write(3, "\tgoto L21;}\n", 12) = 12
[pid 13982] write(3, "\tT0= LC9next(lex0)              "..., 67) = 67
[pid 13982] write(3, "\tcl_env_copy->nvalues=2;\n", 25) = 25
[pid 13982] write(3, "\tcl_env_copy->values[1]=T0;\n", 28) = 28
[pid 13982] write(3, "\tcl_env_copy->values[0]=V1;\n", 28) = 28
[pid 13982] write(3, "\treturn cl_env_copy->values[0];\n", 32) = 32
[pid 13982] write(3, "L21:;\n", 6)      = 6
[pid 13982] write(3, "\tif(!((V1)==(VV[30]))){\n", 24) = 24
[pid 13982] write(3, "\tgoto L25;}\n", 12) = 12
[pid 13982] write(3, "\tV1= LC9next(lex0)              "..., 67) = 67
[pid 13982] write(2, "\nInternal or unrecoverable error"..., 60) = 60
[pid 12588] <... read resumed> "\nInternal or unrecoverable error"..., 8192) = 60
[pid 12588] write(1, "\nInternal or unrecoverable error"..., 60
Internal or unrecoverable error in:
not a lisp data object
 <unfinished ...>
[pid 13982] write(2, "  [2: No such file or directory]"..., 33 <unfinished ...>

The full log is here: http://sage.math.washington.edu/home/bjones/strace_maxima.log.gz (It is VERY large)

It would also perhaps be better to first try to build on a local filesystem (if that wasn't the case, which I assume).

I've tried building the maxima .spkg (and Sage overall) both on my NFS mounted home folder and on the local /tmp on silius with no difference.

Other things I'm trying:

  • running the ECL test-suite and comparing to another architectures (sage.math and skynet/eno)

comment:18 Changed 10 years ago by fbissey

I am not convinced that the problem is with ecl at all. I would try installing another lisp (unfortunately as far as I can see there are no lisps at all in the SuSE linux 11.SP1 DVDs) and compile maxima against that. If it still fail that would point towards maxima rather than ecl. If it builds that would point to ecl. clisp or sbcl would be appropriate. Unfortunately I don't have access to sagemath or boxen so I cannot login into silius right now.

comment:19 Changed 10 years ago by benjaminfjones

I've been trying to build clisp on skynet/silius without success. I've tried with both GCC version 4.6.1 and GCC version 4.3. Both build processes end with:

./lisp.run -B . -N locale -E UTF-8 -Epathname 1:1 -Emisc 1:1 -norc -m 2MW -lp ../src/ -x '(and (load "../src/init.lisp") (sys::%saveinitmem) (ext::exit)) (ext::exit t)'
make: *** [interpreted.mem] Segmentation fault

I'm going to move on and try compiling SBCL..

comment:20 Changed 10 years ago by benjaminfjones

My SBCL build (using ecl-11.1.1 as the cross compilation host) also failed:

;;; Emitting code for LIST-OF-LENGTH-AT-LEAST-P.

Internal or unrecoverable error in:
not a lisp data object
  [2: No such file or directory]

;;; ECL C Backtrace

Same vague error as I get trying to build maxima.

I've searched through the pre-built package lists for SUSE SLE 11 SP 1 looking for "clisp", "sbcl", "gcl", and "lisp" and didn't find anything.

comment:21 Changed 10 years ago by fbissey

I couldn't find any lisp on the SLES DVDs either, I tried to look at RedHat? because they also provide ppc64 but couldn't find anything so far. A source rpm of maxima or a lisp on ppc64 would have been helpful from any distros.

comment:22 Changed 10 years ago by benjaminfjones

  • Cc mhansen added

I found Clozure CL project distributes binary images for ppc64 linux. I tried this out, but ran into yet more problems. Maybe this is going off on a tangent....

bjones@silius:/tmp/bjones/ccl-1.7> ./ppccl64
remap spjump: Invalid argument

Now I read through the platform notes here and see that there is an issue with 16-bit memory references for functions like NIL (that sounds familiar...) not being accessible by non-root processes. This has to do with a parameter set by the OS: mmap_min_addr.

The parameter in question is called mmap_min_addr; one can cat the file /proc/sys/vm/mmap_min_addr to see what the current setting is.

On skynet/silius,

bjones@silius:/tmp/bjones/ccl-1.7> cat /proc/sys/vm/mmap_min_addr
65536

and according to the platform notes, it *should* be 4096.

In light of this, it might be worth changing this parameter on silius (which requires a reboot, see the platform notes above) and then try Clozure CL (run test suites, etc) and THEN try building Maxima. Also, it would be worth trying to reproduce Mike Hansen's reported error about redefining NIL here after the mmap_min_addr parameter has been changed.

comment:23 Changed 10 years ago by was

Quick comment: Even if you could build Maxima with another Lisp, that's not a solution, since Sage uses the C library interface to Maxima, which is *only* available with ECL.

comment:24 Changed 10 years ago by fbissey

@benjaminfjones I think you are bang on. That is probably what is happening and what Mike Hansen was referring to. We should try the recommendation from the CCL web page and rebuild ecl and then maxima. From their comments doing the work in lisp to avoid the issue is not very appealing.

@was My suggestion of doing so was to do "an isolation fault" procedure. It happened to have bear fruits in an unexcepted way.

comment:25 in reply to: ↑ 12 Changed 10 years ago by leif

So what's the current status?

Replying to was:

Do you want an account on the machine? Then you can try all this and maybe fix the problem!

Write me offlist at wstein@… for an account.

William, did you receive my mail from last Friday ("Re: Maxima/ECL on Silius (#11708)")?

I could try debugging / fixing this during the weekend.

comment:26 follow-up: Changed 10 years ago by benjaminfjones

Current status is the same as it was in comment 22. I'd to try building ECL + Maxima with the low memory access threshold changed. This requires root on the machine in question (but it's a simple change and easy to make permanent on reboot). I haven't asked Mariah about the possibility of doing this yet.

comment:27 in reply to: ↑ 26 ; follow-up: Changed 10 years ago by leif

Replying to benjaminfjones:

I'd to try building ECL + Maxima with the low memory access threshold changed. This requires root on the machine in question (but it's a simple change and easy to make permanent on reboot). I haven't asked Mariah about the possibility of doing this yet.

Well, this would probably prevent us from testing other, perhaps better solutions... ;-)

Similar machines will likely have the same problem, and I'm not sure whether everybody would lower the mmap min address since this is considered a security hole, at least in theory. And not every Sage user having access to such a machine will be able to do that, or to persuade a sysadmin to do so.

If mmap() is really the problem, one should be able to perhaps tweak / patch configure, or convince ECL in a different way to not (try to) use such low addresses. Furthermore, if that's really the cause, it is certainly a bug how (or when) ECL fails.

comment:28 in reply to: ↑ 27 Changed 10 years ago by benjaminfjones

Replying to leif:

If mmap() is really the problem, one should be able to perhaps tweak / patch configure, or convince ECL in a different way to not (try to) use such low addresses. Furthermore, if that's really the cause, it is certainly a bug how (or when) ECL fails.

I agree, but I think we should at least determine whether or not this is the current problem. I'm not advocating that Sage users on ppc64 machines fundamentally alter their OS in order to get Sage to compile.

comment:29 follow-up: Changed 10 years ago by fbissey

I think we should test if it is the problem too. I am currently installing some power 7 gear at my home university (university of Canterbury in New Zealand) so we will have alternative testing platform soon if that's a concern.

comment:30 in reply to: ↑ 29 Changed 10 years ago by leif

Replying to fbissey:

I think we should test if it is the problem too. I am currently installing some power 7 gear at my home university (university of Canterbury in New Zealand) so we will have alternative testing platform soon if that's a concern.

I don't think ECL uses fixed mappings (in that range) at all.

Even if it did, mmap() should return MAP_FAILED, which is, as far as I can see, always catched, and the returned address used rather than the one passed. (You can of course test whether the kernel is buggy... :) )

comment:31 Changed 10 years ago by fbissey

I tried a newer version of maxima just in case http://spkg-upload.googlecode.com/files/maxima-5.25.1.spkg but the result is the same not surprisingly.

Leif how did you get such a verbose output out of ecl in #11786 ?

comment:32 Changed 10 years ago by fbissey

Found a way of stracing ecl, not that it is helpful

write(1, ";;; Emitting code for UNARY", 27;;; Emitting code for UNARY) = 27
write(1, ".\n", 2.
)                      = 2
write(3, "}}\n", 3)                     = 3
write(3, "/*\tlocal function UNARY         "..., 68) = 68
write(4, "#define STCK10\n", 15)        = 15
write(3, "/*\toptimize speed 3, debug 2, sp"..., 68) = 68
write(3, "static cl_object LC21unary(volat"..., 67) = 67
write(3, "{ VT11 VLEX11 CLSR11 STCK11\n", 28) = 28
write(3, "\tconst cl_env_ptr cl_env_copy = "..., 51) = 51
write(3, "\tcl_object value0;\n", 19)   = 19
write(3, "\tecl_cs_check(cl_env_copy,value0"..., 35) = 35
write(3, "\t{\n", 3)                    = 3
write(3, "TTL:\n", 5)                   = 5
write(3, "\t{cl_object V2;                 "..., 66) = 66
write(3, "\tV2= Cnil;\n", 11)           = 11
write(3, "\tif(!((V1)==(VV[28]))){\n", 24) = 24
write(3, "\tgoto L3;}\n", 11)           = 11
write(3, "\tT0= LC9next(lex0)              "..., 67) = 67
write(3, "\tcl_env_copy->values[0]=LC10cond"..., 69) = 69
write(3, "\t{int V3=cl_env_copy->nvalues-0;"..., 33) = 33
write(3, "\tif (V3--<=0) goto L8;\n", 23) = 23
write(3, "\tV2= cl_env_copy->values[0];\n", 29) = 29
write(3, "\tif (V3--<=0) goto L9;\n", 23) = 23
write(3, "\tV1= cl_env_copy->values[1];\n", 29) = 29
write(3, "\tgoto L10;}\n", 12)          = 12
write(3, "L8:;\n", 5)                   = 5
write(3, "\tV2= Cnil;\n", 11)           = 11
write(3, "L9:;\n", 5)                   = 5
write(3, "\tV1= Cnil;\n", 11)           = 11
write(3, "L10:;\n", 6)                  = 6
write(3, "\tif((V1)==(VV[29])){\n", 21) = 21
write(3, "\tgoto L13;}\n", 12)          = 12
write(3, "\t(void)cl_error(1,_ecl_static_17"..., 67) = 67
write(3, "\tgoto L11;\n", 11)           = 11
write(3, "L13:;\n", 6)                  = 6
write(3, "\tgoto L11;\n", 11)           = 11
write(3, "L11:;\n", 6)                  = 6
write(3, "\tT0= LC9next(lex0)              "..., 67) = 67
write(3, "\tcl_env_copy->nvalues=2;\n", 25) = 25
write(3, "\tcl_env_copy->values[1]=T0;\n", 28) = 28
write(3, "\tcl_env_copy->values[0]=V2;\n", 28) = 28
write(3, "\treturn cl_env_copy->values[0];\n", 32) = 32
write(3, "L3:;\n", 5)                   = 5
write(3, "\tif(!(ecl_numberp(V1))){\n", 25) = 25
write(3, "\tgoto L17;}\n", 12)          = 12
write(3, "\tT0= LC9next(lex0)              "..., 67) = 67
write(3, "\tcl_env_copy->nvalues=2;\n", 25) = 25
write(3, "\tcl_env_copy->values[1]=T0;\n", 28) = 28
write(3, "\tcl_env_copy->values[0]=V1;\n", 28) = 28
write(3, "\treturn cl_env_copy->values[0];\n", 32) = 32
write(3, "L17:;\n", 6)                  = 6
write(3, "\tif(!((V1)==(VV[26]))){\n", 24) = 24
write(3, "\tgoto L21;}\n", 12)          = 12
write(3, "\tT0= LC9next(lex0)              "..., 67) = 67
write(3, "\tcl_env_copy->nvalues=2;\n", 25) = 25
write(3, "\tcl_env_copy->values[1]=T0;\n", 28) = 28
write(3, "\tcl_env_copy->values[0]=V1;\n", 28) = 28
write(3, "\treturn cl_env_copy->values[0];\n", 32) = 32
write(3, "L21:;\n", 6)                  = 6
write(3, "\tif(!((V1)==(VV[30]))){\n", 24) = 24
write(3, "\tgoto L25;}\n", 12)          = 12
write(3, "\tV1= LC9next(lex0)              "..., 67) = 67
write(2, "\nInternal or unrecoverable error"..., 60
Internal or unrecoverable error in:
not a lisp data object
) = 60
write(2, "  [2: No such file or directory]"..., 33  [2: No such file or directory]
) = 33
write(2, "\n;;; ECL C Backtrace\n", 21
;;; ECL C Backtrace
) = 21
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 106;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(si_dump_c_backtrace-0xb4cb4) [0x400001d28d4]
) = 106
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 105;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(ecl_internal_error-0xc2854) [0x400001c38c4]
) = 105
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 88;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(+0x17a3bc) [0x400001da3bc]
) = 88
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 97;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(cl_type_of-0xadda4) [0x400001dab94]
) = 97
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 96;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(cl_typep-0x192498) [0x400000e7d10]
) = 96
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 96;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(cl_typep-0x192550) [0x400000e7c58]
) = 96
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 91;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/ecl-11.1.1/cmp.fas(+0x37950) [0x40000797950]
) = 91
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 91;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/ecl-11.1.1/cmp.fas(+0x37d60) [0x40000797d60]
) = 91
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 91;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/ecl-11.1.1/cmp.fas(+0x4e57c) [0x400007ae57c]
) = 91
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 88;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(+0x16caa0) [0x400001ccaa0]
) = 88
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 91;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/ecl-11.1.1/cmp.fas(+0x52024) [0x400007b2024]
) = 91
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 92;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(APPLY-0x87d68) [0x40000204200]
) = 92
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 113;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(ecl_apply_from_stack_frame-0xdfee8) [0x400001a2a68]
) = 113
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 95;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(cl_apply-0xdfb50) [0x400001a2e48]
) = 95
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 91;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/ecl-11.1.1/cmp.fas(+0x4cd00) [0x400007acd00]
) = 91
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 88;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(+0x16cb0c) [0x400001ccb0c]
) = 88
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 91;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/ecl-11.1.1/cmp.fas(+0x55aa0) [0x400007b5aa0]
) = 91
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 98;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(APPLY_fixed-0x83d38) [0x40000208248]
) = 98
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 113;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(ecl_apply_from_stack_frame-0xdfeb8) [0x400001a2a98]
) = 113
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 95;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(cl_apply-0xdfb50) [0x400001a2e48]
) = 95
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 91;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/ecl-11.1.1/cmp.fas(+0x4cd00) [0x400007acd00]
) = 91
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 88;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(+0x16cb0c) [0x400001ccb0c]
) = 88
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 91;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/ecl-11.1.1/cmp.fas(+0x55b30) [0x400007b5b30]
) = 91
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 98;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(APPLY_fixed-0x83d38) [0x40000208248]
) = 98
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 113;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(ecl_apply_from_stack_frame-0xdfeb8) [0x400001a2a98]
) = 113
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 95;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(cl_apply-0xdfb50) [0x400001a2e48]
) = 95
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 91;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/ecl-11.1.1/cmp.fas(+0x4cd00) [0x400007acd00]
) = 91
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 88;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(+0x16cb0c) [0x400001ccb0c]
) = 88
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 91;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/ecl-11.1.1/cmp.fas(+0x55b30) [0x400007b5b30]
) = 91
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 98;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(APPLY_fixed-0x83d38) [0x40000208248]
) = 98
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 113;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(ecl_apply_from_stack_frame-0xdfeb8) [0x400001a2a98]
) = 113
write(2, ";;; /home/fbissey/sage-4.7.2.alp"..., 95;;; /home/fbissey/sage-4.7.2.alpha2/local/lib/libecl.so.11.1(cl_apply-0xdfb50) [0x400001a2e48]
) = 95

comment:33 Changed 10 years ago by fbissey

OK I hacked something in ecl and now maxima is compiling... Cross finger that it will be successfull.

src/c/main.d:   16*1024,        /* ECL_OPT_C_STACK_SAFETY_AREA */

I changed 16 to 64, ecl compiled and....

;;;   gcc -o lisp-cache/home/fbissey/sage-4.7.2.alpha2/spkg/build/maxima-5.25.1/src/src/maxima.fasb -L/home/fbissey/sage-4.7.2.alpha2/local/lib/ /tmp/eclinit8QaoTI.o lisp-cache/home/fbissey/sage-4.7.2.alpha2/spkg/build/maxima-5.25.1/src/src/libmaxima.a -Wl,--rpath,/home/fbissey/sage-4.7.2.alpha2/local/lib/ -shared -L/home/fbissey/sage-4.7.2.alpha2/local/lib -L/home/fbissey/sage-4.7.2.alpha2/local/lib -lecl -lgmp -lgc -ldl -lm 
installing Maxima library as /home/fbissey/sage-4.7.2.alpha2/local/lib/ecl//maxima.fas

real    6m36.665s
user    5m44.422s
sys     0m30.499s
Successfully installed maxima-5.25.1
Now cleaning up tmp files.
Making Sage/Python scripts relocatable...
Making script relocatable
Finished installing maxima-5.25.1.spkg
fbissey@silius:~/sage-4.7.2.alpha2>

:) I'll have to downgrade maxima and run the test later but you can take over while real life is calling me.

comment:34 follow-up: Changed 10 years ago by kcrisman

  • Report Upstream changed from N/A to Not yet reported upstream; Will do shortly.

Nice. No idea how you guys figure this stuff out :)

Just be sure to report this potential fix upstream. Juanjo should be able to incorporate something reasonable.

comment:35 in reply to: ↑ 34 Changed 10 years ago by fbissey

Replying to kcrisman:

Nice. No idea how you guys figure this stuff out :)

To be honest, I have no idea myself sometimes :) I would need to be filmed in action. It all started with reading the bits about porting ecl to a new platform. Then inspect some of the files that page mentioned, the variable name jumped at me... I looked for it remembering previous comments on this threads. You could guess where 64 came from.

comment:36 follow-up: Changed 10 years ago by jhpalmieri

I'm not having any luck with this. I took Sage 4.7.2.alpha2, put in the new tachyon spkg from #11706, and modified ecl as described above (just one change in main.d). Neither the included version of maxima nor 5.25 (referenced above) compiles. Did I miss something, or do I just not have the magic touch?

comment:37 in reply to: ↑ 36 ; follow-up: Changed 10 years ago by fbissey

Replying to jhpalmieri:

I'm not having any luck with this. I took Sage 4.7.2.alpha2, put in the new tachyon spkg from #11706, and modified ecl as described above (just one change in main.d). Neither the included version of maxima nor 5.25 (referenced above) compiles. Did I miss something, or do I just not have the magic touch?

Do you get the same error on UNARY or do you get something new?

comment:38 in reply to: ↑ 37 Changed 10 years ago by jhpalmieri

Replying to fbissey:

Replying to jhpalmieri:

I'm not having any luck with this. I took Sage 4.7.2.alpha2, put in the new tachyon spkg from #11706, and modified ecl as described above (just one change in main.d). Neither the included version of maxima nor 5.25 (referenced above) compiles. Did I miss something, or do I just not have the magic touch?

Do you get the same error on UNARY or do you get something new?

It looks like the same error to me: it looks just like the message in the ticket description.

comment:39 Changed 10 years ago by benjaminfjones

I also tried your fix and ran into the same old error at the UNARY part of Maxima's build. Here's exactly what I did.

  1. starting with sage-4.7.2.alpha2 on skynet/silius
  2. unpacked ecl-11.1.1.p1.spkg in a tmp directory
  3. modified the line in src/c/main.d, changing the 16 to a 64
  4. repacked the modified ecl-11.1.1.p1.spkg and put it in SAGE_ROOT/spkg/standard
  5. then ran make from SAGE_ROOT

The ECL build finishes without error, but the Maxima build fails at the same point as before.

So you were saying that you were building a new spkg of maxima (or maybe of ECL). That could be the difference that John and I are seeing.

comment:40 Changed 10 years ago by fbissey

That's weird I definitely have maxima installed on silius now

fbissey@silius:~/sage-4.7.2.alpha2> export LD_LIBRARY_PATH=/usr/local/gcc-4.6.1/ppc64-Linux-power7-suse/lib64
fbissey@silius:~/sage-4.7.2.alpha2> ./sage
----------------------------------------------------------------------
| Sage Version 4.7.2.alpha2, Release Date: 2011-08-24                |
| Type notebook() for the GUI, and license() for information.        |
----------------------------------------------------------------------
**********************************************************************
*                                                                    *
* Warning: this is a prerelease version, and it may be unstable.     *
*                                                                    *
**********************************************************************
sage: quit
Exiting Sage (CPU time 0m0.07s, Wall time 0m16.60s).
fbissey@silius:~/sage-4.7.2.alpha2> ./sage -t -long  -force_lib devel/sage-main/sage/interfaces/maxima.py
init.sage does not exist ... creating
sage -t -long -force_lib "devel/sage-main/sage/interfaces/maxima.py"
**********************************************************************
File "/home/fbissey/sage-4.7.2.alpha2/devel/sage-main/sage/interfaces/maxima.py", line 124:
    sage: a.expand()
Expected:
    3*2^(7/2)+5*sqrt(2)+41
Got:
    29*sqrt(2)+41
**********************************************************************
1 items had failures:
   1 of  97 in __main__.example_0
***Test Failed*** 1 failures.
For whitespace errors, see the file /home/fbissey/.sage//tmp/.doctest_maxima.py
         [30.9 s]
 
----------------------------------------------------------------------
The following tests failed:


        sage -t -long -force_lib "devel/sage-main/sage/interfaces/maxima.py"
Total time for all tests: 31.0 seconds

That's an expected failure with maxima-5.25.{0,1}.

I'll try to rebuild from scratch and make a ecl spkg that is not a hack.

comment:41 Changed 10 years ago by fbissey

I have made my home folder accessible on skynet so anyone can have a look at my install.

comment:42 Changed 10 years ago by fbissey

The plot thicken. I am not sure yet that *my fix* actually does anything. We have a gcc problem... I couldn't reproduce the build for quite a while and even destroyed my working install. maxima will install if ecl has been compiled with gcc-4.3.4 that comes with SLES but not with gcc-4.6.1. I have managed to compile it the first time because I forgot to put gcc-4.6.1 back into my path after my ssh connection was broken one time too many.

Compiling ecl-11.1.1.p1 with gcc-4.3.4 to check if the fix does anything at all now...

comment:43 Changed 10 years ago by fbissey

Confirmed. Fix does nothing gcc version used is what matters.

comment:44 Changed 10 years ago by kcrisman

Is it ok, though, if Linux PPC 64-bit is a relatively unusual platform and we could tell people to use a certain gcc? On Cygwin this would certainly be ok, since it's likely to only be used as a binary; here, maybe not.

comment:45 Changed 10 years ago by fbissey

Well, the SLES install we have does not include gfortran so at the moment we have to get it from gcc-4.6.1. I have had some experience building gcc on power (power5 so far power7 any days now, ok so I had a G4 too)and it is a bit dodgy at times. I am not sure if it is dodgier on linux or AIX. On AIX I haven't had a g++ free of problems for example. I suspect that the gcc on silius could use a bit of massaging, the question is where and how. And finally it could very well be an obscure gcc bug that only shows up on power.

comment:46 Changed 10 years ago by fbissey

Note that that SLES gcc has been built for power4 to ensure compatibility with a wide range of hardware

gcc -v
Using built-in specs.
Target: powerpc64-suse-linux
Configured with: ../configure --prefix=/usr --infodir=/usr/share/info --mandir=/usr/share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64 --enable-languages=c,c++,objc,fortran,obj-c++,java --enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.3 --enable-ssp --disable-libssp --with-bugurl=http://bugs.opensuse.org/ --with-pkgversion='SUSE Linux' --disable-libgcj --disable-libmudflap --with-slibdir=/lib64 --with-system-zlib --enable-__cxa_atexit --enable-libstdcxx-allocator=new --disable-libstdcxx-pch --enable-version-specific-runtime-libs --program-suffix=-4.3 --enable-linux-futex --without-system-libunwind --with-cpu=power4 --enable-secureplt --with-long-double-128 --build=powerpc64-suse-linux
Thread model: posix
gcc version 4.3.4 [gcc-4_3-branch revision 152973] (SUSE Linux)

Not sure about the localy compiled gcc

/usr/local/gcc-4.6.1/ppc64-Linux-power7-suse/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/usr/local/gcc-4.6.1/ppc64-Linux-power7-suse/bin/gcc
COLLECT_LTO_WRAPPER=/usr/local/gcc-4.6.1/ppc64-Linux-power7-suse/libexec/gcc/powerpc64-unknown-linux-gnu/4.6.1/lto-wrapper
Target: powerpc64-unknown-linux-gnu
Configured with: /usr/local/gcc-4.6.1/src/gcc-4.6.1/configure --enable-languages=c,c++,fortran --with-gnu-as --with-gnu-as=/usr/local/binutils-2.21/ppc64-Linux-power7-suse-gcc-4.3.4-suse/bin/as --with-gnu-ld --with-ld=/usr/local/binutils-2.21/ppc64-Linux-power7-suse-gcc-4.3.4-suse/bin/ld --with-gmp=/usr/local/mpir-2.4.0/ppc64-Linux-power7-suse-gcc-4.3.4-suse --with-mpfr=/usr/local/mpfr-3.0.1/ppc64-Linux-power7-suse-mpir-2.4.0-gcc-4.3.4-suse --with-mpc=/usr/local/mpc-0.9/ppc64-Linux-power7-suse-mpir-2.4.0-mpfr-3.0.1-gcc-4.3.4-suse --prefix=/usr/local/gcc-4.6.1/ppc64-Linux-power7-suse
Thread model: posix
gcc version 4.6.1 (GCC) 

There are a number of elements that could play right there from the config.

comment:47 follow-ups: Changed 10 years ago by leif

Funny that you incidentally compiled ECL with some other GCC version...

Perhaps also try some version from the 4.5 series.

Karl-Dieter, which upstream do you have in mind (now)? ;-)


Btw., the "native" SLES GCC is configured with --enable-languages=...,fortran,..., so in principle gfortran should be available, perhaps packaged separately though. (It's only a wrapper anyway, so should be replaceable by some simple shell script.)

comment:48 Changed 10 years ago by benjaminfjones

I added CC=gcc-4.3; export CC to the ECL's spkg-install script and after that change the maxima-5.23.2.p0 does indeed build successfully! I'm doing some tests now, will report back.

comment:49 in reply to: ↑ 47 Changed 10 years ago by kcrisman

Karl-Dieter, which upstream do you have in mind (now)? ;-)

I was wondering that myself, but didn't have anything germane to contribute.

comment:50 in reply to: ↑ 47 ; follow-up: Changed 10 years ago by fbissey

Replying to leif:

Funny that you incidentally compiled ECL with some other GCC version...

YEs, gave me a false sense of victory for a while. Lucky I figured out what the recipe for success was. Imagine: we have a working install and no one knows why or is able to reproduce it!

Perhaps also try some version from the 4.5 series.

Definitely something to try.

Btw., the "native" SLES GCC is configured with --enable-languages=...,fortran,..., so in principle gfortran should be available, perhaps packaged separately though. (It's only a wrapper anyway, so should be replaceable by some simple shell script.)

You would think so. However I cannot find anything on the sles11 dvds (for either x86_64 or ppc64) that mention explicitly fortran. It could be part of some other gcc rpm I need to have a closer look.

comment:51 in reply to: ↑ 50 ; follow-up: Changed 10 years ago by leif

Replying to fbissey:

Replying to leif:

Funny that you incidentally compiled ECL with some other GCC version...

YEs, gave me a false sense of victory for a while. Lucky I figured out what the recipe for success was. Imagine: we have a working install and no one knows why or is able to reproduce it!

I was just wondering nobody (else) tried, intentionally... B)

Unfortunately I now also have an account there, but will first spend my time on the Sage 4.7.2.alpha3 release.


Btw., the "native" SLES GCC is configured with --enable-languages=...,fortran,..., so in principle gfortran should be available, perhaps packaged separately though. (It's only a wrapper anyway, so should be replaceable by some simple shell script.)

You would think so. However I cannot find anything on the sles11 dvds (for either x86_64 or ppc64) that mention explicitly fortran. It could be part of some other gcc rpm I need to have a closer look.

There are at least PPC ones for openSUSE (GCC 4.3), e.g. here.

Other versions (>4.3.x) should IMHO also work, or grab the sources and build it yourself... ;-)

comment:52 in reply to: ↑ 51 ; follow-up: Changed 10 years ago by fbissey

Replying to leif:

There are at least PPC ones for openSUSE (GCC 4.3), e.g. here.

It is definitely not on the dvd for sles 11. I will try my own 4.6.1.

comment:53 in reply to: ↑ 52 Changed 10 years ago by leif

Replying to fbissey:

It is definitely not on the dvd for sles 11. I will try my own 4.6.1.

I can only tell that 4.5.x's works with both GCC 4.3.x and 4.4.x.

comment:54 follow-up: Changed 10 years ago by leif

Did you play with optimization flags when building the ECL spkg?

By default it uses -O2; if you set SAGE_DEBUG=yes, it will be built with -O0.

spkg-install looks quite ugly [again?] btw., the following for example is definitely wrong:

CPPFLAGS="$CPPFLAGS -I$SAGE_LOCAL/include"
LDFLAGS="$LDFLAGS -L$SAGE_LOCAL/lib"

(The order has to be changed in both cases.)

comment:55 in reply to: ↑ 54 Changed 10 years ago by fbissey

Replying to leif:

Did you play with optimization flags when building the ECL spkg?

By default it uses -O2; if you set SAGE_DEBUG=yes, it will be built with -O0.

spkg-install looks quite ugly [again?] btw., the following for example is definitely wrong:

CPPFLAGS="$CPPFLAGS -I$SAGE_LOCAL/include"
LDFLAGS="$LDFLAGS -L$SAGE_LOCAL/lib"

(The order has to be changed in both cases.)

Haven't but that's a good idea. I may have OK-ed those change quickly for .p1 when a problem appeared with altivec and I caouldn't make a new spkg myself due to time constraints.

comment:56 Changed 10 years ago by benjaminfjones

Playing around with the SAGE_DEBUG=yes flag, I've that ECL doesn't even build on skynet/silius using the default system GCC 4.6.1. The ECL build fails with:

;;; Compiling (DEFUN PPRINT-RAW-ARRAY ...).
;;; Compiling (DEFUN PPRINT-LAMBDA-LIST ...).

;;;;;; Stack overflow.
;;; Jumping to the outermost toplevel prompt
;;;

Without SAGE_DEBUG set, it builds fine (but them Maxima fails, of course).

comment:57 Changed 10 years ago by benjaminfjones

Even though Maxima builds when we force ECL to compile under GCC-4.3 on silius, the resulting Sage installation has *lots* of doctest errors. See the ptestlong.log file.

At a glance, there are a couple different errors happening:

  1. OverflowError?: value too large to convert to int (may have to do with the architecture?)
  2. A couple of NameErrors in optimize.py and pushout.py
File "/tmp/bjones/sage-4.7.2.alpha2/devel/sage-main/sage/numerical/optimize.py", line 571:
    sage: fit[a], fit[b], fit[c]
Exception raised:
    Traceback (most recent call last):
      File "/tmp/bjones/sage-4.7.2.alpha2/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/tmp/bjones/sage-4.7.2.alpha2/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/tmp/bjones/sage-4.7.2.alpha2/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_7[8]>", line 1, in <module>
        fit[a], fit[b], fit[c]###line 571:
    sage: fit[a], fit[b], fit[c]
    NameError: name 'fit' is not defined
  1. mpfr errors: RuntimeError: Aborted
  2. many ValueError: Refining interval that does not bound unique root! type errors from the elliptic curves modules

etc..

comment:58 Changed 10 years ago by fbissey

I think you should put this in #11705 as at first glance errors are not just from maxima. I am going into that log with a fine comb.

comment:59 Changed 10 years ago by jhpalmieri

Now I'm confused: I just successfully built Sage 4.7.2.alpha4 on silius, with gcc 4.6.2 (the default on that machine). I had to replace the mpir package with the one from #11964, but that was the only change. Can anyone else verify this? I see that mpir is a dependency of ecl; could the new mpir spkg have fixed the problem with maxima? If so, we can close this ticket.

With the build, I do get lots of test failures, as above. I'll post the log at #11705.

comment:60 Changed 10 years ago by fbissey

It could. I have to take some time to do a build but the log from Benjamin indicated there was a lot of trouble potentially with mpir. So it could indeed have been cured by the upgrade, although it is difficult to guess the chain of event leading to that. Unless there is a competition between mpir/gmp used by gcc and the one installed in sage.

comment:61 Changed 10 years ago by leif

Replying to jhpalmieri:

Now I'm confused: I just successfully built Sage 4.7.2.alpha4 on silius, with gcc 4.6.2 (the default on that machine). I had to replace the mpir package with the one from #11964, but that was the only change. Can anyone else verify this?

I haven't tried GCC 4.6.2 yet, although I downloaded it the day it had been released ;-) (but hadn't had the time to build it on Silius)

Perhaps they've fixed some things, although my impression was that something with ECL is wrong.

Note that I've previously built ECL and Maxima "successfully" (modulo lots of doctest errors) with -mcpu=power4 -mtune=power4 (Maxima uses the CFLAGS from ECL anyway), and Sage with GCC 4.6.1 and GCC 4.4.6 (the other parts with -mcpu=power7 -mtune=power7).


I see that mpir is a dependency of ecl; could the new mpir spkg have fixed the problem with maxima? If so, we can close this ticket.

No, it just triggered the rebuild of ECL with GCC 4.6.2.

(I did build Sage 4.7.2.alpha3 which has [almost] the same MPIR spkg, i.e., the p4; the p7 just fixes a configure regression introduced by the p5/p6.)

I wouldn't close this ticket unless you know that none of the doctest errors is related to / caused by Maxima.

comment:62 follow-up: Changed 10 years ago by jhpalmieri

Just to clarify: I logged into silius, sourced /usr/local/skynet_bash_profile, set MAKE='make -j12' and SAGE_PARALLEL_SPKG_BUILD=yes, and typed make. I didn't modify any other settings. gcc-4.6.2 is the default compiler there now.

comment:63 in reply to: ↑ 62 Changed 10 years ago by leif

Replying to jhpalmieri:

Just to clarify: I logged into silius, sourced /usr/local/skynet_bash_profile, set MAKE='make -j12' and SAGE_PARALLEL_SPKG_BUILD=yes, and typed make. I didn't modify any other settings. gcc-4.6.2 is the default compiler there now.

Which means that upgrading from 4.6.1 to 4.6.2 seems worthwhile, although it's not immediately clear (or I don't know yet) what they changed. It is configured slightly differently, but without --with-cpu=... etc.; it might just default to some other CPU now.

Can anybody look at the GCC diffs? I didn't have and don't have the time right now...

comment:64 follow-up: Changed 9 years ago by jdemeyer

  • Milestone changed from sage-4.8 to sage-duplicate/invalid/wontfix
  • Status changed from new to needs_review

Proposing to close this, since building sage-4.8.alpha6 on silius works. There are many test failures (probably unrelated to ecl) but at least it builds.

comment:65 in reply to: ↑ 64 Changed 9 years ago by leif

Replying to jdemeyer:

Proposing to close this, since building sage-4.8.alpha6 on silius works. There are many test failures (probably unrelated to ecl) but at least it builds.

Unless a couple of doctest failures are caused by ECL and/or Maxima...

Maybe François can tell better.

I don't know right now what Maxima/ECL/GCC version you are referring to, but at least [with] GCC 4.6.x [ECL] seemed to produce buggy or invalid code for POWER7; -mcpu=power4 in contrast "worked", at least better, i.e. did build and produced less test failures. [Note that Maxima uses those CFLAGS specified when configuring ECL.]

comment:66 Changed 9 years ago by jdemeyer

  • Resolution set to worksforme
  • Reviewers set to Jeroen Demeyer
  • Status changed from needs_review to closed

I'm closing this since maxima does build on silius. If there are doctest failures related to the building of Maxima, that's for a different ticket.

Note: See TracTickets for help on using tickets.