Opened 11 years ago
Closed 8 years ago
#10051 closed defect (worksforme)
Building ATLAS fails at STAGE 2-1-2: CacheEdge DETECTION
Reported by: | tux21b | Owned by: | GeorgSWeber |
---|---|---|---|
Priority: | major | Milestone: | sage-duplicate/invalid/wontfix |
Component: | build | Keywords: | |
Cc: | tux21b | Merged in: | |
Authors: | Reviewers: | Jeroen Demeyer | |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
I think it's the same bug like mentioned here (link contains a workaround, but no solution yet).
Distribution: Fedora 13 (i686)
System: EeePC 1000H / Intel Atom N270
Compiler: gcc-4.4.4-10.fc13
Ram: 1GB (completely used during the compilation of ATLAS)
Swap: 2GB (nearly unused)
- http://www.tux21b.org/public/eeepc-sage-install.log.bz2
- http://www.tux21b.org/public/eeepc-sage-atlas-3.8.3.p14.log.bz2
- http://www.tux21b.org/public/eeepc-sage-cpuinfo.txt
Grep for "STAGE 2-1-2: CacheEdge DETECTION"
, "make[3]: *** [build] Error 255"
or "Too many failures to build ATLAS. Giving up!"
Note: After the "too many failures..." message, I've set SAGE_ATLAS_LIB
manually, so the remaining install.log
isn't related to that build anymore (but incomplete, because the current build is still running). I'm sorry for that.
Please do not hesitate to ask for more information. If you want, I can also test everything you tell me, since this bug only seems to show up on certain systems.
Regards,
Christoph
Change History (8)
comment:1 Changed 11 years ago by
comment:2 Changed 11 years ago by
- Description modified (diff)
comment:3 follow-up: ↓ 4 Changed 10 years ago by
- Resolution set to worksforme
- Status changed from new to closed
I've decided to update to Sage 4.6 lately and so I compiled everything again (I was using 4.5.3 before).
My workaround based on setting the SAGE_ATLAS_LIB
env variable manually to skip the compilation of ATLAS wasn't working anymore (I think sage was expecting static libraries, but my distribution only provided shared ones). So I decided to give ATLAS a try again - and tada: everything compiled perfectly!
I am still using the same system, the same configuration and I've tried to compile ATLAS in Sage 4.5.3 several times before (including compiling on a freshly rebooted system), so I am quite sure that something was changed in Sage 4.6 which solves my problem.
Many thanks for that!
Regards,
Christoph
comment:4 in reply to: ↑ 3 Changed 10 years ago by
- Resolution worksforme deleted
- Status changed from closed to new
Replying to tux21b:
so I am quite sure that something was changed in Sage 4.6 which solves my problem.
Many thanks for that!
Regards,
Christoph
The changelog in SPKG.txt for ATLAS tells you what has changed.
== ChangeLog == == atlas-3.8.3.p16 (John Palmieri, September 19th 2010) == * Make spkg-check work when using SAGE_ATLAS_LIB: if SAGE_ATLAS_LIB is set, skip the self-tests. == atlas-3.8.3.p15 (David Kirkby, September 6th 2010) == * Make SAGE_ATLAS_LIB use static libraries on all platforms, as building two shared libraries often fails on Linux, and messes things up on Solaris. The static library is less hassle all around. Worth noting is that the ATLAS package only builds the static library and Wolfram Research only ship the static library with Mathematica, despite they usually use shared libraries. To ensure full compatibility with a fresh build of ATLAS, the symbolic links are created for the shared libraries too. The links will fail to be created if the shared libraries do not exist, but will not cause any extra problems. * Update the list of dependencies to include Python and Lapack (see spkg/deps) * Note that the ATLAS build process could be made much quicker if its depenancy on Python was removed. Since the amount of Python code is very small compared to the bash code, this seems logical to do at a later date. The Fortran package would need the same change - but again the amount of Python in that is trivial. * Add a note that make-correct-shared.sh is badly named, as it often fails. * Remove the OS X specific code from make-correct-shared.sh, as ATLAS is never installed on OS X - see the spkg-install-script. == atlas-3.8.3.p14 (David Kirkby, August 10th 2010) ==
atlas-3.8.3.p15 should link both the shared and static libraries. If it is not, then we have a bug.
If you do not use SAGE_ATLAS_LIB, I cant think of anything that should have changed. that would have affected your CacheEdge? DETECTION problem. I think its just more luck than anything else.
This probably depends on system load. The actual building of ATLAS has remiained unchanged for ages - only the libraries have been altered, but the CacheEdge? DETECTION fails well before any library issues are touched.
Only the release manager should generally close tickets, though those with admin priviledge do so occasionally when it is very clear an issue needs closing. In this case, you should not have closed it.
This issues does not tend to be totally reproducible, and I think there is a real problem here.
Dave
comment:5 Changed 10 years ago by
Ok, sorry for closing the ticket. I won't do that again :)
The thing is, I tried to build Sage 4.5.3 a) normally (which means running in X, a a lot of other applications) and b) twice on a freshly booted system, without X and with a minimal amount of system processes and with a reduced niceness. (I think I've read somewhere that all this might help). Anyway without success (and than I've used the system libraries setting SAGE_ATLAS_LIB
).
And now I've compiled Sage 4.6 on my system while I was Skyping and watching a movie (which is quite a load for my poor little Eee) and I haven't seen any loops in the log, and it has worked at once. (I can provide the log if you are interested).
Maybe it was just luck, or it might depend on compiling it as a static lib - does it? I don't know. Hopefully you will find out soon :)
comment:6 Changed 10 years ago by
The unmodified ATLAS source code only builds static libraries. We later produce shared ones from the static libraries. But your failure was before the static libraries were built.
I thought one would get less problems like this if the system is lightly loaded, but I have seen comments to suggest the problem goes away when the machine is more heavily loaded.
I've seen this problem every time I try to create a virtual machine with VirtualBox? on my system (quad core 3.33 GHz). But when the machine is just running the host operating system (OpenSolaris) it's ok. But a Linux guess always fails to build.
Dave
comment:7 Changed 10 years ago by
Hi
I get this on Ubuntu 11.04, 32 bit OS install, Compaq laptop, Intel P6200 2.13Ghz CPU (64bit capable), 2G RAM (about 1.9G visible), initially with throttling, then without, now with a 512M swap file added.
Note it once made it past 2-1-2 to 2-3-2. jan@mamana-PC:~$ grep CacheEdge? /usr/local/src/sage-4.7.1/spkg/logs/atlas-3.8.3.p16.log
STAGE 2-1-2: CacheEdge? DETECTION STAGE 2-1-2: CacheEdge? DETECTION STAGE 2-1-2: CacheEdge? DETECTION
TA TB M N K alpha beta CacheEdge? TIME MFLOPS
STAGE 2-1-2: CacheEdge? DETECTION STAGE 2-2-2: CacheEdge? DETECTION STAGE 2-3-2: CacheEdge? DETECTION STAGE 2-1-2: CacheEdge? DETECTION STAGE 2-1-2: CacheEdge? DETECTION
TA TB M N K alpha beta CacheEdge? TIME MFLOPS jan@mamana-PC:~$
atlas-3.8.4.spkg inserted into sage-4.7.1 failed pretty quickly:
gcc -DL2SIZE=4194304 -I/usr/local/src/sage-4.7.1/spkg/build/atlas-3.8.4/ATLAS-build/include -I/usr/local/src/sage-4.7.1/spkg/build/atlas-3 .8.4/ATLAS-build/../srcinclude -I/usr/local/src/sage-4.7.1/spkg/build/atlas-3.8.4/ATLAS-build/../srcinclude/contrib -DUNKNOWN -DUNKNOW N -DStringUNKNOWN -DATL_OS_Linux -DATL_ARCH_Corei2 -DATL_CPUMHZ=2133 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_GAS_x8632 -DATL_NCPU=2 -fomit -frame-pointer -mfpmath=sse -mavx -O2 -fno-schedule-insns2 -fPIC -m32 -o xL1 L1CacheSize.o time.o /usr/local/src/sage-4.7.1/spkg/build/atlas-3.8.4/ATLAS-build/bin/ATLrun.sh /usr/local/src/sage-4.7.1/spkg/build/atlas-3.8.4/ATLAS-build/tu ne/sysinfo xL1 64 Illegal instruction make[7]: * [RunL1] Error 132
It looks like the prebuilt sage-4.7-linux-32bit-ubuntu_10.04_lts-i686-Linux is running OK.
I will have periodic access to this laptop to run more tests if requested.
comment:8 Changed 8 years ago by
- Milestone set to sage-duplicate/invalid/wontfix
- Resolution set to worksforme
- Reviewers set to Jeroen Demeyer
- Status changed from new to closed
Assuming this is fixed, for example by #10508.
A failure of ATLAS to build has been seen by a number of people, on different systems. We need to keep an eye on this.
Dave