Opened 11 years ago

Closed 8 years ago

#10051 closed defect (worksforme)

Building ATLAS fails at STAGE 2-1-2: CacheEdge DETECTION

Reported by: tux21b Owned by: GeorgSWeber
Priority: major Milestone: sage-duplicate/invalid/wontfix
Component: build Keywords:
Cc: tux21b Merged in:
Authors: Reviewers: Jeroen Demeyer
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by tux21b)

I think it's the same bug like mentioned here (link contains a workaround, but no solution yet).

Distribution: Fedora 13 (i686)
System: EeePC 1000H / Intel Atom N270
Compiler: gcc-4.4.4-10.fc13
Ram: 1GB (completely used during the compilation of ATLAS)
Swap: 2GB (nearly unused)

Grep for "STAGE 2-1-2: CacheEdge DETECTION", "make[3]: *** [build] Error 255" or "Too many failures to build ATLAS. Giving up!"

Note: After the "too many failures..." message, I've set SAGE_ATLAS_LIB manually, so the remaining install.log isn't related to that build anymore (but incomplete, because the current build is still running). I'm sorry for that.

Please do not hesitate to ask for more information. If you want, I can also test everything you tell me, since this bug only seems to show up on certain systems.

Regards,
Christoph

Change History (8)

comment:1 Changed 11 years ago by drkirkby

A failure of ATLAS to build has been seen by a number of people, on different systems. We need to keep an eye on this.

Dave

comment:2 Changed 11 years ago by tux21b

  • Description modified (diff)

comment:3 follow-up: Changed 10 years ago by tux21b

  • Resolution set to worksforme
  • Status changed from new to closed

I've decided to update to Sage 4.6 lately and so I compiled everything again (I was using 4.5.3 before).

My workaround based on setting the SAGE_ATLAS_LIB env variable manually to skip the compilation of ATLAS wasn't working anymore (I think sage was expecting static libraries, but my distribution only provided shared ones). So I decided to give ATLAS a try again - and tada: everything compiled perfectly!

I am still using the same system, the same configuration and I've tried to compile ATLAS in Sage 4.5.3 several times before (including compiling on a freshly rebooted system), so I am quite sure that something was changed in Sage 4.6 which solves my problem.

Many thanks for that!

Regards,
Christoph

comment:4 in reply to: ↑ 3 Changed 10 years ago by drkirkby

  • Resolution worksforme deleted
  • Status changed from closed to new

Replying to tux21b:

so I am quite sure that something was changed in Sage 4.6 which solves my problem.

Many thanks for that!

Regards,
Christoph

The changelog in SPKG.txt for ATLAS tells you what has changed.

== ChangeLog ==

== atlas-3.8.3.p16 (John Palmieri, September 19th 2010) ==
 * Make spkg-check work when using SAGE_ATLAS_LIB: if SAGE_ATLAS_LIB
   is set, skip the self-tests.

== atlas-3.8.3.p15 (David Kirkby, September 6th 2010) ==
 * Make SAGE_ATLAS_LIB use static libraries on all platforms, 
   as building two shared libraries often fails on Linux, and 
   messes things up on Solaris. The static library is less hassle
   all around. Worth noting is that the ATLAS package only builds
   the static library and Wolfram Research only ship the static 
   library with Mathematica, despite they usually use shared
   libraries. To ensure full compatibility with a fresh build
   of ATLAS, the symbolic links are created for the shared libraries too.
   The links will fail to be created if the shared libraries do not exist, 
   but will not cause any extra problems. 
 * Update the list of dependencies to include Python and Lapack (see
   spkg/deps)
 * Note that the ATLAS build process could be made much quicker if its 
   depenancy on Python was removed. Since the amount of Python code is 
   very small compared to the bash code, this seems logical to do at 
   a later date. The Fortran package would need the same change - but again
   the amount of Python in that is trivial. 
 * Add a note that make-correct-shared.sh is badly named, as it often fails. 
 * Remove the OS X specific code from make-correct-shared.sh, as ATLAS is 
   never installed on OS X - see the spkg-install-script.

== atlas-3.8.3.p14 (David Kirkby, August 10th 2010) ==

atlas-3.8.3.p15 should link both the shared and static libraries. If it is not, then we have a bug.

If you do not use SAGE_ATLAS_LIB, I cant think of anything that should have changed. that would have affected your CacheEdge? DETECTION problem. I think its just more luck than anything else.

This probably depends on system load. The actual building of ATLAS has remiained unchanged for ages - only the libraries have been altered, but the CacheEdge? DETECTION fails well before any library issues are touched.

Only the release manager should generally close tickets, though those with admin priviledge do so occasionally when it is very clear an issue needs closing. In this case, you should not have closed it.

This issues does not tend to be totally reproducible, and I think there is a real problem here.

Dave

comment:5 Changed 10 years ago by tux21b

Ok, sorry for closing the ticket. I won't do that again :)

The thing is, I tried to build Sage 4.5.3 a) normally (which means running in X, a a lot of other applications) and b) twice on a freshly booted system, without X and with a minimal amount of system processes and with a reduced niceness. (I think I've read somewhere that all this might help). Anyway without success (and than I've used the system libraries setting SAGE_ATLAS_LIB).

And now I've compiled Sage 4.6 on my system while I was Skyping and watching a movie (which is quite a load for my poor little Eee) and I haven't seen any loops in the log, and it has worked at once. (I can provide the log if you are interested).

Maybe it was just luck, or it might depend on compiling it as a static lib - does it? I don't know. Hopefully you will find out soon :)

comment:6 Changed 10 years ago by drkirkby

The unmodified ATLAS source code only builds static libraries. We later produce shared ones from the static libraries. But your failure was before the static libraries were built.

I thought one would get less problems like this if the system is lightly loaded, but I have seen comments to suggest the problem goes away when the machine is more heavily loaded.

I've seen this problem every time I try to create a virtual machine with VirtualBox? on my system (quad core 3.33 GHz). But when the machine is just running the host operating system (OpenSolaris) it's ok. But a Linux guess always fails to build.

Dave

comment:7 Changed 10 years ago by pipedream

Hi

I get this on Ubuntu 11.04, 32 bit OS install, Compaq laptop, Intel P6200 2.13Ghz CPU (64bit capable), 2G RAM (about 1.9G visible), initially with throttling, then without, now with a 512M swap file added.

Note it once made it past 2-1-2 to 2-3-2. jan@mamana-PC:~$ grep CacheEdge? /usr/local/src/sage-4.7.1/spkg/logs/atlas-3.8.3.p16.log

STAGE 2-1-2: CacheEdge? DETECTION STAGE 2-1-2: CacheEdge? DETECTION STAGE 2-1-2: CacheEdge? DETECTION

TA TB M N K alpha beta CacheEdge? TIME MFLOPS

STAGE 2-1-2: CacheEdge? DETECTION STAGE 2-2-2: CacheEdge? DETECTION STAGE 2-3-2: CacheEdge? DETECTION STAGE 2-1-2: CacheEdge? DETECTION STAGE 2-1-2: CacheEdge? DETECTION

TA TB M N K alpha beta CacheEdge? TIME MFLOPS jan@mamana-PC:~$

atlas-3.8.4.spkg inserted into sage-4.7.1 failed pretty quickly:

gcc -DL2SIZE=4194304 -I/usr/local/src/sage-4.7.1/spkg/build/atlas-3.8.4/ATLAS-build/include -I/usr/local/src/sage-4.7.1/spkg/build/atlas-3 .8.4/ATLAS-build/../srcinclude -I/usr/local/src/sage-4.7.1/spkg/build/atlas-3.8.4/ATLAS-build/../srcinclude/contrib -DUNKNOWN -DUNKNOW N -DStringUNKNOWN -DATL_OS_Linux -DATL_ARCH_Corei2 -DATL_CPUMHZ=2133 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_GAS_x8632 -DATL_NCPU=2 -fomit -frame-pointer -mfpmath=sse -mavx -O2 -fno-schedule-insns2 -fPIC -m32 -o xL1 L1CacheSize.o time.o /usr/local/src/sage-4.7.1/spkg/build/atlas-3.8.4/ATLAS-build/bin/ATLrun.sh /usr/local/src/sage-4.7.1/spkg/build/atlas-3.8.4/ATLAS-build/tu ne/sysinfo xL1 64 Illegal instruction make[7]: * [RunL1] Error 132

It looks like the prebuilt sage-4.7-linux-32bit-ubuntu_10.04_lts-i686-Linux is running OK.

I will have periodic access to this laptop to run more tests if requested.

comment:8 Changed 8 years ago by jdemeyer

  • Milestone set to sage-duplicate/invalid/wontfix
  • Resolution set to worksforme
  • Reviewers set to Jeroen Demeyer
  • Status changed from new to closed

Assuming this is fixed, for example by #10508.

Note: See TracTickets for help on using tickets.