Opened 12 years ago
Last modified 9 years ago
#10508 closed enhancement
Update ATLAS to stable version 3.10 — at Version 127
Reported by: | vbraun | Owned by: | tbd |
---|---|---|---|
Priority: | major | Milestone: | sage-5.10 |
Component: | packages: standard | Keywords: | ATLAS spkg |
Cc: | dimpase, fbissey, leif, kcrisman | Merged in: | |
Authors: | Volker Braun | Reviewers: | Benjamin Jones |
Report Upstream: | Reported upstream. No feedback yet. | Work issues: | |
Branch: | Commit: | ||
Dependencies: | #13160 | Stopgaps: |
Description (last modified by )
The new atlas release now builds netlib lapack itself, so the lapack tarball is now included in the ATLAS spkg.
- Updated to newest upstream source, various patches are no longer required
SAGE_ATLAS_LIB=path
now searches inpath/libatlas.so
instead ofpath/lib/libatlas.so
so it works for people with atlas in/lib64
, too.- Threading is now enabled by default
- Flush before
os.system
(#13210)
Upstream has made some attempt at changing the layout of the shared libraries, which is now different from the static libraries. The atlas spkg contains a stub autoconf/libtools project that unpacks the static libraries and repacks them into equivalent shared libraries.
By default, ATLAS will now try twice to get timings and fail immediately if throttling is enabled. If auto-tuning fails build with SAGE_ATLAS_ARCH=fast
, and if that fails with SAGE_ATLAS_ARCH=base
. On x86, the fast and base targets are the new ATLAS generic targets x86SSE3 and x86SSE2/x86x87.
The current spkg version is at
http://www.stp.dias.ie/~vbraun/Sage/spkg/atlas-3.10.0.spkg
Apply trac_10508_root_repo.patch to the SAGE_ROOT repository and 10508_doctest.patch, trac_10508_update_atlas_docs.patch to the Sage repository.
Remove the lapack and blas packages.
Change History (128)
comment:1 Changed 12 years ago by
comment:2 Changed 12 years ago by
- Cc dimpase added
comment:3 Changed 12 years ago by
- Status changed from new to needs_info
Does it mean that LAPACK spkg can be removed, too?
comment:4 Changed 12 years ago by
Yes, the lapack spkg can be removed.
I'm still trying to debug some issues with linbox...
comment:5 Changed 12 years ago by
- Cc fbissey added
BLAS can also be removed if we go with this. f2c which was used before we got ATLAS to provide cblas by f2c-ing the BLAS package should also be removed (I think it listed in scipy's dependency only).
comment:6 follow-up: ↓ 7 Changed 12 years ago by
on 32-bit x86 Linux (Debian squeeze) I get the following, when trying to install the spkg (applied the patches to a pristine Sage 4.6.1.alpha3):
... make -j1 libatlas.so libptf77blas.so libf77blas.so \ libptcblas.so libcblas.so liblapack.so make[3]: Entering directory `/usr/local/src/sage/sage-4.6.1.alpha3/spkg/build/atlas-3.9.32/ATLAS-build/lib' ld -melf_i386 -shared -soname /usr/local/src/sage/sage-4.6.1.alpha3/local/lib/libatlas.so -o libatlas.so \ -rpath-link /usr/local/src/sage/sage-4.6.1.alpha3/local/lib \ --whole-archive libatlas.a --no-whole-archive -lc -lm make[3]: *** No rule to make target `libptf77blas.a', needed by `libptf77blas.so'. Stop. make[3]: Leaving directory `/usr/local/src/sage/sage-4.6.1.alpha3/spkg/build/at las-3.9.32/ATLAS-build/lib' make[2]: *** [ptshared] Error 2 make[2]: Leaving directory `/usr/local/src/sage/sage-4.6.1.alpha3/spkg/build/at las-3.9.32/ATLAS-build/lib' Configuration: SAGE_LOCAL: /usr/local/src/sage/sage-4.6.1.alpha3/local linker_Solaris?: False PPC?: False SPKG_DIR: /usr/local/src/sage/sage-4.6.1.alpha3/spkg/build/atlas-3.9.32 linker_GNU?: True ld: GNU system: Linux Darwin?: False machine: i686 fortran: gfortran Solaris?: False fortran_g95?: False bits: 32bit CYGWIN?: False SPARC?: False fortran_GNU?: True FreeBSD?: False 32bit?: True Linux?: True 64bit?: False Intel?: False processor:
comment:7 in reply to: ↑ 6 Changed 12 years ago by
Replying to dimpase:
on 32-bit x86 Linux (Debian squeeze) I get the following, when trying to install the spkg (applied the patches to a pristine Sage 4.6.1.alpha3):
complete install.log is here: http://boxen.math.washington.edu/home/dima/tmp/install-alt3.9.log.gz
comment:8 follow-up: ↓ 9 Changed 12 years ago by
Hi Dima,
I think the problem is
make[6]: Entering directory `/usr/local/src/sage/sage-4.6.1.alpha3/spkg/build/atlas-3.9.32/ATLAS-build/tune/sysinfo' gcc -c -DL2SIZE=4194304 -I/usr/local/src/sage/sage-4.6.1.alpha3/spkg/build/atlas-3.9.32/ATLAS-build/include -I/usr/local/src/sage/sage-4.6.1.alpha3/spkg/build/atlas-3.9.32/ATLAS-build/../src//include -I/usr/local/src/sage/sage-4.6.1.alpha3/spkg/build/atlas-3.9.32/ATLAS-build/../src//include/contrib -DAdd_ -DF77_INTEGER=int -DStringSunStyle -DATL_OS_Linux -DATL_ARCH_CoreDuo -DATL_CPUMHZ=800 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_GAS_x8632 -fomit-frame-pointer -O3 -mfpmath=387 -fPIC -m32 /usr/local/src/sage/sage-4.6.1.alpha3/spkg/build/atlas-3.9.32/ATLAS-build/../src//tune/sysinfo/findNT.c gcc -DL2SIZE=4194304 -I/usr/local/src/sage/sage-4.6.1.alpha3/spkg/build/atlas-3.9.32/ATLAS-build/include -I/usr/local/src/sage/sage-4.6.1.alpha3/spkg/build/atlas-3.9.32/ATLAS-build/../src//include -I/usr/local/src/sage/sage-4.6.1.alpha3/spkg/build/atlas-3.9.32/ATLAS-build/../src//include/contrib -DAdd_ -DF77_INTEGER=int -DStringSunStyle -DATL_OS_Linux -DATL_ARCH_CoreDuo -DATL_CPUMHZ=800 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_GAS_x8632 -fomit-frame-pointer -O3 -mfpmath=387 -fPIC -m32 -o xfindNT findNT.o ATL_walltime.o -lm /usr/lib/crt1.o: In function `_start': (.text+0x18): undefined reference to `main' collect2: ld returned 1 exit status make[6]: *** [xfindNT] Error 1
This fails because ATL_NCPU
is not set. Do you have a single-core processor? I guess the threaded libraries are not built in that case.
comment:9 in reply to: ↑ 8 ; follow-up: ↓ 18 Changed 12 years ago by
Replying to vbraun:
This fails because
ATL_NCPU
is not set. Do you have a single-core processor? I guess the threaded libraries are not built in that case.
yes, it's single-core. An old Pentium M. (Atlas 3.8 spkg does not build on it, at all)
comment:10 follow-up: ↓ 11 Changed 12 years ago by
I changed the spkg-install
to make only single-threaded shared libraries if necessary. Should work now.
For simplicity, I made spkgs for cvxopts and sage_scripts. So you need to add
http://www.stp.dias.ie/~vbraun/Sage/spkg/atlas-3.9.32.spkg http://www.stp.dias.ie/~vbraun/Sage/spkg/cvxopt-1.1.3.p0.spkg http://www.stp.dias.ie/~vbraun/Sage/spkg/sage_scripts-4.6.1.alpha3.p0.spkg
to $SAGE_ROOT/spkg/standard
and then
- replace
spkg/install
with the attached version (Note thatsage_scripts
overwrites this file during installation, aargh) - replace
spkg/standard/deps
with the attached version.
I'm still having doctest errors with linbox...
comment:11 in reply to: ↑ 10 Changed 12 years ago by
Replying to vbraun:
I changed the
spkg-install
to make only single-threaded shared libraries if necessary. Should work now.
It builds OK, but then testlong gives quite a bit of failures:
The following tests failed: sage -t -long -force_lib "devel/sage/doc/en/bordeaux_2008/elliptic_curves.rst" sage -t -long -force_lib "devel/sage/sage/modular/modsym/space.py" sage -t -long -force_lib "devel/sage/sage/modular/modsym/tests.py" sage -t -long -force_lib "devel/sage/sage/modular/modsym/subspace.py" sage -t -long -force_lib "devel/sage/sage/modular/modsym/ambient.py" sage -t -long -force_lib "devel/sage/sage/modular/modform/space.py" sage -t -long -force_lib "devel/sage/sage/modular/modform/element.py" sage -t -long -force_lib "devel/sage/sage/modular/modform/ambient.py" sage -t -long -force_lib "devel/sage/sage/modular/hecke/submodule.py" sage -t -long -force_lib "devel/sage/sage/modular/abvar/abvar.py" sage -t -long -force_lib "devel/sage/sage/modular/abvar/torsion_subgroup.py" sage -t -long -force_lib "devel/sage/sage/modular/abvar/cuspidal_subgroup.py" sage -t -long -force_lib "devel/sage/sage/modular/abvar/finite_subgroup.py" sage -t -long -force_lib "devel/sage/sage/matrix/matrix_integer_dense.pyx" sage -t -long -force_lib "devel/sage/sage/matrix/matrix_integer_dense_hnf.py" sage -t -long -force_lib "devel/sage/sage/rings/qqbar.py" sage -t -long -force_lib "devel/sage/sage/finance/time_series.pyx" sage -t -long -force_lib "devel/sage/sage/schemes/elliptic_curves/padic_lseries.py" sage -t -long -force_lib "devel/sage/sage/schemes/elliptic_curves/ell_modular_symbols.py" sage -t -long -force_lib "devel/sage/sage/schemes/elliptic_curves/ell_rational_field.py" sage -t -long -force_lib "devel/sage/sage/schemes/elliptic_curves/sha_tate.py" Total time for all tests: 30045.4 seconds
I do not know how many of them are Atlas related, though. The log is here: http://boxen.math.washington.edu/home/dima/tmp/testlong-atl3.9.log
comment:12 follow-up: ↓ 13 Changed 12 years ago by
A lot of it is probably related. Was this a build from scratch? linbox calls seem to be particularly affected, there are a lot of failures in that path: sage.matrix.matrix_integer_dense.Matrix_integer_dense._charpoly_linbox stuff going through: sage.matrix.matrix_rational_dense.Matrix_rational_dense.right_kernel is also affected and that smells like atlas at work too.
I must say we have observed a number of failures related to ATLAS-3.9.23 in sage-on-gentoo: https://github.com/cschwan/sage-on-gentoo/issues#issue/3 you have failure there too but not due to iml/ATLAS as far as I can see. https://github.com/cschwan/sage-on-gentoo/issues#issue/6 more subtle. I see you have the failure with devel/sage/sage/finance/time_series.pyx you may want to look the comment I wrote about it in the section "known test failures for Sage on Gentoo": https://github.com/cschwan/sage-on-gentoo/wiki/Known-test-failures note that sage-on-gentoo can use (c)blas-reference, ATLAS or gsl-cblas or some combinations - in fact amd and intel libraries could in principle be used but no one I know has tried. http://www.gentoo.org/proj/en/science/blas-lapack.xml
Not sure about the SIGFPE, it could come from a result rounded to 0 or the like.
comment:13 in reply to: ↑ 12 Changed 12 years ago by
Replying to fbissey:
A lot of it is probably related. Was this a build from scratch?
yes, it's a build from scratch, on the same machine (old Pentium M) as already discussed above.
linbox calls seem to be particularly affected, there are a lot of failures in that path:
I wonder if these are Atlas bugs, or Linbox bugs...
One should try an OSX build.
comment:14 follow-up: ↓ 15 Changed 12 years ago by
why OS X in particular? I could do that but not before the 5th of January, when my university reopens. Ok I could do it before that but I'll enjoy the break a little bit more :)
comment:15 in reply to: ↑ 14 ; follow-up: ↓ 16 Changed 12 years ago by
Replying to fbissey:
why OS X in particular? I could do that but not before the 5th of January, when my university reopens. Ok I could do it before that but I'll enjoy the break a little bit more :)
Or does OSX remain disabled for Atlas, i.e. it's not built?
comment:16 in reply to: ↑ 15 Changed 12 years ago by
Replying to dimpase:
Replying to fbissey:
why OS X in particular? I could do that but not before the 5th of January, when my university reopens. Ok I could do it before that but I'll enjoy the break a little bit more :)
Or does OSX remain disabled for Atlas, i.e. it's not built?
I had forgotten about that. I checked the spkg and ATLAS is not built on cygwin and OS X. So it makes sense.
comment:17 follow-up: ↓ 20 Changed 12 years ago by
I get the same doctest errors. Most of them are linbox related. The SIGFPE is from converting a NAN into a GMP integer, but I haven't gotten to the root of the NAN yet. In trying to debug this I've noticed that there are a bunch of valgrind warnings in the linbox code path we are using. I've asked some more specific questions on the linbox-use mailinglist:
https://groups.google.com/d/topic/linbox-use/N3QNNOQuTAc/discussion
But so far no final conclusion.
comment:18 in reply to: ↑ 9 ; follow-up: ↓ 19 Changed 12 years ago by
Replying to dimpase:
Replying to vbraun:
This fails because
ATL_NCPU
is not set. Do you have a single-core processor? I guess the threaded libraries are not built in that case.yes, it's single-core. An old Pentium M. (Atlas 3.8 spkg does not build on it, at all)
I notice you say it is a Pentium M, yet ATLAS is compiling things with the assumption that it is a coreduo: -DATL_ARCH_CoreDuo -DATL_CPUMHZ=800 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_GAS_x8632 I am guessing the speed is right but the rest is not. Was your successful build using these parameters? I had in one instance a non-working ATLAS because the cpu type was misdetected (wanted to use sse3 when it didn't have them). The build was curiously successful but the library was unusable.
comment:19 in reply to: ↑ 18 Changed 12 years ago by
Replying to fbissey:
Replying to dimpase:
Replying to vbraun:
This fails because
ATL_NCPU
is not set. Do you have a single-core processor? I guess the threaded libraries are not built in that case.yes, it's single-core. An old Pentium M. (Atlas 3.8 spkg does not build on it, at all)
I notice you say it is a Pentium M, yet ATLAS is compiling things with the assumption that it is a coreduo: -DATL_ARCH_CoreDuo -DATL_CPUMHZ=800 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_GAS_x8632 I am guessing the speed is right but the rest is not. Was your successful build using these parameters? I had in one instance a non-working ATLAS because the cpu type was misdetected (wanted to use sse3 when it didn't have them). The build was curiously successful but the library was unusable.
The processor is Pentium M Banias 1.1GHz, http://ark.intel.com/Product.aspx?id=27600 (according to GotoBlas? installation procedure :-))
The Atlas built is not totally useless, it works for many doctests. By the way, 3.8 also thinks it's a CoreDuo?. And it works. I saw somewhere a note that it's OK.
comment:20 in reply to: ↑ 17 Changed 11 years ago by
Replying to vbraun:
I get the same doctest errors. Most of them are linbox related. The SIGFPE is from converting a NAN into a GMP integer, but I haven't gotten to the root of the NAN yet. In trying to debug this I've noticed that there are a bunch of valgrind warnings in the linbox code path we are using. I've asked some more specific questions on the linbox-use mailinglist:
https://groups.google.com/d/topic/linbox-use/N3QNNOQuTAc/discussion
But so far no final conclusion.
Converting a NAN into a GMP integer is exactly what was happening in https://github.com/cschwan/sage-on-gentoo/issues/3 and it didn't happen when using gslcblas. I will do a full build of sage-on-gentoo with 3.9.40 (or 41) and see if I can see anything.
comment:21 Changed 11 years ago by
Yuck, still got problems leading to linbox, I got the ones leading to iml too. Note that using another cblas like gslcblas/openblas/reference(netlib) all these problems disappear which seem to indicate that there is something going on in ATLAS itself or that all the others gets it wrong (I realise that on a stock sage you may have trouble compiling iml with anything else than ATLAS, it requires some patching to the configure script to be able to do so).
comment:22 Changed 11 years ago by
It may be worth noting that new released are available for both stable and development:
http://sourceforge.net/projects/math-atlas/files/Stable/3.8.4/|Stable http://sourceforge.net/projects/math-atlas/files/Developer%20%28unstable%29/3.9.41/|Unstable
comment:23 Changed 11 years ago by
I have 3.9.41 now, the only difference was it compiled faster because there was tuning for my cpu. But we could investigate 3.8.4. Thanks for pointing it is out, some of us (me at least) didn't think it would ever happen. We may have a stable ATLAS supporting newer CPUs at last, and it looks like I could test it quickly.
comment:24 Changed 11 years ago by
OK 3.8.4 doesn't suffer from any of the drawbacks that are apparent in 3.9.23+.
The big drawback is that it doesn't build lapack nicely on its own, provided we point to the sources like the latest 3.9.xx do. That means spkg-install would need a little bit more work.
Opinions?
comment:25 Changed 11 years ago by
Can we keep this ticket focused on the development version of atlas and discuss the stable ATLAS on #10226? I updated the spkg on the latter ticket to the new stable ATLAS release.
comment:26 Changed 11 years ago by
Sure. My current opinion and that's something I am pushing to sage-on-gentoo users is to avoid ATLAS 3.9.xx for the time being. It is possible that ATLAS-3.9.xx is doing something permissible that iml and linbox are not ready to catch. My line of thought is that I remember that some algorithm in iml takes the inverse of a result returned by cblas. If the result is 0 instead of a small value we may have our NaN, more likely some result from ATLAS is a NaN.
comment:27 Changed 11 years ago by
- Cc leif added
Note that you can copy updated spkgs into spkg/standard/
and then do
env SAGE_UPGRADING=yes make build
to rebuild all dependent packages.
(We should perhaps add make
targets for that to the top-level Makefile and document them in the Developer's Guide, as this is currently just a side-effect of the effort to make upgrading work.)
If you at the same time need to apply patches to the Sage library, things get a bit more complicated, as e.g. Sage switches to the main
branch before reinstalling the Sage library package. One safe way is to first apply the patches, create a new sage-x.y.z-whatever
spkg (with devel/sage/spkg-dist
) and replace the one in spkg/standard/
with that one (or at least make sure newest_version
will pick up the right one).
Note that the extension modules' dependencies in module_list.py
are currently far from complete. #8664 adds some in a generic way by adding them automatically in setup.py
, i.e. lets modules also depend on the headers of the libraries they use (which [only] works if the headers' mtimes get modified / updated during installation of their corresponding libraries). The dumb alternative is to run sage -ba-force
after an "upgrade" process.
comment:28 follow-up: ↓ 29 Changed 11 years ago by
Ping.
comment:29 in reply to: ↑ 28 Changed 11 years ago by
Replying to leif:
Ping.
just fired up my vintage MacOSX 32-bit Powerbbok PPC... Will know more in, like, 12 hours...
comment:30 follow-up: ↓ 31 Changed 11 years ago by
Several issues:
- atlas is now at version 3.9.49 (last I checked)
- I have reports of it failling to build on some old opterons
- @dimpase: do you need atlas on OS X ppc? Don't you use vectorize like the other OS X?
- I haven't checked but I am quite sure from other reports that cblas_dgem{m,v} is still buggy (someone posted that R built against that Atlas lapack was giving them trouble and R upstream pointed the finger to Atlas).
comment:31 in reply to: ↑ 30 Changed 11 years ago by
Replying to fbissey:
Several issues:
- atlas is now at version 3.9.49 (last I checked)
- I have reports of it failling to build on some old opterons
so what? I have a 5-year old 32-bit Intel laptop on which the sage-current Atlas does not build.
- @dimpase: do you need atlas on OS X ppc? Don't you use vectorize like the other OS X?
indeed. But that's for #10509. Oops, was posting on the wrong ticket...
comment:32 Changed 10 years ago by
With respect to comment:33:ticket:12011, should this be closed as a dup of #12011?
comment:33 Changed 10 years ago by
- Status changed from needs_info to needs_review
propose to close this one and refer to #12011 the the continuation of the upgrade work.
comment:34 Changed 10 years ago by
- Milestone changed from sage-5.0 to sage-duplicate/invalid/wontfix
- Status changed from needs_review to positive_review
I would say #12011 is a duplicate of my ticket but oh well ;-)
comment:35 Changed 10 years ago by
- Description modified (diff)
- Milestone changed from sage-duplicate/invalid/wontfix to sage-5.0
- Status changed from positive_review to needs_work
- Summary changed from Update ATLAS to version 3.9.32 to Update ATLAS to version 3.9.x
Seems like #12011 isn't a duplicate after all...
comment:36 Changed 10 years ago by
- Description modified (diff)
comment:37 Changed 10 years ago by
- Description modified (diff)
comment:38 Changed 10 years ago by
- Milestone changed from sage-5.1 to sage-5.3
Obviously, the patches to spkg/install
and spkg/standard/deps
must be rebased.
comment:39 Changed 10 years ago by
It makes a lot more sense to me to put the LAPACK
tarball at the top level of the spkg instead of in patches/
.
patches/ATLAS-lib/autom4te.cache
should be removed.
comment:40 Changed 10 years ago by
I don't like using assert
for control flow, because that's not what it's meant for.
Why not replace those by (see #13210)
if rc != 0: print "Error: foo" sys.exit(rc)
comment:41 Changed 10 years ago by
There is something wrong with the history in SPKG.txt
(atlas-3.8.4 is completely missing and there is atlas-3.9.68 for #12011 which never got merged)
comment:42 follow-up: ↓ 44 Changed 10 years ago by
I don't have any strong opinion on where to put the lapack tarball, except that our convention of only allowing a single src/
directory is shortsighted.
Note that I'm not using assertions for control flow, i.e. I'm not using
try: <command> except AssertionError: <alternative>
Note that you could theoretically also catch the SystemExit
exception, so sys.exit()
isn't different from assert
in that regard. I'm only using assertions to ensure the following contract holds: spkg-install
completes successfully only if the relevant atlas configure/make completed with rc==0
I'll replace the asserts by something that returns rc as exit code, though.
comment:43 Changed 10 years ago by
- Description modified (diff)
comment:44 in reply to: ↑ 42 Changed 10 years ago by
Replying to vbraun:
I'm only using assertions to ensure the following contract holds:
spkg-install
completes successfully only if the relevant atlas configure/make completed withrc==0
I think (IMHO but I might be wrong) that assertions should be used only in a situation where an assertion being false indicates a bug in the program. If rc != 0
in spkg-install
, that is an ordinary error condition, not a bug in the spkg-install
script.
comment:45 follow-up: ↓ 46 Changed 10 years ago by
Starting from sage-5.1, I get one doctest failure:
File "/release/merger/sage-5.1-atlas/devel/sage-main/sage/rings/polynomial/polynomial_element.pyx", line 1039: sage: parent(poly)([ 0.0 if abs(c)<=1e-14 else c for c in poly.coeffs() ]) Expected: 1.0 Got: 1.02140518266e-14*x^2 + 1.0
comment:46 in reply to: ↑ 45 Changed 10 years ago by
Replying to jdemeyer:
Starting from sage-5.1, I get one doctest failure:
sage: parent(poly)([ 0.0 if abs(c)<=1e-14 else c for c in poly.coeffs() ])
it's most probably not at Atlas problem, but rather that 1e-14 (any solid rationale behind this choice? I guess not.) needs to be adjusted.
comment:47 Changed 10 years ago by
- Reviewers set to Benjamin Jones
Testing in sage-5.1.rc1 on x86_64 debian Linux:
$ uname -a Linux sage 2.6.32 #1 SMP Fri Sep 2 21:08:57 CDT 2011 x86_64 GNU/Linux $ head -18 /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5690 @ 3.47GHz stepping : 2 cpu MHz : 3465.790 cache size : 12288 KB fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu de tsc msr pae cx8 sep cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good aperfmperf pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 popcnt aes hypervisor lahf_lm ida arat bogomips : 6931.58 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management:
BEFORE new ATLAS spkg
sage -t "devel/sage-main/sage/modular/modsym/ambient.py" [11.6 s] sage -t "devel/sage-main/sage/modular/hecke/ambient_module.py" [4.2 s] sage -t "devel/sage-main/sage/modular/hecke/hecke_operator.py" [3.0 s]
AFTER new spkg
- untar'd fresh sage-5.1.rc1
- replaced atlas-3.8.4.p1.spkg with atlas-3.9.85.spkg
- make build
SPKG build log: http://sage.math.washington.edu/home/bjones/atlas-3.9.85.log
Build atlas-3.9.85
real 11m18.595s user 11m15.906s sys 2m56.099s Successfully installed atlas-3.9.85
Tests
sage -t "devel/sage-main/sage/modular/modsym/ambient.py" [11.6 s] sage -t "devel/sage-main/sage/modular/hecke/ambient_module.py" [4.2 s] sage -t "devel/sage-main/sage/modular/hecke/hecke_operator.py" [3.1 s]
All tests
$ make ptestlong ... All tests passed! Total time for all tests: 1282.1 seconds
comment:48 Changed 10 years ago by
3.10.0 has just been released a few hours ago. Do we try to go for it? Like Jeroen remarked there may be tolerance issues depending on the hardware. I know I have one when using openblas instead of ATLAS.
comment:49 Changed 10 years ago by
- Description modified (diff)
- Status changed from needs_work to needs_review
- Summary changed from Update ATLAS to version 3.9.x to Update ATLAS to stable version 3.10
comment:50 follow-up: ↓ 51 Changed 10 years ago by
The new spkg will attempt to build on OSX if SAGE_ATLAS_ARCH
is set. Untested, but a start.
I also am planning to allow SAGE_ATLAS_LIB=system
and use ldconfig -p
to get the system atlas, if available. Though perhaps we should do that in a later version. This spkg seems to fix some build issues that people reported on the sage mailinglists so it might be good to get it in sooner rather than later.
comment:51 in reply to: ↑ 50 Changed 10 years ago by
Replying to vbraun:
so it might be good to get it in sooner rather than later.
+1, let's not worry now about changing the working of SAGE_ATLAS_LIB
.
comment:52 Changed 10 years ago by
- Dependencies set to sage-5.2.beta0
comment:53 Changed 10 years ago by
- Description modified (diff)
comment:54 Changed 10 years ago by
- Description modified (diff)
comment:55 Changed 10 years ago by
In spkg-install
, I find it confusing that you have a function build()
which calls configure()
and make_atlas()
.
I would get rid of build()
and simply call configure()
and make_atlas()
directly.
comment:56 Changed 10 years ago by
- Status changed from needs_review to needs_work
About SAGE_FAT_BINARY
: at one point you are checking
os.environ.has_key('SAGE_FAT_BINARY')
and at another you are checking
os.environ.get('SAGE_FAT_BINARY', 'no') == 'yes' and conf['Intel?']
I think the correct check is
os.environ.get('SAGE_FAT_BINARY', 'no') == 'yes'
"SAGE_FAT_BINARY" has evolved to mean "build a generic binary on any processor", so it's not Intel specific anymore. Just call configure_base()
.
comment:57 Changed 10 years ago by
Somebody needs to review 10508_doctest.patch
comment:58 Changed 10 years ago by
Detail: in system_with_flush()
, your print
command should not have a space after "Running". The space is automatically added by print
.
comment:59 Changed 10 years ago by
Updated spkg to remove the space and make the SAGE_FAT_BINARY
check uniform.
I'm fine with the doctests patch...
comment:60 Changed 10 years ago by
I'm always calling build()
except in one place where I want to allow configure()
to fail (because throttling is enabled).
comment:61 Changed 10 years ago by
Is this expected? Compare the build time on arando (Ubuntu 12.04 i386):
real 104m44.021s user 97m25.753s sys 1m18.157s Successfully installed atlas-3.8.4.p1
real 259m50.222s user 233m21.879s sys 7m38.405s Successfully installed atlas-3.10.0
comment:62 Changed 10 years ago by
- Status changed from needs_work to needs_review
I've noticed before that the build time for the "generic" binary is rather long. Its not entirely generic, it is still doing timing for cache edges. But the result will be a working library, it won't probe funky asm implementations that other CPUs might not support. I'll ask on the ATLAS mailinglist for clarification.
comment:63 follow-up: ↓ 64 Changed 10 years ago by
Timing on x86_64 with SAGE_FAT_BINARY=yes
on sage.math:
real 139m50.733s user 134m30.710s sys 7m12.390s Successfully installed atlas-3.8.4.p1
real 350m5.662s user 330m23.460s sys 22m1.010s Successfully installed atlas-3.10.0
Why does the new ATLAS take so much longer to build than the old one?
comment:64 in reply to: ↑ 63 Changed 10 years ago by
Replying to jdemeyer:
Timing on x86_64 with
SAGE_FAT_BINARY=yes
on sage.math:real 139m50.733s user 134m30.710s sys 7m12.390s Successfully installed atlas-3.8.4.p1real 350m5.662s user 330m23.460s sys 22m1.010s Successfully installed atlas-3.10.0Why does the new ATLAS take so much longer to build than the old one?
I can beat that:
Finished installing shared ATLAS library. real 821m14.881s user 739m58.560s sys 58m23.440s Successfully installed atlas-3.10.0
(That's with sage -f ...
on an otherwise idle machine, an AMD Fusion E-450 running Ubuntu 10.04.4 LTS x86_64.)
It took far more time than building all of Sage [in parallel, on that dual-core CPU], including the old ATLAS spkg. I currently have no figures for a separate ATLAS 3.8.4 build, but the timings from previous parallel Sage builds on that machine vary between
real 214m52.096s user 175m5.390s sys 9m56.210s Successfully installed atlas-3.8.4.p1
and
real 256m32.949s user 217m30.760s sys 10m31.050s Successfully installed atlas-3.8.4.p1
(The LAPACK and BLAS spkg build times in these builds are a few minutes and less than one minute [wall time], respectively.)
I was actually hoping ATLAS 3.9.x / 3.10 meanwhile "knows" these AMD CPUs and therefore builds at least a bit faster...
comment:65 Changed 10 years ago by
FWIW, ptestlong
passed (after rebuilding all dependent packages) with Sage 5.2.beta0 (without applying the doctest patch).
comment:66 Changed 10 years ago by
We have noticed the building time increase in Gentoo as well. At 3.9.82 I think. Apparently they have changed how they detect the compiler and that's what causing the spike. But it seem fixed in 3.10.0 unless we are carrying a specific patch in Gentoo
Fri Jun 29 13:42:08 2012 >>> sci-libs/atlas-3.9.80 merge time: 8 minutes and 4 seconds. Wed Jul 4 14:56:21 2012 >>> sci-libs/atlas-3.9.82 merge time: 3 hours, 54 minutes and 30 seconds. Wed Jul 11 11:11:23 2012 >>> sci-libs/atlas-3.10.0 merge time: 8 minutes and 34 seconds.
comment:67 Changed 10 years ago by
If I set SAGE_ATLAS_ARCH=Corei2,AVX,SSE3,SSE2,SSE1 then I can also compile atlas-3.10.0 in about 10 minutes. The issue is only with the "generic" archdefs, which seem to be not sufficiently specialzied. I've raised this issue on the ATLAS mailinglist.
comment:68 Changed 10 years ago by
The updated atlas spkg also installs a script atlas-config
in $SAGE_LOCAL/bin
which can be used to compute new architectural defaults. I need somebody on Linux i386 to first install the new spkg and then run
SAGE_FAT_BINARY=yes atlas-config --archdef
to grind out the archdefs for the i386 "generic" target. I'm currently doing this for x86_64, but I don't have a i386 machine. This will use sudo to turn off CPU throttling so you need to be a sudoer.
[vbraun@volker-desktop sage-5.2.beta1]$ ./local/bin/atlas-config --help usage: atlas-config [-h] [--unthrottle PID] [--archdef] (Re-)Build ATLAS (http://math-atlas.sourceforge.net) according to the SAGE_ATLAS_ARCH environment variable optional arguments: -h, --help show this help message and exit --unthrottle PID switch CPU throttling off until PID finishes --archdef build archdef tarball and save it to the current directory
comment:69 follow-up: ↓ 70 Changed 10 years ago by
I've updated the spkg with new 64-bit generic archdefs, this now builds in about 10 mins.
comment:70 in reply to: ↑ 69 Changed 10 years ago by
Replying to vbraun:
I've updated the spkg with new 64-bit generic archdefs, this now builds in about 10 mins.
md5sum?
I incidentally just downloaded some new version (163f090f18bb8616e93617677a644cd8
) and triggered a full build from scratch.
comment:71 Changed 10 years ago by
The newest version is 8a16c9d39add1c6c3f37e13986e2a3cc and thats whats linked in the ticket.
comment:72 Changed 10 years ago by
The new version d33e9114156d8373fa61f957b379e029 changes the "fast" 64-bit archdef to P4E64SSE3
.
It turns out that there are no 64-bit generic archdefs, which might have been one reason for why ATLAS was slow to compile. The SPKG uses the existing 32-bit archdefs or the 64-bit one that I made. On x86 the spkg should produce a working ATLAS library within 10-30 mins, and only go through the tuning process if either
- CPU throttling is disabled (
scaling_governor=performance
, needs root), or - SAGE_ATLAS_ARCH is explicitly set to something different from "base" / "fast".
comment:73 follow-up: ↓ 74 Changed 10 years ago by
PS: Surprisingly enough, the new ATLAS spkg actually compiled on OSX (bsd.math)! If you want to try yourself just set SAGE_ATLAS_ARCH="base"
to force building on OSX.
comment:74 in reply to: ↑ 73 Changed 10 years ago by
Replying to vbraun:
PS: Surprisingly enough, the new ATLAS spkg actually compiled on OSX (bsd.math)! If you want to try yourself just set
SAGE_ATLAS_ARCH="base"
to force building on OSX.
Very cool. I just got ATLAS to build on my OS X 10.6.8 machine setting SAGE_ATLAS_ARCH=Corei2
. Running long tests now.
comment:75 follow-up: ↓ 76 Changed 10 years ago by
Update: ATLAS-3.10.0 built successfully on my OS X 10.6.8 machine with SAGE_ATLAS_ARCH=Corei2
, the build took approx. 16 mins. The build log is at http://sage.math.washington.edu/home/bjones/atlas-3.10.0.log. Sage passes all make ptestlong
tests. The spkg looks very good to me. I'd give this a positive review, but maybe it should be tested by a few other reviewers on other platforms first.
comment:76 in reply to: ↑ 75 Changed 10 years ago by
Replying to benjaminfjones:
Update: ATLAS-3.10.0 built successfully on my OS X 10.6.8 machine with
SAGE_ATLAS_ARCH=Corei2
, the build took approx. 16 mins. The build log is at http://sage.math.washington.edu/home/bjones/atlas-3.10.0.log. Sage passes allmake ptestlong
tests. The spkg looks very good to me. I'd give this a positive review, but maybe it should be tested by a few other reviewers on other platforms first.
I've built it successfully on OS X 10.6.8 (with Core2 Duo) and setting SAGE_ATLAS_ARCH="base"
.
comment:77 Changed 10 years ago by
Well, there's at least room for nitpicking (a couple of typos and some inconsistencies as well as superfluous code in spkg-install
and probably configuration.py
, don't recall)... I'll maybe take a look at it again tomorrow, and probably provide a patch (provided Volker doesn't plan to make further major changes to these files).
How about also installing a user script for convenience to save the built ATLAS libraries to another place (for later use with SAGE_ATLAS_LIB
)?
Regarding the mentioned excessive tuning times, I also wonder whether we should use something like SAGE_ATLAS_ARCH=fast
(or base
) by default, i.e., only do self-tuning if the user explicitly asks for it in some way. I guess the Sage Installation Guide needs to get updated anyway w.r.t. ATLAS and environment variables.
comment:78 Changed 10 years ago by
The root repo patch should get rebased for Sage 5.2.rc0.
comment:79 Changed 10 years ago by
On hawk (OpenSolaris):
DONE configure Finished configuring ATLAS. Running make -j1 make[2]: Entering directory `/export/home/palmieri/testing/ATLAS/sage-5.2.rc0/spkg/build/atlas-3.1\ 0.0/ATLAS-build' make[2]: warning: -jN forced in submake: disabling jobserver mode. make -j1 -f Make.top build make[3]: Entering directory `/export/home/palmieri/testing/ATLAS/sage-5.2.rc0/spkg/build/atlas-3.1\ 0.0/ATLAS-build' Make.top:1: Make.inc: No such file or directory Make.top:325: warning: overriding commands for target `/AtlasTest' Make.top:76: warning: ignoring old commands for target `/AtlasTest' make[3]: *** No rule to make target `Make.inc'. Stop. make[3]: Leaving directory `/export/home/palmieri/testing/ATLAS/sage-5.2.rc0/spkg/build/atlas-3.10\ .0/ATLAS-build' make[2]: *** [build] Error 2 make[2]: Leaving directory `/export/home/palmieri/testing/ATLAS/sage-5.2.rc0/spkg/build/atlas-3.10\ .0/ATLAS-build' ------------------------------------------------------------ File "./spkg-install", line 478, in <module> assert_success(rc, bad='Failed to build ATLAS.', good='Finished building ATLAS core.') File "./spkg-install", line 74, in assert_success traceback.print_stack(file=sys.stdout) ------------------------------------------------------------ Error: Failed to build ATLAS. real 4m10.778s user 0m7.766s sys 0m8.391s Successfully installed atlas-3.10.0 Deleting temporary build directory /export/home/palmieri/testing/ATLAS/sage-5.2.rc0/spkg/build/atlas-3.10.0 Finished installing atlas-3.10.0.spkg
I don't know why it's not building, but it shouldn't exit saying "Successfully installed atlas-3.10.0". I added a print statement, and "rc" is 512. The documentation for sys.exit says that for the argument, "Most systems require it to be in the range 0-127, and produce undefined results otherwise." We could instead do this:
-
spkg-install
diff --git a/spkg-install b/spkg-install
a b def assert_success(rc, good=None, bad=No 74 74 traceback.print_stack(file=sys.stdout) 75 75 print '-'*60 76 76 if bad is not None: 77 print 'Error: ', bad78 sys.exit( rc)77 sys.exit('Error: %s' % bad) 78 sys.exit(1) 79 79 80 80 ###################################################################### 81 81 ### Skip building ATLAS on specific systems
comment:80 Changed 10 years ago by
- Status changed from needs_review to needs_work
comment:81 follow-up: ↓ 93 Changed 10 years ago by
On Ubuntu 10.04.4 LTS x86_64 (AMD E-450), with Sage 5.2.rc0 and SAGE_ATLAS_ARCH=fast
I get:
... Building using specific architecture. Fast configuration on Intel x86_64 compatible CPUs. Running configure with arch = P4E64SSE3, isa extensions ('SSE3', 'SSE2', 'SSE1'), archdef dir None Traceback (most recent call last): File "./spkg-install", line 454, in <module> rc = build() File "./spkg-install", line 447, in build rc = configure(arch, isa_ext, archdef_dir) File "./spkg-install", line 315, in configure cmd += ' -A '+str(ATLAS_MACHTYPE.index(arch)) ValueError: tuple.index(x): x not in tuple real 0m0.701s user 0m0.090s sys 0m0.060s ************************************************************************ Error installing package atlas-3.10.0 ************************************************************************
comment:82 Changed 10 years ago by
Built successfully on power7. A few oddities in the log but I don't think they are important
make -j1 atlas_run atldir=/hpc/scratch/frb15/sandbox/sage-5.1.beta5/spkg/build/atlas-3.10.0/ATLAS-build exe=xprobe_comp redir=config1.out \ args="-v 0 -o atlconf.txt -O 1 -A 7 -Si nof77 0 -V 6 -Fa ic '-fPIC' -C sm 'gcc' -Fa sm '-fPIC' -C dm 'gcc' -Fa dm '-fPIC' -C sk 'gcc' -Fa sk '-fPIC' -C dk 'gcc' -Fa dk '-fPIC' -C xc 'gcc' -Fa xc '-fPIC' -Fa gc '-fPIC' -C if 'sage_fortran' -Fa if '-fPIC' -b 64 - d b /hpc/scratch/frb15/sandbox/sage-5.1.beta5/spkg/build/atlas-3.10.0/ATLAS-build" make[1]: Entering directory `/hpc/scratch/frb15/sandbox/sage-5.1.beta5/spkg/build/atlas-3.10.0/ATLAS-build' cd /hpc/scratch/frb15/sandbox/sage-5.1.beta5/spkg/build/atlas-3.10.0/ATLAS-build ; ./xprobe_comp -v 0 -o atlconf.txt -O 1 -A 7 -Si nof77 0 -V 6 -Fa ic '-fPIC' -C sm 'gcc' -Fa sm '-fPIC' -C dm 'gcc' -Fa dm '-fPIC' -C sk 'gcc' -Fa sk '-fPIC' -C dk 'gcc' -Fa dk '-fPIC' -C xc 'gcc ' -Fa xc '-fPIC' -Fa gc '-fPIC' -C if 'sage_fortran' -Fa if '-fPIC' -b 64 -d b /hpc/scratch/frb15/sandbox/sage-5.1.beta5/spkg/build/atlas-3 .10.0/ATLAS-build > config1.out sh: -c: line 0: unexpected EOF while looking for matching ``' sh: -c: line 1: syntax error: unexpected end of file sh: -c: line 0: unexpected EOF while looking for matching ``' sh: -c: line 1: syntax error: unexpected end of file sh: -c: line 0: unexpected EOF while looking for matching ``' sh: -c: line 1: syntax error: unexpected end of file sh: -c: line 0: unexpected EOF while looking for matching ``' sh: -c: line 1: syntax error: unexpected end of file sh: -c: line 0: unexpected EOF while looking for matching ``' sh: -c: line 1: syntax error: unexpected end of file probe_f2c.o: In function `ATL_tmpnam': /hpc/scratch/frb15/sandbox/sage-5.1.beta5/spkg/build/atlas-3.10.0/ATLAS-build/../src//CONFIG/include/atlas_sys.h:224: warning: the use of ` tmpnam' is dangerous, better use `mkstemp'
I noticed that libcblas.so is using rpath
ldd -r local/lib/libcblas.so.2.1.0 linux-vdso64.so.1 => (0x0000040000000000) libatlas.so.2 => /hpc/scratch/frb15/sandbox/sage-5.1.beta5/local/lib/libatlas.so.2 (0x0000040000060000) libpthread.so.0 => /lib64/power7/libpthread.so.0 (0x00000400008a0000) libm.so.6 => /lib64/power7/libm.so.6 (0x00000400008e0000) libc.so.6 => /lib64/power7/libc.so.6 (0x00000400009c0000) /lib64/ld64.so.1 (0x0000000024560000)
And that's from outside the sage shell. However for f77blas and presumably lapack, libgfortran is not rpath-ed (I am using sage gcc's spkg in this case).
That is important info if you want to put atlas libraries in another location.
I will run a few tests shortly.
comment:83 Changed 10 years ago by
Volker, do you still need architecture defaults for i386? I have access to an old laptop with a Centrino M processor that reports being i386 / i686 in uname -a
and in /proc/cpuinfo:
cpu family 6 cpu model 13 Intel Pentium M
I haven't ever built Sage on it, I imagine it would take several years, but I'm willing to give it a try.
comment:84 follow-up: ↓ 85 Changed 10 years ago by
- John, I'm against changing the exit codes, we report whatever the sub-process spat out. So even if its >127, thats just what we were handed so clearly its a supported exit code.
I don't have an account on hawk, but in any case that should not stop us from shipping an updated ATLAS that fixes build issues on modern hard/software. We can always clean up second-tier platforms later. But do post the whole log, the error is likely further up.
- Leif, are you working on a patch? In
configure_fast()
, we should setarch = 'P4E'
instead ofP4E64SSE3
is added later. This is what causes your build failure.
- Francois: the weird messages are normal. The rpaths are set by libtools and are what libtools likes to set. As long as we set
LD_LIBRARY_PATH
this doesn't matter except when we distribute the binaries. This needs to be fixed some day in all shared libs but not on this ticket.
- Benjamin: no I don't need 32-bit archdefs, this should be covered by what ATLAS already ships with. The atlas spkg should build relatively quickly now except for the cases in 72
comment:85 in reply to: ↑ 84 Changed 10 years ago by
Replying to vbraun:
- John, I'm against changing the exit codes, we report whatever the sub-process spat out. So even if its >127, thats just what we were handed so clearly its a supported exit code.
But after sys.exit(512)
, Python has return code of zero, on sage.math, OS X, and OpenSolaris. So sage-spkg thinks the spkg installed correctly, which it clearly didn't. From the documentation for os.system, it looks to me that its output should be divided by 512 to get a return code suitable for sys.exit, but I'm not sure about that. However you want to fix it, it has to be changed: it's not acceptable for the compilation to fail but for spkg-install to have a return code of zero.
I don't have an account on hawk, but in any case that should not stop us from shipping an updated ATLAS that fixes build issues on modern hard/software. We can always clean up second-tier platforms later. But do post the whole log, the error is likely further up.
The log is posted at http://sage.math.washington.edu/home/palmieri/misc/atlas-3.10.0.log.
comment:86 Changed 10 years ago by
Something along the following lines should be used to handle rc
, see http://docs.python.org/library/os.html
if os.WIFEXITED(rc): rc = os.WEXITSTATUS(rc) elif os.WIFSIGNALED(rc): rc = 128 + os.WTERMSIG(rc) else: raise SystemError("Unknown return value %i for os.system()"%rc)
comment:87 Changed 10 years ago by
I've investigated and reported the Solaris build issue upstream at https://sourceforge.net/tracker/?func=detail&aid=3545418&group_id=23725&atid=379483
comment:88 Changed 10 years ago by
The return value is actually the return value of os.system()
, which is described in http://docs.python.org/library/os.html#os.wait
To extract the exit status, we should just do sys.exit((rc >> 8) & 0x7f)
. Leif, are you working on the spkg right now?
comment:89 Changed 10 years ago by
The workaround for the Solaris build issue is to use the fqn for $CC
and sage_fortran
comment:90 Changed 10 years ago by
Since Leif apparently isn't around I implemented the fqn workaround for the Solaris build and the return status issues. Solaris build is still broken but now at a different place. Updated spkg at the same place, md5sum is 878695a26071cfe73a9977bd8413b748
.
comment:91 Changed 10 years ago by
This version now builds on hawk: log file here.
comment:92 Changed 10 years ago by
Sounds good! I updated the spkg with yet another SPARC Solaris fix, md5sum is 6dbcf22c920626380f2cba877cca4cb1
. Though still doesn't work on mark/skynet, but at least makes it now into the compile phase. In any case SPARC solaris issues shouldn't delay this ticket.
comment:93 in reply to: ↑ 81 ; follow-up: ↓ 94 Changed 10 years ago by
Replying to leif:
On Ubuntu 10.04.4 LTS x86_64 (AMD E-450), with Sage 5.2.rc0 and
SAGE_ATLAS_ARCH=fast
I get:ValueError: tuple.index(x): x not in tuple
Using SAGE_ATLAS_ARCH=base
in contrast worked (and ptestlong
passed with Sage 5.2.rc0, FWIW):
real 34m12.187s user 30m3.170s sys 5m30.850s Successfully installed atlas-3.10.0
Still not that fast, but approximately within your estimates...
comment:94 in reply to: ↑ 93 ; follow-up: ↓ 95 Changed 10 years ago by
Replying to leif:
real 34m12.187s
Thats pretty good for 18W TDP. I take it compiling all of Sage takes 2+ hours on that machine?
comment:95 in reply to: ↑ 94 ; follow-up: ↓ 103 Changed 10 years ago by
Replying to vbraun:
Replying to leif:
real 34m12.187s
Thats pretty good for 18W TDP. I take it compiling all of Sage takes 2+ hours on that machine?
Sure. Although ATLAS currently consumes only <= 9W ;-)
Unfortunately ATLAS is built quite late (due to its odd dependency on Sage's Python -- while your script is apparently designed to support Python 2.4 as well), so a fair amount of the time spent building Sage only one core is used (because the remaining packages directly or indirectly depend on ATLAS).
I reinstalled the updated spkg again with SAGE_ATLAS_ARCH=fast
:
real 40m20.227s user 36m26.220s sys 6m9.500s Successfully installed atlas-3.10.0
comment:96 follow-up: ↓ 97 Changed 10 years ago by
An update: on hawk, I unpacked a sage-5.2.rc0 tarball, replaced the old ATLAS spkg with this one, and built from scratch. There are a bunch of doctest failures:
The following tests failed: sage -t --long -force_lib devel/sage/sage/matrix/matrix2.pyx # 12 doctests failed sage -t --long -force_lib devel/sage/sage/misc/functional.py # 1 doctests failed sage -t --long -force_lib devel/sage/sage/finance/time_series.pyx # 6 doctests failed sage -t --long -force_lib devel/sage/sage/numerical/test.py # Killed/crashed sage -t --long -force_lib devel/sage/sage/modular/modform/numerical.py # 3 doctests failed sage -t --long -force_lib devel/sage/sage/numerical/optimize.py # Killed/crashed sage -t --long -force_lib devel/sage/sage/matrix/matrix_double_dense.pyx # 68 doctests failed sage -t --long -force_lib devel/sage/doc/en/a_tour_of_sage/index.rst # Killed/crashed sage -t --long -force_lib devel/sage/doc/en/numerical_sage/cvxopt.rst # Killed/crashed sage -t --long -force_lib devel/sage/doc/fr/a_tour_of_sage/index.rst # Killed/crashed sage -t --long -force_lib devel/sage/doc/tr/a_tour_of_sage/index.rst # Killed/crashed sage -t --long -force_lib devel/sage/sage/combinat/e_one_star.py # Killed/crashed
For example:
File "/export/home/palmieri/testing/ATLAS/sage-5.2.rc0/devel/sage-main/sage/matrix/matrix2.pyx", line 8157: sage: (A - M*G).zero_at(10^-12) Expected: [0.0 0.0 0.0] [0.0 0.0 0.0] [0.0 0.0 0.0] Got: [ 0.0 0.0 0.0] [ -0.10532733041 + 0.0950573490006*I 0.017805411596 - 0.0512258178986*I -0.0226596712913 + 0.0414519876977*I] [ 0.100615400305 + 0.0962034401538*I -0.0779990660567 - 0.0543172822202*I 0.057608664751 + 0.0154619373789*I]
and
File "/export/home/palmieri/testing/ATLAS/sage-5.2.rc0/devel/sage-main/sage/misc/functional.py", line 1144: sage: norm(M) Expected: 10.6903311292 Got: 10.4323182134
I'll try to build again in case something wrong the first time.
comment:97 in reply to: ↑ 96 Changed 10 years ago by
Replying to jhpalmieri:
An update: on hawk, I unpacked a sage-5.2.rc0 tarball, replaced the old ATLAS spkg with this one, and built from scratch. There are a bunch of doctest failures:
Did you apply the root repo patch (to remove the BLAS and LAPACK spkgs)?
comment:98 Changed 10 years ago by
Oops, no, I forgot. One more time...
comment:99 Changed 10 years ago by
Now Sage doesn't build on hawk, I guess due to the problems noted on #10509: cvxopt doesn't build, because it says
ld: fatal: library -lblas: not found
I'll skip building cvxopt and continue with the rest of the build.
comment:100 follow-up: ↓ 109 Changed 10 years ago by
I don't think the cvxopt problem is due to #10509. The cvxopt spkg explicitly links against blas
, this is bad. From the cvxopt patches/setup.py.patch
:
+ libraries = ['m','lapack','gsl','blas','gslcblas','cblas','gfortran','atlas']
this is wrong, it should be f77blas
if the Fortran version is actually used or not there at all. Of course all modern systems have a libblas.so
somewhere so the linker finds it, notes that it is not used, and proceeds. Except that on Hawk, I guess, there is no system-wide libblas
. We should proceed removing blas in this ticket and then fix cvxopt on second-tier platforms later.
comment:101 Changed 10 years ago by
I meant that linking to blas was noted at #10509 as a possible problem, not that #10509 was causing this issue.
comment:102 Changed 10 years ago by
You are right, that the patch on #10509 should have been applied to cvxopt a long time ago then it wouldn't break here.
comment:103 in reply to: ↑ 95 ; follow-up: ↓ 110 Changed 10 years ago by
Replying to leif:
Replying to vbraun:
Replying to leif:
real 34m12.187s
Thats pretty good for 18W TDP. I take it compiling all of Sage takes 2+ hours on that machine?
[...]
I reinstalled the updated spkg again with
SAGE_ATLAS_ARCH=fast
:real 40m20.227s user 36m26.220s sys 6m9.500s Successfully installed atlas-3.10.0
ROFL, with SAGE_ATLAS_ARCH="AMD64K10h,SSE3,SSE2,SSE1,3DNow"
(which involves self-tuning) it took
real 36m15.153s user 31m23.290s sys 5m56.960s Successfully installed atlas-3.10.0
Also a bit strange is that the timing for ptestlong
(all for Sage 5.2.rc0, GCC 4.4.3) was
base
<fast
<AMD64K10h
(i.e., fastest with SAGE_ATLAS_ARCH=base
), although I think at least during the last run the machine was partially loaded with other stuff as well, and clearly ptestlong
isn't very appropriate to benchmark ATLAS performance... ;-)
[Not going to use the ATLAS tools for comparison right now, perhaps later...]
comment:104 Changed 10 years ago by
P.S.:
Another weird thing are (non-fatal w.r.t. the build) errors like
FlagCheck.c:1: error: bad value (ultrasparc) for -mtune= switch FlagCheck.c:1: error: bad value (ultrasparc) for -mtune= switch FlagCheck.c:1: error: bad value (ultrasparc) for -mtune= switch FlagCheck.c:1: error: bad value (ultrasparc) for -mtune= switch FlagCheck.c:1: error: bad value (armv7) for -march= switch FlagCheck.c:1: error: bad value (armv7) for -mtune= switch FlagCheck.c:1: error: bad value (ultrasparc) for -mtune= switch FlagCheck.c:1: error: bad value (ultrasparc) for -mtune= switch FlagCheck.c:1: error: bad value (ultrasparc) for -mtune= switch FlagCheck.c:1: error: bad value (ultrasparc) for -mtune= switch FlagCheck.c:1: error: bad value (970) for -mtune= switch
even if one specifies the architecture (i.e., on x86).
comment:105 Changed 10 years ago by
On hawk: with the appropriate patches applied, I still get some test failures, but as far as I can tell, they're all due to cvxopt being broken. So it looks pretty good.
comment:106 follow-up: ↓ 108 Changed 10 years ago by
Is it intentional that static libraries no longer get installed (although built)?
comment:107 Changed 10 years ago by
On cvxopt I am doing a new spkg in #13160 I'll check what I have done there. My main issue with the current spkg is it is horribly overlinked.
comment:108 in reply to: ↑ 106 ; follow-up: ↓ 111 Changed 10 years ago by
Replying to leif:
Is it intentional that static libraries no longer get installed (although built)?
Upstream really only builds static libraries. But static libraries suck for our purposes. So yet, it is intentional that the static libraries are not installed.
comment:109 in reply to: ↑ 100 Changed 10 years ago by
Replying to vbraun:
I don't think the cvxopt problem is due to #10509. The cvxopt spkg explicitly links against
blas
, this is bad. From the cvxoptpatches/setup.py.patch
:+ libraries = ['m','lapack','gsl','blas','gslcblas','cblas','gfortran','atlas']this is wrong, it should be
f77blas
if the Fortran version is actually used or not there at all. Of course all modern systems have alibblas.so
somewhere so the linker finds it, notes that it is not used, and proceeds. Except that on Hawk, I guess, there is no system-widelibblas
. We should proceed removing blas in this ticket and then fix cvxopt on second-tier platforms later.
Perhaps this ticket should get #13160 as a dependence. One thing I checked is that it appears to work on OSX 10.6.8, both with native blas/lapack, and with Atlas 3.10 from this ticket. I imagine #13160 can get finalized quickly.
comment:110 in reply to: ↑ 103 Changed 10 years ago by
Replying to leif:
[Not going to use the ATLAS tools for comparison right now, perhaps later...]
FWIW, while make time
works, make atlvat2.pdf ...
(to build ATLAS vs. ATLAS comparison charts) seems to be broken -- for me it always fails with a buffer overflow.
comment:111 in reply to: ↑ 108 Changed 10 years ago by
Replying to vbraun:
Replying to leif:
Is it intentional that static libraries no longer get installed (although built)?
Upstream really only builds static libraries. But static libraries suck for our purposes. So yet, it is intentional that the static libraries are not installed.
Well, as long as also the shared libraries are present, they're (usually) preferred over the static ones (i.e., unless one explicitly asks for linking against the latter), so copying these into $SAGE_LOCAL/lib/
IMHO shouldn't hurt. (The static libraries are btw. needed to compare different ATLAS installations; the only way to "keep" them is to reinstall the ATLAS spkg with ./sage -f -s ...
or to set SAGE_KEEP_BUILT_SPKGS
, and manually copy them.)
Note that previously installed static ATLAS libraries currently don't get removed. Don't know whether that may cause trouble (e.g. with upgrading); see above.
comment:112 Changed 10 years ago by
Its true that it doesn't hurt to have the static libraries as long as you don't use them. This is like saying that a knife doesn't hurt until you are stabbed with it. True, but why put a sharp blade under the couch pillow in hopes that nobody will sit on it?
It would be nice to have some system to compare different atlas versions and compile runs, but thats definitely for another ticket. Ideally the atlas-config
python script could save the atlas libraries in a private directory, for example by setting a special environment variable while building atlas. And then have some way to tabulate the performance of different installs.
comment:113 Changed 10 years ago by
- Dependencies changed from sage-5.2.beta0 to #13160
- Status changed from needs_work to needs_review
I think the only remaining blocker is that it (or rather, cvxopt) doesn't build on hawk. Since I don't have an account, can someone test it (the spkg + patches from this ticket + the cvxopt spkg from #13160)? Everything else in this ticket has been reviewed already, we just have to check that the interaction with cvxopt is fixed on the last "fully supported" platform.
comment:114 Changed 10 years ago by
Cvxopt still doesn't build. I see the same error when using the ATLAS spkg here or when setting SAGE_ATLAS_LIB=/ATLAS32
. Here is the log.
Does the ATLAS spkg here build on skynet/mark (and Solaris on sparc in general)?
comment:115 Changed 10 years ago by
Why the heck is it not finding gsl? Oh I see the include line is actually wrong. I'll check the spkg but we should continue with cvxopt issues at #13160.
comment:116 Changed 10 years ago by
John, now that you verified that it works on Hawk is there anything else that prevents you from pressing the positive review button? ;-)
comment:117 Changed 10 years ago by
I'm testing on a few skynet machines. Should I expect it to work on mark?
comment:118 follow-up: ↓ 119 Changed 10 years ago by
It worked for me on mark (sparc solaris)
comment:119 in reply to: ↑ 118 Changed 10 years ago by
Replying to vbraun:
It worked for me on mark (sparc solaris)
OMG cool, Skynet is back up. I was totally unaware of that!
comment:120 Changed 10 years ago by
I'm still not happy with "discarding" the built static libraries; there should at least be some convenient way to save them (other than SAGE_KEEP_BUILT_SPKGS=yes
or installing with sage (-i|-f) -s ...
.)
Another issue is the extremely increased build time on some machines if one doesn't set SAGE_ATLAS_ARCH
. Don't know how we could handle that, but it certainly gives rise to a lot of user complaints.
comment:121 Changed 10 years ago by
P.S.: W.r.t. the "knife": If you don't want to install static libraries (somewhere), at least previous ones should get removed (or moved somewhere else) upon a successful ATLAS build.
comment:122 follow-up: ↓ 124 Changed 10 years ago by
Now that I added the generic 64-bit archdefs the default build time (without setting SAGE_ATLAS_ARCH
) should be moderate on all x86 systems. I.e. less CPU time than building the rest of Sage.
Your suggestions about handling the static libraries are enhancement requests. By itself, its useless to keep a backup of the static libraries somewhere. I agree that one should keep them around and devise a way to benchmark them, but not on this ticket. Also I'm against attempting to delete stuff from previous installs unless it actively conflicts with the new spkg. Which it does not, the damage of statically linking is already done.
comment:123 Changed 10 years ago by
I'm willing to give this a positive review now. Leif, what about you? Can we defer your issues to a follow-up?
comment:124 in reply to: ↑ 122 ; follow-up: ↓ 126 Changed 10 years ago by
Replying to vbraun:
Now that I added the generic 64-bit archdefs the default build time (without setting
SAGE_ATLAS_ARCH
) should be moderate on all x86 systems. I.e. less CPU time than building the rest of Sage.
Ok, hopefully...
Your suggestions about handling the static libraries are enhancement requests. By itself, its useless to keep a backup of the static libraries somewhere. I agree that one should keep them around and devise a way to benchmark them, but not on this ticket.
I'd rather say not installing them [anywhere] is a regression w.r.t. the previous spkg.
Also I'm against attempting to delete stuff from previous installs unless it actively conflicts with the new spkg. Which it does not, the damage of statically linking is already done.
If so, it shouldn't hurt to keep ATLAS installing them either... ;-)
[I'd expect "more damage" when having different .a
and .so
library versions.]
comment:125 Changed 10 years ago by
I'm not wanting to hold up this ticket, but IMHO the Installation Guide should get updated, at least documenting the new atlas-config
script.
comment:126 in reply to: ↑ 124 Changed 10 years ago by
Replying to leif:
I'd rather say not installing them [anywhere] is a regression w.r.t. the previous spkg.
Its a major improvement, not a regression!
If so, it shouldn't hurt to keep ATLAS installing them either... ;-)
As I explained previously, thats not true. We have to change things to make them better. But we can't un-link the static linkage that has happened previously, so when you upgrade from an existing spkg you potentially keep the old code. And there is nothing a new atlas spkg can do about this. If you want to be sure that you don't have cruft statically linked you'll have to do a clean install. This is precisely why it was a bad idea to install static libraries previously.
comment:127 Changed 10 years ago by
- Description modified (diff)
I've added documentation to the installation guide for the atlas-config script and updated the environment variables.
Note that you cannot just use
sage -f atlas-3.9.32.spkg
to update atlas only. Many other spkgs use blas/lapack and must be rebuilt. The easiest way is to do a separate Sage installation...The cvxopt spkg needs to be updated to link correctly with this atlas release, see #10509.