Opened 7 years ago

Closed 14 months ago

#20507 closed defect (invalid)

Slight annoyance wrt Numpy and ATLAS

Reported by: Erik Bray Owned by:
Priority: minor Milestone: sage-duplicate/invalid/wontfix
Component: packages: standard Keywords:
Cc: Merged in:
Authors: Reviewers: Michael Orlitzky
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description

Since #20157 when I build Numpy, as well as anything else that uses numpy.distutils (namely SciPy?) I get a lot of the warning:

atlas_blas_info:
Disabled atlas_blas_info: (ATLAS is None)
  libraries f77blas,cblas,atlas not found in []
  NOT AVAILABLE

/home/embray/src/sagemath/sage/local/var/tmp/sage/build/numpy-1.11.0/src/numpy/distutils/system_info.py:1584: UserWarning:
    Atlas (http://math-atlas.sourceforge.net/) libraries not found.
    Directories to search for the libraries can be specified in the
    numpy/distutils/site.cfg file (section [atlas]) or by setting
    the ATLAS environment variable.

This caused me deep confusion considering that my laptop just spent a long, hard night compiling ATLAS :) Of course, what's actually going on is that Numpy is being configured to use whatever generic blas/lapack it has (which is correctly including -latlas in its linker flags).

Howver, the spkg-install for Numpy sets the environment variable ATLAS=None which forces Numpy's system_info to report that it doesn't have ATLAS. Of course, it does, so AFAICT this is just a user interface issue albeit a confusing one.

Is there some way we could modify this so that when we do know what BLAS library we're using we at least let Numpy report it correctly? It could figure this out at install time by checking the lapack.pc and/or blas.pc files.

Change History (23)

comment:1 Changed 7 years ago by François Bissey

It would be nice but it may not be what we want. The only way to feed your user configuration whatever it is is to set those variables to None. I did this in #20157 and the goal is to have the user choice respected whatever they are. You can read the description in #20157, without disabling everything, I could get whatever is found first in the list rather than what I want. Note that at least one new implementation has been added to the list in 1.11 if I am not mistaken (bliss).

As far as I am concerned numpy should take user's wishes first and work out what it is if it can.

comment:2 Changed 7 years ago by Erik Bray

I tried reading through #20157 but there's a lot of context missing that makes it hard to tease out what the issues really are.

So what do you mean by "feed your user configuration" in this case? In #20157 you write "Since we want the choice of blas/lapack to be configurable by .pc files" but that could be better specified. What do you mean by that exactly? What if I don't want to edit a .pc file to change my BLAS or LAPACK libs and instead want to tell Numpy exactly what I'm using through the means it already provides?

What this should really be doing is setting all the ATLAS,PTATLAS,OPENBLAS,MKL,BLISS,etc. environment variables to 'None' *except* the one for th implementation I'm actually using.

For now I changed my spkg-install for Numpy to just search for -latlas in blas.pc and unset ATLAS if found. This does the correct thing, but it's still a hack since relying on the .pc file alone is not a great way to determine what BLAS I'm actually using (except in this case there is a way, just ad-hoc).

comment:3 Changed 7 years ago by François Bissey

The means used by numpysucks. Why do you even need to know which blas/lapack you are using? For ATLAS it means that numpy may use ATLAS c interface to lapack - well if you want to do that may be you should look for lapacke instead, something with a standard interface.

The point being, if you are after blas/lapack functions, they are standard and you shouldn't worry about the implementation underneath. numpy's mechanism is a cute way (at best) to try various known libraries in a given order of preference. My preference would be for numpy to start looking for something I feed it before/rather than trying autodetection.

comment:4 Changed 7 years ago by Erik Bray

If I set ATLAS to something more useful then it will look for something I'm feeding it.

Like I said in the original ticket, it's a minor but confusing annoyance that Numpy tells me it can't find ATLAS even when it is using ATLAS. I agree it doesn't matter for the end result, but it makes it impossible to read the build log and determine what BLAS implementation Numpy did find. This is even more confusing when configuring SciPy?.

Is there some discussion you could refer me to where it was decided what the best way should be to configure BLAS for Sage? Because somewhere along the line you probably *do* know what you're using. And if you don't then you're using ATLAS since that's what Sage builds by default. So there's no reason for Numpy to report something inaccurate.

As I wrote in my previous comment if I know I'm using ATLAS and I unset the ATLAS environment variable then the issue is resolved. What the spkg-install is doing currently removes context from the Numpy configuration unnecessarily.

comment:5 Changed 7 years ago by François Bissey

OK the current default in sage is to build ATLAS but Volker and I have been in this conspiracy for at least a couple of year to make sage use any blas/lapack we like - mirroring a possibility already existing in sage-on-gentoo. We left a track of tickets, although there has been a few private emails as well, starting with bringing pkgconf in sage (although it had other side purpose) then its python interface. So we got the current ATLAS spkg install a .pc and probably also the optional openblas spkg and started to make use of .pc files all over the place. numpy/scipy was the latest place not using the .pc file for blas/lapack and I made it behave, with a slightly lighter hand than what is done in pure Gentoo actually.

There has been the occasional email on sage-devel about using MKL and other alternatives blas/lapack but only few people have shown a big interest.

comment:6 Changed 7 years ago by Erik Bray

I think those are fine goals to have, and are not necessarily contradictory with what I'm proposing--in fact the opposite. If there were a clearer way during a build of sage+dependencies to specify which BLAS implementation is being built with, it wouldn't be necessary to disable detection of that library in Numpy, Scipy, or anything else that uses numpy.distutils.

In other words, it would be good to configure Sage at build time what BLAS it's using in such a way that it can pass that on to other packages as well in case they do care, or just for the sake of provenance / reporting.

In the meantime this *can* be sort of done in the limited case of ATLAS, albeit in a somewhat fragile sense, by checking for -latlas. If/when Sage improves support for other libraries that could be handled on a case-by-case basis.

comment:7 in reply to:  6 Changed 7 years ago by Jeroen Demeyer

Replying to embray:

If there were a clearer way during a build of sage+dependencies to specify which BLAS implementation is being built with

Wasn't that the whole point of #17075?

comment:8 Changed 7 years ago by Erik Bray

I don't know if that was the whole point of #17075, but the result of that ticket does not achieve what I'm talking about.

comment:9 in reply to:  8 Changed 7 years ago by Jeroen Demeyer

Replying to embray:

I don't know if that was the whole point of #17075, but the result of that ticket does not achieve what I'm talking about.

Really? Then I am missing something, either in this ticket or in #17075...

Because of #17075, you no longer need to know which BLAS implementation Sage uses, just query pkgconfig. See src/module_list.py for an implementation of this. So we just need numpy to respect these pkgconfig files, right?

comment:10 Changed 7 years ago by François Bissey

Next step may be to add options in configure for users to provide details of the blas/lapack setup instead of using the old environment variable for system ATLAS. But that's orthogonal to this ticket.

comment:11 Changed 7 years ago by Erik Bray

Maybe I can try to be more specific. Indeed there are two issues here. The first is pretty far removed from this issue but not "orthogonal" either.

On the pkg-config files I think it would be good to emulate Debian's work on this. For each BLAS implementation (and LAPACK) it installs a .pc file specific to that implementation. For example "blas-atlas.pc". (It also makes good distinction between Libs and Libs.private in those files--for example -latlas is only needed in Libs.private). The libraries themselves are installed in ${prefix}/lib/atlas-base (roughly speaking). There under ${prefix}/lib and ${prefix}/lib/pkgconfig there are generic symlinkss for libblas.so and blas.pc respectively. The Debian alternatives system is used to manage which implementation those symlinks point to. Point being that it's clear on inspection which implementation is being used at any given time, and the library name and version can also be used at build time for provenance purposes.

Sage doesn't have anything quite like Debian alternatives that I'm aware of. But it could, at least for the specific case of linear algebra libraries, and I think that would be a little nicer than the current situation where installing ATLAS just (partially) clobbers OpenBLAS and vice-versa. For example up above fbissey suggested using configure to provide the BLAS we want to use. Right now the only supported values are "atlas" or "openblas" but that's fine. Those spkgs can also be responsible for deciding whether to build a copy for Sage or use the user's system installation, as the ATLAS spkg currently does (but not OpenBLAS from the look of it). Selecting one of these at configure time would set a make variable called $(BLAS) (for example) that points to the appropriate spkg.

A generic meta-package called simply "blas" can be used as a dependencies of other packages that currently explicitly depend on "atlas" (unless some package specifically requires ATLAS). The only (?) dependency of the "blas" metapackage is $(BLAS). Installing the "blas" metapackage can also handle the update-alternatives like functionality of moving symlinks around depending on $(BLAS).

Anyways that's my current thinking on the subject. Getting back to Numpy specifically (and scipy), if there is a $SAGE_BLAS-like variable available when its spkg-install is run we don't have to set all of Numpy's BLAS-related environment variables to None. Just all except the one we have explicitly selected to build Sage with. Then it is unnecessary as far as I can tell to completely hobble Numpy's auto-detection. It is guiding the auto-detection to pick exactly the implementation you want and nothing more, and it can at least accurately report what BLAS it was configured with (which it currently does not in Sage, and that's what was confusing to me in the first place).

I'll grant you that currently there are hardly any places where this matters practically (there is one place in SciPy that it does and I'm not sure it's that significant). That's why I marked this as "minor". It just struck me as unnecessarily limited, and an indication of an area where improvement could still be used.

comment:12 Changed 7 years ago by François Bissey

I am familiar with debian alternatives and its equivalent implementation in gentoo. I am sure Jeroen is too. Like you say I am not sure we want to go the full way in sage.

A meta package is I guess a vehicle we all have in mind but no one has committed to writing that yet. Although the current spkg is partly a meta-package already.

When I wrote about configure I didn't think about choosing the "internal" blas/lapack but providing a system one if desired, but choosing an internal one should probably be covered there as well.

I'll note that numpy/scipy is an oddity in informing you of the blas/lapack used. Auto-detection by configure script is difficult because of the variety of implementations. Most configure script just ask for the library you want to use and don't really care what they are, only that they work.

But to go back to your point, once you have gone to .pc files some information is held in {blas,cblas,lapack}.pc the content could be a bit more informative at least when set up by an internal spkg.

SAGE_ROOT=/Users/fbissey/build/sage-7.2.beta5
prefix=${SAGE_ROOT}/local
libdir=${prefix}/lib
includedir=${prefix}/include
Name: blas
Version: 1.0
Description: blas for sage, set up by the ATLAS spkg.
Libs: -L${libdir} -lblas

above the blas.pc on my mac. On mac we use Apple's accelerate framework (and it is set up by the atlas spkg, as I say that spkg already has meta bits). The description could reflect the fact that it is accelerate not just that it has been set by the atlas spkg.

comment:13 in reply to:  11 Changed 7 years ago by Jeroen Demeyer

Replying to embray:

Sage doesn't have anything quite like Debian alternatives that I'm aware of. But it could, at least for the specific case of linear algebra libraries, and I think that would be a little nicer than the current situation where installing ATLAS just (partially) clobbers OpenBLAS and vice-versa.

I think it's overkill to allow side-by-side installation of different BLAS installations within Sage. I agree that there should be a choice (just like we already have between MPIR and GMP or between Python2 and Python3) but there is no need to support two installations at the same time.

Version 0, edited 7 years ago by Jeroen Demeyer (next)

comment:14 Changed 7 years ago by Erik Bray

I don't think it's overkill at all, and as I described above is fairly straightforward to do. For example if you wanted to do performance comparisons between BLAS libraries having parallel installations would make it easy--otherwise one would have to have fully separate parallel Sage installs.

comment:15 in reply to:  14 Changed 7 years ago by Jeroen Demeyer

Replying to embray:

as I described above is fairly straightforward to do.

I have big doubts about this statement...

It's not sufficient to deal just with the BLAS libraries themselves, but also other packages (Python packages, other libraries) depending on BLAS...

comment:16 Changed 7 years ago by Jeroen Demeyer

I created a ticket #20542 to allow using openblas instead of ATLAS in Sage (but not a side-by-side installation of both).

comment:17 Changed 7 years ago by Erik Bray

If I spent some 24 hours waiting for ATLAS to build, and then I install OpenBLAS, it will clobber my ATLAS installation. That alone is a good reason to allow side-by-side installation.

I'm not saying that other packages might not need to be rebuilt if you switch (some maybe, others no), but switching shouldn't be a matter of effectively uninstalling one library or the other.

comment:18 Changed 7 years ago by Jeroen Demeyer

I see your point but I still think it's a large effort for a small gain...

comment:19 Changed 7 years ago by Erik Bray

I don't know. I think #20542 did a large portion of the work (thanks!). Maybe there's something I'm not seeing that you think makes this difficult.

comment:20 Changed 6 years ago by Erik Bray

Component: PLEASE CHANGEpackages: standard

comment:21 Changed 15 months ago by Matthias Köppe

Milestone: sage-duplicate/invalid/wontfix
Status: newneeds_review

Outdated, should close

comment:22 Changed 14 months ago by Michael Orlitzky

Reviewers: Michael Orlitzky
Status: needs_reviewpositive_review

SPKG management problems should generally be solved with a system package, not by adding more complexity to sage's duct-tape package manager. In any case, Atlas is being removed soon in #30350.

comment:23 Changed 14 months ago by Matthias Köppe

Resolution: invalid
Status: positive_reviewclosed
Note: See TracTickets for help on using tickets.