Opened 10 months ago

Last modified 2 months ago

#26018 needs_work defect

Race conditions because of DESTDIR

Reported by: jdemeyer Owned by:
Priority: major Milestone: sage-8.8
Component: build Keywords:
Cc: embray Merged in:
Authors: Erik Bray Reviewers:
Report Upstream: N/A Work issues:
Branch: u/embray/build/python_toolchain (Commits) Commit: 6c9482b6d65a153cb87f1c07801f503e4c7e7437
Dependencies: Stopgaps:

Description (last modified by jdemeyer)

Using a staging directory for package install can introduce race conditions. For example, when installing a Python package, the package is installed first and then the egg-info directory is installed. The latter is what pkg_resources uses to consider a package "installed".

However, when a DESTDIR install is used, the order that files appear in the final installation directory might be different.

If the egg-info directory is copied to site-packages before the actual Python package, then this can happen for example:

Installing alabaster-0.7.10
Installing package alabaster using pip2
Ignoring indexes: https://pypi.python.org/simple
Processing /home/jdemeyer/sage-test/local/var/tmp/sage/build/alabaster-0.7.10/src
  Running setup.py (path:/tmp/pip-0ac_4C-build/setup.py) egg_info for package from file:///home/jdemeyer/sage-test/local/var/tmp/sage/build/alabaster-0.7.10/src
    Running command python setup.py egg_info
    running egg_info
    creating pip-egg-info/alabaster.egg-info
    writing pip-egg-info/alabaster.egg-info/PKG-INFO
    writing top-level names to pip-egg-info/alabaster.egg-info/top_level.txt
    writing dependency_links to pip-egg-info/alabaster.egg-info/dependency_links.txt
    writing manifest file 'pip-egg-info/alabaster.egg-info/SOURCES.txt'
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-0ac_4C-build/setup.py", line 37, in <module>
        'Programming Language :: Python :: 3.4',
      File "/home/jdemeyer/sage-test/local/lib/python2.7/site-packages/setuptools/__init__.py", line 129, in setup
        return distutils.core.setup(**attrs)
      File "/home/jdemeyer/sage-test/local/lib/python2.7/distutils/core.py", line 151, in setup
        dist.run_commands()
      File "/home/jdemeyer/sage-test/local/lib/python2.7/distutils/dist.py", line 953, in run_commands
        self.run_command(cmd)
      File "/home/jdemeyer/sage-test/local/lib/python2.7/distutils/dist.py", line 972, in run_command
        cmd_obj.run()
      File "/home/jdemeyer/sage-test/local/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 278, in run
        self.find_sources()
      File "/home/jdemeyer/sage-test/local/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 293, in find_sources
        mm.run()
      File "/home/jdemeyer/sage-test/local/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 524, in run
        self.add_defaults()
      File "/home/jdemeyer/sage-test/local/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 563, in add_defaults
        rcfiles = list(walk_revctrl())
      File "/home/jdemeyer/sage-test/local/lib/python2.7/site-packages/setuptools/command/sdist.py", line 20, in walk_revctrl
        for item in ep.load()(dirname):
      File "/home/jdemeyer/sage-test/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2405, in load
        return self.resolve()
      File "/home/jdemeyer/sage-test/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2411, in resolve
        module = __import__(self.module_name, fromlist=['__name__'], level=0)
      File "/home/jdemeyer/sage-test/local/lib/python2.7/site-packages/setuptools_scm/__init__.py", line 9, in <module>
        from .utils import trace
    ImportError: No module named utils

What you see here is that setuptools_scm is used by pkg_resources even though the package setuptools_scm has not fully been installed.

Change History (29)

comment:1 Changed 10 months ago by jdemeyer

Very naive fix at least for Python packages:

  • build/bin/sage-spkg

    diff --git a/build/bin/sage-spkg b/build/bin/sage-spkg
    index 9b90421..eaf271e 100755
    a b if [ -d "$SAGE_DESTDIR" ]; then 
    881881    FILE_LIST=""
    882882    FIRST=1
    883883    old_IFS="$IFS"; IFS=$'\n'
    884     for filename in $(find "$PREFIX" -type f -o -type l | sort); do
     884    for filename in $(find "$PREFIX" -type f -o -type l | sort -r); do
    885885        filename="${filename#$PREFIX}"
    886886        if [ $FIRST -eq 1 ]; then
    887887            FILE_LIST="\"$filename\""

comment:2 Changed 10 months ago by jdemeyer

  • Description modified (diff)

comment:3 Changed 10 months ago by embray

Thanks for figuring out what was going on there. Do you know a reliable way to reproduce this? I wonder why I've never seen it before.

comment:4 follow-up: Changed 10 months ago by embray

Another thing to note: Previously we worked around this kind of race condition by putting a lock around pip, so only one package can be pip-installed at a time. Now we still have that lock, but I think it's not actually necessary (though maybe still worth keeping?) since we in general do not pip-install packages to the same site-packages at the same time (when using DESTDIR).

However, similar problems arise if packages can be copied into site-packages by sage-spkg while other python packages are simultaneously being installed. So maybe we keep the exiting pip lock, but also require sage-spkg to obey it somehow.

comment:5 Changed 9 months ago by embray

I just had this happen trying to rebuild the Windows patchbot, which broke again, for some reason.

While I believe this is a more general problem in principle, it's happening specifically because installing setuptools_scm can alter the behavior of pip (in particular, it installs some setuptools.file_finder entrypoints). It's also pulled in as a setup_requires from a couple of packages that have it listed as a prerequisite.

This is relatively unusual, and I think among packages installed by Sage, setuptools_scm is probably the only one like this. So I wonder if we shouldn't just list it as a prerequisite to almost all Python packages, like we currently do with pip.

comment:6 Changed 9 months ago by embray

  • Authors set to Erik Bray
  • Branch set to u/embray/build/python_toolchain
  • Commit set to 6c16368a6416d1e3655e026c63811fc9fbe23d2f
  • Status changed from new to needs_review

Just as one possible idea, what about this?

Technically, what I called $(PYTHON_TOOLCHAIN), could just be spelled "setuptools_scm", since there's already a direct dependency sequence of setuptools -> pip -> setuptools_scm. But I like that having a PYTHON_TOOLCHAIN variable makes this relationship to other packages explicit, and can be modified in one place.


New commits:

6c16368Add a PYTHON_TOOLCHIN variable to the makefile, which makes explicit the prerequisites to installing other Python packages with pip

comment:7 Changed 9 months ago by git

  • Commit changed from 6c16368a6416d1e3655e026c63811fc9fbe23d2f to b6268ef708d10573b283cb4dea1c691e90a83872

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

b6268efAdd a PYTHON_TOOLCHIN variable to the makefile, which makes explicit the prerequisites to installing other Python packages with pip

comment:8 Changed 9 months ago by embray

(had to fix one case, build/pkgs/simplegeneric/dependencies, where my sed call was a little over-aggressive; I should have made it just apply to the first line :)

comment:9 in reply to: ↑ 4 Changed 9 months ago by jdemeyer

Replying to embray:

Another thing to note: Previously we worked around this kind of race condition by putting a lock around pip, so only one package can be pip-installed at a time. Now we still have that lock, but I think it's not actually necessary (though maybe still worth keeping?) since we in general do not pip-install packages to the same site-packages at the same time (when using DESTDIR).

Good thinking! The problem is that the part which copies files from DESTDIR to the final installation directory is not locked. Adding an exclusive lock there would also mitigate this and maybe also #18438.

comment:10 Changed 9 months ago by embray

I have in the past considered putting a lock around any kind of writing to $SAGE_LOCAL--not for any specific reason but it just seemed like it would probably be a good idea. I had a feeling problems like this one would come up less frequently with such a lock. However, for that to be practical, while still keeping parallel builds reasonably fast, we need as many packages as possible using DESTDIR builds.

I can experiment with that, if you think it's a good idea. In the meantime what do you think about the workaround I attached? I think it makes a certain amount of sense independently of this issue, but I'm not sure.

comment:11 Changed 8 months ago by jhpalmieri

I don't understand the branch: it looks like pip need not be installed before all of the packages that (without the branch) have it as a dependency. In any case, with a Python 3 build, I get

No record that 'setuptools_scm' was ever installed; skipping uninstall
Installing setuptools_scm-1.15.6
your setuptools is too old (<12)
setuptools_scm functionality is degraded
/Users/palmieri/Desktop/Sage_stuff/sage_builds/PYTHON3/sage-8.4.beta5/local/bin/python3: No module named pip

Perhaps pip should not have $(PYTHON_TOOLCHAIN) as a dependency.

comment:12 follow-up: Changed 8 months ago by jhpalmieri

  • Status changed from needs_review to needs_work

Similar error with a Python 2 build. I don't think it's too much to ask for you to actually try your own branch before marking it as "needs review". Please do better in the future.

comment:13 in reply to: ↑ 12 Changed 8 months ago by embray

Replying to jhpalmieri:

Similar error with a Python 2 build. I don't think it's too much to ask for you to actually try your own branch before marking it as "needs review". Please do better in the future.

With respect, this is extremely condescending and unnecessary. I'm a professional software engineer and you are not my employer or my mentor--you don't have to tell me how to do my job. Obviously I tested it, being that it's a fairly non-trivial change. I see what the problem is about pip and I'm not sure why I didn't hit on it, though I only would have if I were fully rebuilding from scratch (I believe, instead, I was doing some ./sage -f calls that were designed to trigger the race condition).

Sage-the-distribution is a very large, complex pile of software, and it's difficult to test all possible scenarios, especially for builds. I'm not sure what you think the point of "needs review" is, if not to provide peer-review, which includes peer-testing. If you feel your time has been wasted you were also welcome to check the current results of the patchbots (which also show a build failure) and point that out rather than lecturing me needlessly.

Last edited 8 months ago by embray (previous) (diff)

comment:14 Changed 8 months ago by git

  • Commit changed from b6268ef708d10573b283cb4dea1c691e90a83872 to e7efb49659ca300b0f894a6624bb6e7ab3466415

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

e7efb49Add a PYTHON_TOOLCHIN variable to the makefile, which makes explicit the prerequisites to installing other Python packages with pip

comment:15 Changed 8 months ago by embray

  • Status changed from needs_work to needs_review

Updated to ensure that $(PYTHON_TOOLCHAIN) is not used in the dependencies of the PYTHON_TOOLCHAIN members themselves.

IIRC (it was several weeks ago) I did do this in the first place, but I think when I ran sed over the remaining packages I might have forgotten to exclude pip and setuptools_scm in particular.

Of course, let's wait and see what my patchbot says.

comment:16 follow-up: Changed 8 months ago by jhpalmieri

My point is exactly that this is a ticket about building Sage from scratch, so running make distclean; make is one of the first things to test, it seems to me.

Edit: let me tone this down a little.

So here is a serious question: if I think you have not done a good job on this ticket, should I not say anything? I think it's not bad to provide constructive criticism, like saying "you should have run make distclean; make before marking this as needs_review."

Last edited 8 months ago by jhpalmieri (previous) (diff)

comment:17 follow-up: Changed 8 months ago by jhpalmieri

Let me put this another way. As you have no doubt experienced, in the Sage community there are different styles of reviewers. Some are very thorough, while some will take a glance at the code and say "looks good to me, positive review". One can get away with that superficial kind of review some of the time, for example a Python 3 ticket which just replaces some iteritems by items, but on a ticket like this, it needs thorough testing. I believe that the burden of that thorough testing falls on both the author and the reviewer. In this particular case, the problem gets caught by the patchbots, but that does not always happen.

So especially on a ticket involving Sage's build process, I think it is important for both the authors and the reviewers to be more careful and thorough than they might otherwise be. It's better to get it right the first time than to merge a ticket with problems and have to put out a bunch of fires later.

Last edited 8 months ago by jhpalmieri (previous) (diff)

comment:18 in reply to: ↑ 16 Changed 8 months ago by embray

Replying to jhpalmieri:

My point is exactly that this is a ticket about building Sage from scratch, so running make distclean; make is one of the first things to test, it seems to me.

It isn't necessarily though. The issue at hand can occur in any situation where, at the most, the python2 package needs to be rebuilt, as do its dependencies. A full build from scratch was not necessary to test it--I had found it sufficient for reliably reproducing the problem to do ./sage -f <something> where I forget exactly what <something> was but it was one of python's early dependencies. Possibly something fast, that wouldn't require a whole lot of other packages to be rebuilt. Now that I look, it I think it was probably sqlite.

Edit: let me tone this down a little.

So here is a serious question: if I think you have not done a good job on this ticket, should I not say anything? I think it's not bad to provide constructive criticism, like saying "you should have run make distclean; make before marking this as needs_review."

I think that would have been completely fair, though I have mixed feelings about "before marking this as needs_review". What I find insulting is the assumption that I (or anyone, if it were someone else) just didn't test it at all. It might be a little more reasonable if were something that obviously would never have worked at all under any circumstances, like having a syntax error in some code such that it would not have even compiled.

Even then there are cases I've seen, both from myself and others, where I had something working fine, but accidentally hit a key in my editor and inserted a stray symbol somewhere just before committing the change. I almost always review my diff before committing, but something like that can be easy to fix (and of course if I did skip diff review entirely, that's on me). In fact, as I pointed out, this was in some ways such a case, since I had it "right" in the first place but then made a mistake before committing. I did do a diff review but it was the sort that was easy to glaze over since it was dozens of almost equivalent changes.

But I don't think anyone should be lectured to about setting a ticket to "needs review" if there happens to be a mistake. That's the whole point of asking for review--it's saying "I've done everything I can or have time for to work on this; please take a look".

Last edited 8 months ago by embray (previous) (diff)

comment:19 in reply to: ↑ 17 Changed 8 months ago by embray

Replying to jhpalmieri:

So especially on a ticket involving Sage's build process, I think it is important for both the authors and the reviewers to be more careful and thorough than they might otherwise be. It's better to get it right the first time than to merge a ticket with problems and have to put out a bunch of fires later.

I don't disagree of course. Part of the problem is that we do not have a QA department to set review policy. But you will never, ever get everything "right the first time" and that's why we have peer review and continuous integration in the first place (and even with those things some things will be wrong and that's OK).

In the meantime, I think there would be less of a problem here if there were a test policy and review policy for build issues. Lacking that, you can't point fingers if someone didn't follow a policy that doesn't exist. (Though as I've noted, I don't think it's fair to "point fingers" at all on code that is still under review in the first place).

I appreciate your code reviews so I hope you'll keep doing them as they are very valuable; but please don't assume that if something is broken it's not because the contributor was lazy or careless or doesn't know what they're doing.

comment:20 Changed 5 months ago by embray

  • Milestone changed from sage-8.4 to sage-8.6

comment:21 Changed 5 months ago by vbraun

  • Status changed from needs_review to needs_work

comment:22 Changed 5 months ago by embray

Per #26990, the future package also inserts itself at a fairly "low level" in that it ends up getting imported, once it's installed, by other Python build tools like setuptools and/or pip. So it should also be included in the list of toolchain packages.

comment:23 Changed 5 months ago by git

  • Commit changed from e7efb49659ca300b0f894a6624bb6e7ab3466415 to cce3a23d86197f4d9603d3640d4196cbafa8a7fe

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

b04b029Add a PYTHON_TOOLCHAIN variable to the makefile, which makes explicit the prerequisites to installing other Python packages with pip
cce3a23add the "future" package to PYTHON_TOOLCHAIN

comment:24 Changed 5 months ago by git

  • Commit changed from cce3a23d86197f4d9603d3640d4196cbafa8a7fe to a06c851550132fb137aafa09b2f749a52dd5fede

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

a06c851add the "future" package to PYTHON_TOOLCHAIN

comment:25 Changed 5 months ago by git

  • Commit changed from a06c851550132fb137aafa09b2f749a52dd5fede to 6c9482b6d65a153cb87f1c07801f503e4c7e7437

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

b75c772Add a PYTHON_TOOLCHAIN variable to the makefile, which makes explicit the prerequisites to installing other Python packages with pip
6c9482badd the "future" package to PYTHON_TOOLCHAIN

comment:26 Changed 5 months ago by embray

  • Status changed from needs_work to needs_review

Rebased on 8.6.beta1, added future to the list of PYTHON_TOOLCHAIN packages and ensured that those packages are installed in order and can't be installed in parallel. This should still work otherwise.

Last edited 5 months ago by embray (previous) (diff)

comment:27 Changed 4 months ago by embray

  • Milestone changed from sage-8.6 to sage-8.7

Retarging tickets optimistically to the next milestone. If you are responsible for this ticket (either its reporter or owner) and don't believe you are likely to complete this ticket before the next release (8.7) please retarget this ticket's milestone to sage-pending or sage-wishlist.

comment:28 Changed 3 months ago by chapoton

  • Status changed from needs_review to needs_work

red branch

comment:29 Changed 2 months ago by embray

  • Milestone changed from sage-8.7 to sage-8.8

Ticket retargeted after milestone closed (if you don't believe this ticket is appropriate for the Sage 8.8 release please retarget manually)

Note: See TracTickets for help on using tickets.