Opened 16 months ago

Closed 15 months ago

Last modified 14 months ago

#27124 closed defect (fixed)

File manifests complete broken on OSX

Reported by: embray Owned by:
Priority: critical Milestone: sage-8.7
Component: build Keywords:
Cc: jhpalmieri Merged in:
Authors: John Palmieri Reviewers: Erik Bray
Report Upstream: N/A Work issues:
Branch: f8c7f67 (Commits) Commit:
Dependencies: Stopgaps:

Description

The JSON files written to $SAGE_LOCAL/var/lib/sage/installed/ are syntactically invalid on OSX due to the filenames in the the files list not being quoted. I believe this may be a regression introduced in #26011. In particular the sed command in the line

FILE_LIST="$(echo "$FILE_LIST" | sed 's/^\(.\+\)$/"\1"/; 2,$s/^/        /; $!s/$/,/')"

does not work at all correctly with the BSD sed on OSX.

The main impact of this is that packages cannot be properly uninstalled, which may in turn have had various snowball effects on OSX builds.

Change History (37)

comment:1 Changed 16 months ago by jhpalmieri

  • Branch set to u/jhpalmieri/sage-spkg-FILE_LIST

comment:2 Changed 16 months ago by jhpalmieri

  • Authors changed from Erik Bray to John Palmieri
  • Commit set to 5a49bb7d2936d21f4fbaf52756df51528edbc690
  • Status changed from new to needs_review

This works for me. I can now uninstall packages successfully.

Note that independent of this ticket, some manifests have empty file lists: gap, python3 (but not python2), libhomfly are some examples.

Edit: I think I'm wrong about this. libhomfly is an example, but not the others: they have proper manifests.


New commits:

5a49bb7trac 27124: for file manifests in sage-spkg, use python instead of sed
Last edited 16 months ago by jhpalmieri (previous) (diff)

comment:3 Changed 16 months ago by git

  • Commit changed from 5a49bb7d2936d21f4fbaf52756df51528edbc690 to 0b4dc9c3790eaab06b94ea2647318844beec83c1

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

0b4dc9ctrac 27124: for file manifests in sage-spkg, use python instead of sed

comment:4 Changed 16 months ago by jhpalmieri

  • Status changed from needs_review to needs_work

comment:5 Changed 16 months ago by jhpalmieri

I can't get this to work with Python: $FILE_LIST can be too long, apparently, at least the way I was trying to do things: something like

FILE_LIST=`sage-system-python -c "from sys import argv; s = ',\n      '.join(['\"' + x + '\"' for x in argv[1].split()]); print(s)" "$FILE_LIST"`

The file manifest for Python 3 was too long, so it ended up completely blank.

I think I can do everything in bash, though. Ugly but it seems to work. I'll post a branch once I'm done building.

comment:6 Changed 16 months ago by git

  • Commit changed from 0b4dc9c3790eaab06b94ea2647318844beec83c1 to 60a2f06a53229d2e19bbd6f4fe6d3ac1966eef40

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

60a2f06trac 27124: for file manifests in sage-spkg, use python instead of sed

comment:7 Changed 16 months ago by jhpalmieri

  • Status changed from needs_work to needs_review

Okay, let's try this.

comment:8 Changed 16 months ago by git

  • Commit changed from 60a2f06a53229d2e19bbd6f4fe6d3ac1966eef40 to 11a0a1d9ce60545ebb208bc80a06bfdda72add37

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

11a0a1dtrac 27124: for file manifests in sage-spkg, avoid GNUisms in sed

comment:9 Changed 16 months ago by embray

I think previously it was like this, and this bug was introduced when I tried to fix it, because Jeroen complained that the previous code was too slow on long file lists (this was #26011). However, your version is using += for string concatenation, which is a bash-ism. That might be okay since I think the script requires bash anyways. What I don't know for sure (and maybe you do) is if this is any faster as a method of string concatenation (maybe it is!)

A few weeks ago before I got side-tracked I started on a little Python script that would just generate the stamp file though, obviating the need for any fragile shell code for generating JSON :/

Last edited 16 months ago by embray (previous) (diff)

comment:10 Changed 16 months ago by jhpalmieri

The script starts with #!/usr/bin/env bash, so bash-isms are fine.

I just checked some timings. On my machine, boost_cropped has one of the largest file manifests, and also the package itself is quick to install. With the develop branch:

real	0m21.080s
user	0m7.821s
sys	0m12.785s

With this branch:

real	0m21.285s
user	0m7.959s
sys	0m12.966s

For the pari_galpol packaged mentioned at #26011, with develop:

real	0m21.661s
user	0m3.776s
sys	0m17.486s

With this branch:

real	0m20.393s
user	0m4.042s
sys	0m15.856s

So timing is not an issue, I think.

comment:11 follow-up: Changed 16 months ago by jhpalmieri

With Python, by the way, I had problems feeding the whole file list to Python: for some packages it was too long for a command line argument. At some point, as you and others have said, it would be good to rewrite all of sage-spkg in Python, at which point this would be easy to handle.

comment:12 Changed 15 months ago by jhpalmieri

I'm also tempted to add a doctest:

  • src/sage/misc/package.py

    diff --git a/src/sage/misc/package.py b/src/sage/misc/package.py
    index 8501acb53f..f2ea30aa44 100644
    a b def experimental_packages(): 
    453453    return (sorted(pkg['name'] for pkg in pkgs if pkg['installed']),
    454454            sorted(pkg['name'] for pkg in pkgs if not pkg['installed']))
    455455
     456def package_manifest(package, package_type='standard'):
     457    """
     458    Return the file manifest for ``package``.
     459
     460    INPUT:
     461
     462    - ``package`` -- package name
     463    - ``package_type`` (optional, default "standard") -- either
     464      "standard", "optional", or "experimental"
     465
     466    The manifest is written in the file
     467    ``SAGE_SPKG_INST/package-VERSION``. It is a JSON file containing
     468    dictionary with the package name, version, installation date, list
     469    of installed files, etc.
     470
     471    EXAMPLES::
     472
     473        sage: from sage.misc.packages import package_manifest
     474        sage: sagetex_manifest = package_manifest('sagetex')
     475        sage: sagetex_manifest == 'sagetex'
     476        True
     477        sage: 'files' in sagetex_manifest
     478        True
     479    """
     480    version = package_versions(package_type)[package][0]
     481    stamp_file = os.path.join(os.environ['SAGE_SPKG_INST'],
     482                              '{}-{}'.format(package, version))
     483    spkg_meta = {}
     484    try:
     485        with open(stamp_file) as f:
     486            spkg_meta = json.load(f)
     487    except ValueError: # Error reading JSON file
     488        pass
     489    return spkg_meta
     490

It probably needs more error checking that I want to do right now: is it actually a valid package name, etc.?

comment:13 in reply to: ↑ 11 Changed 15 months ago by embray

Replying to jhpalmieri:

With Python, by the way, I had problems feeding the whole file list to Python: for some packages it was too long for a command line argument. At some point, as you and others have said, it would be good to rewrite all of sage-spkg in Python, at which point this would be easy to handle.

Yes, I had thought of maybe writing the list to a file and having the script read the list from that file. I do want to rewrite that entire script in python too, but that's another issue.

Your timings are helpful, thank you. ISTM += is much more efficient than repeated string interpolation like I had previously.

The utility function + tests sounds like a good idea. Let's add that too. I think that if it can't be parsed as a valid JSON document that should raise an error and not just silently return an empty dict.

comment:14 Changed 15 months ago by git

  • Commit changed from 11a0a1d9ce60545ebb208bc80a06bfdda72add37 to 7b7a402dcbe4e88d8e2d8d6963cc5a8ec63906ff

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

3a58f8dtrac 27124: for file manifests in sage-spkg, avoid GNUisms in sed
b5de668trac 27124: add a doctest
7b7a402trac 27124: bump package version on sagetex so the new doctest will pass.

comment:15 follow-up: Changed 15 months ago by jhpalmieri

Okay, here is a doctest. It tests the manifest for sagetex, so I bumped the version number on that package to make sure that the new manifest gets built; otherwise this doctest will fail on OS X machines doing incremental upgrades.

People should probably rebuild everything once this is in place to make sure they have valid file manifests, but I don't want to impose that.

Last edited 15 months ago by jhpalmieri (previous) (diff)

comment:16 in reply to: ↑ 15 Changed 15 months ago by embray

Replying to jhpalmieri:

People should probably rebuild everything once this is in place to make sure they have valid file manifests, but I don't want to impose that.

Yes, it's an unfortunate corner case in a way (just one that happens to affect anyone building Sage on OSX--too bad).

Thank you for the test.

comment:17 Changed 15 months ago by embray

The line

+    spkg_meta = {}

is now superfluous but I can remove that in a squash, if you want. Or if you get to it first that's fine too.

comment:18 follow-up: Changed 15 months ago by embray

Another neat test would be to actually provide a schema for these files, and try to validate all of them against the schema.

If, in the off chance, someone is upgrading some very old SAGE_ROOT and has some files that don't fit the schema it could warn them that they should rebuild those packages.

comment:19 Changed 15 months ago by git

  • Commit changed from 7b7a402dcbe4e88d8e2d8d6963cc5a8ec63906ff to f8c7f67c11c55c5102baa67d03277a3d44ddcf87

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

b3687a2trac 27124: add a doctest
f8c7f67trac 27124: bump package version on sagetex so the new doctest will pass.

comment:20 in reply to: ↑ 18 Changed 15 months ago by jhpalmieri

Replying to embray:

Another neat test would be to actually provide a schema for these files, and try to validate all of them against the schema.

If, in the off chance, someone is upgrading some very old SAGE_ROOT and has some files that don't fit the schema it could warn them that they should rebuild those packages.

I think that would belong on a separate ticket; I want to get this problem fixed quickly.

I'm also not sure how to do it. If it's a doctest, it would fail for everyone using OS X, unless it's lenient about the format. Or would it run (once?) when you start up Sage? What did you have in mind for a schema?

comment:21 Changed 15 months ago by embray

I think if the test failed that would be a good thing: It informs that that system still has some old stamp files either in the old format (before I made it a JSON format which was quite a few Sage versions ago now), or the system was affected by this bug in which case it would be good to rebuild and regenerate those files anyways.

comment:22 Changed 15 months ago by embray

But I agree, it should be a separate issue.

Not sure what the patchbots are saying about pyflakes. I think it might just be bugginess on the patchbot's part.

comment:23 Changed 15 months ago by embray

I made #27363 for the schema idea. I'm going to test this a bit on Windows but otherwise it looks good.

comment:24 Changed 15 months ago by embray

  • Reviewers set to Erik Bray
  • Status changed from needs_review to positive_review

Finally got a chance to test this. Thanks again!

comment:25 Changed 15 months ago by jhpalmieri

Thank you!

comment:26 Changed 15 months ago by vbraun

  • Branch changed from u/jhpalmieri/sage-spkg-FILE_LIST to f8c7f67c11c55c5102baa67d03277a3d44ddcf87
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:27 Changed 14 months ago by gh-timokau

  • Commit f8c7f67c11c55c5102baa67d03277a3d44ddcf87 deleted

The new doctest fails for me on distro, since there are no package manifests. Is there any point to those manifests on distro?

comment:28 follow-up: Changed 14 months ago by jhpalmieri

Can you point out where distro versions of Sage are documented? I don't know the answers to all sorts of basic questions:

  • do you ever run sage -i PKG? (Installing and uninstalling packages are the reason for the manifests: if you want to uninstall a package, which for example is part of upgrading a package, Sage uses the file manifest to know which files to remove.)
  • is there a directory SAGE_ROOT? If so, what subdirectories of it exist?
  • does SAGE_SPKG_INST exist?

One way to help make distro versions of Sage more prominent is to actually document them, probably in the developer's guide. That could be more efficient than having people with no distro experience make changes which break things.

comment:29 in reply to: ↑ 28 ; follow-up: Changed 14 months ago by gh-timokau

Replying to jhpalmieri:

Can you point out where distro versions of Sage are documented?

Does https://wiki.sagemath.org/Distribution#Package_managers cover what you are looking for?

I don't know the answers to all sorts of basic questions:

I can only speak for sage on nix with authority, but the answers will probably hold for other distros as well.

  • do you ever run sage -i PKG? (Installing and uninstalling packages are the reason for the manifests: if you want to uninstall a package, which for example is part of upgrading a package, Sage uses the file manifest to know which files to remove.)

No, sage -i is never used. The package manager is responsible for installing and uninstalling packages.

  • is there a directory SAGE_ROOT? If so, what subdirectories of it exist?

In the case of nix I just set SAGE_ROOT to a directory just containing the sage source code. I think that's not really important, I just do that to statisfy some doctest IIRC. Other distros will differ here. SAGE_LOCAL etc. are not subdirectories of SAGE_ROOT but explicitly set.

  • does SAGE_SPKG_INST exist?

Not sure what that is exactly. The only thing I do currently is create a file in SAGE_LOCAL/installed (that is probably SAGE_SPKG_INST?) for every installed dependency. The files are empty and again are mostly to statisfy some sage doctest that tries to determine weather or not a package is installed. This will hopefully become unnecessary at some point in the future due to https://trac.sagemath.org/ticket/20382.

One way to help make distro versions of Sage more prominent is to actually document them, probably in the developer's guide. That could be more efficient than having people with no distro experience make changes which break things.

There is the wiki page, is that not enough? Not sure how much sage-the-project directly supports the distro packages.

comment:30 in reply to: ↑ 29 ; follow-up: Changed 14 months ago by jhpalmieri

Replying to gh-timokau:

Replying to jhpalmieri:

Can you point out where distro versions of Sage are documented?

Does https://wiki.sagemath.org/Distribution#Package_managers cover what you are looking for?

No, because it doesn't answer my questions.

I don't know the answers to all sorts of basic questions:

I can only speak for sage on nix with authority, but the answers will probably hold for other distros as well.

  • do you ever run sage -i PKG? (Installing and uninstalling packages are the reason for the manifests: if you want to uninstall a package, which for example is part of upgrading a package, Sage uses the file manifest to know which files to remove.)

No, sage -i is never used. The package manager is responsible for installing and uninstalling packages.

  • is there a directory SAGE_ROOT? If so, what subdirectories of it exist?

In the case of nix I just set SAGE_ROOT to a directory just containing the sage source code. I think that's not really important, I just do that to statisfy some doctest IIRC. Other distros will differ here. SAGE_LOCAL etc. are not subdirectories of SAGE_ROOT but explicitly set.

  • does SAGE_SPKG_INST exist?

Not sure what that is exactly.

It's set to SAGE_LOCAL/var/lib/sage/installed in sage/env.py.

The only thing I do currently is create a file in SAGE_LOCAL/installed (that is probably SAGE_SPKG_INST?) for every installed dependency. The files are empty and again are mostly to statisfy some sage doctest that tries to determine weather or not a package is installed. This will hopefully become unnecessary at some point in the future due to https://trac.sagemath.org/ticket/20382.

One way to help make distro versions of Sage more prominent is to actually document them, probably in the developer's guide. That could be more efficient than having people with no distro experience make changes which break things.

There is the wiki page, is that not enough? Not sure how much sage-the-project directly supports the distro packages.

There should be guidance in the developer's guide about for example: what directories can be assumed to exist when writing code for the Sage library or Sage scripts, what sorts of doctests should be avoided and/or are incompatible with distros, etc.

I'm not sure how much support there is, either. If you want more support, more documentation is always good.


In this particular case, the doctest could perhaps be changed to something like: if the directory SAGE_SPKG_INST exists, then check the validity of the file manifest.

comment:31 in reply to: ↑ 30 ; follow-up: Changed 14 months ago by gh-timokau

Replying to jhpalmieri:

Replying to gh-timokau:

  • does SAGE_SPKG_INST exist?

Not sure what that is exactly.

It's set to SAGE_LOCAL/var/lib/sage/installed in sage/env.py.

Oh nevermind I misread my own build script. SAGE_LOCAL/var/lib/sage/installed is where I install those files to.

There should be guidance in the developer's guide about for example: what directories can be assumed to exist when writing code for the Sage library or Sage scripts, what sorts of doctests should be avoided and/or are incompatible with distros, etc.

I'm not sure how much support there is, either. If you want more support, more documentation is always good.

The issue is that there is no central authority to say that there is or isn't support. But I think at least Erik has been pushing in the support direction.

In this particular case, the doctest could perhaps be changed to something like: if the directory SAGE_SPKG_INST exists, then check the validity of the file manifest.

Back to my first question: what are the usecases for these manifests?

And to answer your question: As a generalization, trac tickets that only touch "sagelib" (i.e. most stuff under src/, the actual math library) shouldn't need any special attention for distros. Similarly, everything that only changes the build system/"package manager"/"sage-the-distro" aspects of sage (most stuff under build/) should be fine, since distros don't use any of that.

Issues are likely when a ticket touches both sagelib and sage-the-distro. For example when a package is updated and the sagelib doctests have to be adjusted because of some minor precision changes, all distros will have to cherry-pick the doctest fixes when they update the dependency (solution for this is not to limit the precision of the doctests to something realistic instead of just adjusting them with every dependency update). Another example is when, like it is the case here, aspects of the build system are tested. The nicest solution for this would be Erik's plan to separate sagelib and sage-the-distro. The whole package.py file doesn't serve any purpose on distros.

Sorry if this is a bit hard to follow, I can imagine the separation between sagelib and sage-the-distro is something non-packagers don't think about often. I'd be glad to clear up any unclarity. By the way, if you run a linux machine you could test-drive the sage nix package if you feel like it :)

comment:32 in reply to: ↑ 31 ; follow-up: Changed 14 months ago by jhpalmieri

Replying to gh-timokau:

Back to my first question: what are the usecases for these manifests?

Here's what I said above: Installing and uninstalling packages are the reason for the manifests: if you want to uninstall a package, which for example is part of upgrading a package, Sage uses the file manifest to know which files to remove.

There was a bug in writing the manifests which caused packages to not be removed properly, and fixing that bug motivated the doctest.

comment:33 follow-up: Changed 14 months ago by jhpalmieri

Maybe there should be a tag for doctests which are skipped by distros. If so, then in addition to being able to specify it directly when running "sage -t", maybe there could be a way of trying to intelligently detect whether to set it. That would require knowing what things hold in common among all of the distros and not for sage-the-distro: the existence of SAGE_ROOT/upstream, or some other directory or some file created just for this purpose: a file like SAGE_ROOT/lib/sage-current-location.txt which is written by sage-the-distro but not by everyone else.

comment:34 in reply to: ↑ 32 Changed 14 months ago by gh-timokau

Replying to jhpalmieri:

Replying to gh-timokau:

Back to my first question: what are the usecases for these manifests?

Here's what I said above

Sorry, I asked again because I wasn't completely sure if the mentioned usecase was the only one.

Installing and uninstalling packages are the reason for the manifests: if you want to uninstall a package, which for example is part of upgrading a package, Sage uses the file manifest to know which files to remove.

Okay, so in that case it should be fine to continue to ignore those manifests on distro. Thanks for the explanation :)

comment:35 in reply to: ↑ 33 Changed 14 months ago by gh-timokau

Replying to jhpalmieri:

Maybe there should be a tag for doctests which are skipped by distros.

Yes I thought about that too, maybe something similar to #26110 with a # optional - buildsystem that is enabled by default. Distros can then skip that.

Its not perfect since some distros will differ in how they handle certain things (for example #27230 introduced a doctest that fails on nixos although no functionality is actually broken, but will probably work on most other distros). But a lot of stuff could be covered, pretty much all of package.py for example.

This could be seen as a first step of separating sage and the sage build system, so Erik may have an opinion on that.

If so, then in addition to being able to specify it directly when running "sage -t", maybe there could be a way of trying to intelligently detect whether to set it.

I think intelligent detection isn't that important: Just enable it by default (since most manual runs of sage tests will be in a non-distro context) and let distros disable it (as it only needs to be done once).

comment:36 Changed 14 months ago by embray

New ticket please.

I would just change the test to try/except and ignore errors if the file does not exist.

It also occurs to me that the package_manifest function should not use os.environ['SAGE_SPKG_INST'] but rather should probably take that value from sage.env (and add it is a variable to sage.env if one does not already exist).

comment:37 Changed 14 months ago by jhpalmieri

See #27621 for the distro question and #27622 for getting SAGE_SPKG_INST from sage.env instead of os.environ. (They're separate issues and one has an easy obvious fix. The other I'm not so sure about.)

Last edited 14 months ago by jhpalmieri (previous) (diff)
Note: See TracTickets for help on using tickets.