Opened 11 years ago

Last modified 8 years ago

#4949 closed enhancement

Optionally build spkgs in $SAGE_BUILD_TMPDIR — at Version 31

Reported by: mabshoff Owned by: mabshoff
Priority: minor Milestone: sage-5.0
Component: build Keywords: sd32
Cc: drkirkby, leif Merged in:
Authors: John Palmieri Reviewers: Mariah Lenox, Leif Leonhardy, Maarten Derickx
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description (last modified by jhpalmieri)

$HOME can be slow in case it is NFS-mounted for example. So using local scratch space or even better a RAM disk should speed up the build by a nice factor. To do so, use $SAGE_BUILD_TMPDIR/build/ in case it exists instead of $SAGE_ROOT/spkg/build/.


Apply trac_4949-root.patch to the Sage root repository.

Apply trac_4949-installation.patch to the Sage library.

Change History (34)

comment:1 Changed 11 years ago by mabshoff

  • Milestone changed from sage-3.3 to sage-3.2.4

comment:2 Changed 11 years ago by was

As a temporary hack to see how this "feels" you could delete spkg/build, then make it a symlink to /tmp/build/.

comment:3 Changed 9 years ago by jhpalmieri

  • Authors set to John Palmieri
  • Priority changed from critical to minor
  • Report Upstream set to N/A
  • Status changed from new to needs_review

Here's a patch. This implements both SAGE_BUILD_TMPDIR and SAGE_KEEP_BUILT_SPKGS -- see #9444. (This is an incremental change rather than a complete reworking of the sage-spkg script, which might be called for.)

comment:4 Changed 9 years ago by jhpalmieri

A little explanation: BUILD is defined (as "build") by sage-env, but it was used sporadically in sage-spkg. With this patch, it is used more consistently.

comment:5 Changed 9 years ago by mpatel

If/when this merges, we should consider closing #6550. SAGE_KEEP_BUILT_SPKGS is a much better name than SAGE_KEEP_SPKG_BUILD.

comment:6 Changed 9 years ago by leif

  • Cc drkirkby leif added
  • Description modified (diff)

Nice to see progress in the build process... :)

See also http://trac.sagemath.org/sage_trac/ticket/6550#comment:7 .

comment:7 follow-up: Changed 9 years ago by leif

Should SAGE_BUILD_TMPDIR default to SAGE_TMPDIR?

(We have btw. lots of - in some cases not very well-named - environment variables.)

Making use of e.g. a RAM disk (or some user-provided directory) for doctesting is also worth doing.

comment:8 in reply to: ↑ 7 Changed 9 years ago by mpatel

Replying to leif:

Making use of e.g. a RAM disk (or some user-provided directory) for doctesting is also worth doing.

You can already set SAGE_TESTDIR (or DOT_SAGE) to do this. Or maybe I misunderstand?

comment:9 Changed 9 years ago by mariah

  • Milestone changed from sage-4.7 to sage-4.7.1
  • Status changed from needs_review to needs_work

trac_4949-scripts.patch does not apply:

sage: hg_sage.apply("/home/mariah/trac_4949-scripts.patch")
cd "/home/mariah/sage/sage-4.7.rc2-x86_64-Linux-core2-fc/devel/sage" && hg status
cd "/home/mariah/sage/sage-4.7.rc2-x86_64-Linux-core2-fc/devel/sage" && hg status
cd "/home/mariah/sage/sage-4.7.rc2-x86_64-Linux-core2-fc/devel/sage" && hg import   "/home/mariah/trac_4949-scripts.patch"
applying /home/mariah/trac_4949-scripts.patch
unable to find 'sage-spkg' for patching
5 out of 5 hunks FAILED -- saving rejects to file sage-spkg.rej
abort: patch failed to apply
sage: 

comment:10 Changed 9 years ago by jhpalmieri

  • Status changed from needs_work to needs_review

Here's a rebased version of trac_4949-scripts.patch. Note that it's for the scripts repository, so you have to apply it with "hg_scripts.apply(...)" rather than "hg_sage.apply(...)".

comment:11 Changed 9 years ago by mariah

  • Description modified (diff)
  • Reviewers set to Mariah Lenox
  • Status changed from needs_review to positive_review

I applied the patch trac_4949-scripts.patch, then moved the modified sage-spkg file to a fresh source directory of sage-4.7.rc4. I set SAGE_BUILD_TMPDIR and SAGE_KEEP_BUILT_SPKGS, and did 'make testlong'. The builds took place in the location SAGE_BUILD_TMPDIR and all tests passed. I applied the patch trac_4949-installation.patch, did 'sage -b', then 'sage -docbuild installation html' and verified that the documentation change was made and makes sense. Positive review.

comment:12 Changed 9 years ago by jdemeyer

  • Description modified (diff)

comment:13 follow-up: Changed 9 years ago by jdemeyer

  • Status changed from positive_review to needs_info

I think one should add a note in the documentation about how much disk space this is expected to use. Are the spkgs first all built and then all deleted or are they built-deleted one by one?

comment:14 Changed 8 years ago by leif

Wouldn't it be sufficient to just (keep the -- perhaps slightly modified -- documentation and the three lines honoring SAGE_KEEP_BUILT_SPKGS and) change the variable BUILD (bad name btw.) in sage-env if SAGE_BUILD_TMPDIR is set?

That way this ticket would hardly collide with #11021, which fixes a lot in sage-spkg (and a bug in sage-env w.r.t. BUILD), also using $BUILD consistently there.

I also would use SAGE_BUILD_TMPDIR "directly", without creating yet another subdirectory (build) in it; the spkgs are extracted into their own directories anyway.

comment:15 Changed 8 years ago by leif

Ooops, unfortunately it's not that easy, because $BUILD is also just used as a subdirectory name in sage-spkg.

comment:16 in reply to: ↑ 13 ; follow-up: Changed 8 years ago by jhpalmieri

  • Status changed from needs_info to needs_review

Replying to jdemeyer:

I think one should add a note in the documentation about how much disk space this is expected to use. Are the spkgs first all built and then all deleted or are they built-deleted one by one?

I've modified the documentation to try to address this. I built Sage on various machines (sage.math, David Kirkby's machine hawk, various skynet machines), and found that

  • the single largest subdirectory of "build" can be up to 1165M (building eclib on the skynet machines iras and cleo, ia64 processors). On all of the other machines, it took at most 880M. On sage.math, cicero, and my mac, the largest took 320M.
  • the total amount of disk space, if you keep all of the subdirectories can be as large as 5.3G (iras and cleo again) or as small as 2.2G (hawk).

I've put in conservative estimates for these in the documentation.

comment:17 in reply to: ↑ 16 ; follow-up: Changed 8 years ago by leif

  • Reviewers changed from Mariah Lenox to Mariah Lenox, Leif Leonhardy

Replying to jhpalmieri:

I've modified the documentation to try to address this.

The usage of $ is a bit inconsistent: :file:`$SAGE_ROOT/...` vs. :file:`SAGE_BUILD_TMPDIR/...`.

I would add a warning that SAGE_BUILD_TMPDIR must not contain spaces, and should be an absolute path (starting with a slash or whatever). Note that none of this is checked in sage-spkg; also, a broken test might return true for an empty string, so I would also test -n "$SAGE_BUILD_TMPDIR".

Also, if SAGE_BUILD_TMPDIR is set but the directory does not exist, no warning or error message is printed.


I built Sage on various machines [...] and found that [...] I've put in conservative estimates for these in the documentation.

The actual space required or used does hardly depend on the platform, but the file system characteristics, i.e. the block size.

The worst case space usage is theoretically unlimited when taking into account rebuilds and (re)installations of newer packages, as old build dirs are moved to the $BUILD/old/ directory if -s was specified or SAGE_KEEP_BUILT_SPKGS=yes.

(Btw., for some reason the build dirs of the base packages never get deleted. Perhaps that's a side-effect of the "BUILD bug", haven't tracked this down.)

I would mention the relationship to the -s parameter when installing packages with sage; the main reason for the additional environment variable is that there's no other way to achieve what -s does when using make.

comment:18 in reply to: ↑ 17 Changed 8 years ago by jhpalmieri

Replying to leif:

The actual space required or used does hardly depend on the platform, but the file system characteristics, i.e. the block size.

Well, many of the systems on which I tested were on skynet, built in subdirectories of a shared home directory -- all of the skynet machines use the same $HOME. On some of those machines, building eclib took over 1 gigabyte, while on others, it took under 320 megabytes. There are certainly differences between the types of libraries produced: .so files on linux, .dylib files on darwin, etc. I would also guess that the size of the library files might vary depending on the compiler, whether it's 32- or 64-bit, etc.

The worst case space usage is theoretically unlimited when taking into account rebuilds and (re)installations of newer packages, as old build dirs are moved to the $BUILD/old/ directory if -s was specified or SAGE_KEEP_BUILT_SPKGS=yes.

Right, but the documentation as written is accurate ("After a full build of Sage...") and I think is good enough. Anyone who sets this variable should be paying attention to the build directory anyway.

(Btw., for some reason the build dirs of the base packages never get deleted. Perhaps that's a side-effect of the "BUILD bug", haven't tracked this down.)

prereq and bzip are not installed by sage-spkg but by their own install scripts (prereq-0.9-install and bzip2-1.0.5-install), which create subdirectories of build but don't delete them when they're done.

Adding a comment about the relationship to the -s option is a good idea. I'll try to add some tests for SAGE_BUILD_TEMPDIR, too.

comment:19 Changed 8 years ago by jhpalmieri

Here are two new patches. The scripts patch checks for the existence of SAGE_BUILD_TMPDIR, and it also should correctly delete the build subdirectories afterwards -- I had missed this in the previous patch.

Changed 8 years ago by jhpalmieri

scripts repo

comment:20 follow-up: Changed 8 years ago by leif

There's a $ missing in

:file:`$SAGE_ROOT/spkg/build` or :file:`SAGE_BUILD_TMPDIR/build`

I would clarify that SAGE_KEEP_BUILT_SPKGS=yes affects all spkg installations (whether with ./sage [-i|-f] or make, the latter also when rebuilding [parts of] Sage), and that the build directory (within the Sage tree or in $SAGE_BUILD_TMPDIR/) will definitely grow over time, i.e., whenever new packages get installed or already existing / built packages reinstalled, unless one unsets SAGE_KEEP_BUILT_SPKGS at some point (which of course doesn't delete existing subdirectories in the first place).


Your observations regarding the build tree sizes on skynet are interesting; there IMHO shouldn't be such a large difference, at least not when doing "the same thing".

There are differences in object code size between RISC and CISC architectures (on the former usually larger, but at most by a factor of 2 I think) and between 32-bit and 64-bit (mostly on RISC architectures, and also if there's a lot of static data involving e.g. pointers or integers of different size); other differences might be due to debug symbols and how and what we build (e.g. assembly implementations, static or dynamic libraries in addition) on a specific platform.

I would mention the effect of the block size of the file system though (as a note perhaps), since many packages consist of a large number of small files.

Changed 8 years ago by jhpalmieri

sage repo: update installation guide

comment:21 in reply to: ↑ 20 Changed 8 years ago by jhpalmieri

Replying to leif:

I would clarify that SAGE_KEEP_BUILT_SPKGS=yes affects all spkg installations.

Done

and that the build directory will definitely grow over time

I've added some explanation. It doesn't grow quite as fast as it might, since pre-existing subdirectories are moved to SAGE_ROOT/spkg/build/old/, overwriting copies that were already there. So just reinstalling Sage over and over again will just use twice as much as space as doing it once. Upgrading will then take up more space.

Your observations regarding the build tree sizes on skynet are interesting; there IMHO shouldn't be such a large difference, at least not when doing "the same thing".

"eclib" is the usual culprit. There are huge differences in the amount of disk space it uses, so on some systems it is by far the largest, and on others, it isn't. On the skynet machines, "moin" uses a consistent 320 megabytes, whereas eclib ranges from something under that to over 1 gig, depending on the OS and the processor.

I would mention the effect of the block size of the file system though (as a note perhaps), since many packages consist of a large number of small files.

Done.

comment:22 Changed 8 years ago by mderickx

  • Reviewers changed from Mariah Lenox, Leif Leonhardy to Mariah Lenox, Leif Leonhardy, Maarten Derickx
  • Status changed from needs_review to positive_review

I've build sage entirely from scratch after applying the patch and replacing the bootstrap version of sage-spkg in SAGE_ROOT/spkg/base and passing all doctest . Both with and without the environment variables set. So I think this one is ready to get merged.

comment:23 Changed 8 years ago by mderickx

  • Description modified (diff)

comment:24 Changed 8 years ago by was

  • Keywords sd32 added

comment:25 Changed 8 years ago by leif

Then all of the changes of #11021 will have to be rebased on this one...

Perhaps easier the other way around. (Note that I still haven't updated the patches there though.)

comment:26 Changed 8 years ago by leif

  • Description modified (diff)
  • Milestone changed from sage-4.7.2 to sage-pending

comment:27 follow-up: Changed 8 years ago by jhpalmieri

By the way, regarding "After applying, replace the bootstrap version of sage-spkg in $SAGE_ROOT/spkg/base/ with the new version": this is taken care of automatically by the sage-sdist script, if one is making a new source distribution. You just have to make sure that the version in local/bin is up to date.

comment:28 in reply to: ↑ 27 Changed 8 years ago by leif

Replying to jhpalmieri:

By the way, regarding "After applying, replace the bootstrap version of sage-spkg in $SAGE_ROOT/spkg/base/ with the new version": this is taken care of automatically by the sage-sdist script, if one is making a new source distribution. You just have to make sure that the version in local/bin is up to date.

Yep.

Hope you don't mind me temporarily moving this to "sage-pending"; I intend to finish #11021 and rebase the patch(es) here on that, the latter presumably much easier than the other way around.

If I don't find the time, I'll set the milestone back to 4.7.2 of course.

comment:29 Changed 8 years ago by jhpalmieri

See also #329 which touches sage-spkg, although not in a very complicated way.

comment:30 Changed 8 years ago by swenson

Is there any update to this? It has been several months since the last update, and some of us are eagerly anticipating this. :)

comment:31 Changed 8 years ago by jhpalmieri

  • Description modified (diff)
  • Milestone changed from sage-pending to sage-5.0

I've rebased this to Sage 5.0.beta1. It had a positive review already, so I'm leaving it that way.

Changed 8 years ago by jhpalmieri

root repo

Note: See TracTickets for help on using tickets.