#14204 closed defect (fixed)
Building unchanged documentation takes very long
Reported by: | jdemeyer | Owned by: | jdemeyer |
---|---|---|---|
Priority: | blocker | Milestone: | sage-5.8 |
Component: | build | Keywords: | |
Cc: | jhpalmieri, vbraun, hivert, was | Merged in: | sage-5.8.beta4 |
Authors: | Jeroen Demeyer | Reviewers: | John Palmieri |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description
Even with #14156, rebuilding the documentation when nothing needs to be done (make doc
right after make doc
) still takes a very long time.
In sage-5.7:
real 0m48.142s user 0m37.120s sys 0m10.110s
In sage-5.8.beta2:
real 5m52.020s user 4m39.910s sys 1m5.140s
Attachments (1)
Change History (29)
comment:1 follow-ups: ↓ 3 ↓ 6 Changed 8 years ago by
- Cc vbraun hivert added
comment:2 Changed 8 years ago by
For what it's worth, you can cut down on the build time by setting MAKE=make -jN
, where N
is somewhat larger than the number of processors, at least if nothing needs to be rebuilt. On my machine, which has 2 processors:
$ export MAKE='make -j2' $ time make doc ... real 4m59.845s user 4m33.320s sys 2m0.269s
and
$ export MAKE='make -j4' $ time make doc ... real 4m4.408s user 4m41.729s sys 1m59.842s
comment:3 in reply to: ↑ 1 Changed 8 years ago by
Replying to jhpalmieri:
Should this really be a blocker?
That's a personal opinion of course.
In my opinion, the answer is absolutely "Yes". If you regularly run make ptestlong
, this really is a massive regression. I think the problem mentioned in this ticket is much worse than the problem solved in #6495.
comment:4 follow-up: ↓ 5 Changed 8 years ago by
How long does the rest of make ptestlong
take? Why does a few extra minutes matter that much?
comment:5 in reply to: ↑ 4 Changed 8 years ago by
Replying to jhpalmieri:
How long does the rest of
make ptestlong
take? Why does a few extra minutes matter that much?
OK, with 4 processes on sage.math
, the documentation takes 188s and the doctesting (without --long
) takes 1065s, with #12415. So, it's certainly noticable.
Why does a few extra minutes matter that much?
It's not that those extra minutes matter that much, but those lost minutes matter much more to me than the minutes saved by #6495.
comment:6 in reply to: ↑ 1 Changed 8 years ago by
Replying to jhpalmieri:
For each build, on my machine, it takes 6 seconds to run Sphinx and decide that nothing needs to be done.
Do you have any idea what Sphinx does in those 6 seconds? Does it import something from sage
?
comment:7 Changed 8 years ago by
If I put a call to cProfile
in the docbuild command in spkg/bin/sage
when building reference/homology
, it doesn't tell me much:
178370 function calls (174697 primitive calls) in 5.695 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 2 4.087 2.043 4.087 2.043 {posix.waitpid} 29 1.125 0.039 1.125 0.039 {method 'read' of 'file' objects} 1 0.041 0.041 5.696 5.696 builder.py:12(<module>) 15 0.019 0.001 0.032 0.002 sre_compile.py:301(_optimize_unicode) ...
If I put it instead into the Python command called in builder.py, I get more information:
962034 function calls (948847 primitive calls) in 4.297 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 18537 1.142 0.000 1.142 0.000 {method 'read' of 'file' objects} 64 0.199 0.003 2.480 0.039 all.py:1(<module>) 23751 0.159 0.000 0.279 0.000 intersphinx.py:90(split_lines) 35847/35841 0.133 0.000 0.172 0.000 posixpath.py:60(join) 57 0.112 0.002 0.571 0.010 intersphinx.py:74(read_inventory_v2) 1 0.081 0.081 2.803 2.803 all.py:38(<module>) 2 0.079 0.039 0.079 0.040 {cPickle.load} 29365 0.063 0.000 0.063 0.000 {_codecs.utf_8_decode} 2269/568 0.057 0.000 0.163 0.000 sre_parse.py:379(_parse) 17 0.053 0.003 2.075 0.122 all.py:3(<module>) ...
We're calling certain functions a huge number of times, but we're not importing sage.all (as far as I can tell). I don't know why we're calling posixpath.join 35000 times, for example: that seems like a lot. Could it be reading every file in the Sage library each time, instead of only the files referenced in that section?
comment:8 Changed 8 years ago by
Actually, it is importing sage.all every time.
comment:9 follow-up: ↓ 10 Changed 8 years ago by
I think the issue is
build_command = 'python '+os.path.join(SAGE_DOC, 'common', 'custom-sphinx-build.py') build_command += ' -b %s -d %s %s %s %s'%(type, self._doctrees_dir(), options, self.dir, output_dir) logger.debug(build_command) subprocess.call(build_command, shell=True)
Could we replace this subprocess stuff by a simple Python function call or an execfile()
?
comment:10 in reply to: ↑ 9 ; follow-up: ↓ 11 Changed 8 years ago by
Replying to jdemeyer:
I think the issue is
build_command = 'python '+os.path.join(SAGE_DOC, 'common', 'custom-sphinx-build.py') build_command += ' -b %s -d %s %s %s %s'%(type, self._doctrees_dir(), options, self.dir, output_dir) logger.debug(build_command) subprocess.call(build_command, shell=True)Could we replace this subprocess stuff by a simple Python function call or an
execfile()
?
I tried to do that, but I think Sphinx has a memory leak (this has been suggested before). For whatever reason, after a little while it uses a lot of memory and slows way down. Actually, the inventory build sped up (from almost 2 minutes before down to about half a minute on an OS X machine with 2 cores), but the html build does not.
In more detail, I tried importing sage.all once, and then using a Python function call to build each document: I basically stuck all of custom-sphinx-builder.py into builder.py, and replaced the subprocess.call(...)
with
from sphinx.cmdline import main main(args)
This way sage.all would be imported just once, not 75 times. But there are issues. Maybe we should call main(args)
some other way, so the memory is guaranteed to be released afterward? Any suggestions?
comment:11 in reply to: ↑ 10 Changed 8 years ago by
Replying to jhpalmieri:
Any suggestions?
Use multiprocessing.Pool(maxtasksperchild=1)
. This way, every child would just call Sphinx once, no cleanup to worry about. Make sure that everything that needs to be imported is imported by the master process (the one that calls multiprocessing.Pool()`.
comment:12 follow-ups: ↓ 13 ↓ 16 Changed 8 years ago by
- Cc was added
My laptop builds the new-style documentation in 55 seconds with 8x parallel if nothing is changed. And this is from cold caches.
Jeroen, you are just trying to replace the subprocess call with a fork. That just saves us a couple of Sage startups, each of which is 1.5 seconds. Thats maybe 10% performance that you can get out of there, right?
You need to pull in a number of files across the different doc pieces to figure out if stuff is out-of-date, so I guess this makes this ticket extremely dependent on small file read performance. Thats just the price of having to synchronize multiple separate sub-documents.
William, is there a chance to get a SSD (presumably into sage.math) for Jeroen?
comment:13 in reply to: ↑ 12 Changed 8 years ago by
Replying to vbraun:
My laptop builds the new-style documentation in 55 seconds with 8x parallel if nothing is changed. And this is from cold caches.
Which is probably about the same time as one process for the old docbuilder.
Jeroen, you are just trying to replace the subprocess call with a fork.
Sure.
That just saves us a couple of Sage startups, each of which is 1.5 seconds. Thats maybe 10% performance that you can get out of there, right?
I doubt it. My timings suggest that some documents need 2 ~ 3 seconds to build, which is pretty close to the Sage startup time. So I think there is a substantial gain there.
I think the elephant in the room is still: why did we care so much about speeding up the initial build of the documentation (#6495) at the expense of rebuilds of the documentation? When doing a few rebuilds of the documentation, you lose all the time which is gained by #6495.
comment:14 Changed 8 years ago by
Why did we care so much about speeding up the initial build of the documentation (#6495) at the expense of rebuilds of the documentation?
When I was thinking about #6495, I was imagining the following situations, among others:
- the initial
make
- some people still clone the Sage repo, which usually triggers a full rebuild of the docs
- edit some files (say in
combinat
) and runsage --docbuild reference/combinat html
- building the pdf docs
For all of these, #6495 helps considerably. When I think about running make ptestlong
(which I do more frequently than make ptest
), I didn't think a little added time was much of an issue.
comment:15 Changed 8 years ago by
Could we at the very least get rid of the subprocess
call? It probably won't make the documentation build as fast as before, but there should be some considerable gain.
comment:16 in reply to: ↑ 12 Changed 8 years ago by
Replying to vbraun:
William, is there a chance to get a SSD (presumably into sage.math) for Jeroen?
There already is a SSD in sage.math
for Jeroen, and the timings I mention are with that SSD.
comment:17 Changed 8 years ago by
This is a proof-of-concept patch for getting rid of subprocess
, giving a speedup of roughly a factor 2.
comment:18 Changed 8 years ago by
Perhaps more imports could be done it the master process, but I don't know how much there is to gain. Running strace
shows that the docbuilder tries to open()
a non-existing file 270200 times(!).
comment:19 Changed 8 years ago by
Add a cache for the intersphinx inventory, reducing the total time again by more than a factor of 2. Still untested proof-of-concept.
comment:20 Changed 8 years ago by
- Status changed from new to needs_review
Patch seems to work, I'm setting it to needs_review very tentatively. Somebody who knows Sphinx better than me should look at it.
With this patch, building the documentation with 2 threads is now faster than before #6495, so it would fix the issue.
comment:21 Changed 8 years ago by
The patch seems to work as advertised, and the changes make sense to me. Volker, do you concur? (Note that the commit message is not ideal ;)
Changed 8 years ago by
comment:23 Changed 8 years ago by
- Reviewers set to John Palmieri
- Status changed from needs_review to positive_review
comment:24 Changed 8 years ago by
- Merged in set to sage-5.8.beta4
- Resolution set to fixed
- Status changed from positive_review to closed
comment:25 follow-up: ↓ 27 Changed 8 years ago by
I have a followup question. I'm working with a student on learning to develop, and his first patch is a pretty basic documentation upgrade. Changing just one file and then doing sage -docbuild reference html
takes eons on a reasonably modern Ubuntu laptop because it checks ALL the files. It's true that we didn't necessarily set the number of parallel cores. But it still really slows down checking whether things look right. Any thoughts, or is this a wontfix? I'm hesitant to open a ticket until I know it really should be considered a bug, though I do think it is a pretty nasty (non-mathematical) regression.
comment:26 Changed 8 years ago by
All the more reason to set MAKE
so it uses multiple threads! On my (not reasonably modern) OS X machine, rebuilding the documentation after modifying one file, with MAKE
unset, takes
real 2m44.637s user 2m7.278s sys 0m22.776s
With MAKE
set to use two cores:
real 1m55.552s user 2m12.224s sys 0m24.907s
On sage.math, with MAKE='make -j12'
:
real 0m34.899s user 1m34.650s sys 0m16.800s
So using multiple cores is very advantageous. Note also that if you are only modifying a single file, or files within one section of the reference manual, be aware that you can do
sage --docbuild reference/homology html
(or reference/combinat
or reference/matrices
or ...). This will be very fast (except when rebuilding the combinat
docs ;), even with a single core.
comment:27 in reply to: ↑ 25 Changed 8 years ago by
Replying to kcrisman:
I have a followup question. I'm working with a student on learning to develop, and his first patch is a pretty basic documentation upgrade. Changing just one file and then doing
sage -docbuild reference html
takes eons on a reasonably modern Ubuntu laptop because it checks ALL the files.
On my reasonably modern Gentoo laptop, this takes about 40 seconds (with 1 thread), which I personally find reasonable.
I do think it is a pretty nasty (non-mathematical) regression.
I don't think it's such a big regression. With 1 thread, it is slightly slower than before, but not by much.
If you do want to create a ticket, this is the place to do it: http://bitbucket.org/birkenfeld/sphinx/issues/
comment:28 Changed 8 years ago by
I see, so this isn't something we can fix more without getting rid of parallel documentation building. I think this was a lot faster before because it only had to do a sweep through one directory.
With Sage 5.2, after touching one file and sage -b
, the documentation build takes well under 15 seconds, but in 5.9 the same thing took about 80 seconds (I had to use a watch for this, since it didn't print it out).
Maybe we could add something about *both* of John's solutions to the Sage manuals developer page? I'd be open to opening such a ticket. Maybe my student could even write the patch ;-)
I don't know. I think there are 74 documents that need to be built: 56 modules in the reference manual, 1 master reference manual doc, and 17 others. When you run
make doc
, 57 of those get built twice. For each build, on my machine, it takes 6 seconds to run Sphinx and decide that nothing needs to be done. This all adds up.I wonder if it could be sped up much by calling Sphinx internally in Python, instead of as a command-line call each time. Otherwise, I don't have any ideas. Users should be advised to run
sage --docbuild reference/algebras html
, etc., rather than rebuilding the full reference manual. They should also be advised to buy a machine with lots of cores and setMAKE=make -jN
for some large numberN
:)Should this really be a blocker?