Opened 10 years ago

Last modified 6 years ago

#6495 closed enhancement

Build the reference manual incrementally — at Version 117

Reported by: mpatel Owned by: tba
Priority: major Milestone: sage-5.8
Component: documentation Keywords: days38
Cc: jhpalmieri, leif, niles, hivert, mguaypaq, mhansen Merged in:
Authors: Mitesh Patel, John Palmieri, Florent Hivert Reviewers: Volker Braun
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: #13126, #11078, #11874, #12229, #11503, #12327, #11080, #9774, #8473, #11913, #12299, #13121 Stopgaps:

Description (last modified by jhpalmieri)

Building the Sage reference manual can take a significant amount of time. Decreasing this time could speed up Sage development.

The patch is large, but most of it consists of moving files from one location to another, as described below. A summary of the changes:

Changes in doc/en/reference — this is where the size of the patch comes from, although the changes are pretty simple:

  • rearrange the directory doc/en/reference: for each file like algebras.rst, create a subdirectory algebras and move algebras.rst to algebras/index.rst. Also create a file algebras/conf.py for the build configuration. All of these new conf.py files are identical. Deal with the contents of the directory reference/media similarly, moving the pictures to the appropriate subdirectory.
  • modify reference/index.rst to point to these new files.
  • reorganize reference/index.rst so it is arranged, at least somewhat, by topic.
  • add intersphinx to conf.py — see below. Also add the new subdirectories to the list exclude_trees.
  • new file conf_sub.py, configuration for the pieces of the documentation (as opposed to the main conf.py, which is for building reference/index.rst). This file is imported by each of the files SUBDIRECTORY/conf.py.

Changes to doc/common/builder.py:

  • add code to build the reference manual in sections, and also to build the sections in parallel. The reference manual ought to be built twice to resolve references now, so typing "sage -docbuild all html" will build it twice (along with all of the other documents, of course). "sage -docbuild reference html" will just build it once. You can also run "sage -docbuild reference/combinat html", for example, to just build one part of the manual.
  • the different parts of the manual are separate documents as far as sphinx is concerned, so allow them to reference each other using the "intersphinx" extension. (This is why we need to build it twice: the first pass assembles the intersphinx databases, the second pass uses the databases to create the references correctly.)
  • to accomodate the changes in #11251, which don't seem to be easily compatible with intersphinx, search through the output files looking for "todo" items, and accumulate them in one master "todo" list.
  • for pdf format, since it now produces 30 different pdf files, write an html file which links to each of them.

Other changes:

  • doc/common/conf.py: add the intersphinx extension to the build process.
  • doc/common/themes/sage/layout.html: fix a bug where clicking the Sage logo in the upper left corner of the docs wouldn't take you to the right place, at least in the local documentation.
  • doc/common/themes/sageref/: add a new theme for the pieces of the reference manual. This is almost identical to the "sage" theme, except for a little tinkering to the links along the top and bottom lines.
  • in the main Sage library, change a few pathnames to media files in the reference manual, since those files have been moved.
  • make the necessary changes to .hgignore and MANIFEST.in to deal with the relocated files.

The html docs for Sage 5.4.beta0, built without MathJax and with MathJax, built after applying the patches here.


Apply:

For ease of review, you can instead look at the following four patches:

  • trac_6495-part1-moving-files.patch — this moves 'algebras.rst' to 'algebras/index.rst', and similarly for all other files. It adds .. include:: ../footer.txt to the end of each of these files, and it removes any cross-referencing information like .. _ch:groups:, since that doesn't work anymore with the new structure. It also creates identical files 'DIR/conf.py' in each of the new subdirectories of doc/en/reference, except for doc/en/algebras/conf.py. That file is created in the next patch so that you can focus on reviewing just the second patch.
  • trac_6495-part2-everything-else.patch — this does everything else; in other words, all of the important content is in this patch.
  • trac_6495-part3-the-remaining-vs-5.4.patch — this patch takes care of merging the indexes and the bibliography.
  • trac_6495-part4-interrupts.patch — handle ctrl-c.

Before building the docs, you should delete the documentation output directory: rm -rf SAGE_ROOT/devel/sage/doc/output.

Change History (126)

Changed 10 years ago by mpatel

Experimental.

comment:1 Changed 10 years ago by mpatel

  • Authors set to Mitesh Patel
  • Cc jhpalmieri added
  • Summary changed from Break up the PDF reference manual into smaller pieces to [with patch, needs work] Break up the PDF reference manual into smaller pieces

The attached patch is experimental. Notes:

  • sage -docbuild reference pdf fails to build arithgroup.pdf, apparently because of the math macro \ZZ in the title. Unfortunately, I don't know how to fix this.
  • Since it replaces the top level PDF file with several smaller files, it breaks the patch at #4460.
  • It's not clear what happens to cross-ReST document links. I'll try to investigate.

comment:2 Changed 10 years ago by mpatel

On cross-PDF document links: It seems that Sphinx does not produce these. This may OK, since file:// URLs can break easily.

comment:3 Changed 10 years ago by mpatel

On the \ZZ in arithgroup.tex: It seems the problem stems from \@title in

    \ifsphinxpdfoutput
      \begingroup
      % This \def is required to deal with multi-line authors; it               
      % changes \\ to ', ' (comma-space), making it pass muster for             
      % generating document info in the PDF file.                               
      \def\\{, }
      \pdfinfo{
        /Author (\@author)
        /Title (\@title)
      }
      \endgroup
    \fi

in Sphinx's manual.cls. For some reason, the \math* font commands do not work in this context. I gather that \mathbf is preferred, but one workaround is to use

Arithmetic Subgroups of `{\rm SL}_2({\bf Z})`

in place of

Arithmetic Subgroups of `{\rm SL}_2(\ZZ)`

in arithgroup.rst.

Changed 10 years ago by mpatel

Another approach. Depends on #7549. Still experimental. This patch only. sage repo.

comment:4 Changed 10 years ago by mpatel

  • Priority changed from minor to major
  • Report Upstream set to N/A
  • Summary changed from [with patch, needs work] Break up the PDF reference manual into smaller pieces to Build the reference manual incrementally
  • Type changed from defect to enhancement

The new patch may make it possible to build and update reference manual chapters semi-independently. I think we can use the intersphinx extension to fix the cross-chapter references. But we'll need to build the manual twice, a la LaTeX.

To build just a chapter, try, e.g.,

sage -docbuild reference/algebras html -juiv3

Still to do:

  • Make a combined index page and search page.
  • Check that PDF generation works.
  • Combine chapter PDF files into one large [optional] PDF file (with pdfjam's pdfjoin)?
  • Use a specific LaTeX doc title in each conf.py.
  • Fix the "Arithmetic Subgroups" heading on the top-level page.
  • Use a visual, 2D layout for the top-level page? Group by general area? Add icons?
  • Get a reply from sphinx-dev about making relative paths work.
  • Build docs in parallel (cf. #6255) with multiprocessing?
  • Replace the "website" PDF link?
  • User-friendliness improvements.
  • Encourage more compact chapters? It seems that "Combinatorics" takes the most time and memory.
  • ...

comment:5 Changed 10 years ago by mpatel

Another important item:

  • Use just one _static directory for the manual, not 50+!

comment:6 Changed 10 years ago by mpatel

If this approach is viable, I suggest leaving many (most?) of the "To Do" items for other tickets.

comment:7 Changed 10 years ago by mpatel

While I'm here:

  • Copy PDFs from output/latex/ to output/pdf, so that make all-pdf, at least, doesn't do unnecessary work?

comment:8 Changed 10 years ago by mpatel

  • Description modified (diff)

Changed 10 years ago by mpatel

PDF fixes. This patch only. sage repo.

comment:9 follow-up: Changed 10 years ago by mpatel

Sphinx caches "foreign" object inventories in a document's environment.pickle. These now use a lot of disk space.

comment:10 in reply to: ↑ 9 Changed 10 years ago by mpatel

Replying to mpatel:

Sphinx caches "foreign" object inventories in a document's environment.pickle. These now use a lot of disk space.

Another sphinx-dev query.

comment:11 Changed 9 years ago by leif

  • Cc leif added

comment:12 Changed 8 years ago by jhpalmieri

Here's a new patch, rebased to Sage 4.7.1.alpha4. This implements parallel building, and it provides a great speedup, at least on systems with lots of processors. For example, on sage.math, the time to execute sage -docbuild reference html -j went from about 18 minutes to just under 2 minutes. The main idea is to build each module of the reference manual separately, and use the Sphinx intersphinx extension to handle cross-references (so :class:`blah` will work in the algebras module, even if blah is defined in the rings module).

Remaining issues:

  • The new build uses up more disk space than the old, by about 120 megabytes. I don't know if anything can be done about this, and I also don't think it's a big deal. (With the previous patch, it took about 1 gigabyte more, but the more recent patch manages to cut that down: in the previous patch, the _static subdirectory of the documentation was being copied, once for each module of the reference manual, and with the new version, a symlink is used instead.)
  • There are now some missing bibliographic references: at some point in the past, people have gone through the documentation and unified the references, but this means that references in one module are not seen by any other. This can be fixed just by copying the references to the module where they're used. For example: CMR05 is referenced somewhere in the module on polynomial rings, but the actual item is described in crypto/mq/sr.py.
  • The cross-referencing in intersphinx is not perfect; in particular, it doesn't seem to work after building the documents once, it needs to have the full doctree "inventory" for any module available before resolving references to that module. Since the inventory files are built alongside the documentation, this means it has to be built twice (as far as I can tell) before cross-all of the references work. We could try to figure out dependencies and make sure that if module A is referenced in module B, then A is built first, but that seems complicated, and there is no reason for there not to be circular references. I'm tempted to just allow broken cross-references. For the docs on the web site, we would have to make sure they got built twice.
  • There is a main index for the reference manual, but once you click on any entry (like "Cryptography"), you get to that module's index, and there is no link taking you back to the main index. There ought to be a way to fix this, but I haven't figured it out yet.

comment:13 follow-up: Changed 8 years ago by vbraun

  • Milestone set to sage-4.7.2
  • Reviewers set to Volker Braun

In an ideal world sphinx would be multithreaded, but we probably shouldn't wait for that ;-) The remaining issues about disk space, bibliographic references, and needing two runs seem to be unavoidable. Building parallel gets more and more important, so I think the benefits outweigh the disadvantages.

I tried the patch on Sage-4.7.1.alpha4 without any other patches applied:

  • Only the main page has proper css. For example, html/en/reference/cmd/index.html refers to _static/sage.css but the correct path would be ../_static/sage.css.
  • patch conflicts with #11251 (todo extension). Given that the latter is already positively reviewed, maybe this ticket could be rebased on top of it?
  • During the sage build, I think we should then run the docbuilder twice for the reference manual. Perhaps this should always be done for sage -docbuild all.
  • Can we make output less verbose? The whole intersphinx output scrolled forever off my screen. Ideally, an interspinx failure to find an inventory file would only add one extra line at the end of the build along the lines of "You should re-run docbuild to get references right."

comment:14 in reply to: ↑ 13 Changed 8 years ago by jhpalmieri

  • Authors changed from Mitesh Patel to Mitesh Patel, John Palmieri
  • Dependencies set to 11251

Replying to vbraun:

I tried the patch on Sage-4.7.1.alpha4 without any other patches applied:

  • Only the main page has proper css. For example, html/en/reference/cmd/index.html refers to _static/sage.css but the correct path would be ../_static/sage.css.

This was a mistake in the previous version: it was supposed to create a link from reference/_static to reference/cmd/_static. Now it should work.

  • patch conflicts with #11251 (todo extension). Given that the latter is already positively reviewed, maybe this ticket could be rebased on top of it?

Good point. This raises another problem: intersphinx doesn't easily pass todo lists between different documents, so I don't know how to get a master todo list for the Sage library. Right now, I've put the todolist for each module after its table of contents. I think combinat is the only module with any actual to do elements.

  • During the sage build, I think we should then run the docbuilder twice for the reference manual. Perhaps this should always be done for sage -docbuild all.

Done: sage -docbuild all now builds the reference manual twice. I also added a few print statements to the docbuild process.

  • Can we make output less verbose? The whole intersphinx output scrolled forever off my screen. Ideally, an interspinx failure to find an inventory file would only add one extra line at the end of the build along the lines of "You should re-run docbuild to get references right."

I've tried to do this when doing sage -docbuild all and not in general, but it may be suppressing too much output. (In the first pass, all warnings are suppressed, including intersphinx warnings, and in the second pass, any warnings should be printed. But in the second pass, it's just rewriting output, taking intersphinx links into account -- it's not reading the sources a second time, so it doesn't produce warnings about missing bibliographic references.)

Other issues:

  • In PDF output, this produces one PDF file for each module, but there is no "master" file linking to them. I hope we can create one. Should it be an html file or a PDF file?
  • We could perhaps speed things up more by breaking the combinat module, which is by far the largest, into several pieces. This can happen on another ticket.
  • I've reorganized the main index for the reference manual, grouping modules together by topic. I hope it's easier to find things this way. I wonder if we can get intersphinx to produce a master index for all of the documents...
  • in the old version, at least on my computer, when I clicked on the Sage logo in the top left corner, it wasn't taking me to the right place. I've fixed that. Along the same lines, with the new reorganization, the other links on the top line look a little funny to me in the reference manual. They looked worse before and I've tried to clean them up, but maybe they could use more work?

comment:15 Changed 8 years ago by jhpalmieri

  • Description modified (diff)

comment:16 Changed 8 years ago by jhpalmieri

  • Status changed from needs_work to needs_review

Here's a new version of the patch. This still has the same issue with "todo" items: I don't know a way to accumulate all of them from the different Sage modules, so they are just recorded module-by-module. For PDF output, the main documentation page (in SAGE_ROOT/devel/sage/doc/output/html/en/index.html) has the little PDF icons, and now when you click on the one for the reference manual, it actually opens an html file with links to all of the different PDF files.

I'm marking this for review. If we can come up with a good solution for "todo" items, that would be great, but perhaps we could defer it to another ticket.

comment:17 Changed 8 years ago by jhpalmieri

Okay, so this is not the most elegant solution, but in the most recent patch, in the html version of the reference manual, after everything is built, it searches through all of the output files algebra/index.html, arithgroup/index.html, etc., looking for todo lists. When it finds them, it copies them to todolist/index.html. This only works for the html version; for other formats, the todo list file says "The combined to do list is only available in the html version of the reference manual."

comment:18 Changed 8 years ago by jhpalmieri

Here's a new version; the only difference is this change to SAGE_ROOT/devel/sage/spkg-dist:

  • spkg-dist

    diff --git a/spkg-dist b/spkg-dist
    a b fi 
    3838
    3939# Remove the .cython_hash file, since including this in the bdist will
    4040# completely break "sage -br".
    41 # Also, do not distribute these build files (.os and .os);
     41# Also, do not distribute these build files (.so and .os);
    4242# it wastes space and causes problems!
    4343
    44 rm -f .cython_hash c_lib/*.so c_lib/*.os 
     44rm -f .cython_hash c_lib/*.so c_lib/*.os
    4545
    4646# Delete all the autogenerated files, since they will get created again
    4747# when SAGE is built or upgraded. 
    4848cd sage; "$CUR"/spkg-delauto .; cd ..
    4949
     50# Delete the autogenerated files in the doc directory.
     51rm -rf doc/output
     52rm -rf doc/en/reference/sage
     53rm -rf doc/en/reference/sagenb
     54rm -rf doc/en/reference/static
     55rm -rf doc/en/reference/templates
     56rm -rf doc/en/reference/*/sage sage/doc/en/reference/*/static sage/doc/en/reference/*/templates
     57
    5058# Create the sdist using Python's distutils.
    5159python setup.py sdist

This makes for a slightly smaller sage-....spkg file, and more importantly, if the old autogenerated files are there, they can confuse the docbuild process.

comment:19 Changed 8 years ago by jhpalmieri

Some recent timings on sage.math.

Before the patch:

$ rm -rf SAGE_ROOT/devel/sage/doc/output
$ time sage -docbuild reference html
...
real	17m49.313s
user	16m57.530s
sys	0m45.390s

$ rm -rf SAGE_ROOT/devel/sage/doc/output
$ time sage -docbuild reference pdf
...
real	26m3.623s
user	24m40.290s
sys	0m43.030s

After the patch:

$ rm -rf SAGE_ROOT/devel/sage/doc/output
$ time sage -docbuild reference html
...
real	2m30.092s
user	10m34.900s
sys	1m12.590s

$ rm -rf SAGE_ROOT/devel/sage/doc/output
$ time sage -docbuild reference pdf
...
real	3m35.064s
user	15m49.790s
sys	1m11.070s

comment:20 Changed 8 years ago by jhpalmieri

Question: if you type "sage -docbuild -D" now, it says

$ sage -docbuild -D
DOCUMENTs:
    de/tutorial          a_tour_of_sage       bordeaux_2008        
    constructions        developer            faq                  
    installation         numerical_sage       reference            
    thematic_tutorials   tutorial             website              
    fr/a_tour_of_sage    fr/tutorial          ru/tutorial          
    all  (!)             
(!) Builds everything.

Should we also mention "reference/MODULE" as a valid target?

Changed 8 years ago by jhpalmieri

use only this patch

comment:21 Changed 8 years ago by jhpalmieri

  • Description modified (diff)

Changed 8 years ago by jhpalmieri

use only this patch

comment:22 Changed 8 years ago by jhpalmieri

  • Dependencies changed from 11251 to 11251, 11298
  • Description modified (diff)

Here's a new version, with #11298 as a new dependency. (It may apply without #11298, perhaps with fuzz.) To help with reviewing, I've broken the patch into two pieces, as explained in the ticket description.

comment:23 Changed 8 years ago by jhpalmieri

  • Dependencies changed from 11251, 11298 to #11251, #11298

comment:24 follow-up: Changed 8 years ago by robertwb

Could you post a link to the generated docs so people could browse them?

comment:25 in reply to: ↑ 24 Changed 8 years ago by jhpalmieri

Replying to robertwb:

Could you post a link to the generated docs so people could browse them?

Good idea:

comment:26 Changed 8 years ago by niles

  • Cc niles added

comment:27 Changed 8 years ago by was

  • Keywords sd32 added

comment:28 Changed 8 years ago by jdemeyer

  • Dependencies #11251, #11298 deleted

Testing this against sage-4.8.alpha1 + #10620...

comment:29 Changed 8 years ago by jdemeyer

Against sage-4.8.alpha1:

patching file doc/en/reference/games/index.rst
Hunk #1 FAILED at 6
1 out of 1 hunks FAILED -- saving rejects to file doc/en/reference/games/index.rst.rej
patching file doc/en/reference/graphs/index.rst
Hunk #1 succeeded at 52 with fuzz 1 (offset 2 lines).
abort: patch failed to apply

comment:30 Changed 8 years ago by jdemeyer

Also: I'm not sure whether building totally in parallel should be the default.

comment:31 Changed 8 years ago by jhpalmieri

Here are rebased patches, along with the following change: there is now an environment variable, SAGE_PARALLEL_DOCBUILD, which if set to anything nonempty which doesn't start with "n", causes the reference manual to be built in parallel. I also added "doc-parallel" and "doc-pdf-parallel" targets to the main Makefile with a patch to the root repo.

comment:32 Changed 8 years ago by jhpalmieri

  • Description modified (diff)

Changed 8 years ago by jhpalmieri

root repo

comment:33 Changed 8 years ago by jhpalmieri

By the way, the default in the new patch is to build serially. I've also added a brief description of SAGE_PARALLEL_DOCBUILD to the installation guide.

comment:34 follow-up: Changed 8 years ago by jhpalmieri

  • Description modified (diff)

Some other possible changes: in the parallel-building code (from builder.py)

            from multiprocessing import Pool, cpu_count
            max_cpus = 8 if SAGE_PARALLEL_DOCBUILD else 1
            pool = Pool(min(max_cpus, cpu_count()))

perhaps change "else 1" to "else 2"? As it is, building serially (with max_cpus set to 1) is slower than the current system, because in the new system, the manual has to be built twice to resolve cross-references.

We could also change "pool" to just "Pool(cpu_count())" or "Pool(int(1.5 * cpu_count()))" or something like that, eliminating the minimum of 8 and possibly increasing the maximum.

comment:35 in reply to: ↑ 34 ; follow-up: Changed 8 years ago by jdemeyer

Replying to jhpalmieri:

Some other possible changes: in the parallel-building code (from builder.py)

            from multiprocessing import Pool, cpu_count
            max_cpus = 8 if SAGE_PARALLEL_DOCBUILD else 1
            pool = Pool(min(max_cpus, cpu_count()))

perhaps change "else 1" to "else 2"?

Why? It wouldn't make sense to build with more processes than there are CPUs.

We could also change "pool" to just "Pool(cpu_count())" or "Pool(int(1.5 * cpu_count()))" or something like that, eliminating the minimum of 8 and possibly increasing the maximum.

Why? It wouldn't make sense to build with more processes than there are CPUs.

As I mentioned on sage-devel, I don't like that there is an option to doctest in parallel, a different option to build the docs in parallel, a different option to build in parallel... I would say: let there be one environment variable SAGE_NUM_PROCESSES or something like that and use that for everything.

comment:36 in reply to: ↑ 35 Changed 8 years ago by jhpalmieri

Replying to jdemeyer:

Replying to jhpalmieri:

Some other possible changes: in the parallel-building code (from builder.py)

            from multiprocessing import Pool, cpu_count
            max_cpus = 8 if SAGE_PARALLEL_DOCBUILD else 1
            pool = Pool(min(max_cpus, cpu_count()))

perhaps change "else 1" to "else 2"?

Why? It wouldn't make sense to build with more processes than there are CPUs.

This version still does min(max_cpus, cpu_count()), so it won't use more processes than there are CPUs.

We could also change "pool" to just "Pool(cpu_count())" or "Pool(int(1.5 * cpu_count()))" or something like that, eliminating the minimum of 8 and possibly increasing the maximum.

Why? It wouldn't make sense to build with more processes than there are CPUs.

I see lots of suggestions on the internet to set MAKE=make -jN where N is 1.5 * (the number of cpus), or 1 or 2 more than the number of cpus. Why not here as well?

As I mentioned on sage-devel, I don't like that there is an option to doctest in parallel, a different option to build the docs in parallel, a different option to build in parallel... I would say: let there be one environment variable SAGE_NUM_PROCESSES or something like that and use that for everything.

I think maybe two variables: one (SAGE_PARALLEL) to enable parallel processes, one (SAGE_NUM_THREADS) to determine the maximum number of processes. The first could be "no" by default, and the second could be "0" by default, meaning use cpu_count() or min(8, cpu_count()) or some other variant on this, the way we do with NUM_THREADS in Makefile and sage-ptest. Then it's easy to turn on and off without remembering how many cores your machine has.

comment:37 follow-up: Changed 8 years ago by was

Why not exactly one environment variable "MAKE" which gets set to "make -jN" for some N, and that is it? I suggest this, since that's what I'm used to already for years. I don't claim it is the best solution, but it's in all my .bash* files already, and as John points out above it is documented already all over. Why don't we just stick with it?

comment:38 in reply to: ↑ 37 Changed 8 years ago by jhpalmieri

Replying to was:

Why not exactly one environment variable "MAKE" which gets set to "make -jN" for some N, and that is it?

That's an interesting idea. See #12016.

comment:39 follow-ups: Changed 8 years ago by jhpalmieri

  • Dependencies set to #12016

Here's a new patch which uses the setting of MAKE to build docs in parallel (or not). It's very similar to the code in sage-ptest from #12016, except that when you run sage-ptest, the assumption should be that you want to work in parallel, so the default number of threads (if MAKE is not set) is min(8, #cpus). For doc building, we shouldn't assume parallel building by default, I guess, so the default number of threads is 1.

comment:40 in reply to: ↑ 39 Changed 8 years ago by jdemeyer

Replying to jhpalmieri:

For doc building, we shouldn't assume parallel building by default, I guess, so the default number of threads is 1.

Indeed!

comment:41 in reply to: ↑ 39 Changed 8 years ago by jdemeyer

  • Status changed from needs_review to needs_work

Replying to jhpalmieri:

Here's a new patch which uses the setting of MAKE to build docs in parallel (or not). It's very similar to the code in sage-ptest from #12016, except that when you run sage-ptest, the assumption should be that you want to work in parallel, so the default number of threads (if MAKE is not set) is min(8, #cpus). For doc building, we shouldn't assume parallel building by default, I guess, so the default number of threads is 1.

So then we don't need the environment variable SAGE_PARALLEL_DOCBUILD, nor the Makefile patch, anymore.

comment:42 Changed 8 years ago by jdemeyer

With sage-4.8.alpha1, I get

Building reference manual, first pass.

sphinx-build -b html -d /mnt/usb1/scratch/jdemeyer/merger/sage-4.8.alpha2/devel/sage/doc/output/doctrees/en/reference   -A hide_pdf_links=1 -Q  /mnt/usb1/scratch/jdemeyer/merger/sage-4.8.alpha2/devel/sage/doc/en/reference /mnt/usb1/scratch/jdemeyer/merger/sage-4.8.alpha2/devel/sage/doc/output/html/en/reference
Build finished.  The built documents can be found in /mnt/usb1/scratch/jdemeyer/merger/sage-4.8.alpha2/devel/sage/doc/output/html/en/reference
Traceback (most recent call last):
  File "/mnt/usb1/scratch/jdemeyer/merger/sage-4.8.alpha2/devel/sage/doc/common/builder.py", line 1332, in <module>
    getattr(get_builder(name), type)()
  File "/mnt/usb1/scratch/jdemeyer/merger/sage-4.8.alpha2/devel/sage/doc/common/builder.py", line 288, in _wrapper
    getattr(get_builder(document), 'html')(*args, **kwds)
  File "/mnt/usb1/scratch/jdemeyer/merger/sage-4.8.alpha2/devel/sage/doc/common/builder.py", line 405, in _wrapper
    N = re.search('(-j|--jobs[=]?)\s*([0-9]*)', MAKE).groups()[1]
UnboundLocalError: local variable 're' referenced before assignment
make: *** [doc-html] Error 1

comment:43 Changed 8 years ago by jhpalmieri

  • Description modified (diff)
  • Status changed from needs_work to needs_review

So then we don't need the environment variable SAGE_PARALLEL_DOCBUILD, nor the Makefile patch, anymore.

Right, fixed. Also, I added import re into builder.py; this should fix the other problem as well.

comment:44 Changed 8 years ago by jhpalmieri

This now uses SAGE_NUM_THREADS from #12016.

comment:45 follow-up: Changed 8 years ago by jdemeyer

  • Status changed from needs_review to needs_work

Regarding #12016: you should simply use the value of SAGE_NUM_THREADS, nothing fancier than that.

Regarding spkg-dist: this is essentially a duplicate of #12081 and #12086, so this patch should be removed. In any case, removing the files from MANIFEST.in is the proper way of dealing with this, as opposed to removing the files before packaging the repository. Ideally, sage-sdist should not change the current Sage source tree at all.

What's the rationale for adding all these files to doc/common/themes/sageref?

Instead of always building twice, would it be possible to detect whether the manual has already been built once. For example, if I want both the HTML and PDF documentation, the current patch would do 4 passes, even if 3 would be sufficient.

Last edited 7 years ago by jdemeyer (previous) (diff)

comment:46 Changed 8 years ago by jdemeyer

Last edited 7 years ago by jdemeyer (previous) (diff)

comment:47 in reply to: ↑ 45 Changed 7 years ago by jhpalmieri

  • Cc hivert added
  • Keywords sd32 removed
  • Work issues set to intersphinx

Rebased to 5.0.beta13, but the intersphinx stuff needs fixing (the use of intersphinx here conflicts with the changes in #9128, and I haven't fixed this). Part 1 of the patch was mostly generated automatically using the attached script. After running the script, I removed lines like .. _ch:algebras: (from algebras/index.rst, in this case) by hand.

Replying to jdemeyer:

Regarding #12016: you should simply use the value of SAGE_NUM_THREADS, nothing fancier than that.

Okay, done.

Regarding spkg-dist: this is essentially a duplicate of #12081 and #12086, so this patch should be removed. In any case, removing the files from MANIFEST.in is the proper way of dealing with this, as opposed to removing the files before packaging the repository. Ideally, sage-sdist should not change the current Sage source tree at all.

I removed that part of the patch.

What's the rationale for adding all these files to doc/common/themes/sageref?

The new structure of the reference manual, in particular the new directory structure, means we need new templates for the files coming from reference/algebras/index.rst, as opposed to the old templates, which work for the main file reference/index.rst.

Instead of always building twice, would it be possible to detect whether the manual has already been built once. For example, if I want both the HTML and PDF documentation, the current patch would do 4 passes, even if 3 would be sufficient.

I don't know how to do this.

comment:48 Changed 7 years ago by jhpalmieri

  • Description modified (diff)

comment:49 Changed 7 years ago by jdemeyer

This patch would allow building the reference manual with less memory usage. 1.5GB (still a lot) is sufficient to build the manual, as opposed to more than 2GB without the patch.

comment:50 Changed 7 years ago by jdemeyer

I do get several warnings when building the HTML manual. The following is repeated many times:

sphinx-build -b html -d /fast_scratch/jdemeyer/sage-5.0.beta13/devel/sage/doc/output/doctrees/de/tutorial   -A hide_pdf_links=1  /fast_scr
atch/jdemeyer/sage-5.0.beta13/devel/sage/doc/de/tutorial /fast_scratch/jdemeyer/sage-5.0.beta13/devel/sage/doc/output/html/de/tutorial
Running Sphinx v1.1.2
loading translations [de]... done
loading pickled environment... not yet created
loading intersphinx inventory from /fast_scratch/jdemeyer/sage-5.0.beta13/devel/sage/doc/common/python.inv...
loading intersphinx inventory from /fast_scratch/jdemeyer/sage-5.0.beta13/devel/sage/doc/output/html/en/reference/objects.inv...
WARNING: intersphinx inventory '/fast_scratch/jdemeyer/sage-5.0.beta13/devel/sage/doc/output/html/en/reference/objects.inv' not fetchable
due to <type 'exceptions.IOError'>: [Errno 2] No such file or directory: '/fast_scratch/jdemeyer/sage-5.0.beta13/devel/sage/doc/output/htm
l/en/reference/objects.inv'
building [html]: targets for 22 source files that are out of date
updating environment: 22 added, 0 changed, 0 removed
[...]
Build finished.  The built documents can be found in /fast_scratch/jdemeyer/sage-5.0.beta13/devel/sage/doc/output/html/de/tutorial

There are several of the form:

sphinx-build -b html -d /fast_scratch/jdemeyer/sage-5.0.beta13/devel/sage/doc/output/doctrees/en/reference/schemes   -A hide_pdf_links=1  -q  -a  /fast_scratch/jdemeyer/sage-5.0.beta13/devel/sage/doc/en/reference/schemes /fast_scratch/jdemeyer/sage-5.0.beta13/devel/sage/doc/output/html/en/reference/schemes
None:37: WARNING: citation not found: Fulton
Build finished.  The built documents can be found in /fast_scratch/jdemeyer/sage-5.0.beta13/devel/sage/doc/output/html/en/reference/schemes

comment:51 Changed 7 years ago by jhpalmieri

  • Work issues changed from intersphinx to citations

The intersphinx stuff actually seems to be working, despite my earlier comment. I'll keep testing it to make sure, but I think it's okay.

Regarding the warnings: the missing intersphinx inventories are expected. That's why we have to build everything twice: once to create all of the inventory files, and then a second time to use them. At least for me, if I run sage --docbuild reference html twice in a row, I don't get these warnings the second time through. The citation warnings need to be fixed, though.

comment:52 Changed 7 years ago by hivert

Hi John,

What is the status of this one ? It this robust ?

I'm quite glad that I didn't stomp on your foot with #9128, but I may be right now in the process on jumping high to land on your foot ;-/: I just dived deep inside Sphinx and using @parallel, I managed to have the "writing output..." part of the doc compilation in parallel. I've right now no idea how robust it is. I'll put a log of my experiment on #6255. Also, it seems possible that the read part could also be made parallel.

Florent

comment:53 follow-ups: Changed 7 years ago by jhpalmieri

Hi Florent,

This is in pretty good shape, but it's not perfect. It undoes some of what you did in #9128 (mainly because I haven't tried to rewrite the patch to do it differently), and in particular, I'm not sure that the other parts of the Sage documentation can use intersphinx to access information from the reference manual. Citations may be an issue, in particular if the same reference is cited twice in two different parts of the reference manual: we may just need to add another copy of the citation, or perhaps a master list of citations that gets used by everything. I don't know if that's practical.

There are also issues with having to build the reference manual twice so that all of the references are resolved. This is not ideal.

I think that doing the reading and/or writing in parallel would be good, but given the size of the reference manual, breaking it into pieces seems worthwhile as well. If the parallel reading and writing help to cut down on the memory usage, which seems to be getting out of hand, then maybe that is good enough for now. (At least on sage.math, the writing part seems to take way too long, so doing that in parallel might help significantly.)

So if you have a workable solution which accomplishes some of what is done here, and perhaps does it more simply, go right ahead. I'll take a look at your comments at #6255.

John

comment:54 in reply to: ↑ 53 ; follow-up: Changed 7 years ago by hivert

Hi John,

Thanks for the quick answer.

This is in pretty good shape, but it's not perfect. It undoes some of what you did in #9128 (mainly because I haven't tried to rewrite the patch to do it differently), and in particular, I'm not sure that the other parts of the Sage documentation can use intersphinx to access information from the reference manual.

I'll have a look at it. Please do not hesitate to ask for some more explanation on the hack I did with intersphinx. Is there a specific reason why you doubt intersphinx will work for the other part of the doc ?

There are also issues with having to build the reference manual twice so that all of the references are resolved. This is not ideal.

It doesn't seem to be a huge problem with LaTeX, since this never has been solved since years... Though I never seen a LaTeX compilation as long as Sage's doc.

I think that doing the reading and/or writing in parallel would be good, but given the size of the reference manual, breaking it into pieces seems worthwhile as well.

I agree.

If the parallel reading and writing help to cut down on the memory usage, which seems to be getting out of hand, then maybe that is good enough for now. (At least on sage.math, the writing part seems to take way too long, so doing that in parallel might help significantly.)

I don't think it will cut down memory usage in any way. I'd rather expect the contrary. My solution seems to be working but since I currently for a sage for every single file, a lot of time is wasted in forking. I'll try to improve it tomorrow.

So if you have a workable solution which accomplishes some of what is done here, and perhaps does it more simply, go right ahead. I'll take a look at your comments at #6255.

I don't think I really will. As I said I'll probably trade speed against memory usage... I'll keep you in touch.

Cheers,

Florent

comment:55 in reply to: ↑ 54 ; follow-up: Changed 7 years ago by jdemeyer

Replying to hivert:

I don't think it will cut down memory usage in any way. I'd rather expect the contrary.

Memory usage is already too much, so anything that further increases memory usage is very bad.

comment:56 in reply to: ↑ 55 ; follow-up: Changed 7 years ago by hivert

Replying to jdemeyer:

Replying to hivert:

I don't think it will cut down memory usage in any way. I'd rather expect the contrary.

Memory usage is already too much, so anything that further increases memory usage is very bad.

I respectfully disagree if

  • it is optional
  • it divide by more than 10 the documentation compile time

comment:57 in reply to: ↑ 56 Changed 7 years ago by leif

Replying to hivert:

Replying to jdemeyer:

Replying to hivert:

I don't think it will cut down memory usage in any way. I'd rather expect the contrary.

Memory usage is already too much, so anything that further increases memory usage is very bad.

Like adding more modules and functions to the Sage library, and rising the "doc[test] coverage"... ;-)


I respectfully disagree if

  • it is optional

That would be fine.

  • it divides by more than 10 the documentation compile time

Wow... :P

comment:58 Changed 7 years ago by hivert

This is an extremely useful patch which should go in Sage ASAP. It effectively reduce the compilation time to 10min on my I7 laptop. This is brilliant. It will certainly prove more maintainable in the long term than my parallelization of Sphinx expertiment. Though, I think we should discuss of is with Georges Brandl (the maintainer of Sphinx).

I'm now reviewing the code itself. You'll find below some comments.

Florent

Suggestion of improvement:

  • I think we should get rid of the warning on the first pass, maybe by patching a little bit intersphinx. Also, now the compilation is a lot more verbose. Maybe we should silent the info for loading the bunch of intersphinx load at least in reference/*. If you agree with the idea, I'm Ok to look at this one.
  • Some directories are much longer than the other (eg: combinat) they should be launched first (or maybe in this case broken in even smaller parts). This could wait for another ticket.

Some remaining problems:

  • Interrupting the compilation with ctrl+C doesn't work. I had to kill by hand the master compilation process. I'm not sure what we can do for solving this problem.
  • On the main page, you managed to gather the todos which is great ! Is it possible to do the same for the index (mine is empty). The module index is moreover broken.

comment:59 Changed 7 years ago by hivert

Hi John,

Using your patch I realized that for the first stage, you don't need to get the html file only the pickle and object.inv file. Did you try to do that ? It seems that this is easily feasible with a custom Builder. I'm currently testing this workflow.

Florent

comment:60 Changed 7 years ago by hivert

Another comment: You should start by building the reference manual and then the other component. Currently in class AllBuilder method _wrapper you start by building the other documents.

Florent

comment:61 Changed 7 years ago by hivert

I just attached a new patch which defines a new builder for creating the pickle and object.inv file. It can be called by

sage -docbuild DOCUMENT invpickle

It is automatically called for the first pass when building DOCUMENT=all. As a consequence on my laptop the first pass in only 2 min and a half long.

Florent

comment:62 Changed 7 years ago by hivert

The invpickle builder seem quite efficient here are the timing on my I7 laptop:

  • First stage: 02:36
  • Second Stage: 08:06

Without my patch the first stage is 10:19 long.

Florent

comment:63 Changed 7 years ago by hivert

There is also a failure:

sage -t  builder.py
File "/home/data/Sage-Install/sage-5.0.beta13/devel/sage-doc/doc/common/builder.py", line 862:
    [...]
    sage: builder.ReferenceBuilder("reference").auto_rest_filename("sage.combinat.partition")
      File "/home/data/Sage-Install/sage-5.0.beta13/devel/sage/doc/common/builder.py", line 409, in _wrapper
        getattr(DocBuilder(self.name, lang), format)(*args, **kwds)
    AttributeError: 'DocBuilder' object has no attribute 'auto_rest_filename'

comment:64 Changed 7 years ago by jhpalmieri

Hi Florent,

Your ideas look great; I hope you can keep working on them. I am happy with all of your suggestions. In particular:

  • building the reference manual before the other documents is fine. I think that was your approach in #9128, and the original patch here built the reference manual last. I hadn't gotten around to changing it to build the reference manual first.
  • I'll try to take a look at combining the indices, the way the todo lists are combined.
  • further subdividing combinat (for example) is a good idea, but I also agree that it belongs on another ticket.
  • I've also noticed that ctrl-C doesn't quit the process. Any ideas why? Should we run the parallel processes differently, maybe using the methods you suggested at #6255?

By the way, mpatel should get a lot of the credit for this; he wrote the original patches, which I've just cleaned up and reorganized.

comment:65 Changed 7 years ago by jhpalmieri

I got two doctest failures (after applying your patch), which can be fixed with this patch:

  • doc/common/builder.py

    diff --git a/doc/common/builder.py b/doc/common/builder.py
    a b class DocBuilder(object): 
    200200            sage: import os, sys; sys.path.append(os.environ['SAGE_DOC']+'/common/'); import builder
    201201            sage: b = builder.DocBuilder('tutorial')
    202202            sage: b._output_formats()
    203             ['changes', 'html', 'htmlhelp', 'json', 'latex', 'linkcheck', 'pickle', 'web']
     203            ['changes', 'html', 'htmlhelp', 'invpickle', 'json', 'latex', 'linkcheck', 'pickle', 'web']
    204204
    205205        """
    206206        #Go through all the attributes of self and check to
    class ReferenceSubBuilder(DocBuilder): 
    859859
    860860            sage: import os, sys; sys.path.append(os.environ['SAGE_DOC']+'/common/'); import builder
    861861            sage: import builder
    862             sage: builder.ReferenceBuilder("reference").auto_rest_filename("sage.combinat.partition")
     862            sage: builder.ReferenceSubBuilder("reference").auto_rest_filename("sage.combinat.partition")
    863863            '.../devel/sage/doc/en/reference/sage/combinat/partition.rst'
    864864        """
    865865        return self.dir + os.path.sep + module_name.replace('.',os.path.sep) + '.rst'

comment:66 Changed 7 years ago by jhpalmieri

By the way, searching is broken with these patches. The files searchindex.js (both in reference/ and in its various subdirectories) are all essentially empty. Hmm.

It looks like combining the indices should be doable, but a little annoying: read in the html from each, extract the entries, sort them, and rewrite to a new index file. The same should work for the module index. I think talking to Georg Brandl makes sense: if we wanted Sphinx to support this kind of thing out of the box, what exactly would we be looking for? I guess at the least we want separate Sphinx documents which share indices, searches, and inventories. It would be nice to have a specific "Sphinx Enhancement Proposal", or even some code to contribute.

comment:67 Changed 7 years ago by hivert

Hi John,

Another reason to have this in Sage ASAP: #12878

I'm willing to help on this one but we need to coordinate. I'll be in Montreal for the next Sage days 38 so hopefully will have so time for Sphinx. Do you think you'll have so time on it in the forthcoming weeks ? Please tell me what I can do to help. Here are a few suggestions:

  • avoiding to build the full doc twice: If you agree with the idea of the invpicke builder, I can polish it a little more. Also the name is far from being perfect and I'm open to any suggestion.
  • Intersphinx being too verbose: I don't think we can silence intersphinx a little without patching Sphinx. I can try to do it by some Monkey patch.
  • Parallelizing the rests of the doc: I noticed that the refman is build in parallel but the other documents are build in series. We should build them in parallel too. I'll look for the easiest way.

Cheers,

Florent

comment:68 Changed 7 years ago by jhpalmieri

I hope I have time to work on it next week, although right now I have other things I need to be doing (like reading my student's PhD thesis).

  • I really like the idea of the invpickle builder, so please continue with that. I don't have any better suggestions for the name.
  • The verbosity of intersphinx is a minor issue. If we can reduce it, that would be fine, but it's not the highest priority, in my opinion.
  • I think it shouldn't be too hard to parallelize building the rest of the docs. We want to do it so that hitting ctrl-c will quit the build. I think we should try using tools from sage.parallel rather than the Python multiprocessing module. If you want to work on that, that would be great.

Also:

  • I can try to work on the indices.
  • I will investigate why the searchindex.js files are empty, but right now I have no idea.

I will try to post any progress that I make, and you should do likewise, so we don't end up duplicating each other's work.

comment:69 follow-up: Changed 7 years ago by jhpalmieri

As far as handling ctrl-c, I found this question. It seems we can replace pool.apply_async(ARGS) with pool.apply_async(ARGS).get(999999). This apparently adds a (very long) timeout to the process, and it apparently works around a bug in Python. In practice, it seems to work for me.

  • doc/common/builder.py

    diff --git a/doc/common/builder.py b/doc/common/builder.py
    a b class ReferenceBuilder(AllBuilder): 
    415415            for doc in self.get_all_documents(refdir):
    416416                pool.apply_async(build_ref_doc,
    417417                                 (doc, lang, format,
    418                                   os.path.split(output_dir)[0]) + args, kwds)
     418                                  os.path.split(output_dir)[0]) + args, kwds).get(999999)
    419419            pool.close()
    420420            pool.join()
    421421            if format == 'html':

comment:70 Changed 7 years ago by leif

Replying to jhpalmieri:

  • I really like the idea of the invpickle builder, so please continue with that. I don't have any better suggestions for the name.

One should perhaps just expand the name to contain "inventory"; "pickle" is IMHO minor (or irrelevant) and won't tell most(?) people much.

  • The verbosity of intersphinx is a minor issue. If we can reduce it, that would be fine, but it's not the highest priority, in my opinion.

Mine, too. Although I hate Sphinx's / Sage's current messages already, especially since "Build succeeded. The built documents can be found in ..." is always printed, so is plain wrong in case of an error. (We tried to fix that once, but then I think Jeroen decided to keep it as is.)

  • I think it shouldn't be too hard to parallelize building the rest of the docs. We want to do it so that hitting ctrl-c will quit the build. I think we should try using tools from sage.parallel rather than the Python multiprocessing module. If you want to work on that, that would be great.

Why not just use make? Either add (and change the) targets in the top-level Makefile, or add one to devel/sage/doc/. To me seems cleanest (preferably the latter), and docbuilding IMHO shouldn't depend [more] on the Sage library [as needed / it already does].

comment:71 Changed 7 years ago by jhpalmieri

The problem with using make is that I want to be able to run sage --docbuild all html, and I don't think that running this should call make. The fix I mentioned above for interrupts (adding a timeout) doesn't make use of the sage.parallel module, just plain Python, by the way. So we should be able to do this for building all of the documents in parallel.

comment:72 in reply to: ↑ 69 Changed 7 years ago by hivert

Replying to jhpalmieri:

As far as handling ctrl-c, I found this question. It seems we can replace pool.apply_async(ARGS) with pool.apply_async(ARGS).get(999999). This apparently adds a (very long) timeout to the process, and it apparently works around a bug in Python. In practice, it seems to work for me.

Does it really work ? For me is seems that ctrl-c is now working but the doc is no more compiling in parallel.

Florent

comment:73 follow-up: Changed 7 years ago by hivert

Concerning todos (and maybe indexes too), I think I've found an alternative: they are pickled in the files environment.pickle. Here is an example, in the directory doc/output/doctrees/en/reference/modules:

sage: import cPickle
sage: f = open('environment.pickle', 'rb')
sage: env = cPickle.load(f)
sage: f.close()
sage: env.todo_all_todos
[{'docname': 'sage/modules/free_module', 'source': u'/home/data/Sage-Install/sage-5.0.beta14/local/lib/python2.7/site-packages/sage/modules/free_module.py:docstring of sage.modules.free_module.FreeModuleFactory', 'todo': <todo_node: <title...><paragraph...>>, 'lineno': 122, 'target': <target: >}]
sage: env.indexentries.keys()
['index', 'sage/modules/free_module_element', 'sage/modules/real_double_vector', 'sage/modules/matrix_morphism', 'sage/modules/fg_pid/fgp_module', 'sage/modules/free_module_homspace', 'sage/modules/vector_space_homspace', 'sage/modules/fg_pid/fgp_element', 'sage/modules/complex_double_vector', 'sage/modules/fg_pid/fgp_morphism', 'sage/modules/vector_callable_symbolic_dense', 'sage/modules/module', 'sage/modules/free_module_morphism', 'sage/modules/vector_space_morphism', 'sage/modules/free_module']

So it seems that we can get them from there from the pickle concatenate them and let sphinx output the todo list and the index.

Florent

Last edited 7 years ago by hivert (previous) (diff)

comment:74 in reply to: ↑ 73 Changed 7 years ago by hivert

Replying to hivert:

So it seems that we can get them from there from the pickle concatenate them and let sphinx output the todo list and the index.

I've a proof of concept which seems to be working upto two problems:

  • The generated links to the "original entry" generated by app.builder.get_relative_uri is missing the subdirectory (ie: "reference/sage/modules/..." instead of "reference/modules/sage/modules/...". This could probably either be fixed by fixing the builder or playing with symlinks.
  • if one recompile the doc twice, all the todos are duplicated. This could probably also be fixed by removing duplicates or clearing the todo-list at the right moment.

I think both problem can be fixed.

comment:75 Changed 7 years ago by hivert

Hi there,

I just uploaded a new invbuilder.patch; you need to apply it after the two mentionned patch in the header of this ticket. It seems to solve the problems of

  • merging todo lists
  • merging the html indexes

It remains to merge the javascript indexes.

Please tests.

Florent

Last edited 7 years ago by hivert (previous) (diff)

comment:76 Changed 7 years ago by hivert

I just uploaded a new patch which merge the javascript indexes as well !!! It was a little more tricky but it seems to work now. The only things that remains to be merged is the list of modules. I'll write a proposal to Sphinx for my merging tool.

Florent

comment:77 Changed 7 years ago by hivert

The new version now also merge modules indexes !!! The obtained doc seems to be in quite a good shape.

Changed 7 years ago by hivert

inventory builder + merge todo list & html / js indexes

comment:78 in reply to: ↑ 53 Changed 7 years ago by hivert

Citations may be an issue, in particular if the same reference is cited twice in two different parts of the reference manual: we may just need to add another copy of the citation, or perhaps a master list of citations that gets used by everything. I don't know if that's practical.

I've probably a way to handle cross-citation as well. They are stored in the environement and I can get them to gather the link in the main reference manual, exactly the way I gather TODO and indexes. It is just a little more tricky because I need to redispatch them to the other documents. I'm experimenting...

Florent

Changed 7 years ago by hivert

comment:79 Changed 7 years ago by hivert

  • Authors changed from Mitesh Patel, John Palmieri to Mitesh Patel, John Palmieri, Florent Hivert
  • Description modified (diff)
  • Status changed from needs_work to needs_review
  • Work issues citations deleted

Hi there,

I just uploaded which seems to solve most of the problems we had here, including merging of

  • the todo list if this extension is activated;
  • the python indexes;
  • the list of python modules;
  • the javascript index;
  • the citations.

I put a need review though I'm quite not sure everything is ready for inclusion into Sage. I do need feedback and people shaking my code to see if it's robust.

Florent

comment:80 Changed 7 years ago by hivert

  • Keywords days38 added

comment:81 follow-up: Changed 7 years ago by jhpalmieri

Hi Florent,

I am quite busy with other things right now, so I hope you can get people at Sage Days 38 to take a good look at this. Maybe you can figure out the ctrl-c issue as well; maybe tinkering with what I had might work; if not, the link I provided had some other ideas, too.

Thanks, John

comment:82 in reply to: ↑ 81 Changed 7 years ago by hivert

Hi John,

Replying to jhpalmieri:

I am quite busy with other things right now, so I hope you can get people at Sage Days 38 to take a good look at this. Maybe you can figure out the ctrl-c issue as well; maybe tinkering with what I had might work; if not, the link I provided had some other ideas, too.

well, they are very few people knowing the documentation building system here and I used quite a few Sphinx internal stuff. I think a good reviewer would be George Brandle himself but I need to sit for a moment to write him an e-mail.

Cheers,

Florent

comment:83 Changed 7 years ago by mguaypaq

I haven't had time yet to read the whole ticket description and discussion, but I will note that this patch has allowed me to build the documentation on my old laptop (with only 1GB of RAM) in a couple of hours, whereas trying to build it without the patch causes my laptop to become unresponsive and eventually crash. (Also, Florent had a look at the output and seemed satisfied that everything was fine.) So, a huge +1 from me!

I will try to do as much as I can to review this ticket, but I know almost nothing about the documentation building process.

comment:84 Changed 7 years ago by mguaypaq

  • Cc mguaypaq added

comment:85 Changed 7 years ago by jhpalmieri

  • Description modified (diff)

After a lot of swearing at my computer, I think I have the ctrl-c situation figured out. We've been using Pool.apply_async, and this didn't handle ctrl-c well. Adding a timeout (using the get method) caused it to handle interrupts, but it no longer build in parallel. If we instead use Pool.map_async together with get to provide a timeout, it seems to work. Please take a look at the new patch.

comment:86 Changed 7 years ago by jhpalmieri

Several quick comments and questions:

  • First, the combined indices and search are great! At first I thought that searching was broken, but that's my browser, not the code. I'm still looking over the patch, but in practice it seems to work.
  • This is an old problem: I object to the use of "type" as a global variable, set by __main__. Then I can't use print type(var) anywhere in the rest of the code (as I tried to do when debugging something). I may change type to something else; any suggestions? Perhaps format (which also has a meaning in Python)?
  • When we're happy about the patches, I'll try to combine them into just one or two chunks. Right now, for example, one of my patches creates the "todolist" directory, and then Florent's patch moves all of the files out of it, so we end up with an empty directory. I don't know if there are other similar things which will need cleaning up.
  • I get some warnings about missing citations. Should we worry about those now?

comment:87 Changed 7 years ago by jhpalmieri

I've modified the part 4 patch to do a little cleanup:

  • I moved the detection of the number of threads to build_options.py, and expanded on the comment about loading build_options.py. This way if someone is reading builder.py and wants to know where NUM_THREADS is defined, at least they'll find a comment pointing them to that file.
  • I modified the mkdir function, making it a bit more secure: the new approach seems to be preferred, as far as I can tell.

comment:88 Changed 7 years ago by jhpalmieri

Another update to part 4: fix the main index.html file for the pdf build of the reference manual.

comment:89 Changed 7 years ago by jhpalmieri

One more update to part 4, to fix a few doctests.

comment:90 follow-up: Changed 7 years ago by jhpalmieri

I think this looks very good. I'm just about ready to give Florent's part a positive review. Two questions: can we reinstate the -Q flag for the first pass on the reference manual, to silence all of the warnings? Also, I see this warning message; do you know if it's important?

preparing documents... WARNING: search index couldn't be loaded, but not all documents will be built: the index will be incomplete.

(This occurs if I do sage --docbuild all html, at the end of the second pass through the reference manual.)

I'm attaching one more version of the part 4 patch, just to fix a few typos and grammar issues in the files Florent added.

comment:91 in reply to: ↑ 90 Changed 7 years ago by hivert

Hi John,

I think this looks very good. I'm just about ready to give Florent's part a positive review.

Wow !!! This is extremely cool. Thanks a lot ! I'm sorry for my current silence. I didn't had the time to look at your part4 code. I'll try to do it shortly.

Two questions: can we reinstate the -Q flag for the first pass on the reference manual, to silence all of the warnings?

No problem. I just wanted to have some idea of the progress and forgot to switch back to a silent mode.

Also, I see this warning message; do you know if it's important?

preparing documents... WARNING: search index couldn't be loaded, but not all documents will be built: the index will be incomplete.

(This occurs if I do sage --docbuild all html, at the end of the second

pass through the reference manual.)

I'm not sure now. I'll though I had silenced this warning. Give me a few day to investigate a little more. It is probably not important as the produced index is correct but I may be missing to merge somme part of it.

I'm attaching one more version of the part 4 patch, just to fix a few typos and grammar issues in the files Florent added.

Thanks for those rereading. I planned to polish more the code and try to get some feedback from Sphinx and didn't find the time. However I now think that we should let the code enter sage as soon as possible because it allows the doc to compile on small machine. At sage days 48, with mguaypaq (see above) we manage to compile the documentation in a seemingly satisfactory way on a 1GB machine. Moreover its a requirement to the feature #12878 which I think is definitely needed at least for huge classes such as graphs...

Thanks a lot,

Florent

comment:92 Changed 7 years ago by jhpalmieri

See #12991 for a peripherally-related ticket: don't doctest the autogenerated rst files.

comment:93 Changed 7 years ago by jdemeyer

  • Status changed from needs_review to needs_work

This needs to be rebased to sage-5.1.beta5:

applying /release/merger/patches/trac_6495-part2-everything-else.patch
patching file doc/en/reference/combinat/index.rst
Hunk #1 FAILED at 3
1 out of 2 hunks FAILED -- saving rejects to file doc/en/reference/combinat/index.rst.rej
patching file doc/en/reference/misc/index.rst
Hunk #1 succeeded at 31 with fuzz 1 (offset 3 lines).
abort: patch failed to apply

comment:94 Changed 7 years ago by jdemeyer

  • Dependencies #12016 deleted

comment:95 Changed 7 years ago by jhpalmieri

  • Status changed from needs_work to needs_review

Okay, this is now rebased.

comment:96 Changed 7 years ago by jhpalmieri

  • Description modified (diff)

I've added a link to the built documents for Sage 5.1.beta5, in case anyone wants to look at it without applying the patches here. The reference manual is organized differently and as a result looks different. The other documents should be unchanged.

comment:97 Changed 7 years ago by jhpalmieri

  • Description modified (diff)

comment:98 Changed 7 years ago by jhpalmieri

  • Dependencies set to #9774
  • Description modified (diff)

This now depends on #9774. If you want to try it without the patches at #9774 (thus using jsMath instead of MathJax), apply trac_6495-part2-everything-else.patch instead of trac_6495-part2-everything-else-9774.patch.

comment:99 Changed 7 years ago by jhpalmieri

A slight issue with the reference manual is that it takes up a little more room, and most of the difference is in devel/sage/doc/output/doctrees. There is some redundancy in the environment.pickle files. I wonder if there is any way to avoid this.

In more detail: using MathJax instead of jsMath saves room in the directory doc/output/html: with the old version, jsMath and non-parallel, that directory takes about 325M, while the new version, MathJax and parallel, takes about 200M, saving 125M. Unfortunately, the doctree directory has increased by about 150M, so the whole thing has gone up by 25M. Not a big difference, but if there were a way to improve it, it would be nice. Maybe on another ticket...

comment:100 Changed 7 years ago by jdemeyer

  • Dependencies changed from #9774 to #13126, #11078, #11874, #12229, #11503, #12327, #11080, #9774, #8473, #11913, #12299, #13121
  • Milestone changed from sage-5.1 to sage-pending

Clarified recursive dependencies.

comment:101 Changed 7 years ago by jdemeyer

  • Status changed from needs_review to needs_work
  • Work issues set to too many WARNINGs

I get tons of warnings (1043 to be precise) when building the manuals. I don't like this, because it's hard to distinguish warnings due to this ticket with true warnings about bad formatting. Is there a way not to produce any warnings during the first run (I assume the warnings should only appear in the first run)?

comment:102 Changed 7 years ago by jhpalmieri

Rebased because of #12299. I'll try to work on the warnings. I think it might be best to only suppress the warnings when you run sage --docbuild all html, because that is when both passes of the reference manual are actually run, and the first pass is designed to be fast: it just produces the inventory files, not html output. If you run sage --docbuild reference html, then it will just do one pass, html format.

Edit: actually, the warnings are easy to get rid of. Florent intentionally left them there while working on the ticket: see these lines in builder.py:

        global ALLSPHINXOPTS
        # ALLSPHINXOPTS += ' -Q -D multidoc_first_pass=1'
        ALLSPHINXOPTS += ' -D multidoc_first_pass=1'

If we uncomment the second line and remove the third, the warnings will go away. I'll incorporate this change into the 4th patch.

Last edited 7 years ago by jhpalmieri (previous) (diff)

comment:103 Changed 7 years ago by jhpalmieri

  • Status changed from needs_work to needs_review

comment:104 Changed 7 years ago by jhpalmieri

  • Work issues too many WARNINGs deleted

comment:105 Changed 7 years ago by jhpalmieri

I see one warning now:

WARNING: search index couldn't be loaded, but not all documents will be built: the index will be incomplete.

I don't know what this means. The search index looks pretty good to me. Florent might have some idea, when he has a chance to look at this.

comment:106 Changed 7 years ago by jhpalmieri

  • Description modified (diff)

comment:107 Changed 7 years ago by jhpalmieri

  • Description modified (diff)

comment:108 Changed 7 years ago by jhpalmieri

  • Description modified (diff)

comment:109 Changed 7 years ago by jhpalmieri

  • Description modified (diff)

comment:110 Changed 7 years ago by jhpalmieri

In addition to the warning mentioned in comment #105, I have a question about .buildinfo files. It seems that on the first pass through the reference manual, these files contain lines like

config: 6a23e6beb735e39dc46994bfb813cf55
tags: fbb0d17656682115ca4d033fb2f83ba1

On the second pass, these files get overwritten, and the new files have

config:
tags:

Then running sage --docbuild reference all produces warnings like

WARNING: unsupported build info format in '.../devel/sage/doc/output/html/en/reference/libs/.buildinfo', building all

and then everything is rebuilt again. I propose this change, which seems to fix this issue:

  • doc/common/builder.py

    diff --git a/doc/common/builder.py b/doc/common/builder.py
    a b class AllBuilder(object): 
    300300        logger.warning("Building reference manual, second pass.\n")
    301301        ALLSPHINXOPTS = ALLSPHINXOPTS.replace(
    302302            'multidoc_first_pass=1', 'multidoc_first_pass=0')
    303         ALLSPHINXOPTS = ALLSPHINXOPTS.replace('-Q', '-q') + ' -a '
     303        ALLSPHINXOPTS = ALLSPHINXOPTS.replace('-Q', '-q') + ' '
    304304        for document in refs:
    305305            getattr(get_builder(document), name)(*args, **kwds)
    306306
Last edited 7 years ago by jhpalmieri (previous) (diff)

Changed 7 years ago by jhpalmieri

apply second (if you've applied the patches at #9774)

comment:111 Changed 7 years ago by jhpalmieri

  • Description modified (diff)

comment:112 Changed 7 years ago by jhpalmieri

  • Description modified (diff)

comment:113 Changed 7 years ago by jhpalmieri

  • Description modified (diff)

comment:114 Changed 7 years ago by jhpalmieri

  • Description modified (diff)

comment:115 Changed 7 years ago by jhpalmieri

Rebased to Sage 5.4.beta0.

comment:116 Changed 7 years ago by jhpalmieri

  • Description modified (diff)

comment:117 Changed 7 years ago by jhpalmieri

  • Description modified (diff)
Note: See TracTickets for help on using tickets.