Opened 6 years ago

Closed 6 years ago

Last modified 6 years ago

#14626 closed defect (fixed)

Docbuilder hangs if latex fails

Reported by: jdemeyer Owned by: mvngu
Priority: blocker Milestone: sage-5.10
Component: documentation Keywords:
Cc: jhpalmieri, leif Merged in: sage-5.10.beta5
Authors: Jeroen Demeyer Reviewers: John Palmieri
Report Upstream: Reported upstream. Developers acknowledge bug. Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description (last modified by jdemeyer)

When building the PDF documentation, if there is problem while running latex, then the docbuilder just hangs forever after building all documentation. There is no obvious clue what the problem is apart from a message like the following (example from #9107) in the log file:

! LaTeX Error: Too deeply nested.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H <return>  for immediate help.
 ...                                              
                                                  
l.27819 \begin{Verbatim}[commandchars=\\\{\}]
                                             
? 
! Emergency stop.
 ...                                              
                                                  
l.27819 \begin{Verbatim}[commandchars=\\\{\}]
                                             
!  ==> Fatal error occurred, no output PDF file produced!
Transcript written on categories.log.
make[1]: *** [categories.pdf] Error 1
make[1]: Leaving directory `/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage-main/doc/output/latex/en/reference/categories'
Exception in thread Thread-6:
Traceback (most recent call last):
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/local/lib/python/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/local/lib/python/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/local/lib/python/multiprocessing/pool.py", line 376, in _handle_results
    task = get()
TypeError: ('__init__() takes at least 3 arguments (1 given)', <class 'subprocess.CalledProcessError'>, ())

This hang is http://bugs.python.org/issue9400

Also: the docbuilder should use $MAKE instead of make.

Attachments (1)

14626_workaround.patch (899 bytes) - added by jdemeyer 6 years ago.

Download all attachments as: .zip

Change History (24)

comment:1 Changed 6 years ago by jdemeyer

  • Description modified (diff)

comment:2 Changed 6 years ago by jdemeyer

  • Description modified (diff)

comment:3 Changed 6 years ago by jdemeyer

  • Description modified (diff)

comment:4 Changed 6 years ago by jdemeyer

I have an idea, patch possibly coming up...

comment:5 Changed 6 years ago by jhpalmieri

  • Cc jhpalmieri added

Changed 6 years ago by jdemeyer

comment:6 Changed 6 years ago by jdemeyer

  • Authors set to Jeroen Demeyer
  • Priority changed from critical to blocker
  • Status changed from new to needs_review

comment:7 Changed 6 years ago by jhpalmieri

The patch makes a lot of sense at first glance, but I should test it to make sure. I'll try to get to it soon.

comment:8 Changed 6 years ago by leif

  • Cc leif added

comment:9 Changed 6 years ago by jhpalmieri

With the patch and with bad LaTeX code, I see the hang occur earlier (soon after trying to build the bad document), but it still hangs.

comment:10 Changed 6 years ago by jdemeyer

John, it seems to work for me, so could you please send me the docpdf.log file?

comment:11 Changed 6 years ago by jdemeyer

When I apply this patch here and the patches from #9107 causing a LaTeX failure, then I get

! LaTeX Error: Too deeply nested.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H <return>  for immediate help.
 ...

l.27819 \begin{Verbatim}[commandchars=\\\{\}]

?
! Emergency stop.
 ...

l.27819 \begin{Verbatim}[commandchars=\\\{\}]

!  ==> Fatal error occurred, no output PDF file produced!
Transcript written on categories.log.
make[1]: *** [categories.pdf] Error 1
make[1]: Leaving directory `/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage-main/doc/output/latex/en/reference/categories'
Traceback (most recent call last):
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 1452, in <module>
    getattr(get_builder(name), type)()
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 273, in _wrapper
    getattr(get_builder(document), name)(*args, **kwds)
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 472, in _wrapper
    pool.map_async(build_ref_doc, L, 1).get(99999)
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/local/lib/python/multiprocessing/pool.py", line 554, in get
    raise self._value
RuntimeError: failed to run $MAKE all-pdf in /mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/output/latex/en/reference/categories
make: *** [doc-pdf] Error 1

after which I get back into the shell as expected.

comment:12 Changed 6 years ago by jdemeyer

The cause of the crash seems to be a combination of:

  1. subprocess.CalledProcessError instances cannot be unpickled properly.
  2. The multiprocessing module uses pickles to transfer exceptions from the child process to the master process and apparently doesn't gracefully handle unpickling errors.

comment:13 Changed 6 years ago by jdemeyer

  • Description modified (diff)
  • Report Upstream changed from N/A to Reported upstream. Developers acknowledge bug.

comment:14 Changed 6 years ago by jhpalmieri

I mistakenly thought that I wasn't getting an error from the patches at #9107, so I made this change and then build the documentation:

  • sage/algebras/steenrod/steenrod_algebra.py

    diff --git a/sage/algebras/steenrod/steenrod_algebra.py b/sage/algebras/steenrod/steenrod_algebra.py
    a b  
    1010  the Steenrod algebra using CombinatorialFreeModule; improved the
    1111  test suite.
    1212
     13Broken: `\aaaaaa`
     14
    1315This module defines the mod `p` Steenrod algebra `\mathcal{A}_p`, some
    1416of its properties, and ways to define elements of it.

With the patch here, it hangs after trying to build reference/algebras. I agree that with just the patches at #9107, the hang is no longer present: once reference/categories fails, I get sent back to the shell.

comment:15 Changed 6 years ago by jdemeyer

John: your change still works for me:

! Undefined control sequence.
<recently read> \aaaaaa

l.4077 Broken: $\aaaaaa
                       $
?
! Emergency stop.
<recently read> \aaaaaa

l.4077 Broken: $\aaaaaa
                       $
!  ==> Fatal error occurred, no output PDF file produced!
Transcript written on algebras.log.
make[1]: *** [algebras.pdf] Error 1
make[1]: Leaving directory `/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage-main/doc/output/latex/en/reference/algebras'
Traceback (most recent call last):
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 1452, in <module>
    getattr(get_builder(name), type)()
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 273, in _wrapper
    getattr(get_builder(document), name)(*args, **kwds)
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 472, in _wrapper
    pool.map_async(build_ref_doc, L, 1).get(99999)
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/local/lib/python/multiprocessing/pool.py", line 554, in get
    raise self._value
RuntimeError: failed to run $MAKE all-pdf in /mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/output/latex/en/reference/algebras
make: *** [doc-pdf] Error 1

I correctly get a shell prompt after this.

Please attach your docpdf.log such that I can maybe find out what is happening.

comment:16 Changed 6 years ago by jhpalmieri

Sorry, once again I didn't communicate well enough. I've been running ./sage --docbuild reference pdf, which still exhibits the hang. I see now that running make doc-pdf works as you say (so I'm not going to bother attaching docpdf.log).

comment:17 Changed 6 years ago by jdemeyer

  • Reviewers set to John Palmieri

Also ./sage --docbuild reference pdf works for me...

! Undefined control sequence.
<recently read> \aaaaaa 
                        
l.4066 Broken: $\aaaaaa
                       $
? 
! Emergency stop.
<recently read> \aaaaaa 
                        
l.4066 Broken: $\aaaaaa
                       $
!  ==> Fatal error occurred, no output PDF file produced!
Transcript written on algebras.log.
make: *** [algebras.pdf] Error 1
Traceback (most recent call last):
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 1452, in <module>
    getattr(get_builder(name), type)()
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 472, in _wrapper
    pool.map_async(build_ref_doc, L, 1).get(99999)
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/local/lib/python/multiprocessing/pool.py", line 554, in get
    raise self._value
RuntimeError: failed to run $MAKE all-pdf in /mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/output/latex/en/reference/algebras

comment:18 Changed 6 years ago by jhpalmieri

I've seen this repeatably while running ./sage --docbuild reference pdf on two different OS X machines (with two cores, with MAKE='make -j2'). Also, I just tried applying the patches at #9107 and the one here (without my change to steenrod_algebra.py) on sage.math (with MAKE='make -j12'), and it hangs after failing to compile categories.tex. (It finishes the compilations in progress, but then hangs).

Last edited 6 years ago by jhpalmieri (previous) (diff)

comment:19 Changed 6 years ago by jdemeyer

John, I still cannot reproduce your problems, can you say the exact steps that you did.

I am doing the following on sage.math:

  1. Extract a Sage 5.9 binary:
jdemeyer@sage:/release$ tar xzf /home/release/sage-5.9/sage-5.9-boxen-x86_64-Linux.tar.gz
jdemeyer@sage:/release$ cd sage-5.9-boxen-x86_64-Linux
  1. Apply the patch:
jdemeyer@sage:/release/sage-5.9-boxen-x86_64-Linux$ ./sage --hg -R devel/sage qimport -P http://trac.sagemath.org/sage_trac/raw-attachment/ticket/14626/14626_workaround.patch
adding 14626_workaround.patch to series file
applying 14626_workaround.patch
now at: 14626_workaround.patch
  1. Break LaTeX:
jdemeyer@sage:/release/sage-5.9-boxen-x86_64-Linux$ ( cd devel/sage && patch -p1 )
diff --git a/sage/algebras/steenrod/steenrod_algebra.py b/sage/algebras/steenrod/steenrod_algebra.py

Index: sage/algebras/steenrod/steenrod_algebra.py
===================================================================
--- a/sage/algebras/steenrod/steenrod_algebra.py
+++ b/sage/algebras/steenrod/steenrod_algebra.py
@@ -10,5 +10,7 @@
   the Steenrod algebra using CombinatorialFreeModule; improved the
   test suite.
 
+Broken: `\aaaaaa`
+
 This module defines the mod `p` Steenrod algebra `\mathcal{A}_p`, some
 of its properties, and ways to define elements of it.
patching file sage/algebras/steenrod/steenrod_algebra.py
Hunk #1 succeeded at 10 with fuzz 1.
  1. Rebuild Sage:
jdemeyer@sage:/release/sage-5.9-boxen-x86_64-Linux$ ./sage -b

[...]

  1. Build the PDF reference manual using 12 threads:
jdemeyer@sage:/release/sage-5.9-boxen-x86_64-Linux$ env MAKE="make -j12" ./sage --docbuild reference pdf 2>&1 |tee docpdf.log

[...]

! Undefined control sequence.
<recently read> \aaaaaa 
                        
l.3809 Broken: $\aaaaaa
                       $
? 
! Emergency stop.
<recently read> \aaaaaa 
                        
l.3809 Broken: $\aaaaaa
                       $
!  ==> Fatal error occurred, no output PDF file produced!
Transcript written on algebras.log.
]
Adding blank page after the table of contents.
pdfTeX warning (ext4): destination with the same identifier (name{page.i}) has 
been already used, duplicate ignored
<to be read again> 
                   \relax 
l.129 \tableofcontents
                       [1 [28]]pdfTeX warning (ext4): destination with the same iden
tifier (name{page.ii}) has been already used, duplicate ignored
<to be read again> 
                   \relax 
l.129 \tableofcontents
                       [2make: *** [algebras.pdf] Error 1

[...]

Underfull \hbox (badness 10000) in paragraph at lines 2769--2772
[]\T1/ptm/m/n/10 WalshCode - a bi-nary lin-ear $\OT1/cmr/m/n/10 [2[]\OML/cmm/m/
it/10 ; m; \OT1/cmr/m/n/10 2[]]$ \T1/ptm/m/n/10 code re-lated to Hadamard ma-tr
i-ces.
[30][constants] reading sources... [100%] sage/symbolic/constants_c
 [7] [31Traceback (most recent call last):
  File "/release/sage-5.9-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 1452, in <module>
]    getattr(get_builder(name), type)()
  File "/release/sage-5.9-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 472, in _wrapper
    pool.map_async(build_ref_doc, L, 1).get(99999)
  File "/release/sage-5.9-boxen-x86_64-Linux/local/lib/python/multiprocessing/pool.py", line 528, in get
    raise self._value
RuntimeError: failed to run $MAKE all-pdf in /release/sage-5.9-boxen-x86_64-Linux/devel/sage/doc/output/latex/en/reference/algebras

[...]

Output written on arithgroup.pdf (85 pages, 458174 bytes).
Transcript written on arithgroup.log.
  1. I get back to the shell as expected.

comment:20 follow-up: Changed 6 years ago by jhpalmieri

  • Status changed from needs_review to positive_review

Okay, sorry, you're right. It looked to me as though it was hanging, but that's because the shell prompt was buried in output from the still-running processes. I stupidly didn't think to hit RET to see if I got a shell prompt.

At some point we might want to provide an error message at the very end, which won't get lost amidst the output from parallel processes, but that can go on another ticket.

comment:21 in reply to: ↑ 20 Changed 6 years ago by jdemeyer

  • Merged in set to sage-5.10.beta5
  • Resolution set to fixed
  • Status changed from positive_review to closed

Replying to jhpalmieri:

Okay, sorry, you're right. It looked to me as though it was hanging, but that's because the shell prompt was buried in output from the still-running processes.

Do you remember the shell command that you ran (in particular, did you use any unusual redirections or piping)? Because otherwise I don't see how it can happen what you describe.

comment:22 Changed 6 years ago by jhpalmieri

I just logged into sage.math and did

$ cd /scratch/palmieri/sage-5.10.beta4
$ ./sage --docbuild reference pdf

Then I see, in the middle of a lot of output,

[32 [20 <pairing.png, id=620, 416.9979pt x 217.5327pt>
<use pairing.png>]] [68] <use pairing.png> [69 <./pairing.png [33] [21] [34]
Underfull \hbox (badness 10000) in paragraph at lines 2826--2827


[22][35]palmieri@boxen:sage-5.10.beta4$ 
Underfull \hbox (badness 10000) in paragraph at lines 2950--2951

 [23][36] [24]>] [70]
Chapter 11.

palmieri@boxen:sage-5.10.beta4$ is my shell prompt. At the end of the output:

Output written on homology.pdf (117 pages, 651759 bytes).
Transcript written on homology.log.

but no shell prompt because it was already printed earlier. With make doc-pdf, I see a proper error message at the end.

comment:23 Changed 6 years ago by jdemeyer

John: probably the "output after shell prompt" problem is caused by parallel docbuilding: it seems that, if one thread fails, the docbuilder master process exists and the other threads simply continue working...

Not really a bug, just a peculiarity of multiprocessing.Pool I guess.

Note: See TracTickets for help on using tickets.