Opened 3 years ago

Last modified 2 years ago

#26608 closed defect

Docbuild segfaults when pari is compiled with threading — at Version 13

Reported by: gh-timokau Owned by:
Priority: major Milestone: sage-pending
Component: documentation Keywords: docbuild, pari
Cc: arojas, jdemeyer, fbissey, dimpase, saraedum, embray Merged in:
Authors: Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by jdemeyer)

This ticket is a followup to this sage-packaging discussion. To summarize:

sage does not work together with pari's threading. Instead of relying on it being compiled without threading, I made use of the "nthreads" option to disable threading at runtime in #26002.

However since #24655 (unconditionally enabling threaded docbuild), the docbuild segfaults when pari is compiled with threading support.

Change History (13)

comment:1 Changed 3 years ago by embray

I think there's a little bit of misinformation / misconception here.

There's nothing about Sage's docbuild program that uses multi-threading. It uses a process pool and builds each sub-document in separate processes.

(There are some cases where it does not run builds in subprocesses when it probably should, and I think that is contributing somewhat to the explosion of memory usage in the docbuild, but that's a separate issue).

comment:2 follow-up: Changed 3 years ago by embray

I don't know if PARI uses openblas in its multi-threaded mode but I wonder if this is related to #26585

comment:3 Changed 3 years ago by arojas

Note that the code excerpts in the last lines of the backtrace are nonsense since I was compiling an older version of the docs. Here's the "translated" version:

  File "sage/misc/cachefunc.pyx", line 1005, in sage.misc.cachefunc.CachedFunction.__call__ (build/cythonized/sage/misc/cachefunc.c:6065)
    w = self.f(*args, **kwds)
  File "/usr/lib/python2.7/site-packages/sage/structure/unique_representation.py", line 1027, in __classcall__
    instance = typecall(cls, *args, **options)
  File "sage/misc/classcall_metaclass.pyx", line 496, in sage.misc.classcall_metaclass.typecall (build/cythonized/sage/misc/classcall_metaclass.c:2148)
    return (<PyTypeObject*>type).tp_call(cls, args, kwds)
  File "/usr/lib/python2.7/site-packages/sage/geometry/triangulation/point_configuration.py", line 367, in __init__
    PointConfiguration_base.__init__(self, points, defined_affine)
  File "sage/geometry/triangulation/base.pyx", line 398, in sage.geometry.triangulation.base.PointConfiguration_base.__init__ (build/cythonized/sage/geometry/triangulation/base.cpp:4135)
    self._init_points(points)
  File "sage/geometry/triangulation/base.pyx", line 456, in sage.geometry.triangulation.base.PointConfiguration_base._init_points (build/cythonized/sage/geometry/triangulation/base.cpp:4982)
    red = matrix([ red.row(i) for i in red.pivot_rows()])
  File "sage/matrix/matrix2.pyx", line 517, in sage.matrix.matrix2.Matrix.pivot_rows (build/cythonized/sage/matrix/matrix2.c:8414)
    v = self.transpose().pivots()
  File "sage/matrix/matrix_integer_dense.pyx", line 2217, in sage.matrix.matrix_integer_dense.Matrix_integer_dense.pivots (build/cythonized/sage/matrix/matrix_integer_dense.c:19162)
    E = self.echelon_form()
  File "sage/matrix/matrix_integer_dense.pyx", line 2019, in sage.matrix.matrix_integer_dense.Matrix_integer_dense.echelon_form (build/cythonized/sage/matrix/matrix_integer_dense.c:17749)
    H_m = self._hnf_pari(flag, include_zero_rows=include_zero_rows)
  File "sage/matrix/matrix_integer_dense.pyx", line 5719, in sage.matrix.matrix_integer_dense.Matrix_integer_dense._hnf_pari (build/cythonized/sage/matrix/matrix_integer_dense.c:46635)
    sig_on()
SignalError: Segmentation fault 

comment:4 Changed 3 years ago by gh-timokau

  • Description modified (diff)

comment:5 in reply to: ↑ 2 ; follow-up: Changed 3 years ago by gh-timokau

Replying to embray:

I don't know if PARI uses openblas in its multi-threaded mode but I wonder if this is related to #26585

I'll test if that openblas patch fixes it. Very interesting ticket, I wonder if that also causes #26130 (I've heard darwin is somewhat more prone to threading bugs).

Last edited 3 years ago by gh-timokau (previous) (diff)

comment:6 follow-up: Changed 3 years ago by embray

I'm not sure if it does, but it might. I grepped the pari/gp source and it doesn't use openblas_set_num_threads directly, but something else might be. I did see some reference to omp_set_num_threads but I don't think we compile with OpenMP by default in Sage.

comment:7 Changed 3 years ago by embray

There's also some multi-threading support in FLINT which could be problematic, but I have no idea if that's relevant in this case.

comment:8 Changed 3 years ago by embray

I read on the mailing list post "It is called indirectly via matplotlib when rendering plots, see full backtrace below (btw, I had to downgrade to an old version of Sage to get a meaningful backtrace - I really dislike this trend of hiding build output, it makes it very hard to debug stuff)"

I had a very similar problem to this; it actually came from the BLAS library by way of a Numpy ufunc (I think for the "dot" product of a matrix in a vector, or two matrices). I feel like I actually fixed this but now I can't remember.

comment:9 Changed 3 years ago by embray

Do you know some direct, specific way to reproduce this so that I can try it?

comment:10 in reply to: ↑ 5 Changed 3 years ago by embray

Replying to gh-timokau:

Replying to embray:

I don't know if PARI uses openblas in its multi-threaded mode but I wonder if this is related to #26585

I'll test if that openblas patch fixes it. Very interesting ticket, I wonder if that also causes #26130 (I've heard darwin is somewhat more prone to threading bugs).

I don't think it's related, because this only showed up when we patched fflas-ffpack to allow configuring the number of threads to use with openblas (by default it just sets it to 1). But conceivably there's a similar bug elsewhere. Possibly related to fork(). I have found many bugs in different projects related to threads/fork interaction.

comment:11 Changed 3 years ago by embray

You know what though--I'm looking at the relevant code in openblas, and openblas_set_num_threads(1) might actually cause it to spin up a single thread (beyond the main thread) which is actually good enough to invoke the bug I fixed. I'm going to try to confirm that though.

comment:12 in reply to: ↑ 6 Changed 3 years ago by jdemeyer

Replying to embray:

I'm not sure if it does, but it might. I grepped the pari/gp source and it doesn't use openblas_set_num_threads directly, but something else might be.

I don't think that PARI uses BLAS in any way.

comment:13 Changed 3 years ago by jdemeyer

  • Description modified (diff)
Note: See TracTickets for help on using tickets.