Opened 2 years ago

Last modified 15 months ago

#28800 new defect

PARI is not thread-safe from above

Reported by: embray Owned by:
Priority: major Milestone: sage-wishlist
Component: interfaces Keywords: pari threading parallelism gap
Cc: arojas, jdemeyer, fbissey, dimpase, saraedum, embray, gh-timokau Merged in:
Authors: Reviewers:
Report Upstream: Reported upstream. Developers deny it's a bug. Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by embray)

This is a follow-up to #26608. That ticket specifically discussed the issue of multi-threaded PARI causing Sage's docbuild to break. That problem is worked around in #28356, so I decided to close #26608.

However, the general problem remains, which is that PARI is not *thread-safe from above*, meaning that while threads created and managed by the PARI library itself work fine, threads created in a multi-system environment (like Sage) which happen to use PARI (specifically PARI built with multi-threading support) it will segfault.

This has been discussed in #26608 as well as this discussion on the OpenDreamKit project as well as related in-person conversations for which I unfortunately lack notes.

With #26608 resolved this is fortunately not an immediate problem for Sage, though it would be very easy for someone thinking they can just carelessly use threads (e.g. from the Python level) in their own code and experience similar crashes.

This is also not a problem just of PARI; getting this kind of multi-system multi-level parallelism right is hard, and should require cooperation, and specific guidelines to follow. Although I have not yet hit any other specific examples I have no doubt that the exist; for example I would not be surprised if HPC-GAP has similar problems.

Change History (7)

comment:1 Changed 2 years ago by embray

  • Description modified (diff)

comment:2 Changed 2 years ago by gh-timokau

  • Cc gh-timokau added

comment:3 Changed 2 years ago by dimpase

Macaulay2 does have these problems with Pari, AFAIK. They are thinking of removing Pari from their dependencies all together due to this.

comment:4 Changed 2 years ago by embray

As I think I mentioned in the original discussion, the issue could be mitigated in PARI somewhat, with a few strategically-placed checks to ensure that important thread-local variables have been initialized, and re-initialize them as needed (using more-or-less existing code for re-initializing PARI in its own, self-managed threads).

comment:5 Changed 2 years ago by embray

An alternative approach (although one that would still be made easier with some internal refactoring of PARI*) would be like Python's PyGILState_Ensure(). This places some onus on users of PARI in multi-threaded code to make sure PARI's interpreter state is properly initialized before using it in a new thread, which is not an unfair thing to ask users to do.

* I should clarify what I mean by this. When multi-threading was added to PARI, a number of global variables were simply converted directly to thread-local variables (sometimes it's not clear to me if all of them need to be thread-local; I don't know). By contrast, to use CPython again as an example (since I know it well), all variables that need to be thread specific (e.g. in PARI this would include things like the main stack pointer) are collected into a single PyThreadState struct, which makes it much easier to manage. Each thread has its own threadstate stored in a thread-local variable, so just one variable instead of a whole bunch (meaning only one call to the TSS APIs to get/set it). A similar reorganization of PARI's thread-specific variables would be helpful.

Last edited 2 years ago by embray (previous) (diff)

comment:6 Changed 16 months ago by dimpase

one way around this difficulty is to use spawn, not fork, in multiprocessing, something that is available in Python 3.7 and later. One just calls


somewhere early enough.

Version 0, edited 16 months ago by dimpase (next)

comment:7 Changed 15 months ago by embray

Yeah, setting set_start_method('spawn') globally would wreak havoc, though might be useful in some careful cases. I think this specific issue would be better addressed with improvements to how PARI manages its thread-local state.

Note: See TracTickets for help on using tickets.