Opened 4 years ago

Closed 4 years ago

#26579 closed enhancement (fixed)

Fix ecl.pyx doctests with threaded ecl

Reported by: arojas Owned by:
Priority: major Milestone: sage-8.5
Component: distribution Keywords: packaging
Cc: fbissey, gh-timokau, saraedum Merged in:
Authors: Antonio Rojas Reviewers: François Bissey
Report Upstream: N/A Work issues:
Branch: 1f4794c (Commits, GitHub, GitLab) Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by arojas)

Make ecl.pyx tests pass when ecl is built with threads support. Fixes one test suite failure on Arch.

Change History (19)

comment:1 Changed 4 years ago by arojas

  • Branch set to u/arojas/fix_ecl_pyx_doctests_with_threaded_ecl

comment:2 Changed 4 years ago by arojas

  • Authors set to Antonio Rojas
  • Cc fbissey gh-timokau saraedum added
  • Commit set to 1f4794c68c0bbe1b9c33a074e006bbb88119d7a4
  • Component changed from PLEASE CHANGE to distribution
  • Description modified (diff)
  • Keywords packaging added
  • Status changed from new to needs_review
  • Type changed from PLEASE CHANGE to enhancement

New commits:

1f4794cFix doctest with thread-enabled ecl

comment:3 Changed 4 years ago by fbissey

  • Reviewers set to François Bissey
  • Status changed from needs_review to positive_review

Feels like something I have seen before. Anyway the fix in question is extremely minor and is at the level of a typo correction.

comment:4 follow-up: Changed 4 years ago by nbruin

  • Status changed from positive_review to needs_info

At least a while ago, running ecllib in sage with thread support was rather fundamentally not possible: when threads are enabled, ecl will have a separate thread for signal handling (and garbage collection). Furthermore, it will use some very strange signals for syncing processes around garbage collection.

Sage will swap signal handlers when entering/exiting ecllib. That's fundamentally incompatible with a multi-threaded approach.

I think before changing the doctest to just pass, I think you should submit some evidence that these problems do not arise anymore. We could even extend the doctests to more properly test if garbage collection in ecl acts nicely with multithreading in sage now (I have trouble imagining how it could).

comment:5 in reply to: ↑ 4 ; follow-up: Changed 4 years ago by arojas

Replying to nbruin:

I think before changing the doctest to just pass, I think you should submit some evidence that these problems do not arise anymore. We could even extend the doctests to more properly test if garbage collection in ecl acts nicely with multithreading in sage now (I have trouble imagining how it could).

I don't really see what this has to do with this ticket. I'm not proposing to enable threading in sage's ecl or anything of that sort. This patch is just so that distributions that *already* build ecl with threads (something that is not going to change regardless of whether this is merged or not) don't get a bogus test failure that doesn't really provide any value or give any hint about what issues threaded ecl may cause, and only creates noise in the testuite output.

If there are concerns about possible issues with threaded ecl (of which I'm not aware after having packaged Sage for 4 years) they should be addressed elsewhere by adding appropriate tests for the actual issues. Again, I don't understand why they should prevent merging this.

comment:6 Changed 4 years ago by vbraun

  • Branch changed from u/arojas/fix_ecl_pyx_doctests_with_threaded_ecl to 1f4794c68c0bbe1b9c33a074e006bbb88119d7a4
  • Resolution set to fixed
  • Status changed from needs_info to closed

comment:7 Changed 4 years ago by vbraun

  • Commit 1f4794c68c0bbe1b9c33a074e006bbb88119d7a4 deleted
  • Resolution fixed deleted
  • Status changed from closed to new

Its not easy to write tests for signal / multithread interop, things just tend to randomly crash on some systems but not on others. How much of the Sage testsuite passes reliably on Arch?

Last edited 4 years ago by vbraun (previous) (diff)

comment:8 follow-up: Changed 4 years ago by gh-timokau

I agree with nbruin here. If sage doesn't work with an ecl with threading enabled, I'd rather have a doctest that tells me that than having to figure out the underlying reason from other seemingly unrelated test failures.

comment:9 in reply to: ↑ 8 ; follow-up: Changed 4 years ago by arojas

Replying to gh-timokau:

I agree with nbruin here. If sage doesn't work with an ecl with threading enabled, I'd rather have a doctest that tells me that than having to figure out the underlying reason from other seemingly unrelated test failures.

Except that this test doesn't tell you anything, simply that your ecl has been build with different options. I don't see how that's going to help when you get a seemingly unrelated crash somewhere else. So far I've only heard about hypothetical issues that there's no way to test and that no Arch user has ever reported in 4 years, nothing concrete.

Anyway, not going to waste more of my time pushing for this. I'll just keep ignoring the test as I've always done.

comment:10 in reply to: ↑ 5 Changed 4 years ago by nbruin

Replying to arojas:

I don't really see what this has to do with this ticket. I'm not proposing to enable threading in sage's ecl or anything of that sort. This patch is just so that distributions that *already* build ecl with threads (something that is not going to change regardless of whether this is merged or not) don't get a bogus test failure that doesn't really provide any value or give any hint about what issues threaded ecl may cause, and only creates noise in the testuite output.

See #11752 for an actual bug report that led to the disabling of threads. It's 7 years old, so things might have changed, but it was a very concrete issue that crashed sage.

The incompatibility I observed before was so bad that sage should refuse to load ecllib when it has threading enabled. If the situation has improved, it's great. Otherwise, Arch should modify its sage dependencies and link to a libecl-nothreading.so or something. It's unfortunate, but as far as I know, sage will *not* work properly with a threaded libecl.

It would be great if sage could work with a libecl configured with default options (which means threading nowadays -- when sage adopted libecl it didn't have threading), especially because it would mean better integration with distributions like Arch. I'm just sharing my earlier experience that this was (at least at some point) not the case. I expect that the test suite won't exercise ecl enough to reliably trigger the behaviours that would crash sage/ecl-with-threading.

EDIT: In fact, the ticket in question deals exactly with the scenario you are facing: running sage in a way that is more integrated in a larger distribution (Mandriva there). It looked like it turned out to be possible at the time to run a generically compiled ECL, but to turn off the threading with a boot-time option (it looks like the option got renamed and that the test is exactly checking that the option is set appropriately).

Last edited 4 years ago by nbruin (previous) (diff)

comment:11 in reply to: ↑ 9 Changed 4 years ago by gh-timokau

Replying to arojas:

Replying to gh-timokau:

I agree with nbruin here. If sage doesn't work with an ecl with threading enabled, I'd rather have a doctest that tells me that than having to figure out the underlying reason from other seemingly unrelated test failures.

Except that this test doesn't tell you anything, simply that your ecl has been build with different options. I don't see how that's going to help when you get a seemingly unrelated crash somewhere else. So far I've only heard about hypothetical issues that there's no way to test and that no Arch user has ever reported in 4 years, nothing concrete.

Well the first thing I would try is to build ecl with the same options and see if that fixes anything. If there are no actual issues with threading however that is of course different. Would be nice if someone with more experience with the ecl interface could re-evaluate.

comment:12 Changed 4 years ago by nbruin

Good news! I looked up the particular configuration option:

https://common-lisp.net/project/ecl/static/manual/re97.html#table.boot_options

and it's only the signal number that would be used for thread communication in ECL, should multiple threads exist. I don't think we ever should have multiple threads in ECL within sage, because the signal handling would get completely screwed up. However, this option doesn't particularly seem to influence whether we do. I think it might be a bit of a red flag if this option does have a special value (because it might indicate an ECL build that has thread support), but that in itself is not necessarily indicative of problems. Also, with the work on #11752 it seems that just having thread support in ECL built doesn't necessarily trigger problems. So it might be OK to not test this option.

On the other hand, "ECL_OPT_SIGNAL_HANDLING_THREAD" really should be checked to be 0. If it isn't we can really expect trouble.

comment:13 follow-up: Changed 4 years ago by arojas

  • Status changed from new to needs_review

Great, so can this be set back to positive_review?

comment:14 in reply to: ↑ 13 ; follow-up: Changed 4 years ago by fbissey

  • Resolution set to fixed
  • Status changed from needs_review to closed

Replying to arojas:

Great, so can this be set back to positive_review?

While you were having this discussion, this has been merged for the next beta and closed fixed already. Restoring to that state.

comment:15 in reply to: ↑ 14 ; follow-up: Changed 4 years ago by arojas

Replying to fbissey:

Replying to arojas:

Great, so can this be set back to positive_review?

While you were having this discussion, this has been merged for the next beta and closed fixed already. Restoring to that state.

This is not merged in beta2, please reopen.

Last edited 4 years ago by arojas (previous) (diff)

comment:16 in reply to: ↑ 15 Changed 4 years ago by fbissey

  • Resolution fixed deleted
  • Status changed from closed to new

Replying to arojas:

Replying to fbissey:

Replying to arojas:

Great, so can this be set back to positive_review?

While you were having this discussion, this has been merged for the next beta and closed fixed already. Restoring to that state.

This is not merged in beta2.

OK I didn't see it being pulled off again :( putting it back to positive review then.

comment:17 Changed 4 years ago by fbissey

  • Status changed from new to needs_review

comment:18 Changed 4 years ago by fbissey

  • Status changed from needs_review to positive_review

comment:19 Changed 4 years ago by vbraun

  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.