Opened 4 years ago

Closed 3 months ago

Last modified 2 months ago

#22191 closed enhancement (fixed)

update ECL to 20.4.24

Reported by: dimpase Owned by:
Priority: critical Milestone: sage-9.2
Component: packages: standard Keywords: upgrade, ecl
Cc: rws, fbissey, jdemeyer, charpent, jpflori, arojas, gh-timokau, saraedum, infinity0, slelievre, thansen, embray, mjo Merged in:
Authors: Marius Gerbershagen, Nils Bruin, Dima Pasechnik, Erik Bray Reviewers: Dima Pasechnik, Nils Bruin, François Bissey, Emmanuel Charpentier, Matthias Koeppe
Report Upstream: Fixed upstream, but not in a stable release. Work issues:
Branch: f82c716 (Commits) Commit:
Dependencies: Stopgaps:

Description (last modified by mkoeppe)

ECL-20.4.24 has been released (on 2020/04/24)

https://common-lisp.net/project/ecl/posts/ECL-20424-release.html

We should update, as it will resolve some Maxima bugs, and is essential for resolving other Maxima bugs.

Tarball: see checksums.ini [upstream_url]. (To configure Sage to download from the upstream URL, use ./configure --enable-download-from-upstream-url)

Attachments (4)

cra.log (191.1 KB) - added by dimpase 4 years ago.
crash gdb log
chkerrs-22191.txt (25.0 KB) - added by charpent 4 months ago.
chkerrs-V2.txt (24.5 KB) - added by charpent 4 months ago.
ptest.log (823.9 KB) - added by dimpase 3 months ago.
cygwin-standard run

Download all attachments as: .zip

Change History (272)

comment:1 Changed 4 years ago by dimpase

  • Branch set to u/dimpase/ecl16.1.3
  • Commit set to 33e59cb6e6d5b6b16f0b23eacc566d613ffccd07
  • Description modified (diff)

New commits:

3ed32fbUpgrade Jupyter packages, add enum34
72993f2Merge branch 'u/jdemeyer/upgrade_jupyter_packages' of trac.sagemath.org:sage into develop
2a291abMerge branch 'develop' of trac.sagemath.org:sage into develop
bd5a04fUpdate source to 2.11.0, fix checksum.
8356446Merge branch 'u/charpent/upgrade_git_to_2_10_2__or_2_11___' of trac.sagemath.org:sage into newgit
33e59cbupdate ECL to 16.1.3 - most of our patches are obsolete

comment:2 Changed 4 years ago by dimpase

oops, somewhat wrong branch... testing anyway

comment:3 Changed 4 years ago by git

  • Commit changed from 33e59cb6e6d5b6b16f0b23eacc566d613ffccd07 to eff60815383ca3b3120a7a26e98e1712072c70ac

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

eff6081update ECL to 16.1.3 - most of our patches are obsolete

comment:4 Changed 4 years ago by dimpase

  • Authors set to Dima Pasechnik
  • Status changed from new to needs_review

comment:5 Changed 4 years ago by fbissey

  • Cc fbissey added

comment:6 Changed 4 years ago by dimpase

The FPE problem on #18920 in comment 99 might be due to ECL 16.1.3 throwing an FPE exception at

> (/ 1.0 0.0)

Condition of type: DIVISION-BY-ZERO
#<a DIVISION-BY-ZERO>

while 16.1.2 happily returns

#.ext::single-float-positive-infinity

comment:7 Changed 4 years ago by dimpase

with 6.1.3 there is apparently new --no-trap-fpe which emulates the old behaviour. So we somehow have to figure out where to put this option.

comment:8 Changed 4 years ago by fbissey

If it is an option for ecl then somewhere in sage/libs/ecl.pyx is the place, probably in ecl_init but I have to dig a bit.

comment:9 follow-up: Changed 4 years ago by fbissey

There is already an option ECL_OPT_TRAP_SIGFPE is it equivalent?

Changed 4 years ago by dimpase

crash gdb log

comment:10 in reply to: ↑ 9 Changed 4 years ago by dimpase

Replying to fbissey:

There is already an option ECL_OPT_TRAP_SIGFPE is it equivalent?

I guess it should be. Perhaps it's something more subtle... I attach the crash dump.

comment:11 Changed 4 years ago by git

  • Commit changed from eff60815383ca3b3120a7a26e98e1712072c70ac to ac8348c462b11a67ee22f77e265afc0ce0572620

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

62b5825some more easy doctest fixes
3c28022more easy doctest fixes
5d86413Merge branch 'public/t18920' of trac.sagemath.org:sage into max5381
100d9d6workaround for #<a ARITHMETIC-ERROR> (18920#comment:60
79eb82ffix for maxima bug #3236
814e8edreflect patch's dependence on an earlier maxima tree commit
af7e5ffnew scaling of dynatomic polynomial
68da1e4Merge branch 'public/t18920' of trac.sagemath.org:sage into ecl1613
6bae4b1bumped up for 5.39.0. Sage builds (with ecl 16.1.3)
ac8348cdisable SIGFPE traps

comment:12 Changed 4 years ago by git

  • Commit changed from ac8348c462b11a67ee22f77e265afc0ce0572620 to a06f00245bfb652270eeea78248348777d006538

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

3ed32fbUpgrade Jupyter packages, add enum34
72993f2Merge branch 'u/jdemeyer/upgrade_jupyter_packages' of trac.sagemath.org:sage into develop
2a291abMerge branch 'develop' of trac.sagemath.org:sage into develop
6027df9Merge branch 'develop' of trac.sagemath.org:sage into develop
b9ee72dupdate ECL to 16.1.3 - most of our patches are obsolete
a06f002disable SIGFPE traps

comment:13 Changed 4 years ago by git

  • Commit changed from a06f00245bfb652270eeea78248348777d006538 to ad25254d3445265211081b50501af670daf4dade

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

377e0edupdate ECL to 16.1.3 - most of our patches are obsolete
ad25254disable SIGFPE traps

comment:14 Changed 4 years ago by dimpase

  • Status changed from needs_review to needs_work
  • Work issues set to sort out SIGFPE

this is a clean update branch for ECL 16.1.3 - needs work to sort out SIGFPE.

comment:15 follow-up: Changed 4 years ago by nbruin

I'm pretty sure the change in behaviour comes from this check-in on ECL: https://gitlab.com/embeddable-common-lisp/ecl/commit/c2b2941768f39544f45f24e19a30081d316eeb71

Clearly, this code will be setting FPE generation flags in more places than before. Probably ECL doesn't go out of its way to reset FPE flags when ECL is exiting. So probably, after executing ECL code, it is now the case that (often?) the FPE flags are set the way ECL wants; not the way we want. So in addition to restoring the SIG handlers, we should probably reset _controlfp or something similar to values that match sage's normal conditions upon exiting ECL code.

Version 0, edited 4 years ago by nbruin (next)

comment:16 Changed 4 years ago by dimpase

Upstream tells me that https://gitlab.com/embeddable-common-lisp/ecl/blob/develop/src/lsp/top.lsp#L1456 makes write-error patch unnecessary. So we should remove it too.

Last edited 4 years ago by dimpase (previous) (diff)

comment:17 in reply to: ↑ 15 Changed 4 years ago by dimpase

  • Cc jdemeyer added

Replying to nbruin:

I'm pretty sure the change in behaviour comes from this check-in on ECL: https://gitlab.com/embeddable-common-lisp/ecl/commit/c2b2941768f39544f45f24e19a30081d316eeb71

Clearly, this code will be setting FPE generation flags in more places than before.

but this ECL macro is not about the case at hand, as I believe we do have fenv.h available. Or you mean that Sage does not use fenv.h?

Probably ECL doesn't go out of its way to reset FPE flags when ECL is exiting. So probably, after executing ECL code, it is now the case that (often?) the FPE flags are set the way ECL wants; not the way we want. So in addition to restoring the SIG handlers, we should probably sprinkle an appropriate amount of fesetenv calls around the ECL entry and exit wrappers.

this does not explain why simply setting ECL_OPT_TRAP_SIGFPE to 0 solves the issue for division by 0, but not for FP overflow.

However, I don't understand why doing anything with ECL_OPT_TRAP_SIGFPE has any effect whatsoever (assuming we do not preserve it among other ECL's interrupts such as SIGINT). Shouldn't

    #and put the Sage signal handlers back
    for i in range(1,32):
        sigaction(i, &sage_action[i], NULL)

in ecl.pyx be overriding whatever ECL did with SIGFPE?

comment:18 Changed 4 years ago by dimpase

The experimental branch using ECLs SIGFPE handler is on girt trac: u/dimpase/ecl16.1.3exp So you can have a look --- it also doesn't work.

comment:19 follow-up: Changed 4 years ago by nbruin

I am not very familiar with this kind of coding issues and/or with SIG handling in general.

If we're doing our sig handler setting properly then it shouldn't matter what sig handlers ECL installs when we run non-ECL code, because we switch them out. As far as I can see this is the case. I see no evidence that an ECL sig handler is interfering with our work. The trace-back doesn't mention ECL code at all. Just an uncaught signal.

It seems that the sage action for SIGFPE is to coredump. That's what we're seeing.

There are flags that govern when SIGFPE is raised (rather than propagating a NaN or something similar). The ECL macro I pointed at will be inserted in ECL generated code quite a bit, so I expect that macro to be executed ALL THE TIME. Plus, the patch was made exactly to fix an issue of unraised SIGFPEs (CL apparently prefers the exceptions over NaNs? by default).

We are not taking measures to set fenv to our liking after executing ECL, and one change in the new ECL version is certainly to change fenv more often. So I think it's worth a try to save & restore's sage's own fenv across ECL calls.

It could be that ECL changes its FPE raise mask depending on the value of ECL_OPT_TRAP_SIGFPE. I haven't checked. That would be consistent with observed behaviour.

comment:20 in reply to: ↑ 19 ; follow-ups: Changed 4 years ago by dimpase

Replying to nbruin:

I am not very familiar with this kind of coding issues and/or with SIG handling in general.

If we're doing our sig handler setting properly then it shouldn't matter what sig handlers ECL installs when we run non-ECL code, because we switch them out. As far as I can see this is the case. I see no evidence that an ECL sig handler is interfering with our work. The trace-back doesn't mention ECL code at all. Just an uncaught signal.

But why do we get an uncaught FP overflow signal? That's because booting ECL up destroys Sages' sig handler, even if we switch ECLs SIGFPE handler off. How exactly this is possible is not clear to me.

It seems that the sage action for SIGFPE is to coredump. That's what we're seeing.

no, it is not. Sage's action on FP overflow is to return infinity. Why does ECL continue to mess around with it, even if told not to? In fact, ECL 16.1.3 has a new command line switch, -no-trap-fp, and it works. (i.e. you get ECLs infinity if you try to convert a too big number to float if this switch is on, and you get an interrupt if that switch is off)

There are flags that govern when SIGFPE is raised (rather than propagating a NaN or something similar). The ECL macro I pointed at will be inserted in ECL generated code quite a bit, so I expect that macro to be executed ALL THE TIME. Plus, the patch was made exactly to fix an issue of unraised SIGFPEs (CL apparently prefers the exceptions over NaNs? by default).

We are not taking measures to set fenv to our liking after executing ECL, and one change in the new ECL version is certainly to change fenv more often. So I think it's worth a try to save & restore's sage's own fenv across ECL calls.

It could be that ECL changes its FPE raise mask depending on the value of ECL_OPT_TRAP_SIGFPE. I haven't checked. That would be consistent with observed behaviour.

comment:21 Changed 4 years ago by nbruin

  • Branch changed from u/dimpase/ecl16.1.3 to u/nbruin/ecl16.1.3

comment:22 in reply to: ↑ 20 Changed 4 years ago by nbruin

  • Commit changed from ad25254d3445265211081b50501af670daf4dade to a2cadc04eb91409ad1f88b5f4d6ad80d5dfc3451

Replying to dimpase:

In fact, ECL 16.1.3 has a new command line switch, -no-trap-fp, and it works.

The corresponding ECL command is (si::trap-fpe t nil), so we could just pass that to ECL just after cl_boot. That might be easier/more efficient than passing it in the argv.

This should put ECL in agreement about what the fenv FPE raise flags are supposed to be (and then it also doesn't matter what SIGFPE handler ECL uses, because the signals will not occur, so we don't need to restore ECL's SIGFPE handler for ECL code either.

Last edited 4 years ago by nbruin (previous) (diff)

comment:23 follow-up: Changed 4 years ago by git

  • Commit changed from a2cadc04eb91409ad1f88b5f4d6ad80d5dfc3451 to ea11dd53248462bf2859fb3cbe34bfd47f006682

Branch pushed to git repo; I updated commit sha1. New commits:

ea11dd5Restore "fenv" upon entry and exit of ECL code

comment:24 in reply to: ↑ 23 Changed 4 years ago by nbruin

Replying to git:

Branch pushed to git repo; I updated commit sha1. New commits:

ea11dd5Restore "fenv" upon entry and exit of ECL code

OK, committing the change before pushing is probably helpful.

comment:25 Changed 4 years ago by git

  • Commit changed from ea11dd53248462bf2859fb3cbe34bfd47f006682 to 35e9d3d1610bade7ae7d5fac735244be25e009eb

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

377e0edupdate ECL to 16.1.3 - most of our patches are obsolete
35e9d3dset ECL to not trap floating point exceptions, so that SIGFPE will not occur

comment:26 Changed 4 years ago by nbruin

OK, that should do the trick! I guess ECL_OPT_TRAP_SIGFPE doesn't have much to do with what we need because it's supposed to control what happens with SIGFPE if ECL is catching signals, whereas we're interested in preventing FP exceptions from generating signals at all. the routine si:trap-fpe is exactly the one to set that behaviour, in a way that is maintained in ECL's environment tracking as well.

comment:27 follow-up: Changed 4 years ago by charpent

  • Cc charpent added

Just as a memento for future testing :

With the current ECL Maxima 5.39.0 gives :

(%i3) taylor(integrate(f(x),x),x,x_0,2); 

taylor: unable to expand at a point specified in:
'integrate(f(x),x)
 -- an error. To debug this try: debugmode(true);

whereas :

%i2) taylor(integrate(f(x),x),x,x_0,2);

(%o2) 'at('integrate(f(x),x),x = x_0)+f(x_0)*(x-x_0)
                                     +(('at('diff(f(x),x,1),x = x_0))
                                      *(x-x_0)^2)
                                      /2

is expected since Maxima 5.38.1 at the latest...

comment:28 Changed 4 years ago by nbruin

DISCLAIMER: I just found out that all my testing was on 16.1.2. While I expect that my fix should work, I haven't actually successfully tested this. Other people who do have easy access to 16.1.3 can verify this easily.

comment:29 in reply to: ↑ 20 Changed 4 years ago by dimpase

OK, this appears to work. make ptest with this ECL 16.1.3 branch and Maxima from #18920 passes without any errors.

comment:30 Changed 4 years ago by dimpase

  • Report Upstream changed from N/A to Reported upstream. No feedback yet.

Daniel Kochmański (ECL lead dev) invited me to open an issue #347 on this.

comment:31 in reply to: ↑ 27 Changed 4 years ago by dimpase

Replying to charpent:

Just as a memento for future testing :

With the current ECL Maxima 5.39.0 gives :

(%i3) taylor(integrate(f(x),x),x,x_0,2); 

taylor: unable to expand at a point specified in:
'integrate(f(x),x)
 -- an error. To debug this try: debugmode(true);

this unfortunately stays the same with ECL 16.1.3. This is for Maxima/ECL people to fix, not for us.

comment:32 follow-ups: Changed 4 years ago by jdemeyer

It would be really good to explain on this ticket what the SIGFPE issue is all about.

comment:33 Changed 4 years ago by jdemeyer

  • Description modified (diff)

comment:34 in reply to: ↑ 32 Changed 4 years ago by dimpase

Replying to jdemeyer:

It would be really good to explain on this ticket what the SIGFPE issue is all about.

examples can be found on #18920. More specifically, in Sage Maxima built with ECL 16.1.3 and unpatched libs/ecl.pyx gives crash dumps like

sage: g(x)=taylor(log(x),x,1,6) # this initialises ECL/Maxima
sage: q = (2^53) * 2^971/1
sage: float(q) # boom, instead of "inf" as the answer.
...
Unhandled SIGFPE: An unhandled floating point exception occurred.
...

and

sage: g(x)=taylor(log(x),x,1,6); g
x |--> -1/6*(x - 1)^6 + 1/5*(x - 1)^5 - 1/4*(x - 1)^4 + 1/3*(x - 1)^3 - 1/2*(x - 1)^2 + x - 1
sage: plot(log(x),(x,0,2))
...
Unhandled SIGFPE: An unhandled floating point exception occurred.
...

comment:35 in reply to: ↑ 32 ; follow-up: Changed 4 years ago by nbruin

The mechanics of why SIGFPE is a problem with ECL 6.1.3 by default:

  • Apparently it is desirable in CL to raise floating point exceptions rather than propagate Inf and NaN by default (CL is a formally specified language, so it probably states that FP exceptions should be handled that way)
  • This is most easily done by using fenv.h functionality to configure floating point arithmetic to raise SIGFPE when these things arise. So, ECL does that, and has a SIGFPE handler in place.
  • We switch the ECL sighandlers in/out whenever we're entering/exiting significant ECL code. Sage is not expecting floating point exceptions to raise SIGFPE, so sage does not have a SIGFPE handler.
  • In order to fix a bug elsewhere in ECL, https://gitlab.com/embeddable-common-lisp/ecl/commit/c2b2941768f39544f45f24e19a30081d316eeb71 changed ECL to much more often update the fenv to reflect its desired state (previously, they were not doing this at all). ECL is not actually written to be called as a library, so it doesn't take any precautions to set/restore global state such as fenv upon entry/exit. Hence, after running ECL in its default configuration, you're liable to end up with an fenv in which SIGFPE can be raised. This is what was happening.

For solutions:

  • it's possible to tell ECL to NOT let floating point exceptions raise SIGFPE. Then ECL's SIGFPE handler is irrelevant, and the fenv state ECL wants is basically consistent with what we want. This is the solution currently used.
  • alternatively, we could let ECL do what it wants and set/restore fenv upon entry/exit of ECL code. fesetenv and fegetenv are the relevant routines from fenv.h. In addition, though, we should restore ECL's SIGFPE handler during ECL code execution, if we go that route (currently we don't).

This was one of these issues that's really easy to resolve, once you understand the problem.

The main confusion is that there are two components: whether floating point exceptions should raise SIGFPE, and how SIGFPE gets handled once it's raised. You might think these two components are controlled via the same mechanism, but they are not. It looks like ECL might be investigating if its configuration can be streamlined on these two issues: https://gitlab.com/embeddable-common-lisp/ecl/issues/347

Currently there are two options that seem relevant: ECL_OPT_TRAP_SIGFPE and --no-trap-fpe that feed into different places and control different aspects.

comment:36 in reply to: ↑ 35 ; follow-up: Changed 4 years ago by jdemeyer

Replying to nbruin:

This was one of these issues that's really easy to resolve, once you understand the problem.

So the issue is solved now?

comment:37 in reply to: ↑ 36 ; follow-up: Changed 4 years ago by dimpase

Replying to jdemeyer:

Replying to nbruin:

This was one of these issues that's really easy to resolve, once you understand the problem.

So the issue is solved now?

it is desirable to tell ECL to NOT let floating point exceptions raise SIGFPE before booting it up, not during the bootup, for this leaves a time window where SIGFPE can be raised. How can this be done?

comment:38 in reply to: ↑ 37 Changed 4 years ago by nbruin

Replying to dimpase:

it is desirable to tell ECL to NOT let floating point exceptions raise SIGFPE before booting it up, not during the bootup, for this leaves a time window where SIGFPE can be raised. How can this be done?

I think this is an academic issue, because in our current setting, no fp exceptions will occur in the window, and hence the way SIGFPE is handled is irrelevant. Anyway:

I assume ECL installs its SIGFPE handler before it executes any FP instructions and/or sets the fenv. The ECL's SIGFPE handler greatly eases the effect of SIGFPE. So if (si::trap-fpe T NIL) is executed right after cl_boot; before switching the SIGhandlers, I think the risk is smaller (I don't think there is a risk at all because no FP work occurs in this process, so it's hard to make an accurate estimate of the relative risks here).

Alternatively, you could trust that ECL knows what it's doing and execute cl_boot with an argv that contains --no-trap-fpe. With any luck, that should just get processed in the same way as command line options upon startup. It is then ECL's responsibility to do this before a SIGFPE actually occurs. In reality, this is essentially the same as just feeding ECL (si::trap-fpe T NIL) just after boot.

fenv is supposed to be thread-specific, so concurrency issues should not come into play, even if sage were ever to grow multiple threads.

comment:39 Changed 4 years ago by nbruin

  • Status changed from needs_work to needs_review
  • Work issues sort out SIGFPE deleted

Needs review?

comment:40 follow-up: Changed 4 years ago by dimpase

there is now https://gitlab.com/embeddable-common-lisp/ecl/issues/349 and also still write_error patch to remove.

comment:41 follow-up: Changed 4 years ago by jdemeyer

  • Status changed from needs_review to needs_work

You should document the change to src/sage/libs/ecl.pyx. At a minimum, there should be a link to this ticket.

comment:42 in reply to: ↑ 41 Changed 4 years ago by dimpase

Replying to jdemeyer:

You should document the change to src/sage/libs/ecl.pyx. At a minimum, there should be a link to this ticket.

Sure. I didn't set it up for review, Nils did. IMHO upstream is still working on this.

comment:43 Changed 3 years ago by jpflori

  • Cc jpflori added

comment:44 in reply to: ↑ 40 Changed 3 years ago by nbruin

Replying to dimpase:

there is now https://gitlab.com/embeddable-common-lisp/ecl/issues/349 and also still write_error patch to remove.

That ticket has now been closed and its fix was to change a test.

No movement on https://gitlab.com/embeddable-common-lisp/ecl/issues/347, so if updating ECL is relevant then doing so with the (si::trap-fpe T NIL) given here is probably a good way to go.

comment:45 Changed 2 years ago by thansen

  • Description modified (diff)
  • Summary changed from update ECL to 6.1.3 to update ECL to 16.1.3

comment:46 Changed 2 years ago by thansen

Any chance this could be finished soon? Some of the ECL maintainers in Debian would like to update the ECL package.

comment:47 follow-up: Changed 2 years ago by fbissey

If I am being honest we want a brand new release of ECL. It won't fix everything but we may be in a better position.

comment:48 in reply to: ↑ 47 Changed 2 years ago by dimpase

Replying to fbissey:

If I am being honest we want a brand new release of ECL. It won't fix everything but we may be in a better position.

I second this. 16.1.3 introduced tricky changes, and the upcoming release (they wanted to get something in late 2017, but OK...) is going again to be much different in the way exceptions are handled.

comment:49 Changed 2 years ago by gh-antonio-rojas

  • Cc gh-antonio-rojas added

comment:50 Changed 2 years ago by arojas

  • Cc arojas added; gh-antonio-rojas removed

comment:51 Changed 2 years ago by gh-timokau

  • Cc gh-timokau added

comment:52 Changed 2 years ago by saraedum

  • Cc saraedum added

comment:53 Changed 2 years ago by gh-timokau

In case we're really waiting for the next ecl release, here is the gitlab milestone and the rc testing issue (which was opened a year ago and got practically no attention).

comment:54 follow-up: Changed 22 months ago by slelievre

  • Keywords upgrade ecl added
  • Milestone changed from sage-7.6 to sage-8.6

Does it help with this ticket that upstream issue 347 was closed a month ago:

comment:55 Changed 22 months ago by slelievre

  • Report Upstream changed from Reported upstream. No feedback yet. to Fixed upstream, but not in a stable release.

Note: debian-science-sagemath discussion about upgrading to ECL 16.1.3:

comment:56 Changed 22 months ago by slelievre

  • Cc infinity0 slelievre thansen added

comment:57 in reply to: ↑ 54 ; follow-up: Changed 22 months ago by fbissey

Replying to slelievre:

Does it help with this ticket that upstream issue 347 was closed a month ago:

It is unclear why it was closed. Possibly because no one was caring. As far as it goes, some of the fix we need would be hard to rebase on top of 16.1.3 - we may just as well try a snapshot from git with all the uncertainty that it entails.

comment:58 in reply to: ↑ 57 ; follow-up: Changed 22 months ago by arojas

Replying to fbissey:

It is unclear why it was closed. Possibly because no one was caring. As far as it goes, some of the fix we need would be hard to rebase on top of 16.1.3 - we may just as well try a snapshot from git with all the uncertainty that it entails.

If I'm reading it correctly it was closed by https://gitlab.com/embeddable-common-lisp/ecl/commit/2a9084b105caf26c89bb16ed4dd0bf5aa7ceab59

Beckporting that commit to 16.1.3 still doesn't fix the SIGFPE crashes on Arch unfortunately.

comment:59 follow-up: Changed 22 months ago by fbissey

Oh right. I am obviously not used to gitlab UI for those. Did you also manage to backport the fix for https://gitlab.com/embeddable-common-lisp/ecl/issues/317 ?

comment:60 in reply to: ↑ 59 Changed 22 months ago by arojas

Replying to fbissey:

Oh right. I am obviously not used to gitlab UI for those. Did you also manage to backport the fix for https://gitlab.com/embeddable-common-lisp/ecl/issues/317 ?

I did now, still no improvement.

comment:61 in reply to: ↑ 58 ; follow-up: Changed 22 months ago by gh-spaghettisalat

Replying to arojas:

If I'm reading it correctly it was closed by https://gitlab.com/embeddable-common-lisp/ecl/commit/2a9084b105caf26c89bb16ed4dd0bf5aa7ceab59

Beckporting that commit to 16.1.3 still doesn't fix the SIGFPE crashes on Arch unfortunately.

As the ECL developer that made this change: Did you also set the ECL_OPT_TRAP_SIGFPE option to false, i.e. put an extra line ecl_set_option(ECL_OPT_TRAP_SIGFPE, 0); into the ecl.pyx file before testing this? Because otherwise if you use the default value of true, you still instruct ECL to generate and catch floating point exceptions.

comment:62 in reply to: ↑ 61 ; follow-ups: Changed 22 months ago by arojas

Replying to gh-spaghettisalat:

Replying to arojas:

If I'm reading it correctly it was closed by https://gitlab.com/embeddable-common-lisp/ecl/commit/2a9084b105caf26c89bb16ed4dd0bf5aa7ceab59

Beckporting that commit to 16.1.3 still doesn't fix the SIGFPE crashes on Arch unfortunately.

As the ECL developer that made this change: Did you also set the ECL_OPT_TRAP_SIGFPE option to false, i.e. put an extra line ecl_set_option(ECL_OPT_TRAP_SIGFPE, 0); into the ecl.pyx file before testing this? Because otherwise if you use the default value of true, you still instruct ECL to generate and catch floating point exceptions.

Yes, still no difference

comment:63 in reply to: ↑ 62 ; follow-up: Changed 22 months ago by fbissey

Replying to arojas:

Replying to gh-spaghettisalat:

Replying to arojas:

If I'm reading it correctly it was closed by https://gitlab.com/embeddable-common-lisp/ecl/commit/2a9084b105caf26c89bb16ed4dd0bf5aa7ceab59

Beckporting that commit to 16.1.3 still doesn't fix the SIGFPE crashes on Arch unfortunately.

As the ECL developer that made this change: Did you also set the ECL_OPT_TRAP_SIGFPE option to false, i.e. put an extra line ecl_set_option(ECL_OPT_TRAP_SIGFPE, 0); into the ecl.pyx file before testing this? Because otherwise if you use the default value of true, you still instruct ECL to generate and catch floating point exceptions.

Yes, still no difference

Is the patch set you are working with available somewhere? I would like to try what you have done myself. May be more eyeballs can detect an issue that escaped you.

comment:64 in reply to: ↑ 63 ; follow-up: Changed 22 months ago by arojas

Replying to fbissey:

Is the patch set you are working with available somewhere? I would like to try what you have done myself. May be more eyeballs can detect an issue that escaped you.

Here it is: http://pkgbuild.com/~arojas/sagemath-ecl-sigfpe.patch

ecl version is 16.1.3 with 2a9084b1 and c2b29417 backported

comment:65 in reply to: ↑ 64 Changed 22 months ago by fbissey

Replying to arojas:

Replying to fbissey:

Is the patch set you are working with available somewhere? I would like to try what you have done myself. May be more eyeballs can detect an issue that escaped you.

Here it is: http://pkgbuild.com/~arojas/sagemath-ecl-sigfpe.patch

ecl version is 16.1.3 with 2a9084b1 and c2b29417 backported

I was thinking of the backport patches as well actually.

comment:67 Changed 22 months ago by fbissey

Thank you. I must say my memory of that conversation - started more than a year ago - was making the backport patches way more complicated.

comment:68 Changed 21 months ago by arojas

  • Cc embray added

comment:69 follow-up: Changed 21 months ago by embray

Is this something we should also try to get done for 8.6, along with the GAP 4.10 upgrade? ISTM that would be best from the Debian perspective.

comment:70 in reply to: ↑ 69 Changed 21 months ago by fbissey

Replying to embray:

Is this something we should also try to get done for 8.6, along with the GAP 4.10 upgrade? ISTM that would be best from the Debian perspective.

ecl 16.1.3 has been out for a while. Us in the distro world have been stumped for a bit. If you can move it, it would be very much appreciated. Everyone has their own timetable and that would be more interesting to me in Gentoo than gap 4.10 (because I am literally in charge of gap amongst other things - and don't get me wrong getting gap 4.10 is important especially getting rid of libgap).

comment:71 Changed 21 months ago by dimpase

ECL has not done a new release in years. We were hoping their next release would address some of problems with signals and exceptions we found in 16.1.3 - something that needed tricky work on Sage side, only to be thrown away, as ECL is going to change in this area a lot...

comment:72 Changed 21 months ago by slelievre

Note: the ECL release manager tells me the current estimate for the ECL 16.2.0 release date is "first quarter of 2019".

comment:73 Changed 21 months ago by dimpase

So they should have a beta soon, no?

comment:74 Changed 21 months ago by gh-timokau

Well they have been "release candidate testing" without any activity for a year now (https://gitlab.com/embeddable-common-lisp/ecl/issues/333). I wouldn't bet on the Q1 2019 release.

comment:75 in reply to: ↑ 62 ; follow-ups: Changed 21 months ago by gh-spaghettisalat

Replying to arojas:

Replying to gh-spaghettisalat:

Replying to arojas:

If I'm reading it correctly it was closed by https://gitlab.com/embeddable-common-lisp/ecl/commit/2a9084b105caf26c89bb16ed4dd0bf5aa7ceab59

Beckporting that commit to 16.1.3 still doesn't fix the SIGFPE crashes on Arch unfortunately.

As the ECL developer that made this change: Did you also set the ECL_OPT_TRAP_SIGFPE option to false, i.e. put an extra line ecl_set_option(ECL_OPT_TRAP_SIGFPE, 0); into the ecl.pyx file before testing this? Because otherwise if you use the default value of true, you still instruct ECL to generate and catch floating point exceptions.

Yes, still no difference

Then I'm guessing, the branch at https://git.sagemath.org/sage.git/log/?h=u/nbruin/ecl16.1.3 also failed for you? Because all the patch of ​https://gitlab.com/embeddable-common-lisp/ecl/commit/2a9084b105caf26c89bb16ed4dd0bf5aa7ceab59 does is to call (si::trap-fpe t nil) during startup if the ECL_OPT_TRAP_SIGFPE option is false.

Anyway, I think the real issue is that maxima enables floating point overflow exceptions during startup (see the last line in the src/ecl-port.lisp file of the maxima source directory). Removing this line should fix the observed crashes also with an unpatched ECL 16.1.3 and without the need to explicitely call (si::trap-fpe t nil) in ecl.pyx (as long as the ECL_OPT_TRAP_SIGFPE option is false of course). This is also consistent with the observation in https://gitlab.com/embeddable-common-lisp/ecl/issues/317 that floating point overflows behaved differently than the other floating point exceptions.

comment:76 in reply to: ↑ 75 Changed 21 months ago by arojas

Replying to gh-spaghettisalat:

Then I'm guessing, the branch at https://git.sagemath.org/sage.git/log/?h=u/nbruin/ecl16.1.3 also failed for you?

You're right, the branch never worked for me

Anyway, I think the real issue is that maxima enables floating point overflow exceptions during startup (see the last line in the src/ecl-port.lisp file of the maxima source directory). Removing this line should fix the observed crashes also with an unpatched ECL 16.1.3 and without the need to explicitely call (si::trap-fpe t nil) in ecl.pyx (as long as the ECL_OPT_TRAP_SIGFPE option is false of course).

And you're right again. Removed that line in maxima and there are no more crashes.

comment:77 in reply to: ↑ 75 ; follow-up: Changed 21 months ago by nbruin

Replying to gh-spaghettisalat:

Anyway, I think the real issue is that maxima enables floating point overflow exceptions during startup (see the last line in the src/ecl-port.lisp file of the maxima source directory). Removing this line should fix the observed crashes also with an unpatched ECL 16.1.3 and without the need to explicitely call (si::trap-fpe t nil) in ecl.pyx (as long as the ECL_OPT_TRAP_SIGFPE option is false of course).

In that case, I suspect that for us the right place to fix it is to revert the effect of the line in ecl-port.lisp after initialization in maxima_lib. Then we don't have to patch maxima. Also, maxima may have a good reason to do this under normal operation, and sagemath does build maxima both for stand-alone use and for library use (and it uses the same fas for both!). Unless we have a good reason, we should probably not modify the stand-alone behaviour.

The relevant file would be sage/interfaces/maxima_lib.py, somewhere after the (require 'maxima). It already has a lot of lisp commands that go digging in the internals of maxima. The short window where maxima turns on fpe exceptions and we turn them off again shouldn't be a problem.

comment:78 follow-up: Changed 21 months ago by arojas

With this [1] patch in sagemath it no longer crashes for me, with unmodified ecl and maxima

[1] https://git.archlinux.org/svntogit/community.git/tree/trunk/sagemath-ecl-sigfpe.patch?h=packages/sagemath

comment:79 in reply to: ↑ 77 Changed 21 months ago by embray

Replying to nbruin:

Replying to gh-spaghettisalat:

Anyway, I think the real issue is that maxima enables floating point overflow exceptions during startup (see the last line in the src/ecl-port.lisp file of the maxima source directory). Removing this line should fix the observed crashes also with an unpatched ECL 16.1.3 and without the need to explicitely call (si::trap-fpe t nil) in ecl.pyx (as long as the ECL_OPT_TRAP_SIGFPE option is false of course).

In that case, I suspect that for us the right place to fix it is to revert the effect of the line in ecl-port.lisp after initialization in maxima_lib. Then we don't have to patch maxima. Also, maxima may have a good reason to do this under normal operation, and sagemath does build maxima both for stand-alone use and for library use (and it uses the same fas for both!). Unless we have a good reason, we should probably not modify the stand-alone behaviour.

The relevant file would be sage/interfaces/maxima_lib.py, somewhere after the (require 'maxima). It already has a lot of lisp commands that go digging in the internals of maxima. The short window where maxima turns on fpe exceptions and we turn them off again shouldn't be a problem.

+1 Sage should not depend on non-standard functionality supplied via patch in order to work.

comment:80 Changed 21 months ago by embray

Of the patches we still include for our ECL build, are there any that haven't been submitted upstream, or at least that still need upstream issues?

I know that I got flisten-bug.patch fixed here: https://gitlab.com/embeddable-common-lisp/ecl/merge_requests/51 so that patch can probably be backported to 16.1.3 if need be (and in any case, the bug it fixes was only, to my knowledge, affecting Cygwin, so I don't care about it for Linux).

Last edited 21 months ago by embray (previous) (diff)

comment:81 Changed 21 months ago by embray

Also, are we still doing the tarball patching with spkg-src? I did not do that, so I'm guessing it's obsolete.

comment:82 follow-up: Changed 21 months ago by embray

The difficulty/problem here is that ECL's si_trap_fpe still uncondtionally (when available) calls feenableexcept. The effect of (si::trap-fpe t nil) is effectively to call feenableexcept(FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW | FE_INVALID) which means now SIGFPEs will be sent to the program that weren't otherwise.

The only effect of setting ECL_OPT_TRAP_SIGFPE = 0 is that ECL's signal handler won't be installed (actually this setting could be avoided entirely by saving/restoring ECL's SIGFPE handler like we do some of its other signal handlers).

However, (si::trap-fpe t nil) still results now in unhandled SIGFPEs in random places where previously floating point exceptions were just ignored, or handled manually.

comment:83 in reply to: ↑ 82 ; follow-up: Changed 21 months ago by gh-spaghettisalat

Replying to embray:

The difficulty/problem here is that ECL's si_trap_fpe still uncondtionally (when available) calls feenableexcept. The effect of (si::trap-fpe t nil) is effectively to call feenableexcept(FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW | FE_INVALID) which means now SIGFPEs will be sent to the program that weren't otherwise.

No, the effect of (si::trap-fpe t nil) is to call feenableexcept(0). What you're describing happens with (si::trap-fpe t t).

The only effect of setting ECL_OPT_TRAP_SIGFPE = 0 is that ECL's signal handler won't be installed (actually this setting could be avoided entirely by saving/restoring ECL's SIGFPE handler like we do some of its other signal handlers).

The effect of ECL_OPT_TRAP_SIGFPE = 0 in ECL 16.1.3 is a) that no signal handler for SIGFPE exceptions is installed and b) that in contrast to ECL_OPT_TRAP_SIGFPE = 1 no floating point exceptions are enabled (i.e. feenableexcept is not called). Because of the feature request https://gitlab.com/embeddable-common-lisp/ecl/issues/347, the next ECL release changes this behaviour to call feenableexcept(0) during startup if ECL_OPT_TRAP_SIGFPE is 0.

However, (si::trap-fpe t nil) still results now in unhandled SIGFPEs in random places where previously floating point exceptions were just ignored, or handled manually.

No, it doesn't, as explained above.

comment:84 in reply to: ↑ 78 Changed 21 months ago by embray

Replying to arojas:

With this [1] patch in sagemath it no longer crashes for me, with unmodified ecl and maxima

[1] https://git.archlinux.org/svntogit/community.git/tree/trunk/sagemath-ecl-sigfpe.patch?h=packages/sagemath

This patch also works for me. It's not ideal but it's okay as a temporary workaround.

comment:85 in reply to: ↑ 83 Changed 21 months ago by embray

Replying to gh-spaghettisalat:

Replying to embray:

The difficulty/problem here is that ECL's si_trap_fpe still uncondtionally (when available) calls feenableexcept. The effect of (si::trap-fpe t nil) is effectively to call feenableexcept(FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW | FE_INVALID) which means now SIGFPEs will be sent to the program that weren't otherwise.

No, the effect of (si::trap-fpe t nil) is to call feenableexcept(0). What you're describing happens with (si::trap-fpe t t).

You're right: I was having trouble understanding exactly how this code works. The problem doesn't come from (si::trap-fpe t nil), but rather the fact that maxima later enables traps for overflows (which then needs to be subsequently disabled, per the patch in comment:78).

I guess, what I would want, would be an option to effectively turn trap-fpe into a no-op, because it's still modifying the global floating point environment in a way that's rather opaque and detrimental when embedding.

Anyways, since we identified for now the one place where this was happening--loading maxima--I'm content for now to step away, though we should probably change our ecl_sig_on/off() to save/restore the floating point environment and clear any floating point exceptions.

Happy New Year and have a good night!

comment:86 Changed 21 months ago by embray

Here's my attempt to patch this more generally. It appears to work, but I'm not sure how robust it is. With this, it's not necessary to manually call si::trap-fpe anywhere, because ECL's FPE traps are saved/restored alongside its signal handlers by ecl_sig_on/off():

  • src/sage/libs/ecl.pyx

    diff --git a/src/sage/libs/ecl.pyx b/src/sage/libs/ecl.pyx
    index 8dd2895..fa24ac0 100644
    a b from __future__ import print_function, absolute_import 
    1616#adapted to work with pure Python types.
    1717
    1818from libc.stdlib cimport abort
    19 from libc.signal cimport SIGINT, SIGBUS, SIGSEGV, SIGCHLD
     19from libc.signal cimport SIGINT, SIGBUS, SIGSEGV, SIGCHLD, SIGFPE
    2020from libc.signal cimport raise_ as signal_raise
    2121from posix.signal cimport sigaction, sigaction_t
    2222cimport cysignals.signals
    cdef extern from "eclsig.h": 
    4848    void ecl_sig_off()
    4949    cdef sigaction_t ecl_sigint_handler
    5050    cdef sigaction_t ecl_sigbus_handler
     51    cdef sigaction_t ecl_sigfpe_handler
    5152    cdef sigaction_t ecl_sigsegv_handler
    5253    cdef mpz_t ecl_mpz_from_bignum(cl_object obj)
    5354    cdef cl_object ecl_bignum_from_mpz(mpz_t num)
     55    cdef int fegetexcept()
     56    cdef int feenableexcept(int)
     57    cdef int fedisableexcept(int)
     58    cdef int ecl_feflags
    5459
    5560cdef cl_object string_to_object(char * s):
    5661    return ecl_read_from_cstring(s)
    def init_ecl(): 
    239244    global ecl_has_booted
    240245    cdef char *argv[1]
    241246    cdef sigaction_t sage_action[32]
     247    cdef int sage_fpes
    242248    cdef int i
    243249
    244250    if ecl_has_booted:
    def init_ecl(): 
    258264    for i in range(1,32):
    259265        sigaction(i, NULL, &sage_action[i])
    260266
     267    sage_fpes = fegetexcept()
     268
    261269    #initialize ECL
    262270    ecl_set_option(ECL_OPT_SIGNAL_HANDLING_THREAD, 0)
    263271    cl_boot(1, argv)
    def init_ecl(): 
    265273    #save signal handler from ECL
    266274    sigaction(SIGINT, NULL, &ecl_sigint_handler)
    267275    sigaction(SIGBUS, NULL, &ecl_sigbus_handler)
     276    sigaction(SIGFPE, NULL, &ecl_sigfpe_handler)
    268277    sigaction(SIGSEGV, NULL, &ecl_sigsegv_handler)
    269278
     279    #save ECL's floating point exception flags
     280    ecl_feflags = fegetexcept()
     281
    270282    #verify that no SIGCHLD handler was installed
    271283    cdef sigaction_t sig_test
    272284    sigaction(SIGCHLD, NULL, &sig_test)
    def init_ecl(): 
    277289    for i in range(1,32):
    278290        sigaction(i, &sage_action[i], NULL)
    279291
     292    fedisableexcept(ecl_feflags)
     293    feenableexcept(sage_fpes)
     294
    280295    #initialise list of objects and bind to global variable
    281296    # *SAGE-LIST-OF-OBJECTS* to make it rooted in the reachable tree for the GC
    282297    list_of_objects=cl_cons(Cnil,cl_cons(Cnil,Cnil))
    def init_ecl(): 
    320335                    (values nil (princ-to-string cnd)))))
    321336        """))
    322337    safe_funcall_clobj=cl_eval(string_to_object(b"(symbol-function 'sage-safe-funcall)"))
    323 
    324     cl_eval(string_to_object("(si::trap-fpe T NIL)"))
    325338    ecl_has_booted = 1
    326339
    327340cdef cl_object ecl_safe_eval(cl_object form) except NULL:
  • src/sage/libs/eclsig.h

    diff --git a/src/sage/libs/eclsig.h b/src/sage/libs/eclsig.h
    index f9f2690..c1d5244 100644
    a b  
    99
    1010
    1111#include <signal.h>
     12
     13/* Rummage around to determine how ECL was configured */
     14#define ECL_AVOID_FPE_H  /* Prevent some local includes */
     15#include <ecl/config-internal.h>
     16
     17#ifdef HAVE_FENV_H
     18#include <fenv.h>
     19#ifndef FE_ALL_EXCEPT
     20#define FE_ALL_EXCEPT FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW | FE_INVALID
     21#endif
     22#else
     23#ifndef FE_ALL_EXCEPT
     24#define FE_ALL_EXCEPT 0
     25#endif
     26#endif
     27
     28#ifndef HAVE_FEENABLEEXCEPT
     29/* These are GNU extensions */
     30#define fegetexcept() 0
     31#define feenablexcept(flags)
     32#define fdisableexcept(flags)
     33#endif
     34
    1235static struct sigaction ecl_sigint_handler;
    1336static struct sigaction ecl_sigbus_handler;
     37static struct sigaction ecl_sigfpe_handler;
    1438static struct sigaction ecl_sigsegv_handler;
    1539static struct sigaction sage_sigint_handler;
    1640static struct sigaction sage_sigbus_handler;
     41static struct sigaction sage_sigfpe_handler;
    1742static struct sigaction sage_sigsegv_handler;
     43static int ecl_feflags;
     44static int sage_feflags;
    1845
    1946static inline void set_ecl_signal_handler(void)
    2047{
    2148    sigaction(SIGINT, &ecl_sigint_handler, &sage_sigint_handler);
    2249    sigaction(SIGBUS, &ecl_sigbus_handler, &sage_sigbus_handler);
     50    sigaction(SIGFPE, &ecl_sigfpe_handler, &sage_sigfpe_handler);
    2351    sigaction(SIGSEGV, &ecl_sigsegv_handler, &sage_sigsegv_handler);
     52    /* sage_feflags should be 0; we don't set them otherwise */
     53    sage_feflags = fedisableexcept(FE_ALL_EXCEPT);
     54    feenableexcept(ecl_feflags);
    2455}
    2556
    2657static inline void unset_ecl_signal_handler(void)
    2758{
    2859    sigaction(SIGINT, &sage_sigint_handler, NULL);
    2960    sigaction(SIGBUS, &sage_sigbus_handler, NULL);
     61    sigaction(SIGFPE, &sage_sigfpe_handler, NULL);
    3062    sigaction(SIGSEGV, &sage_sigsegv_handler, NULL);
     63    ecl_feflags = fedisableexcept(FE_ALL_EXCEPT);
     64    feenableexcept(sage_feflags);
    3165}
    3266
    3367/* This MUST be a macro because sig_on() must be in the same

comment:87 Changed 21 months ago by embray

  • Authors changed from Dima Pasechnik to Dima Pasechnik, Erik Bray
  • Branch changed from u/nbruin/ecl16.1.3 to public/ticket-22191
  • Commit 35e9d3d1610bade7ae7d5fac735244be25e009eb deleted
  • Status changed from needs_work to needs_review

New branch based on the original one, rebased on 8.6.beta0, and including my floating point exception handling fixes. These non-standard FPE APIs are woefully underdocumented and confusing, but I think this has the right idea and all tests pass for me, docs build, etc. It should be a non-issue on platforms that don't have feenableexcept(), so we just treat it there as a no-op.

comment:88 Changed 21 months ago by git

  • Commit set to f2eea3db4ce90cb5743033c32753b9d9ade75703

Branch pushed to git repo; I updated commit sha1. New commits:

615b29fupdate ECL to 16.1.3 - most of our patches are obsolete
1a643faset ECL to not trap floating point exceptions, so that SIGFPE will not occur
f2eea3dimprove save/restore of floating point exception handling when entering/leaving libecl

comment:89 Changed 21 months ago by dimpase

some patches don't apply

Setting up build directory for ecl-16.1.3.p0
Finished extraction
Applying patches from ../patches...
Applying ../patches/flisten-bug.patch
patching file src/c/file.d
Hunk #1 FAILED at 5317.
1 out of 1 hunk FAILED -- saving rejects to file src/c/file.d.rej
Error applying '../patches/flisten-bug.patch'
************************************************************************
Error applying patches

comment:90 Changed 21 months ago by dimpase

  • Status changed from needs_review to needs_work

comment:91 Changed 21 months ago by embray

I had renamed that patch while updating, but looks like I didn't push the updated patch.

comment:92 Changed 21 months ago by git

  • Commit changed from f2eea3db4ce90cb5743033c32753b9d9ade75703 to ee5a10b5781f825b4da678f54c8b5589d152e438

Branch pushed to git repo; I updated commit sha1. New commits:

ee5a10bupdated this patch to apply cleanly to ECL 16.1.3

comment:93 Changed 21 months ago by embray

  • Status changed from needs_work to needs_review

comment:94 Changed 21 months ago by dimpase

  • Authors changed from Dima Pasechnik, Erik Bray to Nils Bruin, Dima Pasechnik, Erik Bray
  • Reviewers set to Dima Pasechnik

Nils, could you have a look at this?

comment:95 Changed 21 months ago by nbruin

It seems to me that the fedisableexcept/feenableexcept dance is supposed to switch between "ecl" FPE state and "sage" FPE state. However, that is not what those commands do. According to the documentation, fedisableexcept just disables the given exceptions (and leaves the other ones the way they were) and feenableexcept enables them. Furthermore, it seems these are supposedly glibc extensions.

When I look at the relevant manpage, it seems to indicate that fegetenv and fesetenv are defined in C99 and do set the FPE environment wholesale. So, would it be better to do:

fegetenv(&sage_fe_env)
fesetenv(&ecl_fe_env)
<do ecl stuff>
fegetenv(&ecl_fe_env)
fesetenv(&sagel_fe_env)

that might do a more comprehensive capture ...

comment:96 follow-up: Changed 21 months ago by embray

I thought of using feget/setenv, but the reason I didn't is simply because ECL does not use them, and rather uses fedisableexcept and feenableexcept. See: https://gitlab.com/embeddable-common-lisp/ecl/blob/2a9084b105caf26c89bb16ed4dd0bf5aa7ceab59/src/c/unixint.d#L1234

It's not clear to me or well documented how feenable/disableexcept interact with fesetenv. Presumably the latter should encompass the former, but I have not looked into it in more detail. I agree that the get/setenv functions look better overall and more portable.

So for now I'd be more comfortable just doing exactly the converse to what ECL is doing. In this code I use fedisableexcept(FE_ALL_EXCEPT) to just disable all floating point exceptions, followed by feenableexcept(...) for just the ECL flags, or just the Sage flags (which should be 0, since I'm not aware of anything in Sage or any of its other dependencies that, at least by default, sets FPE traps, but we save/restore it anyways just in case).

Last edited 21 months ago by embray (previous) (diff)

comment:97 Changed 21 months ago by dimpase

I don’t recall whether numpy sets FPE traps, or merely monitors for FPEs, but this did cause us trouble in the past, see #22799.

comment:98 in reply to: ↑ 96 ; follow-up: Changed 21 months ago by nbruin

Replying to embray:

I thought of using feget/setenv, but the reason I didn't is simply because ECL does not use them, and rather uses fedisableexcept and feenableexcept. See: https://gitlab.com/embeddable-common-lisp/ecl/blob/2a9084b105caf26c89bb16ed4dd0bf5aa7ceab59/src/c/unixint.d#L1234

OK, that makes sense. There is one point of concern: as far as I can see, these routines change which exceptions raise a signal. It doesn't change which flags are set. Can it be that sage or ecl get exception flag states they are not expecting because the state arose under a signal regime they are not used to? (getenv/setenv would take care of that because they'd swap the *entire* FP state)

Also: can we do something so that the patchbot can test this ticket? It looks like it can't find the ecl tarball at the moment.

Last edited 21 months ago by nbruin (previous) (diff)

comment:99 in reply to: ↑ 98 ; follow-up: Changed 21 months ago by embray

Replying to nbruin:

Replying to embray:

I thought of using feget/setenv, but the reason I didn't is simply because ECL does not use them, and rather uses fedisableexcept and feenableexcept. See: https://gitlab.com/embeddable-common-lisp/ecl/blob/2a9084b105caf26c89bb16ed4dd0bf5aa7ceab59/src/c/unixint.d#L1234

OK, that makes sense. There is one point of concern: as far as I can see, these routines change which exceptions raise a signal. It doesn't change which flags are set. Can it be that sage or ecl get exception flag states they are not expecting because the state arose under a signal regime they are not used to? (getenv/setenv would take care of that because they'd swap the *entire* FP state)

I'm afraid I don't exactly follow you here. Maybe you'd have to describe a specific case. However, the answer is probably "Yes, that's possible, but I don't currently have a case where that's occurring so I'm not going to worry about it until I do."

Also: can we do something so that the patchbot can test this ticket? It looks like it can't find the ecl tarball at the moment.

I don't know. The patchbot has some code for divining a link to the upstream tarball from the ticket description and downloading it, so I don't know why it's not working in this case.

comment:100 in reply to: ↑ 99 Changed 21 months ago by nbruin

Replying to embray:

I'm afraid I don't exactly follow you here. Maybe you'd have to describe a specific case.

I'd expect that signalling-based code would clear exception flags as it goes along, expecting a clear set of exception flags as normal state.

With NaN-producing code, the exception flags would usually pile up. Entering signalling-based code could therefore easily be confused if you feed it a state attained by NaN-producing code (it would be hard to tell what exception triggered the signal if all the flags are on ...)

It might be that ECL functions reasonably well with that kind of state poorly defined ... It seems it can turn on/off signals by itself already.

comment:101 Changed 21 months ago by embray

I'm sorry but I'm still not following. What do you mean by "entering signalling-based code"? Do you have some specific suggestions or are you just speculating at this point? I would like to move on here, that's all...

comment:102 Changed 21 months ago by nbruin

OK, I don't know enough about IEEE floats. Someone else with more expertise should take a look at this.

The fact that only part of the FP state gets swapped (which exceptions get trapped) and another part (exception state flags) does not, looks suspicious.

I would normally think that a little more rigorous argument than "it doesn't seem to cause problems right now" would be warranted before implementing such a shortcut in an interface. Someone with more expertise in scientific computing software using IEEE floats can probably make a better assessment on whether this is something to worry about.

comment:103 follow-up: Changed 21 months ago by embray

It's not a question about IEEE floats at all per se, just the API.

And this is literally just doing the converse of what ECL does. I'm not worried about a "more rigorous argument" for a problem I don't have. Maybe that's just the pragmatist in me but I really don't have more time to spend on this unless someone can demonstrate an actual problem.

I do remember reading somewhere that fedisableexcept also clears any exception flags, but I forget where, so I might need to double-check that. If it helps we could also add an explicit feclearexcept() but that's as far as I'm willing to go right now. I don't think there's any point in otherwise delaying this further after...looks...2 years.

comment:104 Changed 21 months ago by dimpase

  • Reviewers changed from Dima Pasechnik to Dima Pasechnik, Nils Bruin
  • Status changed from needs_review to positive_review

with numerics moving to GPUs and TPUs, it's more and more a purely academic matter of sticking, or not, to IEEE floats standard. Let's go ahead with this, and hope that a new ECL release happens soon...

comment:105 in reply to: ↑ 103 Changed 21 months ago by embray

Replying to embray:

I do remember reading somewhere that fedisableexcept also clears any exception flags, but I forget where, so I might need to double-check that. If it helps we could also add an explicit feclearexcept() but that's as far as I'm willing to go right now.

I did go ahead and check on this since now I'm thinking I must have imagined or misremembered that. Modifying the exception masks does not cause any pending exceptions to be cleared, so theoretically this could result in some odd bugs I guess.

Nils has a point that it would be better to save/restore the entire FPU environment but my point still stands that for the present purpose it's mostly academic. I'm really only concerned about saving/reverting exactly what ECL is (potentially) doing and nothing more. Clearing any pending exceptions first does make sense though so I'll add that in real quick.

comment:106 Changed 21 months ago by git

  • Commit changed from ee5a10b5781f825b4da678f54c8b5589d152e438 to 85fcb48922383cd1c2ac36412ed761a507cb9537
  • Status changed from positive_review to needs_review

Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. New commits:

85fcb48add feclearexcept to ecl_sig_on/off to clear any pending floating point status flags before modifying the traps

comment:107 Changed 21 months ago by embray

  • Status changed from needs_review to positive_review

I tested the changes myself and it's not a very significant difference over what was already reviewed, so setting back to positive_review.

comment:108 Changed 21 months ago by dimpase

OK, I ran tests too, just to make sure, it all looks good.

comment:109 Changed 21 months ago by embray

  • Milestone changed from sage-8.6 to sage-8.7

Well, clearly it was too late for this, though I believe this should have been part of 8.6. Changing milestone since it wasn't (and it was more a nice-to-have I think than critical).

If downstream wants to use sagemath with this or a newer ECL they can simply use the patch from this branch.

comment:110 follow-ups: Changed 21 months ago by vbraun

  • Status changed from positive_review to needs_work

Doesn't work on OSX, maxima then fails to build with

for l in ecl; do for d in / /numerical /numerical/slatec; do .././install-sh -c -d binary-$l$d; done; done
ecl -norc \
           -eval '(progn  (load "../lisp-utils/defsystem.lisp") (funcall (intern (symbol-name :operate-on-system) :mk) "maxima" :compile :verbose t) (build-maxima-lib))' \
           -eval '(ext:quit)'
dyld: Library not loaded: @libdir@/libecl.16.1.dylib
  Referenced from: /Users/buildslave-sage/slave/sage_git/build/local/bin/ecl
  Reason: image not found
make[5]: *** [binary-ecl/maxima] Abort trap: 6
make[5]: Target `all' not remade because of errors.

comment:111 in reply to: ↑ 110 Changed 20 months ago by dimpase

Replying to vbraun:

Doesn't work on OSX, maxima then fails to build with

Does building ECL actually work on OSX (what version?) ? It looks as if ECL is broken in some way then.

comment:112 in reply to: ↑ 110 Changed 20 months ago by gh-spaghettisalat

Replying to vbraun:

Doesn't work on OSX, maxima then fails to build with

for l in ecl; do for d in / /numerical /numerical/slatec; do .././install-sh -c -d binary-$l$d; done; done
ecl -norc \
           -eval '(progn  (load "../lisp-utils/defsystem.lisp") (funcall (intern (symbol-name :operate-on-system) :mk) "maxima" :compile :verbose t) (build-maxima-lib))' \
           -eval '(ext:quit)'
dyld: Library not loaded: @libdir@/libecl.16.1.dylib
  Referenced from: /Users/buildslave-sage/slave/sage_git/build/local/bin/ecl
  Reason: image not found
make[5]: *** [binary-ecl/maxima] Abort trap: 6
make[5]: Target `all' not remade because of errors.

This is probably same as https://gitlab.com/embeddable-common-lisp/ecl/issues/398, fixed by https://gitlab.com/embeddable-common-lisp/ecl/commit/612eeb5ed1623c4c7cb71029aab39107caf1cdba

comment:113 Changed 20 months ago by fbissey

Looks like totally it.

comment:114 Changed 20 months ago by git

  • Commit changed from 85fcb48922383cd1c2ac36412ed761a507cb9537 to 61d68b4a3872e3ae05e3a42f6aa6d960b7847be9

Branch pushed to git repo; I updated commit sha1. New commits:

28def8dMerge branch 'develop' into ticket-22191
61d68b4Add upstream patch for OS X build

comment:115 Changed 20 months ago by fbissey

  • Status changed from needs_work to needs_review

Added upstream patch. It applies without problem and doesn't break things on linux. But it has to be tested on OS X.

comment:116 Changed 19 months ago by fbissey

I wonder if #27225 will fix the darwin build bot for this ticket once it is in.

comment:117 Changed 19 months ago by git

  • Commit changed from 61d68b4a3872e3ae05e3a42f6aa6d960b7847be9 to 4b4255e39af6d1baf19ba0af9d11e924501946eb

Branch pushed to git repo; I updated commit sha1. New commits:

4b4255eMerge branch 'develop' into ticket-22191

comment:118 Changed 19 months ago by fbissey

Just rebasing on top of 8.5.beta5. Hopefully the patchbot will stop choking on python-3.7.2 and we can see if things work for real.

comment:119 Changed 19 months ago by fbissey

  • Reviewers changed from Dima Pasechnik, Nils Bruin to Dima Pasechnik, Nils Bruin, François Bissey
  • Status changed from needs_review to positive_review

Patchbot still breaking on something unrelated to this ticket. Putting to positive review to have some real testing.

comment:120 Changed 19 months ago by vbraun

  • Status changed from positive_review to needs_work

How is this unrelated (see patchbot)?

[sagelib-8.7.beta5] build/cythonized/sage/libs/ecl.c:4814:36: error: 'ecl_sigfpe_handler' undeclared (first use in this function); did you mean 'ecl_sigbus_handler'?

comment:121 Changed 18 months ago by embray

  • Milestone changed from sage-8.7 to sage-8.8

Ticket retargeted after milestone closed (if you don't believe this ticket is appropriate for the Sage 8.8 release please retarget manually)

comment:122 Changed 16 months ago by embray

  • Milestone changed from sage-8.8 to sage-8.9

Tickets still needing working or clarification should be moved to the next release milestone at the soonest (please feel free to revert if you think the ticket is close to being resolved).

comment:123 Changed 15 months ago by dimpase

Is it possible to revive the current branch? It used to work at some point.

comment:124 Changed 9 months ago by embray

  • Milestone changed from sage-8.9 to sage-9.1

Ticket retargeted after milestone closed

comment:125 Changed 8 months ago by mkoeppe

on macOS (merged with 9.1.beta3):

ImportError: dlopen(/Users/mkoeppe/s/sage/sage-rebasing/worktree-algebraic-2018-spring/local/lib/python3.7/site-packages/sage/libs/ecl.cpython-37m-darwin.so, 2): Symbol not found: _fedisableexcept
  Referenced from: /Users/mkoeppe/s/sage/sage-rebasing/worktree-algebraic-2018-spring/local/lib/python3.7/site-packages/sage/libs/ecl.cpython-37m-darwin.so
  Expected in: flat namespace
 in /Users/mkoeppe/s/sage/sage-rebasing/worktree-algebraic-2018-spring/local/lib/python3.7/site-packages/sage/libs/ecl.cpython-37m-darwin.so

comment:126 Changed 5 months ago by mkoeppe

  • Milestone changed from sage-9.1 to sage-9.2

comment:127 Changed 5 months ago by dimpase

  • Description modified (diff)
  • Summary changed from update ECL to 16.1.3 to update ECL to 16.1.3 or - rather - 20.4.24

comment:128 Changed 4 months ago by gh-spaghettisalat

  • Branch changed from public/ticket-22191 to u/gh-spaghettisalat/ecl-update
  • Commit changed from 4b4255e39af6d1baf19ba0af9d11e924501946eb to 3f37e8dc6801e78fa79c1cbdc4f5f9a46354c546
  • Status changed from needs_work to needs_review

I have created a commit with an updated ECL and implemented a clean solution encapsulating the floating point environment changes using a new macro introduced in ECL 20.4.24 for this purpose (documented at https://common-lisp.net/project/ecl/static/manual/Embedding-ECL.html#ECL_005fWITH_005fLISP_005fFPE).

I have pushed my solution to a new branch to not overwrite the previous changes from this ticket, I hope that's OK and people can see the branch to review it.

comment:129 Changed 4 months ago by dimpase

you need to rebase your branch over 9.1. (normally speaking it ought to be based on the current develop branch, which atm is the same as stable 9.1)

non-interactive rebase leads to a broken branch, I just checked. There were warnings about tabs and trailing spaces too. Our Python code only uses spaces.

Last edited 4 months ago by dimpase (previous) (diff)

comment:130 Changed 4 months ago by dimpase

needless to say, your branch is most welcome!

comment:131 Changed 4 months ago by dimpase

  • Branch changed from u/gh-spaghettisalat/ecl-update to public/packages/ecl20
  • Commit changed from 3f37e8dc6801e78fa79c1cbdc4f5f9a46354c546 to a18783e58b4874c2da4693c0443412d2541f4966
  • Status changed from needs_review to needs_work

rebased branch, with few extra touches to make ECL build with Sage 9.1. To be tested.


New commits:

677677erebased over 9.1 branch u/gh-spaghettisalat/ecl-update
9bbde18update checksums, add upstream URL
a18783erenaming the patch, tbd if it is still needed

comment:132 follow-up: Changed 4 months ago by dimpase

Maxima does not build: There is no package with the name LISP..:

...
Summary:
ECL enabled. Executable name: "ecl"
default lisp: ecl
wish executable name: "wish"
Building maxima-5.42.2
make[3]: Entering directory '/home/scratch2/dimpase/sage/sage/local/var/tmp/sage/build/maxima-5.42.2/src'
Making all in admin
make[4]: Entering directory '/home/scratch2/dimpase/sage/sage/local/var/tmp/sage/build/maxima-5.42.2/src/admin'
make[4]: Nothing to be done for 'all'.
make[4]: Leaving directory '/home/scratch2/dimpase/sage/sage/local/var/tmp/sage/build/maxima-5.42.2/src/admin'
Making all in crosscompile-windows
make[4]: Entering directory '/home/scratch2/dimpase/sage/sage/local/var/tmp/sage/build/maxima-5.42.2/src/crosscompile-windows'
make[4]: Nothing to be done for 'all'.
make[4]: Leaving directory '/home/scratch2/dimpase/sage/sage/local/var/tmp/sage/build/maxima-5.42.2/src/crosscompile-windows'
Making all in src
make[4]: Entering directory '/home/scratch2/dimpase/sage/sage/local/var/tmp/sage/build/maxima-5.42.2/src/src'
ecl -norc -eval '(progn (load "../lisp-utils/defsystem.lisp") (load "../lisp-utils/make-depends.lisp") (funcall (intern "CREATE-DEPENDENCY-FILE" :mk) "binary-ecl/maxima" "ecl-depends.mk") (quit))'
;;; Loading "/home/scratch2/dimpase/sage/sage/local/var/tmp/sage/build/maxima-5.42.2/src/src/../lisp-utils/defsystem.lisp"
;;; Loading #P"/home/scratch2/dimpase/sage/sage/local/lib/ecl/cmp.fas"
An error occurred during initialization:
There is no package with the name LISP..
make bd
make[5]: Entering directory '/home/scratch2/dimpase/sage/sage/local/var/tmp/sage/build/maxima-5.42.2/src/src'
ecl -norc -eval '(progn (load "../lisp-utils/defsystem.lisp") (load "../lisp-utils/make-depends.lisp") (funcall (intern "CREATE-DEPENDENCY-FILE" :mk) "binary-ecl/maxima" "ecl-depends.mk") (quit))'
;;; Loading "/home/scratch2/dimpase/sage/sage/local/var/tmp/sage/build/maxima-5.42.2/src/src/../lisp-utils/defsystem.lisp"
;;; Loading #P"/home/scratch2/dimpase/sage/sage/local/lib/ecl/cmp.fas"
An error occurred during initialization:
There is no package with the name LISP..
for l in ecl; do for d in / /numerical /numerical/slatec; do /usr/bin/mkdir -p binary-$l$d; done; done
make[5]: Leaving directory '/home/scratch2/dimpase/sage/sage/local/var/tmp/sage/build/maxima-5.42.2/src/src'
ecl -norc \
           -eval '(progn  (load "../lisp-utils/defsystem.lisp") (funcall (intern (symbol-name :operate-on-system) :mk) "maxima" :compile :verbose t) (build-maxima-lib))' \
           -eval '(ext:quit)'
;;; Loading "/home/scratch2/dimpase/sage/sage/local/var/tmp/sage/build/maxima-5.42.2/src/src/../lisp-utils/defsystem.lisp"
;;; Loading #P"/home/scratch2/dimpase/sage/sage/local/lib/ecl/cmp.fas"
An error occurred during initialization:
There is no package with the name LISP..
make[4]: *** [Makefile:1350: binary-ecl/maxima] Error 1
make[4]: Leaving directory '/home/scratch2/dimpase/sage/sage/local/var/tmp/sage/build/maxima-5.42.2/src/src'
make[3]: *** [Makefile:458: all-recursive] Error 1
make[3]: Leaving directory '/home/scratch2/dimpase/sage/sage/local/var/tmp/sage/build/maxima-5.42.2/src'
****************************************************************************************************************************************************************************************************************************************************************************************************************************
Error building maxima-5.42.2

comment:133 in reply to: ↑ 132 ; follow-up: Changed 4 months ago by gh-spaghettisalat

Replying to dimpase:

Maxima does not build: There is no package with the name LISP..:

I had a patch for that in my original commit. Did it get lost during the rebase?

The error is due to maxima's copy of defsystem using a long obsolete package nickname which has been removed in ECL 20.4.24 and still needs to be reported on the maxima bugtracker.

comment:134 in reply to: ↑ 133 Changed 4 months ago by dimpase

Replying to gh-spaghettisalat:

Replying to dimpase:

Maxima does not build: There is no package with the name LISP..:

I had a patch for that in my original commit. Did it get lost during the rebase?

I just took the diff against 9.0 and patched 9.1 with it. In the process I had to rename spkg-install to spkg-install.in, but otherwise everything else just applied without any problems. So it most probably was never in your branch.

The error is due to maxima's copy of defsystem using a long obsolete package nickname which has been removed in ECL 20.4.24 and still needs to be reported on the maxima bugtracker.

the latter has been reported and fixed by Maxima, as I just found. cf. https://sourceforge.net/p/maxima/bugs/3629/

Last edited 4 months ago by dimpase (previous) (diff)

comment:135 Changed 4 months ago by git

  • Commit changed from a18783e58b4874c2da4693c0443412d2541f4966 to 4f7e1a59e19c46896e51553190ba94d764006c17

Branch pushed to git repo; I updated commit sha1. New commits:

4f7e1a5backport Maxima fix for bug #3629

comment:136 Changed 4 months ago by dimpase

ok, great, a bit of numerical noise, but further seems to be in order. an update soon.

comment:137 Changed 4 months ago by dimpase

  • Reviewers changed from Dima Pasechnik, Nils Bruin, François Bissey to Marius Gerbershagen, Dima Pasechnik, Nils Bruin, François Bissey

comment:138 Changed 4 months ago by dimpase

  • Authors changed from Nils Bruin, Dima Pasechnik, Erik Bray to Marius Gerbershagen, Nils Bruin, Dima Pasechnik, Erik Bray
  • Reviewers changed from Marius Gerbershagen, Dima Pasechnik, Nils Bruin, François Bissey to Dima Pasechnik, Nils Bruin, François Bissey

comment:139 Changed 4 months ago by git

  • Commit changed from 4f7e1a59e19c46896e51553190ba94d764006c17 to e7e7457622e13bb6e67a83df7ff10d36f29c7e5c

Branch pushed to git repo; I updated commit sha1. New commits:

e7e7457doctest fixes

comment:140 Changed 4 months ago by dimpase

  • Status changed from needs_work to needs_review

OK, seems to work and pass all tests. Please review

comment:141 follow-up: Changed 4 months ago by dimpase

  • Status changed from needs_review to needs_work

Absense of makeinfo leads to a building error, cf https://github.com/dimpase/sage/runs/704594060?check_suite_focus=true

[ecl-20.4.24.p0]   configure: error: Unable to build the manual: makeinfo not found.

comment:142 in reply to: ↑ 141 Changed 4 months ago by fbissey

Replying to dimpase:

Absense of makeinfo leads to a building error, cf https://github.com/dimpase/sage/runs/704594060?check_suite_focus=true

[ecl-20.4.24.p0]   configure: error: Unable to build the manual: makeinfo not found.

Some flashback that is!

comment:143 Changed 4 months ago by fbissey

Oh right, that was maxima not ecl that had problems before with makeinfo, this is different. That would make it a requirement, which may not be a bad thing (TM).

comment:144 Changed 4 months ago by dimpase

  • Report Upstream changed from Fixed upstream, but not in a stable release. to Reported upstream. No feedback yet.

comment:145 Changed 4 months ago by dimpase

Their configure.ac is buggy, I'm fixing it...

comment:147 Changed 4 months ago by git

  • Commit changed from e7e7457622e13bb6e67a83df7ff10d36f29c7e5c to 4001aeb51385aaf833315c4d73f925295ecde004

Branch pushed to git repo; I updated commit sha1. New commits:

4001aebbackport ECL PR #210

comment:148 Changed 4 months ago by dimpase

re-running tests: https://github.com/dimpase/sage/actions


New commits:

4001aebbackport ECL PR #210

comment:149 Changed 4 months ago by dimpase

  • Report Upstream changed from Reported upstream. No feedback yet. to Fixed upstream, in a later stable release.
  • Status changed from needs_work to needs_review

comment:150 Changed 4 months ago by dimpase

  • Summary changed from update ECL to 16.1.3 or - rather - 20.4.24 to update ECL to 20.4.24

comment:151 Changed 4 months ago by git

  • Commit changed from 4001aeb51385aaf833315c4d73f925295ecde004 to 266d8c103e7a2905f5532a586e292b7e434eb25e

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

469c14arebased over 9.1 branch u/gh-spaghettisalat/ecl-update
c75887cupdate checksums, add upstream URL
2d71967renaming the patch, tbd if it is still needed
0076bc3backport Maxima fix for bug #3629
1d074e8doctest fixes
266d8c1backport ECL PR #210

comment:152 Changed 4 months ago by dimpase

  • Cc mjo added
  • Priority changed from major to critical

Could someone please review this? (rebased over current beta)


New commits:

469c14arebased over 9.1 branch u/gh-spaghettisalat/ecl-update
c75887cupdate checksums, add upstream URL
2d71967renaming the patch, tbd if it is still needed
0076bc3backport Maxima fix for bug #3629
1d074e8doctest fixes
266d8c1backport ECL PR #210

comment:153 Changed 4 months ago by charpent

  • Description modified (diff)

Updated the goddamn tarball address...

Changed 4 months ago by charpent

comment:154 Changed 4 months ago by charpent

  • Reviewers changed from Dima Pasechnik, Nils Bruin, François Bissey to Dima Pasechnik, Nils Bruin, François Bissey, Emmanuel Charpentier
  • Status changed from needs_review to positive_review

On Debian testing running on core i7 + 16 GB RAM using the system packages recommended by README.md}}, {{{ptestlong on the present branch raises the same failures already reported for 5.2.beta0 on the same machine.

These failures are similar to those reported for 9.1.rc3 and explained away by Jonathan Kliem.

Re-running the failed/timed out doctests show that the timed out one pass and all others persist to fail ; see attached chkerrs-22191.txt.

This branch results do not seem to differ from thos of 9.2.beta0 as far as ECL might be concerned.

==> (tentative) positive_review. As in other cases, a check on a system using Sage packages (i. e. one on whichg ptestlong raises no failures) would be useful.

comment:155 Changed 4 months ago by mkoeppe

  • Description modified (diff)

comment:156 Changed 4 months ago by mkoeppe

  • Description modified (diff)

comment:157 follow-up: Changed 4 months ago by mkoeppe

  • Status changed from positive_review to needs_work

Compile fails on macOS:

[ecl-20.4.24.p0] manual.txi:13: warning: unrecognized encoding name `UTF-8'.
[ecl-20.4.24.p0] /Users/mkoeppe/s/sage/sage-rebasing/worktree-algebraic-2018-spring/local/var/tmp/sage/build/ecl-20.4.24.p0/src/build/doc/manual//user-guide/invoking.txi:58: Unknown command `inlinefmtifelse'.
[ecl-20.4.24.p0] /Users/mkoeppe/s/sage/sage-rebasing/worktree-algebraic-2018-spring/local/var/tmp/sage/build/ecl-20.4.24.p0/src/build/doc/manual//user-guide/invoking.txi:58: Misplaced {.
[ecl-20.4.24.p0] /Users/mkoeppe/s/sage/sage-rebasing/worktree-algebraic-2018-spring/local/var/tmp/sage/build/ecl-20.4.24.p0/src/build/doc/manual//user-guide/invoking.txi:58: Misplaced }.
...
[ecl-20.4.24.p0] Too many errors!  Gave up.
[ecl-20.4.24.p0] make[5]: *** [ecl.info.gz] Error 1
[ecl-20.4.24.p0] make[4]: *** [info] Error 2
[ecl-20.4.24.p0] make[3]: *** [doc] Error 2
[ecl-20.4.24.p0] make[2]: *** [all] Error 2
[ecl-20.4.24.p0] ********************************************************************************
[ecl-20.4.24.p0] Error building ecl-20.4.24.p0
[ecl-20.4.24.p0] ********************************************************************************

comment:158 in reply to: ↑ 157 Changed 4 months ago by gh-spaghettisalat

Replying to mkoeppe:

Compile fails on macOS:

[ecl-20.4.24.p0] manual.txi:13: warning: unrecognized encoding name `UTF-8'.
[ecl-20.4.24.p0] /Users/mkoeppe/s/sage/sage-rebasing/worktree-algebraic-2018-spring/local/var/tmp/sage/build/ecl-20.4.24.p0/src/build/doc/manual//user-guide/invoking.txi:58: Unknown command `inlinefmtifelse'.
[ecl-20.4.24.p0] /Users/mkoeppe/s/sage/sage-rebasing/worktree-algebraic-2018-spring/local/var/tmp/sage/build/ecl-20.4.24.p0/src/build/doc/manual//user-guide/invoking.txi:58: Misplaced {.
[ecl-20.4.24.p0] /Users/mkoeppe/s/sage/sage-rebasing/worktree-algebraic-2018-spring/local/var/tmp/sage/build/ecl-20.4.24.p0/src/build/doc/manual//user-guide/invoking.txi:58: Misplaced }.
...
[ecl-20.4.24.p0] Too many errors!  Gave up.
[ecl-20.4.24.p0] make[5]: *** [ecl.info.gz] Error 1
[ecl-20.4.24.p0] make[4]: *** [info] Error 2
[ecl-20.4.24.p0] make[3]: *** [doc] Error 2
[ecl-20.4.24.p0] make[2]: *** [all] Error 2
[ecl-20.4.24.p0] ********************************************************************************
[ecl-20.4.24.p0] Error building ecl-20.4.24.p0
[ecl-20.4.24.p0] ********************************************************************************

What does makeinfo --version report?

comment:159 Changed 4 months ago by dimpase

IIRC I saw a similar weird error in case of a broken texinfo installation.

comment:160 Changed 4 months ago by dimpase

With Homebrew, a setup that works for me is with texinfo installed, and /usr/local/Cellar/texinfo/6.7/bin coming early in PATH, so that the native makeinfo is not used.

comment:161 Changed 4 months ago by jhpalmieri

To make it accessible to more users, how about using ./configure --enable-manual=no? Or use this if the version of makeinfo is too old?

The system version of makeinfo on OS X:

% makeinfo --version                     
makeinfo (GNU texinfo) 4.8

Copyright (C) 2004 Free Software Foundation, Inc.
There is NO warranty.  You may redistribute this software
under the terms of the GNU General Public License.
For more information about these matters, see the files named COPYING.

comment:162 Changed 4 months ago by git

  • Commit changed from 266d8c103e7a2905f5532a586e292b7e434eb25e to 12447bc4ba5e9443a95ac0bb4b2fe916410e3c6f

Branch pushed to git repo; I updated commit sha1. New commits:

12447bcreject old makeinfo

comment:163 follow-up: Changed 4 months ago by dimpase

  • Status changed from needs_work to needs_review

OK, this does the trick. Another question whether on Homebrew we want to recommend installing texinfo, and do

  • .homebrew-build-env

    a b  
    22# that activate keg-only homebrew package installations
    33
    44HOMEBREW=`brew --prefix` || return 1
    5 for l in gettext; do
     5for l in texinfo gettext; do
    66    if [ -d "$HOMEBREW/opt/$l/bin" ]; then
    77        PATH="$HOMEBREW/opt/$l/bin:$PATH"
    88    fi
Last edited 4 months ago by dimpase (previous) (diff)

comment:164 follow-up: Changed 4 months ago by charpent

Do you need/want another check on Debian ?

comment:165 in reply to: ↑ 164 ; follow-up: Changed 4 months ago by dimpase

Replying to charpent:

Do you need/want another check on Debian ?

Yes, with the optional packages dependent on ecl, please - fricas and kenzo

comment:166 in reply to: ↑ 165 ; follow-up: Changed 4 months ago by charpent

Replying to dimpase:

Replying to charpent:

Do you need/want another check on Debian ?

Yes, with the optional packages dependent on ecl, please - fricas and kenzo

Underway. Expect about 3 hours...

comment:167 in reply to: ↑ 166 Changed 4 months ago by charpent

Replying to charpent:

Replying to dimpase:

Replying to charpent:

Do you need/want another check on Debian ?

Yes, with the optional packages dependent on ecl, please - fricas and kenzo

Underway. Expect about 3 hours...

The build phase of time make ptestlong doesn't seem to have changed anything. I'll keep it running in order to test kenzo (not in my initial setup...).

Changed 4 months ago by charpent

comment:168 Changed 4 months ago by charpent

Well,that was faster than expexcted.

I get the same 11 permanent failures as reported for 2.0.beta1. See checkerrs-V2.txt for details.

==>Nihil obstat for positive_review, but I won't put it myself (someone testing this on a Mac might...).

comment:169 in reply to: ↑ 163 Changed 4 months ago by mkoeppe

Replying to dimpase:

Another question whether on Homebrew we want to recommend installing texinfo, and do

  • .homebrew-build-env

    a b  
    22# that activate keg-only homebrew package installations
    33
    44HOMEBREW=`brew --prefix` || return 1
    5 for l in gettext; do
     5for l in texinfo gettext; do
    66    if [ -d "$HOMEBREW/opt/$l/bin" ]; then
    77        PATH="$HOMEBREW/opt/$l/bin:$PATH"
    88    fi

I think this should go to #29557 (Add script package _recommended and generate "recommended system packages" info) - having a current texinfo should be on the same footing as having an installation of texlive etc.

comment:170 follow-up: Changed 4 months ago by mkoeppe

  • Status changed from needs_review to positive_review

Builds OK now on macOS. I haven't tested or looked at anything else in this ticket. Resetting to charpent's tentative positive review.

comment:171 in reply to: ↑ 170 ; follow-up: Changed 4 months ago by charpent

Replying to mkoeppe:

Builds OK now on macOS. I haven't tested or looked at anything else in this ticket. Resetting to charpent's tentative positive review.

You should add your name in the "Reviewers" field (I tend to forget this pretty regularly, and get reminded by kcrisman equally regularly...).

comment:172 Changed 4 months ago by mkoeppe

  • Authors changed from Marius Gerbershagen, Nils Bruin, Dima Pasechnik, Erik Bray to Marius Gerbershagen, Nils Bruin, Dima Pasechnik, Erik Bray, Matthias Koeppe

comment:173 Changed 4 months ago by mkoeppe

  • Authors changed from Marius Gerbershagen, Nils Bruin, Dima Pasechnik, Erik Bray, Matthias Koeppe to Marius Gerbershagen, Nils Bruin, Dima Pasechnik, Erik Bray
  • Reviewers changed from Dima Pasechnik, Nils Bruin, François Bissey, Emmanuel Charpentier to Dima Pasechnik, Nils Bruin, François Bissey, Emmanuel Charpentier, Matthias Koeppe

comment:174 in reply to: ↑ 171 Changed 4 months ago by mkoeppe

Replying to charpent:

You should add your name in the "Reviewers" field (I tend to forget this pretty regularly, and get reminded by kcrisman equally regularly...).

Thanks, done.

comment:175 Changed 4 months ago by mkoeppe

  • Status changed from positive_review to needs_work

comment:176 follow-up: Changed 4 months ago by mkoeppe

Tests at https://github.com/mkoeppe/sage/runs/745921410:

On ubuntu-trusty-standard:

In file included from /sage/local/var/tmp/sage/build/ecl-20.4.24.p0/src/src/c/cfun.d:18:0:
/sage/local/var/tmp/sage/build/ecl-20.4.24.p0/src/src/c/cfun_dispatch.d: In function 'fixed_dispatch1':
/sage/local/var/tmp/sage/build/ecl-20.4.24.p0/src/src/c/cfun_dispatch.d:92:3: error: 'for' loop initial declarations are only allowed in C99 mode
   for (int i = 0; i < 1; i++)
   ^

Likewise in debian-jessie-standard (https://github.com/mkoeppe/sage/runs/745921476)

Last edited 4 months ago by mkoeppe (previous) (diff)

comment:177 follow-ups: Changed 4 months ago by mkoeppe

And on ubuntu-xenial-standard (https://github.com/mkoeppe/sage/runs/745921419), maxima fails:

ecl -norc \
           -eval '(progn  (load "../lisp-utils/defsystem.lisp") (funcall (intern (symbol-name :operate-on-system) :mk) "maxima" :compile :verbose t) (build-maxima-lib))' \
           -eval '(ext:quit)'

Internal or unrecoverable error in:
Can't set the size of the C stack

;;; ECL C Backtrace
;;; /sage/local/lib/libecl.so.20.4(_ecl_dump_c_backtrace+0x22) [0x14564d659042]
;;; /sage/local/lib/libecl.so.20.4(ecl_internal_error+0x41) [0x14564d647431]
;;; /sage/local/lib/libecl.so.20.4(+0x1d0a20) [0x14564d66fa20]
;;; /sage/local/lib/libecl.so.20.4(ecl_cs_set_org+0x5d) [0x14564d66fe9d]
;;; /sage/local/lib/libecl.so.20.4(cl_boot+0x15c) [0x14564d53581c]
;;; ecl() [0x40090a]
;;; /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x14564d0f5830]
;;; ecl() [0x4009f9]
Makefile:1349: recipe for target 'binary-ecl/maxima' failed
make[3]: *** [binary-ecl/maxima] Aborted (core dumped)

comment:178 in reply to: ↑ 176 ; follow-up: Changed 4 months ago by dimpase

Replying to mkoeppe:

Tests at https://github.com/mkoeppe/sage/runs/745921410:

On ubuntu-trusty-standard:

In file included from /sage/local/var/tmp/sage/build/ecl-20.4.24.p0/src/src/c/cfun.d:18:0:
/sage/local/var/tmp/sage/build/ecl-20.4.24.p0/src/src/c/cfun_dispatch.d: In function 'fixed_dispatch1':
/sage/local/var/tmp/sage/build/ecl-20.4.24.p0/src/src/c/cfun_dispatch.d:92:3: error: 'for' loop initial declarations are only allowed in C99 mode
   for (int i = 0; i < 1; i++)
   ^

Likewise in debian-jessie-standard (https://github.com/mkoeppe/sage/runs/745921476)

xenial is long past EOL (oops, more precisely, past End of Support). jessie is past EOL, and jessie LTS will be EOL in less than a month.

If the C compiler there defaults to c89 mode, I don't see why we should care.

Last edited 4 months ago by dimpase (previous) (diff)

comment:179 Changed 4 months ago by dimpase

I've left a comment on https://gitlab.com/embeddable-common-lisp/ecl/-/issues/484 (cleanup: consider removing c89 validation from buildsystem)

comment:180 Changed 4 months ago by dimpase

  • Report Upstream changed from Fixed upstream, in a later stable release. to Reported upstream. No feedback yet.

comment:181 in reply to: ↑ 177 Changed 4 months ago by dimpase

Replying to mkoeppe:

And on ubuntu-xenial-standard (https://github.com/mkoeppe/sage/runs/745921419), maxima fails:

ecl -norc \
           -eval '(progn  (load "../lisp-utils/defsystem.lisp") (funcall (intern (symbol-name :operate-on-system) :mk) "maxima" :compile :verbose t) (build-maxima-lib))' \
           -eval '(ext:quit)'

Internal or unrecoverable error in:
Can't set the size of the C stack
...

I can reproduce this on arando buildbot - which is an Ubuntu Xenial Bionic 32-bit machine. So this seems to do more with OS than anything else.

Last edited 4 months ago by dimpase (previous) (diff)

comment:182 Changed 4 months ago by dimpase

issue in comment:181 is reproducble without Sage, just ECL and Maxima: I filed https://gitlab.com/embeddable-common-lisp/ecl/-/issues/596

comment:183 in reply to: ↑ 177 ; follow-up: Changed 4 months ago by dimpase

Replying to mkoeppe:

And on ubuntu-xenial-standard (https://github.com/mkoeppe/sage/runs/745921419),

this says

2020-06-06T21:39:17.5489746Z Ubuntu
2020-06-06T21:39:17.5489978Z 18.04.4

which is Bionic, not Xenial (cf https://wiki.ubuntu.com/Releases), so I don't know why this is mixed up.

maxima fails: ...

comment:184 in reply to: ↑ 183 Changed 4 months ago by mkoeppe

Replying to dimpase:

Replying to mkoeppe:

on ubuntu-xenial-standard (https://github.com/mkoeppe/sage/runs/745921419),

this says

2020-06-06T21:39:17.5489746Z Ubuntu
2020-06-06T21:39:17.5489978Z 18.04.4

which is Bionic, not Xenial (cf https://wiki.ubuntu.com/Releases), so I don't know why this is mixed up.

You're looking in the wrong place. The GitHub runner is on Ubuntu, but the test is in a Docker container running xenial

Digest: sha256:db6697a61d5679b7ca69dbde3dad6be0d17064d5b6b0e9f7be8d456ebb337209
Status: Downloaded newer image for ubuntu:xenial

comment:185 follow-up: Changed 3 months ago by gh-sheerluck

as mentioned in https://trac.sagemath.org/ticket/23011#comment:15

sage with ecls-20.4.24/maxima-5.44.0 fails to build because ECL_OPT_TRAP_SIGCHLD is gone

comment:186 Changed 3 months ago by git

  • Commit changed from 12447bc4ba5e9443a95ac0bb4b2fe916410e3c6f to a3e0eca1690d463a8aeef9cac915f9fe6ff8baea

Branch pushed to git repo; I updated commit sha1. New commits:

a3e0ecaadd upstream fix from MR 215

comment:187 in reply to: ↑ 185 Changed 3 months ago by dimpase

  • Status changed from needs_work to needs_review

Replying to gh-sheerluck:

as mentioned in https://trac.sagemath.org/ticket/23011#comment:15

sage with ecls-20.4.24/maxima-5.44.0 fails to build because ECL_OPT_TRAP_SIGCHLD is gone

we're still on Maxima 5.42.2, and it works with this ecl.

Upstream fix for Ubuntu xenial is addded, please review.

comment:188 in reply to: ↑ 178 ; follow-up: Changed 3 months ago by mkoeppe

Replying to dimpase:

If the C compiler there defaults to c89 mode, I don't see why we should care.

Certainly we should care about the bug that our compiler flags that are part of $CC are not passed on correctly...

comment:189 Changed 3 months ago by dimpase

right, I forgot the patch I made on https://gitlab.com/embeddable-common-lisp/ecl/-/merge_requests/214 - I should add it too.

comment:190 Changed 3 months ago by dimpase

  • Status changed from needs_review to needs_work

comment:191 in reply to: ↑ 188 ; follow-up: Changed 3 months ago by dimpase

Replying to mkoeppe:

Replying to dimpase:

If the C compiler there defaults to c89 mode, I don't see why we should care.

Certainly we should care about the bug that our compiler flags that are part of $CC are not passed on correctly...

by the way, ECL sets CFLAGS in its c/Makefile.in without taking $CFLAGS into account, only @CFLAGS@. What would be the correct setting? To take $CFLAGS in in Makefile, or in ./configure ?

comment:192 Changed 3 months ago by git

  • Commit changed from a3e0eca1690d463a8aeef9cac915f9fe6ff8baea to 89b006b106ca050240f580cb80d2a6320e5ade2e

Branch pushed to git repo; I updated commit sha1. New commits:

89b006badd the patch from upstream MR 214

comment:193 Changed 3 months ago by dimpase

  • Status changed from needs_work to positive_review

OK, now std=c99 should be enforced in ECL, too.

comment:194 Changed 3 months ago by dimpase

  • Status changed from positive_review to needs_review

comment:195 in reply to: ↑ 191 ; follow-up: Changed 3 months ago by mkoeppe

Replying to dimpase:

ECL sets CFLAGS in its c/Makefile.in without taking $CFLAGS into account, only @CFLAGS@. What would be the correct setting? To take $CFLAGS in in Makefile, or in ./configure ?

Well, as ECL is not using automake, I guess one cannot expect automake features such as being able to pass user CFLAGS to make without breaking things. So one needs to set user CFLAGS before running ./configure.

comment:196 in reply to: ↑ 195 Changed 3 months ago by dimpase

Replying to mkoeppe:

Replying to dimpase:

ECL sets CFLAGS in its c/Makefile.in without taking $CFLAGS into account, only @CFLAGS@. What would be the correct setting? To take $CFLAGS in in Makefile, or in ./configure ?

Well, as ECL is not using automake, I guess one cannot expect automake features such as being able to pass user CFLAGS to make without breaking things. So one needs to set user CFLAGS before running ./configure.

anyhow ​89b006b does the job, it's more a question of improving ECL further.

comment:198 Changed 3 months ago by mkoeppe

  • Status changed from needs_review to positive_review

Looking good.

comment:199 Changed 3 months ago by mkoeppe

On Linux/macOS, that is.

However, on Cygwin (https://github.com/mkoeppe/sage/runs/765619947), I am getting:

;; Compiling /cygdrive/d/a/sage/sage/local/var/tmp/sage/build/maxima-5.42.2/src/src/numerical/slatec/dqawc.lisp.
;;; OPTIMIZE levels: Safety=2, Space=0, Speed=3, Debug=2
;;;
;;; End of Pass 1.
;;; Finished compiling /cygdrive/d/a/sage/sage/local/var/tmp/sage/build/maxima-5.42.2/src/src/numerical/slatec/dqawc.lisp.
;;;
      1 [main] ecl 57306 child_info_fork::abort: address space needed by 'eclcTpROy.dll' (0x190000) is already occupied
An error occurred during initialization:
Could not spawn subprocess to run "gcc"..
make[3]: *** [Makefile:1350: binary-ecl/maxima] Error 1

comment:200 Changed 3 months ago by mkoeppe

  • Status changed from positive_review to needs_work

comment:201 Changed 3 months ago by dimpase

strange. looks like a temporary dll created for some reason. Anyhow, this could come from a BLODA routine breaking Cygwin fork.

comment:202 Changed 3 months ago by gh-spaghettisalat

Damn, looks like changes I made after we release tested on cygwin to fix compiler issues for msvc broke cygwin compilation. Merge request https://gitlab.com/embeddable-common-lisp/ecl/-/merge_requests/216 contains a fix for that.

Last edited 3 months ago by gh-spaghettisalat (previous) (diff)

comment:203 Changed 3 months ago by mjo

We should be able to use abs tol to future-proof tests like these,

-            [0.5,0.55511151231257...e-14,21,0]
+            [0.5,5.5511151231257...e-15,21,0]

More generally, there are a lot of e.g. maxima tests that check for a particular output string when a boolean comparison (actual == expected) that checks for True would be more reliable. When we start accepting system installations of maxima this will become more annoying.

comment:204 Changed 3 months ago by git

  • Commit changed from 89b006b106ca050240f580cb80d2a6320e5ade2e to 0b777377289ba21166ea8d4f647e38e8b6ea1d23

Branch pushed to git repo; I updated commit sha1. New commits:

0b77737add upstream MR 216 (to fix cygwin fork)

comment:205 follow-up: Changed 3 months ago by dimpase

OK, I've added this patch to the branch - however, on a Gentoo box I tried this branch (with and without the last commit) running Sage tests comes to a grinding halt due to a lot of zombie Maxima processes eating up CPU and RAM.

I wonder whether this is reproducible on a mainstream Linux.

comment:206 Changed 3 months ago by dimpase

it looks as if after a doctest crash (or perhaps even without one) a running maxima process related to it does not always get cleaned up. I don't know much about sage-cleaner script - but it is active during the test run.

Thanks goodness it turned out to be a red herring. Well, almost - I had an alive instance of sage-cleaner from a previous test, and starting a new test run did not clean it (which probably is a bug), so a new sage-cleaner didn't start, and the old one didn't do anything useful for the current test run.

Last edited 3 months ago by dimpase (previous) (diff)

comment:208 in reply to: ↑ 205 Changed 3 months ago by mkoeppe

Replying to dimpase:

... on a Gentoo box I tried this branch (with and without the last commit) running Sage tests ...

Overdue to add Gentoo to our CI infrastructure ... #29105

comment:209 follow-up: Changed 3 months ago by dimpase

it's not quite clear how to add Gentoo, which is meant to be built from source, from ground up. Yes, one can get an image from https://hub.docker.com/r/gentoo/stage3-amd64/ - but adding Sage dependencies means building them from source. Anything short of a custom image won't work, I think.

comment:210 in reply to: ↑ 209 ; follow-up: Changed 3 months ago by mkoeppe

Replying to dimpase:

it's not quite clear how to add Gentoo, which is meant to be built from source, from ground up. Yes, one can get an image from https://hub.docker.com/r/gentoo/stage3-amd64/ - but adding Sage dependencies means building them from source.

That's fine - why would that take longer than the 6h that our -minimal builds take? Presumably the image has the compilers already?

comment:211 in reply to: ↑ 210 ; follow-up: Changed 3 months ago by dimpase

Replying to mkoeppe:

Replying to dimpase:

it's not quite clear how to add Gentoo, which is meant to be built from source, from ground up. Yes, one can get an image from https://hub.docker.com/r/gentoo/stage3-amd64/ - but adding Sage dependencies means building them from source.

That's fine - why would that take longer than the 6h that our -minimal builds take? Presumably the image has the compilers already?

right, I presume it has some kind of a toolchain, but how can we talk about -standard builds?

comment:212 Changed 3 months ago by dimpase

  • Status changed from needs_work to needs_review

comment:213 Changed 3 months ago by dimpase

  • Report Upstream changed from Reported upstream. No feedback yet. to Fixed upstream, but not in a stable release.

comment:214 in reply to: ↑ 211 Changed 3 months ago by mkoeppe

Replying to dimpase:

but how can we talk about -standard builds?

Same as on all other platforms: We install system packages first - it does not make difference that they are built from source rather than downloaded as a binary.

comment:215 follow-up: Changed 3 months ago by gh-sheerluck

I vote for François Bissey aka @fbissey aka @kiwifb (solo contributor to sage-on-gentoo) to deal with gentoo-on-docker stuff (maybe by creating stage4-amd64 image, so only sci-mathematics/sage package would be built from source, and all the rest would come FROM fbissey/sage-on-gentoo-stage4:latest)

comment:216 in reply to: ↑ 215 Changed 3 months ago by fbissey

Replying to gh-sheerluck:

I vote for François Bissey aka @fbissey aka @kiwifb (solo contributor to sage-on-gentoo) to deal with gentoo-on-docker stuff (maybe by creating stage4-amd64 image, so only sci-mathematics/sage package would be built from source, and all the rest would come FROM fbissey/sage-on-gentoo-stage4:latest)

Thanks for nominating me :(

Being solo for most (but not all) of the last 12 years has not been without its ups and downs. I could even use the word burnout for a few periods. On top of that, I don't have an academic style position and I don't have all the time I would like. I am currently a one man eResearch department (doing solo must be my things - kidding we were almost ready to get me a collaborator when a change at the top lead to cancelling all ongoing hiring processes, even for stuff like me that have a truck factor of one). I currently have to deal with stuff like strategic planning, data management, cloud provisioning, HPCCF, carpentries... I sometimes call myself an undertaker. I do all the stuff other people don't want to do. There is a moment where you learn the word "no".

So, thank you very much but I would very much prefer a helper.

comment:217 Changed 3 months ago by gh-timokau

Thanks for all you do François. I don't even want to think about how much harder it would be to maintain sage in nixpkgs without all your work. I can attest that solo-maintaining sage can be a bunch some times. sage on nixpkgs is still stuck to 8.9 due to the not-quite-py3 switch in 9.0.

Sorry for derailing the topic further, but sometimes a thanks is in order. Thanks.

comment:218 Changed 3 months ago by mkoeppe

Further discussion of extending the CI to Gentoo please on #29105. The corresponding ticket for nixpkgs is #29130.

Last edited 3 months ago by mkoeppe (previous) (diff)

comment:219 in reply to: ↑ 207 Changed 3 months ago by mkoeppe

Replying to mkoeppe:

Cygwin test runs at https://github.com/mkoeppe/sage/actions/runs/134541665

Maxima now succeeds to build on Cygwin (stage-ii-b, https://github.com/mkoeppe/sage/runs/768953492), but the Sage testsuite (stage-iv-a) fails with various Maxima-related errors. (There are also many GIAC-related failures, but that's for another ticket)

sage -t src/doc/en/constructions/plotting.rst  # 3 doctests failed
sage -t src/doc/en/tutorial/tour_algebra.rst  # 2 doctests failed
sage -t src/doc/ja/tutorial/interfaces.rst  # Timed out
sage -t src/doc/ru/tutorial/tour_algebra.rst  # 2 doctests failed
sage -t src/doc/ru/tutorial/interfaces.rst  # 7 doctests failed
sage -t src/sage/calculus/desolvers.py  # 1 doctest failed
sage -t src/sage/coding/binary_code.pyx  # Timed out
sage -t src/sage/doctest/control.py  # 1 doctest failed
sage -t src/sage/functions/error.py  # 1 doctest failed
sage -t src/sage/functions/orthogonal_polys.py  # 2 doctests failed
sage -t src/sage/groups/class_function.py  # Timed out
sage -t src/sage/interfaces/giac.py  # 49 doctests failed
sage -t src/sage/interfaces/quit.py  # 4 doctests failed
sage -t src/sage/interfaces/maxima_abstract.py  # 3 doctests failed
sage -t src/sage/libs/gap/element.pyx  # 1 doctest failed
sage -t src/sage/libs/glpk/error.pyx  # 2 doctests failed
sage -t src/sage/matrix/matrix1.pyx  # 1 doctest failed
sage -t src/sage/numerical/backends/glpk_backend.pyx  # 2 doctests failed
sage -t src/sage/repl/interpreter.py  # 2 doctests failed
sage -t src/sage/rings/number_field/number_field_element.pyx  # Timed out
sage -t src/sage/symbolic/assumptions.py  # 1 doctest failed
sage -t src/sage/symbolic/expression.pyx  # 1 doctest failed
sage -t src/sage/symbolic/maxima_wrapper.py  # 2 doctests failed
sage -t src/sage/tests/benchmark.py  # 1 doctest failed
sage -t src/sage/tests/books/computational-mathematics-with-sagemath/sol/graphique_doctest.py  # 1 doctest failed
sage -t src/sage/tests/cmdline.py  # 3 doctests failed

Examples of the errors (see logs for more):

sage -t src/sage/functions/error.py
**********************************************************************
File "src/sage/functions/error.py", line 149, in sage.functions.error.Function_erf
Failed example:
    merf = maxima(erf(x)).sage().operator()
Expected nothing
Got:
    Maxima crashed -- automatically restarting.


sage -t src/sage/functions/orthogonal_polys.py
**********************************************************************
File "src/sage/functions/orthogonal_polys.py", line 876, in sage.functions.orthogonal_polys.Func_chebyshev_U.__init__
Failed example:
    maxima(chebyshev_U(2,x, hold=True))
Exception raised:
...
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/interface.py", line 296, in __call__
        return cls(self, x, name=name)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 1157, in __init__
        ExpectElement.__init__(self, parent, value, is_name=False, name=None)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 1469, in __init__
        self._name = parent._create(value, name=name)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/interface.py", line 501, in _create
        self.set(name, value)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 1002, in set
        self._eval_line(cmd)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 788, in _eval_line
        assert line_echo.strip().endswith(line.strip()), 'mismatch:\n' + line_echo + line
    AssertionError: mismatch:
    
    sage5 : chebyshev_u(2,_SAGE_VAR_x)$

comment:220 Changed 3 months ago by mkoeppe

  • Status changed from needs_review to needs_work

comment:221 follow-up: Changed 3 months ago by dimpase

this shoud be easy to reproduce without Sage. Is there a Docker container created by GH Actions to download and try on Windows?

comment:222 Changed 3 months ago by mkoeppe

The workflow for Cygwin does not use Docker. There is a "sage-local" artifact that one can download.

comment:223 Changed 3 months ago by gh-thierry-FreeBSD

Trying to compile sage 9.1 with ECL 20.4.24 and Maxima 5.44.0 as external packages produces these errors:

build/cythonized/sage/libs/ecl.c:3734:57: error: use of undeclared identifier 'ECL_OPT_TRAP_SIGCHLD'; did you mean 'ECL_OPT_TRAP_SIGILL'?
  __pyx_t_3 = __Pyx_PyInt_From_cl_fixnum(ecl_get_option(ECL_OPT_TRAP_SIGCHLD)); if (unlikely(!__pyx_t_3)) __PYX_ERR(0, 180, __pyx_L1_error)
                                                        ^~~~~~~~~~~~~~~~~~~~
                                                        ECL_OPT_TRAP_SIGILL
/usr/local/include/ecl/external.h:972:9: note: 'ECL_OPT_TRAP_SIGILL' declared here
        ECL_OPT_TRAP_SIGILL,
        ^
build/cythonized/sage/libs/ecl.c:4406:57: error: use of undeclared identifier 'ECL_OPT_SIGALTSTACK_SIZE'
  __pyx_t_3 = __Pyx_PyInt_From_cl_fixnum(ecl_get_option(ECL_OPT_SIGALTSTACK_SIZE)); if (unlikely(!__pyx_t_3)) __PYX_ERR(0, 208, __pyx_L1_error)
                                                        ^
build/cythonized/sage/libs/ecl.c:4731:18: error: use of undeclared identifier 'ECL_OPT_TRAP_SIGCHLD'; did you mean 'ECL_OPT_TRAP_SIGILL'?
  ecl_set_option(ECL_OPT_TRAP_SIGCHLD, 0);
                 ^~~~~~~~~~~~~~~~~~~~
                 ECL_OPT_TRAP_SIGILL
/usr/local/include/ecl/external.h:972:9: note: 'ECL_OPT_TRAP_SIGILL' declared here
        ECL_OPT_TRAP_SIGILL,
        ^
3 errors generated.

comment:225 in reply to: ↑ 221 ; follow-up: Changed 3 months ago by mkoeppe

Replying to dimpase:

this shoud be easy to reproduce without Sage. Is there a Docker container created by GH Actions to download and try on Windows?

Has this been reported upstream?

comment:226 in reply to: ↑ 225 Changed 3 months ago by dimpase

Replying to mkoeppe:

Replying to dimpase:

this shoud be easy to reproduce without Sage. Is there a Docker container created by GH Actions to download and try on Windows?

Has this been reported upstream?

gh-spaghettisalat (a.k.a. Marius) is a core upstream developer - but just in case I made https://gitlab.com/embeddable-common-lisp/ecl/-/issues/599

comment:227 follow-up: Changed 3 months ago by arojas

Compiling this branch against our system threaded-enabled ecl gives multiple segfaults when running the test suite. Running the tests manually works fine. Running against system ecl used to work fine with the version of this branch pre-20.04 upgrade

/usr/lib/python3.8/site-packages/cysignals/signals.cpython-38-x86_64-linux-gnu.so(+0x729d)[0x7f803914129d]
/usr/lib/python3.8/site-packages/cysignals/signals.cpython-38-x86_64-linux-gnu.so(+0x748c)[0x7f803914148c]
/usr/lib/python3.8/site-packages/cysignals/signals.cpython-38-x86_64-linux-gnu.so(+0x9f05)[0x7f8039143f05]
/usr/lib/libc.so.6(+0x3c3e0)[0x7f803a6ad3e0]
/usr/lib/libecl.so.20.4(ecl_extend_hashtable+0x11a)[0x7f7fdc1d55da]
/usr/lib/libecl.so.20.4(+0x203947)[0x7f7fdc1d5947]
/usr/lib/libecl.so.20.4(cl_export2+0x36c)[0x7f7fdc176aac]
/usr/lib/libecl.so.20.4(init_all_symbols+0x29c)[0x7f7fdc071cec]
/usr/lib/libecl.so.20.4(cl_boot+0x433)[0x7f7fdc0709c3]
/usr/lib/python3.8/site-packages/sage/libs/ecl.cpython-38-x86_64-linux-gnu.so(+0xc85e)[0x7f7fdc4f585e]
/usr/lib/python3.8/site-packages/sage/libs/ecl.cpython-38-x86_64-linux-gnu.so(+0xa9f4)[0x7f7fdc4f39f4]
/usr/lib/python3.8/site-packages/sage/libs/ecl.cpython-38-x86_64-linux-gnu.so(+0x81b0)[0x7f7fdc4f11b0]
/usr/lib/libpython3.8.so.1.0(PyModule_ExecDef+0x78)[0x7f803aa06ae8]
/usr/lib/libpython3.8.so.1.0(+0x1cea50)[0x7f803aa06a50]
/usr/lib/libpython3.8.so.1.0(+0x12dd9f)[0x7f803a965d9f]
/usr/lib/libpython3.8.so.1.0(PyVectorcall_Call+0x6f)[0x7f803a976a7f]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x601a)[0x7f803a95b98a]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x304)[0x7f803a954654]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x19d)[0x7f803a96687d]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x4cc5)[0x7f803a95a635]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x761)[0x7f803a9560d1]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x398)[0x7f803a955d08]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x398)[0x7f803a955d08]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(+0x12e02f)[0x7f803a96602f]
/usr/lib/libpython3.8.so.1.0(_PyObject_CallMethodIdObjArgs+0x126)[0x7f803a979136]
/usr/lib/libpython3.8.so.1.0(PyImport_ImportModuleLevelObject+0x4c0)[0x7f803a977e20]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x469a)[0x7f803a95a00a]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x304)[0x7f803a954654]
/usr/lib/libpython3.8.so.1.0(PyEval_EvalCode+0x23)[0x7f803aa04b03]
/usr/lib/libpython3.8.so.1.0(+0x1d1e0d)[0x7f803aa09e0d]
/usr/lib/libpython3.8.so.1.0(+0x12f098)[0x7f803a967098]
/usr/lib/libpython3.8.so.1.0(PyVectorcall_Call+0x6f)[0x7f803a976a7f]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x601a)[0x7f803a95b98a]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x304)[0x7f803a954654]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x19d)[0x7f803a96687d]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x4cc5)[0x7f803a95a635]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x761)[0x7f803a9560d1]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x398)[0x7f803a955d08]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x398)[0x7f803a955d08]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(+0x12e02f)[0x7f803a96602f]
/usr/lib/libpython3.8.so.1.0(_PyObject_CallMethodIdObjArgs+0x126)[0x7f803a979136]
/usr/lib/libpython3.8.so.1.0(PyImport_ImportModuleLevelObject+0x4c0)[0x7f803a977e20]
/usr/lib/libpython3.8.so.1.0(+0x165b18)[0x7f803a99db18]
/usr/lib/libpython3.8.so.1.0(PyCFunction_Call+0x7e)[0x7f803a96c8ce]
/usr/lib/python3.8/site-packages/sage/misc/lazy_import.cpython-38-x86_64-linux-gnu.so(+0x7e6a)[0x7f80393abe6a]
/usr/lib/python3.8/site-packages/sage/misc/lazy_import.cpython-38-x86_64-linux-gnu.so(+0xc5f4)[0x7f80393b05f4]
/usr/lib/python3.8/site-packages/sage/misc/lazy_import.cpython-38-x86_64-linux-gnu.so(+0x12f31)[0x7f80393b6f31]
/usr/lib/python3.8/site-packages/sage/structure/sage_object.cpython-38-x86_64-linux-gnu.so(+0x16343)[0x7f8039d9d343]
/usr/lib/python3.8/site-packages/sage/symbolic/expression.cpython-38-x86_64-linux-gnu.so(+0x3495b)[0x7f7fe6dc795b]
/usr/lib/python3.8/site-packages/sage/symbolic/expression.cpython-38-x86_64-linux-gnu.so(+0x3df9e)[0x7f7fe6dd0f9e]
/usr/lib/libpython3.8.so.1.0(PyCFunction_Call+0x7e)[0x7f803a96c8ce]
/usr/lib/python3.8/site-packages/sage/symbolic/expression.cpython-38-x86_64-linux-gnu.so(+0x325f7)[0x7f7fe6dc55f7]
/usr/lib/python3.8/site-packages/sage/symbolic/expression.cpython-38-x86_64-linux-gnu.so(+0x36733)[0x7f7fe6dc9733]
/usr/lib/libpython3.8.so.1.0(PyCFunction_Call+0x7e)[0x7f803a96c8ce]
/usr/lib/libpython3.8.so.1.0(_PyObject_MakeTpCall+0x45c)[0x7f803a95f18c]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x5108)[0x7f803a95aa78]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x304)[0x7f803a954654]
/usr/lib/libpython3.8.so.1.0(_PyFunction_FastCallDict+0x1f6)[0x7f803a91a392]
/usr/lib/python3.8/site-packages/sage/symbolic/expression.cpython-38-x86_64-linux-gnu.so(+0x564f2)[0x7f7fe6de94f2]
/usr/lib/libpython3.8.so.1.0(PyCFunction_Call+0x7e)[0x7f803a96c8ce]
/usr/lib/libpython3.8.so.1.0(_PyObject_MakeTpCall+0x45c)[0x7f803a95f18c]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x5108)[0x7f803a95aa78]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x304)[0x7f803a954654]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x19d)[0x7f803a96687d]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x398)[0x7f803a955d08]
/usr/lib/python3.8/site-packages/sage/structure/parent.cpython-38-x86_64-linux-gnu.so(+0xc274)[0x7f80388a3274]
/usr/lib/python3.8/site-packages/sage/structure/parent.cpython-38-x86_64-linux-gnu.so(+0xd4aa)[0x7f80388a44aa]
/usr/lib/python3.8/site-packages/sage/structure/parent.cpython-38-x86_64-linux-gnu.so(+0xfdd6)[0x7f80388a6dd6]
/usr/lib/python3.8/site-packages/sage/structure/parent.cpython-38-x86_64-linux-gnu.so(+0x339ff)[0x7f80388ca9ff]
/usr/lib/libpython3.8.so.1.0(+0x1965af)[0x7f803a9ce5af]
/usr/lib/python3.8/site-packages/sage/rings/integer_ring.cpython-38-x86_64-linux-gnu.so(+0x16fc7)[0x7f7ffe8ccfc7]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0xb2b)[0x7f803a95649b]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x304)[0x7f803a954654]
/usr/lib/libpython3.8.so.1.0(PyEval_EvalCode+0x23)[0x7f803aa04b03]
/usr/lib/libpython3.8.so.1.0(+0x1d1e0d)[0x7f803aa09e0d]
/usr/lib/libpython3.8.so.1.0(+0x12f098)[0x7f803a967098]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x398)[0x7f803a955d08]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x761)[0x7f803a9560d1]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0xa22)[0x7f803a954d72]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x19d)[0x7f803a96687d]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x761)[0x7f803a9560d1]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0xa22)[0x7f803a954d72]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x19d)[0x7f803a96687d]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x761)[0x7f803a9560d1]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x761)[0x7f803a9560d1]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x304)[0x7f803a954654]
/usr/lib/libpython3.8.so.1.0(_PyObject_FastCallDict+0x225)[0x7f803a95e635]
/usr/lib/libpython3.8.so.1.0(_PyObject_Call_Prepend+0x63)[0x7f803a971c13]
/usr/lib/libpython3.8.so.1.0(+0x1f5e09)[0x7f803aa2de09]
/usr/lib/libpython3.8.so.1.0(_PyObject_MakeTpCall+0x45c)[0x7f803a95f18c]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x4bf4)[0x7f803a95a564]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x761)[0x7f803a9560d1]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x304)[0x7f803a954654]
/usr/lib/libpython3.8.so.1.0(+0x13e442)[0x7f803a976442]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x117b)[0x7f803a956aeb]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x761)[0x7f803a9560d1]
/usr/lib/libpython3.8.so.1.0(_PyObject_FastCallDict+0x15e)[0x7f803a95e56e]
/usr/lib/libpython3.8.so.1.0(+0x139824)[0x7f803a971824]
/usr/lib/libpython3.8.so.1.0(_PyObject_MakeTpCall+0x500)[0x7f803a95f230]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x4bf4)[0x7f803a95a564]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x4cc5)[0x7f803a95a635]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x4cc5)[0x7f803a95a635]
/usr/lib/libpython3.8.so.1.0(+0x13e356)[0x7f803a976356]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x4cc5)[0x7f803a95a635]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x7fa)[0x7f803a954b4a]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x19d)[0x7f803a96687d]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x761)[0x7f803a9560d1]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0xa22)[0x7f803a954d72]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x19d)[0x7f803a96687d]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x761)[0x7f803a9560d1]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x761)[0x7f803a9560d1]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x761)[0x7f803a9560d1]
/usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108)[0x7f803a9667e8]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x761)[0x7f803a9560d1]
/usr/lib/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x304)[0x7f803a954654]
/usr/lib/libpython3.8.so.1.0(PyEval_EvalCode+0x23)[0x7f803aa04b03]
/usr/lib/libpython3.8.so.1.0(+0x1d8248)[0x7f803aa10248]
/usr/lib/libpython3.8.so.1.0(+0x1d2483)[0x7f803aa0a483]
/usr/lib/libpython3.8.so.1.0(PyRun_FileExFlags+0x97)[0x7f803a8ca3f8]
/usr/lib/libpython3.8.so.1.0(PyRun_SimpleFileExFlags+0x389)[0x7f803a8c9e8a]
/usr/lib/libpython3.8.so.1.0(Py_RunMain+0x502)[0x7f803aa1d1e2]
/usr/lib/libpython3.8.so.1.0(Py_BytesMain+0x39)[0x7f803a9f93d9]
/usr/lib/libc.so.6(__libc_start_main+0xf2)[0x7f803a698002]
/bin/python3(_start+0x2e)[0x5611ca37504e]

comment:228 follow-up: Changed 3 months ago by dimpase

what will happen if you replace --disable-threads in the line sdh_configure $SAGE_CONFIGURE_GMP --disable-threads \ of ecl's spkg-install.in with --enable-threads=yes ?

comment:229 in reply to: ↑ 227 Changed 3 months ago by arojas

Replying to arojas:

Compiling this branch against our system threaded-enabled ecl gives multiple segfaults when running the test suite. Running the tests manually works fine. Running against system ecl used to work fine with the version of this branch pre-20.04 upgrade

Adding to the old patch only the build fixes required to compile against 20.4 (ie ECL_OPT_TRAP_SIGCHLD etc removal) makes it work fine without segfaults. So some of the later modifications to the patch is causing this, will try to figure out which (doesn't look easy since the older commit history seems to be gone)

comment:230 in reply to: ↑ 228 ; follow-up: Changed 3 months ago by arojas

Replying to dimpase:

what will happen if you replace --disable-threads in the line sdh_configure $SAGE_CONFIGURE_GMP --disable-threads \ of ecl's spkg-install.in with --enable-threads=yes ?

Not sure I understand - this is about distro packages, are you asking to test with sage's bundled ecl package built with threads? (FTR, the segfaults are not reproducible when using system ecl built without threads, so I expect they will happen with sage's ecl if it's built with threads)

Last edited 3 months ago by arojas (previous) (diff)

comment:231 in reply to: ↑ 230 ; follow-up: Changed 3 months ago by dimpase

Replying to arojas:

Replying to dimpase:

what will happen if you replace --disable-threads in the line sdh_configure $SAGE_CONFIGURE_GMP --disable-threads \ of ecl's spkg-install.in with --enable-threads=yes ?

Not sure I understand - this is about distro packages, are you asking to test with sage's bundled ecl package built with threads? (FTR, the segfaults are not reproducible when using system ecl built without threads, so I expect they will happen with sage's ecl if it's built with threads)

OK, sorry, I misunderstood what you were testing. Regarding multithreaded ECL 20.4.24 that you say you have a working version from an "old" branch, are you referring to https://github.com/sagemath/sagetrac-mirror/tree/u/gh-spaghettisalat/ecl-update or some other branch?

comment:232 in reply to: ↑ 231 ; follow-up: Changed 3 months ago by arojas

Replying to dimpase:

OK, sorry, I misunderstood what you were testing. Regarding multithreaded ECL 20.4.24 that you say you have a working version from an "old" branch, are you referring to https://github.com/sagemath/sagetrac-mirror/tree/u/gh-spaghettisalat/ecl-update or some other branch?

I mean the 'public/ticket-22191' branch that was attached to this ticket previously. To sum up:

with the current 'public/packages/ecl20' tests segfault if ecl is built with threads, work fine if ecl is built without threads

with the 'public/ticket-22191' plus the ECL_OPT_TRAP_SIGCHLD-removal part of this patch, there are no segfaults with ecl built with threads.

comment:233 Changed 3 months ago by arojas

There are two different types of error: some modules fail with Unhandled SIGFPE which kills the tests run. Those are fixed if I change safe_cl_boot(1, argv) back to cl_boot(1, argv).

For other modules, I get a FloatingPointError: Floating point exception for some tests, but the remaining tests are run.

Both are only reproducible when running the test suite, and work fine when the relevant code is run in a regular Sage session

comment:234 in reply to: ↑ 232 ; follow-up: Changed 3 months ago by dimpase

Replying to arojas:

Replying to dimpase:

OK, sorry, I misunderstood what you were testing. Regarding multithreaded ECL 20.4.24 that you say you have a working version from an "old" branch, are you referring to https://github.com/sagemath/sagetrac-mirror/tree/u/gh-spaghettisalat/ecl-update or some other branch?

I mean the 'public/ticket-22191' branch that was attached to this ticket previously. To sum up:

with the current 'public/packages/ecl20' tests segfault if ecl is built with threads, work fine if ecl is built without threads

with the 'public/ticket-22191'

this branch was for ECL 16.1.3, thus it is not clear to me which multithreadef ECL you tested.

In any case I hope gh-spaghettisalad knows what is going on, I bet it has to do with fork() called by the doctesting framework.

comment:235 in reply to: ↑ 234 Changed 3 months ago by arojas

Replying to dimpase:

this branch was for ECL 16.1.3, thus it is not clear to me which multithreadef ECL you tested.

I tested 20.4:

Replying to arojas:

Adding to the old patch only the build fixes required to compile against 20.4 (ie ECL_OPT_TRAP_SIGCHLD etc removal) makes it work fine without segfaults.

comment:236 Changed 3 months ago by git

  • Commit changed from 0b777377289ba21166ea8d4f647e38e8b6ea1d23 to 8ca1c0e5223d3e94ed36da2bfc22c9e608283074

Branch pushed to git repo; I updated commit sha1. New commits:

8ca1c0eMerge tag '9.2.beta2' into t/22191/public/packages/ecl20

comment:237 follow-up: Changed 3 months ago by gh-spaghettisalat

Compiling this branch against our system threaded-enabled ecl gives multiple segfaults when running the test suite.

This was caused by a bug in ECL which has just been fixed (see https://gitlab.com/embeddable-common-lisp/ecl/-/commit/75877dd8f0d534552284ba4380ba65baa74f028f).

comment:238 in reply to: ↑ 237 Changed 3 months ago by arojas

Replying to gh-spaghettisalat:

Compiling this branch against our system threaded-enabled ecl gives multiple segfaults when running the test suite.

This was caused by a bug in ECL which has just been fixed (see https://gitlab.com/embeddable-common-lisp/ecl/-/commit/75877dd8f0d534552284ba4380ba65baa74f028f).

Thanks, all good with this branch and patched, threads enabled ecl.

comment:239 follow-up: Changed 3 months ago by dimpase

maybe upstream will do a minor release, so that we don't carry all these patches along?

comment:240 Changed 3 months ago by git

  • Commit changed from 8ca1c0e5223d3e94ed36da2bfc22c9e608283074 to f82c716fdf9c6e91a07166d36b6329a15ecfb41d

Branch pushed to git repo; I updated commit sha1. New commits:

f82c716Commit 75877dd8 from upstream

comment:241 follow-up: Changed 3 months ago by dimpase

  • Status changed from needs_work to needs_review

comment:242 in reply to: ↑ 241 Changed 3 months ago by mkoeppe

Replying to dimpase:

tests are running on https://github.com/dimpase/sage/pull/13

Unfortunately the cygwin builds didn't make it to the interesting part

comment:244 follow-up: Changed 3 months ago by dimpase

restarted tests, cygwin-only, on https://github.com/dimpase/sage/pull/14

by the way, I noticed a few places mentioning python2 in .github/workflows Are they leftovers, to be removed? I'm testing with the following diff applied:

  • .github/workflows/ci-cygwin-minimal.yml

    diff --git a/.github/workflows/ci-cygwin-minimal.yml b/.github/workflows/ci-cygwin-minimal.yml
    index 97734216e8..aba9db76ae 100644
    a b env: 
    1111  MAKE: make -j8
    1212  SAGE_NUM_THREADS: 3
    1313  SAGE_CHECK: warn
    14   SAGE_CHECK_PACKAGES: "!cython,!r,!python3,!python2,!nose,!pathpy,!gap,!cysignals,!linbox,!git,!ppl"
     14  SAGE_CHECK_PACKAGES: "!cython,!r,!python3,!nose,!pathpy,!gap,!cysignals,!linbox,!git,!ppl"
    1515  CYGWIN: winsymlinks:native
    1616  CONFIGURE_ARGS: --enable-experimental-packages --enable-download-from-upstream-url
    1717  SAGE_FAT_BINARY: yes
  • .github/workflows/ci-cygwin-standard.yml

    diff --git a/.github/workflows/ci-cygwin-standard.yml b/.github/workflows/ci-cygwin-standard.yml
    index 8268e0e75f..f317a702f4 100644
    a b env: 
    1111  MAKE: make -j8
    1212  SAGE_NUM_THREADS: 3
    1313  SAGE_CHECK: warn
    14   SAGE_CHECK_PACKAGES: "!cython,!r,!python3,!python2,!nose,!pathpy,!gap,!cysignals,!linbox,!git,!ppl"
     14  SAGE_CHECK_PACKAGES: "!cython,!r,!python3,!nose,!pathpy,!gap,!cysignals,!linbox,!git,!ppl"
    1515  CYGWIN: winsymlinks:native
    1616  CONFIGURE_ARGS: --enable-experimental-packages --enable-download-from-upstream-url
    1717  SAGE_FAT_BINARY: yes
  • .github/workflows/update-cygwin-yml.sh

    diff --git a/.github/workflows/update-cygwin-yml.sh b/.github/workflows/update-cygwin-yml.sh
    index 247fe20e18..bc1d195a9f 100755
    a b  
    11#!/usr/bin/env bash
    2 for X in standard-python2 minimal; do sed 's/\[standard\]/['$X']/g;s/CI cygwin-standard/CI cygwin-'$X'/g;' ci-cygwin-standard.yml > ci-cygwin-$X.yml; done
     2for X in minimal; do sed 's/\[standard\]/['$X']/g;s/CI cygwin-standard/CI cygwin-'$X'/g;' ci-cygwin-standard.yml > ci-cygwin-$X.yml; done

comment:245 in reply to: ↑ 244 ; follow-up: Changed 3 months ago by mkoeppe

Replying to dimpase:

restarted tests, cygwin-only, on https://github.com/dimpase/sage/pull/14

Thanks

by the way, I noticed a few places mentioning python2 in .github/workflows Are they leftovers, to be removed?

Yes, these changes are OK to do. Separate ticket please

comment:246 in reply to: ↑ 245 Changed 3 months ago by dimpase

Replying to mkoeppe:

Replying to dimpase:

by the way, I noticed a few places mentioning python2 in .github/workflows Are they leftovers, to be removed?

Yes, these changes are OK to do. Separate ticket please

see #30048

comment:247 Changed 3 months ago by dimpase

unfortunately Cygwin GH Actions builds still fail for reasons which have nothing to do with this ticket.

Intermediate builds fail to upload.

Last edited 3 months ago by dimpase (previous) (diff)

comment:248 follow-up: Changed 3 months ago by mkoeppe

Just restart another time

comment:250 in reply to: ↑ 248 Changed 3 months ago by dimpase

Replying to mkoeppe:

Just restart another time

in progress now

comment:251 follow-up: Changed 3 months ago by pbruin

The diff has a line build/pkgs/ecl/patches/ffi_abi_libffi33.oldpatch (renamed from build/pkgs/ecl/patches/ffi_abi_libffi33.patch). Should this patch be removed or left in place?

comment:252 in reply to: ↑ 251 ; follow-up: Changed 3 months ago by dimpase

Replying to pbruin:

The diff has a line build/pkgs/ecl/patches/ffi_abi_libffi33.oldpatch (renamed from build/pkgs/ecl/patches/ffi_abi_libffi33.patch). Should this patch be removed or left in place?

removed (needless to say it has no effect on functionality). we're still not done with this, due to Cygwin errors. I'll post details on them.

comment:253 in reply to: ↑ 252 Changed 3 months ago by pbruin

Replying to dimpase:

Replying to pbruin:

The diff has a line build/pkgs/ecl/patches/ffi_abi_libffi33.oldpatch (renamed from build/pkgs/ecl/patches/ffi_abi_libffi33.patch). Should this patch be removed or left in place?

removed (needless to say it has no effect on functionality). we're still not done with this, due to Cygwin errors. I'll post details on them.

Thanks. By the way, I just opened #30063 for upgrading Maxima (though I am not planning to work on this myself).

comment:254 follow-up: Changed 3 months ago by dimpase

cygwin errors:

pexpect chocked? something else? Perhaps, redo using libmaxima?

sage -t src/doc/en/constructions/plotting.rst
**********************************************************************
File "src/doc/en/constructions/plotting.rst", line 53, in doc.en.constructions.plotting
Failed example:
    L = [(i/100.0, maxima.eval('jacobi_sn (%s/100.0,2.0)'%i))
        for i in range(-300,300)]
Expected nothing
Got:
    Maxima crashed -- automatically restarting.
**********************************************************************

the following resembles #8772 - apparently just pexpect choking under heavy load?

sage -t src/doc/en/tutorial/tour_algebra.rst
**********************************************************************
File "src/doc/en/tutorial/tour_algebra.rst", line 219, in doc.en.tutorial.tour_algebra
Failed example:
    de1 = maxima("2*diff(x(t),t, 2) + 6*x(t) - 2*y(t)")
Exception raised:
    Traceback (most recent call last):
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 1383, in eval
        for L in code.split('\n') if L != ''])
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 1383, in <listcomp>
        for L in code.split('\n') if L != ''])
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 793, in _eval_line
        self._error_check(line, out)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 934, in _error_check
        self._error_msg(cmd, out)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 951, in _error_msg
        raise TypeError("Error executing code in Maxima\nCODE:\n\t%s\nMaxima ERROR:\n\t%s"%(cmd, out.replace('-- an error.  To debug this try debugmode(true);','')))
    TypeError: Error executing code in Maxima
    CODE:
        display2d : false;
    Maxima ERROR:
        
    incorrect syntax: Premature termination of input at ;.
    ;
     ^


    During handling of the above exception, another exception occurred:
...

sage -t src/doc/ru/tutorial/tour_algebra.rst
**********************************************************************
File "src/doc/ru/tutorial/tour_algebra.rst", line 202, in doc.ru.tutorial.tour_algebra
Failed example:
    lde1 = de1.laplace("t","s"); lde1
Exception raised:
    Traceback (most recent call last):
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/doctest/forker.py", line 707, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/doctest/forker.py", line 1131, in compile_and_execute
        exec(compiled, globs)
      File "<doctest doc.ru.tutorial.tour_algebra[1]>", line 1, in <module>
        lde1 = de1.laplace("t","s"); lde1
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/interface.py", line 680, in __call__
        return self._obj.parent().function_call(self._name, [self._obj] + list(args), kwds)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/interface.py", line 596, in function_call
        args, kwds = self._convert_args_kwds(args, kwds)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/interface.py", line 557, in _convert_args_kwds
        args[i] = self(arg)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/interface.py", line 296, in __call__
        return cls(self, x, name=name)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 1157, in __init__
        ExpectElement.__init__(self, parent, value, is_name=False, name=None)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 1469, in __init__
        self._name = parent._create(value, name=name)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/interface.py", line 501, in _create
        self.set(name, value)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 1002, in set
        self._eval_line(cmd)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 793, in _eval_line
        self._error_check(line, out)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 934, in _error_check
        self._error_msg(cmd, out)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 951, in _error_msg
        raise TypeError("Error executing code in Maxima\nCODE:\n\t%s\nMaxima ERROR:\n\t%s"%(cmd, out.replace('-- an error.  To debug this try debugmode(true);','')))
    TypeError: Error executing code in Maxima
    CODE:
        sage4 : t$
    Maxima ERROR:
        
    incorrect syntax: Premature termination of input at ;.
    ;
     ^
...

sage -t src/doc/ru/tutorial/interfaces.rst
**********************************************************************
File "src/doc/ru/tutorial/interfaces.rst", line 258, in doc.ru.tutorial.interfaces
Failed example:
    f = maxima.eval('ij_entry[i,j] := i/j')
Exception raised:
    Traceback (most recent call last):
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/doctest/forker.py", line 707, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/doctest/forker.py", line 1131, in compile_and_execute
        exec(compiled, globs)
      File "<doctest doc.ru.tutorial.interfaces[0]>", line 1, in <module>
        f = maxima.eval('ij_entry[i,j] := i/j')
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 1383, in eval
        for L in code.split('\n') if L != ''])
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 1383, in <listcomp>
        for L in code.split('\n') if L != ''])
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 782, in _eval_line
        self._sendline(line)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 644, in _sendline
        self._sendstr(string)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 1250, in _sendstr
        self._start()
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 609, in _start
        Expect._start(self)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 527, in _start
        self.eval(X)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 1383, in eval
        for L in code.split('\n') if L != ''])
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 1383, in <listcomp>
        for L in code.split('\n') if L != ''])
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 788, in _eval_line
        assert line_echo.strip().endswith(line.strip()), 'mismatch:\n' + line_echo + line
    AssertionError: mismatch:
    
    nolabels : true;
...

File "src/doc/ru/tutorial/interfaces.rst", line 259, in doc.ru.tutorial.interfaces
Failed example:
    A = maxima('genmatrix(ij_entry,4,4)'); A
Expected:
    matrix([1,1/2,1/3,1/4],[2,1,2/3,1/2],[3,3/2,1,3/4],[4,2,4/3,1])
Got:
    matrix([ij_entry[1,1],ij_entry[1,2],ij_entry[1,3],ij_entry[1,4]],[ij_entry[2,1],ij_entry[2,2],ij_entry[2,3],ij_entry[2,4]],[ij_entry[3,1],ij_entry[3,2],ij_entry[3,3],ij_entry[3,4]],[ij_entry[4,1],ij_entry[4,2],ij_entry[4,3],ij_entry[4,4]])
...

[and more of the same...]

sage -t src/sage/symbolic/assumptions.py
**********************************************************************
File "src/sage/symbolic/assumptions.py", line 37, in sage.symbolic.assumptions
Failed example:
    maxima('features')
Exception raised:
    Traceback (most recent call last):
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/doctest/forker.py", line 707, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/doctest/forker.py", line 1131, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.symbolic.assumptions[6]>", line 1, in <module>
        maxima('features')
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/interface.py", line 296, in __call__
        return cls(self, x, name=name)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 1157, in __init__
        ExpectElement.__init__(self, parent, value, is_name=False, name=None)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 1469, in __init__
        self._name = parent._create(value, name=name)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/interface.py", line 501, in _create
        self.set(name, value)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 1002, in set
        self._eval_line(cmd)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 782, in _eval_line
        self._sendline(line)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 644, in _sendline
        self._sendstr(string)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 1250, in _sendstr
        self._start()
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 609, in _start
        Expect._start(self)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 527, in _start
        self.eval(X)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 1383, in eval
        for L in code.split('\n') if L != ''])
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 1383, in <listcomp>
        for L in code.split('\n') if L != ''])
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 788, in _eval_line
        assert line_echo.strip().endswith(line.strip()), 'mismatch:\n' + line_echo + line
    AssertionError: mismatch:
    
    domain : complex;

...

the following - already mentioned:

sage -t src/sage/symbolic/expression.pyx
**********************************************************************
File "src/sage/symbolic/expression.pyx", line 8735, in sage.symbolic.expression.Expression.arccosh
Failed example:
    maxima('acosh(0.5)')
Expected:
    1.04719755119659...*%i
Got:
    1.047197551196598*%i-1.110223024625157e-16

is not a big deal, just a floating point noise.

Pexpext, again (?):

sage -t src/sage/tests/books/computational-mathematics-with-sagemath/sol/graphique_doctest.py
**********************************************************************
File "src/sage/tests/books/computational-mathematics-with-sagemath/sol/graphique_doctest.py", line 53, in sage.tests.books.computational-mathematics-with-sagemath.sol.graphique_doctest
Failed example:
    P = desolve_system_rk4(f(x,y), [x,y],ics=[0,10,5], ivar=t, end_points=15)
Exception raised:
    Traceback (most recent call last):
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/doctest/forker.py", line 707, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/doctest/forker.py", line 1131, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.tests.books.computational-mathematics-with-sagemath.sol.graphique_doctest[22]>", line 1, in <module>
        P = desolve_system_rk4(f(x,y), [x,y],ics=[Integer(0),Integer(10),Integer(5)], ivar=t, end_points=Integer(15))
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/calculus/desolvers.py", line 1490, in desolve_system_rk4
        maxima("load('dynamics)")
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/interface.py", line 296, in __call__
        return cls(self, x, name=name)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 1157, in __init__
        ExpectElement.__init__(self, parent, value, is_name=False, name=None)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 1469, in __init__
        self._name = parent._create(value, name=name)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/interface.py", line 501, in _create
        self.set(name, value)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 1002, in set
        self._eval_line(cmd)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 782, in _eval_line
        self._sendline(line)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 644, in _sendline
        self._sendstr(string)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/expect.py", line 1250, in _sendstr
        self._start()
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 618, in _start
        self._eval_line('0;')
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 793, in _eval_line
        self._error_check(line, out)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 934, in _error_check
        self._error_msg(cmd, out)
      File "/cygdrive/d/a/sage/sage/local/lib/python3.7/site-packages/sage/interfaces/maxima.py", line 951, in _error_msg
        raise TypeError("Error executing code in Maxima\nCODE:\n\t%s\nMaxima ERROR:\n\t%s"%(cmd, out.replace('-- an error.  To debug this try debugmode(true);','')))
    TypeError: Error executing code in Maxima
    CODE:
        0;
    Maxima ERROR:
        
    incorrect syntax: tex is not an infix operator
    lisp (defun tex-
                  ^

Changed 3 months ago by dimpase

cygwin-standard run

comment:255 Changed 3 months ago by dimpase

see the attachment for complete ptest.log on Cygwin

comment:256 Changed 3 months ago by dimpase

As we don't see errors on embedded Maxima, but only on pexpect Maxima, I am inclined to say it is OK.

comment:257 Changed 3 months ago by mkoeppe

Maybe not "OK", but I think acceptable for the next beta. But the precise failure mode of the maxima pexpect interface should be investigated. This is clearly a regression from the older ECL.

comment:258 in reply to: ↑ 254 ; follow-up: Changed 3 months ago by mkoeppe

Replying to dimpase:

cygwin errors:

pexpect chocked? [...]

sage -t src/doc/en/constructions/plotting.rst
**********************************************************************
File "src/doc/en/constructions/plotting.rst", line 53, in doc.en.constructions.plotting
Failed example:
    L = [(i/100.0, maxima.eval('jacobi_sn (%s/100.0,2.0)'%i))
        for i in range(-300,300)]
Expected nothing
Got:
    Maxima crashed -- automatically restarting.
**********************************************************************

the following resembles #8772 - apparently just pexpect choking under heavy load?

Yes, let's reuse that old ticket to investigate this more.

comment:259 follow-up: Changed 3 months ago by dimpase

it could actually be an improvement - maxima runs faster, and on a heavily loaded host this leads to this mess. I will try another round of Cygwin tests, with tests not run in parallel.

comment:260 Changed 3 months ago by mkoeppe

It would be really good to simulate these tests without having to build all of Sage so that this could become part of the CI in https://github.com/spaghettisalat/ecl/pull/1

comment:261 in reply to: ↑ 259 ; follow-up: Changed 3 months ago by dimpase

Replying to dimpase:

it could actually be an improvement - maxima runs faster, and on a heavily loaded host this leads to this mess. I will try another round of Cygwin tests, with tests not run in parallel.

tests running at https://github.com/dimpase/sage/actions/runs/157700338

comment:262 Changed 3 months ago by dimpase

by the way, there are also a number of giac tests failing on Cygwin, with a very similar pexpect under heavy load (see attachment ptest results); so I really think there is nothing wrong with ECL on Cygwin per se, it's pexpect that plays up in this particular testing configuration.

comment:263 in reply to: ↑ 258 ; follow-up: Changed 3 months ago by dimpase

Replying to mkoeppe:

Replying to dimpase:

cygwin errors:

pexpect chocked? [...]

sage -t src/doc/en/constructions/plotting.rst
**********************************************************************
File "src/doc/en/constructions/plotting.rst", line 53, in doc.en.constructions.plotting
Failed example:
    L = [(i/100.0, maxima.eval('jacobi_sn (%s/100.0,2.0)'%i))
        for i in range(-300,300)]
Expected nothing
Got:
    Maxima crashed -- automatically restarting.
**********************************************************************

the following resembles #8772 - apparently just pexpect choking under heavy load?

Yes, let's reuse that old ticket to investigate this more.

this can be rewritten as follows,

sage: L = [(i/100.0, maxima_calculus.eval('jacobi_sn (%s/100.0,2.0)'%i))
....:         for i in range(-300,300)]

avoiding pexpect, I think. Indeed, it's 5 times faster:

sage: timeit("L = [(i/100.0, maxima_calculus.eval('jacobi_sn (%s/100.0,2.0)'%i)) for i in range(-300,300)]")
5 loops, best of 3: 110 ms per loop
sage: timeit("L = [(i/100.0, maxima.eval('jacobi_sn (%s/100.0,2.0)'%i)) for i in range(-300,300)]")
5 loops, best of 3: 556 ms per loop

comment:264 in reply to: ↑ 263 Changed 3 months ago by dimpase

Replying to dimpase:

5 loops, best of 3: 110 ms per loop sage: timeit("L = [(i/100.0, maxima.eval('jacobi_sn (%s/100.0,2.0)'%i)) for i in range(-300,300)]") 5 loops, best of 3: 556 ms per loop }}}

to improve this, I opened #30071.

comment:265 in reply to: ↑ 261 Changed 3 months ago by dimpase

Replying to dimpase:

Replying to dimpase:

it could actually be an improvement - maxima runs faster, and on a heavily loaded host this leads to this mess. I will try another round of Cygwin tests, with tests not run in parallel.

tests running at https://github.com/dimpase/sage/actions/runs/157700338

and this ran out of time limit, so no interesting tests were performed.

I think we should proceed with merging this ticket, anyway, and speed up Maxima things on #30071

comment:266 Changed 3 months ago by mkoeppe

  • Status changed from needs_review to positive_review

comment:267 Changed 3 months ago by vbraun

  • Branch changed from public/packages/ecl20 to f82c716fdf9c6e91a07166d36b6329a15ecfb41d
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:268 in reply to: ↑ 239 Changed 2 months ago by gh-spaghettisalat

  • Commit f82c716fdf9c6e91a07166d36b6329a15ecfb41d deleted

Replying to dimpase:

maybe upstream will do a minor release, so that we don't carry all these patches along?

Sorry for the late reply, I've been busy. I agree that a bugfix release is needed. This will probably still take a bit though, I want to take a look at some other bugs first (https://gitlab.com/embeddable-common-lisp/ecl/-/issues/594 and https://gitlab.com/embeddable-common-lisp/ecl/-/issues/586) and then there's still tests to run.

Note: See TracTickets for help on using tickets.