Opened 6 years ago
Closed 4 years ago
#17650 closed defect (wontfix)
alarm broken on cygwin
Reported by: | gouezel | Owned by: | embray |
---|---|---|---|
Priority: | major | Milestone: | sage-duplicate/invalid/wontfix |
Component: | porting: Cygwin | Keywords: | ppl, cygwin |
Cc: | vbraun, jpflori | Merged in: | |
Authors: | Sebastien Gouezel | Reviewers: | |
Report Upstream: | N/A | Work issues: | |
Branch: | u/gouezel/alarm_broken_on_cygwin (Commits, GitHub, GitLab) | Commit: | 457bfa5062b248f4c36abeda94e84ed7c0acb877 |
Dependencies: | Stopgaps: |
Description (last modified by )
On cygwin (32 and 64), the alarm mechanism is broken:
sage: alarm(0.1); sum(xrange(100000000)) 4999999950000000
instead of the expected interrupt.
I traced the bug by dichotomy to a ppl component named watchdog. Its role is similar to alarm, involving signals and interrupts. I guess that, due to the peculiarities of cygwin's implementation of signals, watchdog somehow intercepts the python signals and does dot send them back.
watchdog is not an optional component of ppl, so it can not be disabled by a configure flag. However, it is disabled on systems that don't have setitimer
. Cygwin has setitimer
, but a simple hack of ppl's makefile can hide it, thereby disabling watchdog and fixing the sage alarm mechanism. This is done in the proposed patch.
A better fix would be to dig into the ppl and cygwin signal mechanism, of course...
Change History (20)
comment:1 Changed 6 years ago by
comment:2 Changed 6 years ago by
The PPL code seems to be correct, I am stuck on this one...
comment:3 Changed 6 years ago by
- Report Upstream changed from N/A to Reported upstream. No feedback yet.
comment:4 Changed 6 years ago by
- Description modified (diff)
- Report Upstream changed from Reported upstream. No feedback yet. to N/A
comment:5 Changed 6 years ago by
- Branch set to u/gouezel/alarm_broken_on_cygwin
comment:6 Changed 6 years ago by
- Cc vbraun jpflori added
- Commit set to 1a8ecd257246a869941d6407eda677a489e019fa
- Description modified (diff)
- Keywords ppl cygwin added
- Status changed from new to needs_review
New commits:
1a8ecd2 | disable ppl watchdog on cygwin
|
comment:7 Changed 6 years ago by
comment:8 Changed 6 years ago by
- Status changed from needs_review to needs_info
What I don't like about this ticket is that it's not clear that the "ppl watchdog" is not needed. What does it do and why does Sage not need it?
comment:9 follow-up: ↓ 11 Changed 6 years ago by
Excellent question. There is essentially no documentation on this ppl watchdog that I could find. It seems to be a classical watchdog mechanism, i.e., it detects if some computation loops forever, and in this case it interrupts it automatically and resets things to a nicer state. In particular, it should not play any role in non-buggy situations.
It is not necessary to PPL (since it is disabled on some platforms, those which don't have setitimer
), and I do not see how it could be relevant to sage. So, my impression is that it is safe to disable it (note that the patch only disables it on cygwin, so it will not break anything on officially supported platforms). On cygwin, with the patch, sage -t all
does not show any ppl-related failure. By the way, without the patch, sage -t all
hangs forever and eats all available memory, since some tests rely on alarm
to be interrupted! So, in my opinion, the patch gives a definitive improvement on cygwin.
comment:10 Changed 6 years ago by
- Status changed from needs_info to needs_review
comment:11 in reply to: ↑ 9 Changed 6 years ago by
- Status changed from needs_review to needs_work
Replying to gouezel:
note that the patch only disables it on cygwin, so it will not break anything on officially supported platforms
For me, that's actually a reason to be against this patch. Either we need it and we shouldn't disable it on Cygwin, or we don't need it and we can just disable it everywhere.
comment:12 Changed 6 years ago by
I get your point. Unfortunately, I can't say for sure that watchdog is not helpful to ppl in rare situations (for instance, maybe it uses some algorithms whose termination is not guaranteed, counting on watchdog to exit bad situations), so I would rather keep watchdog wherever possible (i.e., not on cygwin, where setitimer is not fully functional).
I leave the ticket as needs_work
until someone more knowledgeable on ppl steps up (or someone who really needs the patch on cygwin pushes for it)
comment:13 Changed 6 years ago by
- Commit changed from 1a8ecd257246a869941d6407eda677a489e019fa to 457bfa5062b248f4c36abeda94e84ed7c0acb877
Branch pushed to git repo; I updated commit sha1. New commits:
457bfa5 | #17650: touch files in ppl spkg_install to avoid reconfiguration
|
comment:14 follow-up: ↓ 16 Changed 6 years ago by
I'm just posting random links potentially related:
According to http://trac.sagemath.org/ticket/10039#comment:49 there used to be a --disable-watchdog
configure flag, too bad it disappeared.
I guess we should open a bug on ppl bugtracker (if one exists).
Unfortunately it seems cygwin does not ship ppl anymore, so we cannot look at what they would have done. It could still be useful to post on the cygwin ml's.
@sebastien: can you provide a minimal C file reproducing the issue? Maybe something inspired in the aforementioned 2008 post can help.
comment:15 Changed 6 years ago by
The post does indeed look related. I don't understand why it was posted on PPL-devel, since it seems to have nothing to do with PPL really.
@sebastien: can you provide a minimal C file reproducing the issue?
That would be great, especially to send to the upstream bug trackers (PPL and/or Cygwin).
comment:16 in reply to: ↑ 14 Changed 5 years ago by
Replying to jpflori:
@sebastien: can you provide a minimal C file reproducing the issue? Maybe something inspired in the aforementioned 2008 post can help.
I just tried, but unfortunately I was not able to reproduce the issue with plain C or C++ files (for instance, taking the file in the link you give and adding ppl headers and initialization does not break the itimer mechanism).
For the record, steps to reproduce the issue using python:
File alarm.pyx
:
import signal def essai(n): signal.setitimer(signal.ITIMER_REAL, 0.5 , 0) return sum(xrange(n))
File alarme_c.cpp
#include <ppl.hh>
File setup.py
from distutils.core import setup from Cython.Build import cythonize from distutils.extension import Extension setup( ext_modules = cythonize(Extension("alarme", ['alarme.pyx', 'alarme_c.cpp'], libraries = ['ppl'])) )
File essai.py
import alarme print alarme.essai(100000000)
Compile the extension with
python setup.py build_ext --inplace
Then python essai.py
results in 4999999950000000
, no interruption.
Commenting out the header inclusion in alarme_c.cpp
, then one gets Alarm clock
instead.
comment:17 Changed 5 years ago by
- Owner changed from (none) to embray
comment:18 Changed 4 years ago by
- Milestone sage-6.5 deleted
comment:19 Changed 4 years ago by
- Milestone set to sage-duplicate/invalid/wontfix
- Status changed from needs_work to positive_review
duplicate of #21190
comment:20 Changed 4 years ago by
- Resolution set to wontfix
- Status changed from positive_review to closed
Closing tickets in the sage-duplicate/invalid/wontfix module with positive_review (i.e. someone has confirmed they should be closed).
Should it be considered as a PPL bug that the methods are not defined in the header? Or a cython bug that this suffices to break the interrupts mechanism? Or a cygwin bug since it does not seem to be a problem on other platforms?