Opened 3 years ago
Last modified 3 years ago
#27769 new defect
sage is killed on startup via sage-cleaner
Reported by: | andy | Owned by: | andy |
---|---|---|---|
Priority: | trivial | Milestone: | |
Component: | scripts | Keywords: | sage-cleaner |
Cc: | jdemeyer | Merged in: | |
Authors: | Andy Howell | Reviewers: | |
Report Upstream: | N/A | Work issues: | |
Branch: | u/andy/sage_is_killed_on_startup_via_sage_cleaner (Commits, GitHub, GitLab) | Commit: | 8be49ef21d3f20b80f325189c56b7d196eff21b0 |
Dependencies: | Stopgaps: |
Description
Under rare conditions, sage is killed at startup by sage-cleaner.
This requires:
- Files left over in ~/.sage/temp/HOSTNAME/PID
- A pid in ~/.sage/temp/HOSTNAME/PID/spawned_processes must exist, and be started at init time. Its process group id will be 0
When sage-cleaner tries to clean up, it will do a "kill 0", which will kill sage.
Files will only be left in ~/.sage/temp/HOSTNAME if sage is not shutdown properly. Most likely because of a reboot.
The attached patch to sage-cleaner prevents it from doing 'kill 0'
With the fix, the ~/.sage/temp/HOSTNAME/cleaner.log
Starting sage-cleaner with PID 16954 Checking PIDs [809, 31312] Process 809 is no longer running, so we clean up Killing 809's spawned jobs --> Killing 'Singular' with PID 814 and parent PID 809 --> Process group of PID 814 is 0. Not killing process group Deleting /home/andy/.sage/temp/dokodemo/809
Contents of ~/.sage/temp/dokodemo/809/spawned_processes: 814 Singular
ps auxww|grep 814 root 814 0.0 0.0 0 0 ? S< Apr30 0:00 [loop11]
Attachments (1)
Change History (10)
Changed 3 years ago by
comment:1 follow-up: ↓ 7 Changed 3 years ago by
just to clarify the development guidelines - one posts the branch with fixes, not the branch where the error occured. So the fix in the attachment should go into the branch to be posted.
comment:2 Changed 3 years ago by
- Cc jdemeyer added
comment:3 follow-up: ↓ 4 Changed 3 years ago by
But how could the PID be 0? I feel like this patch is fixing the symptom rather than the root cause.
comment:4 in reply to: ↑ 3 Changed 3 years ago by
Replying to jdemeyer:
But how could the PID be 0? I feel like this patch is fixing the symptom rather than the root cause.
Its not actually the PID, but the process group id PGID.
sage-cleaner is getting a pid out of a stale spawned_processes file, and retrieving the process group id for that. It just so happens that the pid 814 is owned by root, and its process group id is 0.
ps axo pid,pgrp,user,ppid,comm -q 814 PID PGRP USER PPID COMMAND 814 0 root 2 loop11
sage-cleaner is calling getpgid():
>>> import os >>> print(os.getpgid(814)) 0
This confluence of events is almost as rare as unicorns. My motivation in posting it was that it took a long time for me to figure out what was wrong. I wanted to save others the trouble.
One could argue that the root cause is the failure of sage-cleaner to remove everything under ~/.sage/temp/HOSTNAME/ at startup.
Would that be a better solution?
comment:5 Changed 3 years ago by
- Branch develop deleted
- Commit d765ee29175e802cbbb44557471ab081c6e14c33 deleted
- Owner changed from (none) to andy
comment:6 Changed 3 years ago by
- Branch set to u/andy/sage_is_killed_on_startup_via_sage_cleaner
comment:7 in reply to: ↑ 1 Changed 3 years ago by
- Commit set to 8be49ef21d3f20b80f325189c56b7d196eff21b0
Replying to dimpase:
just to clarify the development guidelines - one posts the branch with fixes, not the branch where the error occured. So the fix in the attachment should go into the branch to be posted.
Sorry. Hopefully I have done it correctly now. After jdemeyer's comment, I changed my approach. The attachment is not needed. Checked in changes via git trac push.
Thanks,
Andy
comment:8 Changed 3 years ago by
As the Sage-8.8 release milestone is pending, we should delete the sage-8.8 milestone for tickets that are not actively being worked on or that still require significant work to move forward. If you feel that this ticket should be included in the next Sage release at the soonest please set its milestone to the next release milestone (sage-8.9).
comment:9 Changed 3 years ago by
- Milestone sage-8.8 deleted
As the Sage-8.8 release milestone is pending, we should delete the sage-8.8 milestone for tickets that are not actively being worked on or that still require significant work to move forward. If you feel that this ticket should be included in the next Sage release at the soonest please set its milestone to the next release milestone (sage-8.9).
Patch to prevent sage-cleaner from doing a kill 0