Opened 13 months ago

Last modified 12 months ago

#27769 new defect

sage is killed on startup via sage-cleaner

Reported by: andy Owned by: andy
Priority: trivial Milestone:
Component: scripts Keywords: sage-cleaner
Cc: jdemeyer Merged in:
Authors: Andy Howell Reviewers:
Report Upstream: N/A Work issues:
Branch: u/andy/sage_is_killed_on_startup_via_sage_cleaner (Commits) Commit: 8be49ef21d3f20b80f325189c56b7d196eff21b0
Dependencies: Stopgaps:

Description

Under rare conditions, sage is killed at startup by sage-cleaner.

This requires:

  1. Files left over in ~/.sage/temp/HOSTNAME/PID
  2. A pid in ~/.sage/temp/HOSTNAME/PID/spawned_processes must exist, and be started at init time. Its process group id will be 0

When sage-cleaner tries to clean up, it will do a "kill 0", which will kill sage.

Files will only be left in ~/.sage/temp/HOSTNAME if sage is not shutdown properly. Most likely because of a reboot.

The attached patch to sage-cleaner prevents it from doing 'kill 0'

With the fix, the ~/.sage/temp/HOSTNAME/cleaner.log

Starting sage-cleaner with PID 16954
Checking PIDs [809, 31312]
Process 809 is no longer running, so we clean up
Killing 809's spawned jobs
--> Killing 'Singular' with PID 814 and parent PID 809
--> Process group of PID 814 is 0. Not killing process group
Deleting /home/andy/.sage/temp/dokodemo/809
Contents of ~/.sage/temp/dokodemo/809/spawned_processes: 
814 Singular
ps auxww|grep 814
root       814  0.0  0.0      0     0 ?        S<   Apr30   0:00 [loop11]

Attachments (1)

sage-cleaner.patch (746 bytes) - added by andy 13 months ago.
Patch to prevent sage-cleaner from doing a kill 0

Download all attachments as: .zip

Change History (10)

Changed 13 months ago by andy

Patch to prevent sage-cleaner from doing a kill 0

comment:1 follow-up: Changed 13 months ago by dimpase

just to clarify the development guidelines - one posts the branch with fixes, not the branch where the error occured. So the fix in the attachment should go into the branch to be posted.

comment:2 Changed 13 months ago by dimpase

  • Cc jdemeyer added

comment:3 follow-up: Changed 13 months ago by jdemeyer

But how could the PID be 0? I feel like this patch is fixing the symptom rather than the root cause.

comment:4 in reply to: ↑ 3 Changed 13 months ago by andy

Replying to jdemeyer:

But how could the PID be 0? I feel like this patch is fixing the symptom rather than the root cause.

Its not actually the PID, but the process group id PGID.

sage-cleaner is getting a pid out of a stale spawned_processes file, and retrieving the process group id for that. It just so happens that the pid 814 is owned by root, and its process group id is 0.

ps axo pid,pgrp,user,ppid,comm -q 814
  PID  PGRP USER      PPID COMMAND
  814     0 root         2 loop11

sage-cleaner is calling getpgid():

>>> import os
>>> print(os.getpgid(814))
0

This confluence of events is almost as rare as unicorns. My motivation in posting it was that it took a long time for me to figure out what was wrong. I wanted to save others the trouble.

One could argue that the root cause is the failure of sage-cleaner to remove everything under ~/.sage/temp/HOSTNAME/ at startup.

Would that be a better solution?

comment:5 Changed 13 months ago by andy

  • Branch develop deleted
  • Commit d765ee29175e802cbbb44557471ab081c6e14c33 deleted
  • Owner changed from (none) to andy

comment:6 Changed 13 months ago by andy

  • Branch set to u/andy/sage_is_killed_on_startup_via_sage_cleaner

comment:7 in reply to: ↑ 1 Changed 13 months ago by andy

  • Commit set to 8be49ef21d3f20b80f325189c56b7d196eff21b0

Replying to dimpase:

just to clarify the development guidelines - one posts the branch with fixes, not the branch where the error occured. So the fix in the attachment should go into the branch to be posted.

Sorry. Hopefully I have done it correctly now. After jdemeyer's comment, I changed my approach. The attachment is not needed. Checked in changes via git trac push.

Thanks,

Andy

comment:8 Changed 12 months ago by embray

As the Sage-8.8 release milestone is pending, we should delete the sage-8.8 milestone for tickets that are not actively being worked on or that still require significant work to move forward. If you feel that this ticket should be included in the next Sage release at the soonest please set its milestone to the next release milestone (sage-8.9).

comment:9 Changed 12 months ago by embray

  • Milestone sage-8.8 deleted

As the Sage-8.8 release milestone is pending, we should delete the sage-8.8 milestone for tickets that are not actively being worked on or that still require significant work to move forward. If you feel that this ticket should be included in the next Sage release at the soonest please set its milestone to the next release milestone (sage-8.9).

Note: See TracTickets for help on using tickets.