Opened 10 years ago
Closed 10 years ago
#14292 closed defect (fixed)
Race conditions in doctester
Reported by: | jdemeyer | Owned by: | roed |
---|---|---|---|
Priority: | blocker | Milestone: | sage-5.9 |
Component: | doctest framework | Keywords: | |
Cc: | Merged in: | sage-5.9.beta4 | |
Authors: | Jeroen Demeyer | Reviewers: | Volker Braun |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
The writing of the JSON file is prone to race conditions.
Attachments (1)
Change History (22)
comment:1 Changed 10 years ago by
Description: | modified (diff) |
---|
comment:2 follow-up: 3 Changed 10 years ago by
comment:3 Changed 10 years ago by
Replying to roed:
so there aren't multiple processes trying to write to the same file...
There could be when testing the doctester in parallel. Anyway I have a patch ready but it's on sage.math
which is currently down.
comment:4 Changed 10 years ago by
Component: | doctest → doctest framework |
---|---|
Owner: | changed from mvngu to roed |
comment:5 Changed 10 years ago by
OK, I managed to recover the data from sage.math
including this patch. It's not the final version, but it shows the main idea.
comment:6 Changed 10 years ago by
Status: | new → needs_review |
---|
comment:7 Changed 10 years ago by
Status: | needs_review → needs_work |
---|
comment:8 follow-ups: 9 12 Changed 10 years ago by
Its good to have a general solution to this recurring problem. But writing via a temporary file is an ugly workaround for file locking, which is what we really want to do. It is also likely to hide race conditions. Also, on Win32 (and possibly other non-posix platforms) there is no atomic replace. File locking is also useful as a simple synchronization tool for clusters with a shared file system.
comment:9 Changed 10 years ago by
Replying to vbraun:
But writing via a temporary file is an ugly workaround for file locking
Not always. Locking has different goals. Locks should be held as short as possible, which cannot be guaranteed (for example when writing the GAP workspace). My solution also solves the problem of exceptions/crashes during the write and it doesn't need any changes to the reading of files.
comment:10 Changed 10 years ago by
Status: | needs_work → needs_review |
---|
comment:11 Changed 10 years ago by
But we really do want locking when creating the gap workspace instead of 10 processes creating the workspace 10 times and then fighting over the workspace file (last process standing wins).
comment:13 Changed 10 years ago by
Not natively, but e.g. https://github.com/smontanaro/pylockfile looks quite sane to me.
comment:14 Changed 10 years ago by
Volker, I don't object to using that, but I still think this patch here is ready for review. I would rather use file locking in addition to the current patch such that we can have all advantages of both approaches.
comment:15 follow-up: 16 Changed 10 years ago by
I agree that using a temp file is good enough for now. But it would be nice if we could rename write_via_temp
to something that embodies the underlying idea instead of the implementation. Then we could improve the implementation later without having to change the usage. Maybe write_safely
or write_threadsafe
.
comment:16 Changed 10 years ago by
If you prefer write_safely
, that's fine for me. However, I do think that using locking and using a temporary file are two different solutions which are sufficiently different to warrant different names. Given this, perhaps it is better to be explicit about the implementation.
comment:17 Changed 10 years ago by
Well if your goal is to eventually have both proper locking and atomic writing separately then thats fine. But the name should still convey why we are doing that, not how.
Obviously many Sage developers are not familiar with which file system operations are atomic, so they will rightfully be confused when they see code that writes a cache with write_via_temp
. Clearly there is an optimization where you take out that extra step ;-)
How about atomic_write
?
comment:18 Changed 10 years ago by
Do you agree with the patch, modulo the naming of write_via_temp
?
comment:19 Changed 10 years ago by
Yes, I think its great to use a with block and the actual implementation seems fine. With the caveat that Windows can't do atomic replace, but then there is nothing we can do about it.
Changed 10 years ago by
Attachment: | 14292_doctest_race.patch added |
---|
comment:20 Changed 10 years ago by
Reviewers: | → Volker Braun |
---|---|
Status: | needs_review → positive_review |
OK, renamed to atomic_write
.
comment:21 Changed 10 years ago by
Merged in: | → sage-5.9.beta4 |
---|---|
Resolution: | → fixed |
Status: | positive_review → closed |
What exactly goes wrong? Loading and saving stats is done in
DocTestController
, so there aren't multiple processes trying to write to the same file...