Opened 5 years ago
Closed 5 years ago
#24097 closed defect (fixed)
Broken paths in cython()'d modules
Reported by: | embray | Owned by: | |
---|---|---|---|
Priority: | critical | Milestone: | sage-8.1 |
Component: | misc | Keywords: | sageinspect cython cythonize |
Cc: | jdemeyer, fbissey | Merged in: | |
Authors: | Erik Bray | Reviewers: | Jeroen Demeyer |
Report Upstream: | N/A | Work issues: | |
Branch: | 1cf56a5 (Commits, GitHub, GitLab) | Commit: | 1cf56a587a15c89b9b759fa327fc255e7779eb6c |
Dependencies: | #24088 | Stopgaps: |
Description
As discussed on the mailing list, when cythonizing a module, the hard-coded path to that file saved in the module is written as a relative path if the current working directory is somewhere along the path that the module file resides.
That is, if the cython module is at
/home/sage/.sage/temp/cython/mymodule.pyx
and the current working directory is /home/sage
, then the module's path is written as
.sage/temp/cython/mymodule.pyx
.
This breaks sage.misc.sageinspect._extract_embedded_position
which assumes that files generated with cython()
are relative to the SPYX_TMP
variable.
This could be considered a defect in Cython, but it's also easy to work around for now by chdir()-ing into the SPYX_TMP
directory before running cythonize()
.
Change History (38)
comment:1 Changed 5 years ago by
Authors: | → Erik Bray |
---|---|
Branch: | → u/embray/misc/cython-path-bug |
Commit: | → f6dd73f77a132d790f7a0092b25403eb69f50805 |
Priority: | major → critical |
Status: | new → needs_review |
comment:2 Changed 5 years ago by
Dependencies: | → #24088 |
---|
Depends on #24088 only because I can't build without it.
comment:4 Changed 5 years ago by
What does the docker container do differently that causes this bug to only appear there? Or in other words, how could I reproduce this bug?
comment:5 Changed 5 years ago by
OK, after reading your mailing list post again, I understand the problem. It happens when you rundoctest from $HOME
.
But I think the bug is really in _extract_embedded_position
: it shouldn't assume that the path is relative to SPYX_TMP
. In other words, the following patch also fixes the problem for me and I think it is a better solution:
-
src/sage/misc/sageinspect.py
diff --git a/src/sage/misc/sageinspect.py b/src/sage/misc/sageinspect.py index c8e79e0..2628e07 100644
a b def _extract_embedded_position(docstring): 222 222 return None 223 223 224 224 raw_filename = res.group('FILENAME') 225 filename = os.path.join(SAGE_SRC, raw_filename) 226 if not os.path.isfile(filename): 227 from sage.misc.misc import SPYX_TMP 228 filename = os.path.join(SPYX_TMP, '_'.join(raw_filename.split('_')[:-1]), raw_filename) 225 if os.path.isfile(raw_filename): 226 filename = os.path.abspath(raw_filename) 227 else: 228 # Maybe the path is relative to SAGE_SRC 229 sagesrcfile = os.path.join(SAGE_SRC, raw_filename) 230 if os.path.isfile(sagesrcfile): 231 filename = sagesrcfile 232 else: 233 # No idea, keep raw path 234 filename = raw_filename 229 235 lineno = int(res.group('LINENO')) 230 236 original = res.group('ORIGINAL') 231 237 return (original, filename, lineno)
comment:6 follow-ups: 9 10 Changed 5 years ago by
Hmm--I don't particularly like this either. If I understand your point, it's that this code in _extract_embedded_position
should be able to work on arbitrary modules. But then, if it's a relative path, there's really know way of knowing where it's relative to. In fact it's making a wild guess when it checks if it's relative to SAGE_SRC
. How would this handle the case where it's relative to SPYX_TMP
?
This if os.path.isfile(raw_filename): filename = os.path.abspath(raw_filename)
works if the filename is already an absolute path or if the file is a relative path and we just happen to be in the directory it's relative to. For example, you could cython()
a function, then change directories, then try to inspect the function and this would fail.
I think for now my solution is better because it explicitly addresses correct handling of modules that were compiled by Sage's cython()
function.
comment:7 Changed 5 years ago by
Replying to chapoton:
missing
::
in the doc in sage inspect
Oh I see what you're saying. I'll fix that.
comment:8 Changed 5 years ago by
Commit: | f6dd73f77a132d790f7a0092b25403eb69f50805 → a43134ce20db0ba1463efdd3ff539a7dddcca5af |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
a43134c | Fix docstring
|
comment:9 follow-up: 11 Changed 5 years ago by
Replying to embray:
In fact it's making a wild guess when it checks if it's relative to
SAGE_SRC
.
That is not a wild guess. This is correct for functions in the Sage library, which is the most important use case.
comment:10 follow-up: 12 Changed 5 years ago by
Replying to embray:
For example, you could
cython()
a function, then change directories, then try to inspect the function and this would fail.
OK, I see your point and I agree.
But then I have some further comments:
- Instead of changing the directory to
target_dir
and then using some dubious code in_extract_embedded_position
to reconstruct the path, why not change directory to some other directory (saySAGE_LOCAL
) to force Cython to write an absolute path in the first place?
- Only the
cythonize()
call should be inside therestore_cwd()
context, not the wholetry/except
block.
comment:11 Changed 5 years ago by
Replying to jdemeyer:
Replying to embray:
In fact it's making a wild guess when it checks if it's relative to
SAGE_SRC
.That is not a wild guess. This is correct for functions in the Sage library, which is the most important use case.
Sorry, I was unclear. It's not a "wild guess" and I agree that part of the code is actually fine. All I meant was that it's still a heuristic approach that can in theory be wrong (but probably isn't).
comment:12 follow-ups: 13 14 Changed 5 years ago by
Replying to jdemeyer:
Replying to embray:
For example, you could
cython()
a function, then change directories, then try to inspect the function and this would fail.OK, I see your point and I agree.
But then I have some further comments:
- Instead of changing the directory to
target_dir
and then using some dubious code in_extract_embedded_position
to reconstruct the path, why not change directory to some other directory (saySAGE_LOCAL
) to force Cython to write an absolute path in the first place?
I considered that, but it seemed like more of a hack around Cython. The current approach feels more explicit, along the lines of how we guess a path might be relative to SAGE_SRC
. I'd be fine with either approach if you have strong feelings about it but I like this better and I don't think it's all that dubious.
- Only the
cythonize()
call should be inside therestore_cwd()
context, not the wholetry/except
block.
Heh, I actually had a small debate with myself about this. I'm not saying I disagree but why do you think it should be inverted?
comment:13 follow-up: 16 Changed 5 years ago by
Replying to embray:
I'm not saying I disagree but why do you think it should be inverted?
Because it is easier to understand if the minimal amount of code is inside the context. Otherwise, you give the impression that the except:
block should be executed also in a specific directory.
comment:14 follow-up: 15 Changed 5 years ago by
Replying to embray:
I considered that, but it seemed like more of a hack around Cython. The current approach feels more explicit, along the lines of how we guess a path might be relative to
SAGE_SRC
. I'd be fine with either approach if you have strong feelings about it but I like this better and I don't think it's all that dubious.
Ideally, Cython should just use absolute paths. That was, no hacks in _extract_embedded_position
are needed.
comment:15 Changed 5 years ago by
Replying to jdemeyer:
Replying to embray:
I considered that, but it seemed like more of a hack around Cython. The current approach feels more explicit, along the lines of how we guess a path might be relative to
SAGE_SRC
. I'd be fine with either approach if you have strong feelings about it but I like this better and I don't think it's all that dubious.Ideally, Cython should just use absolute paths. That was, no hacks in
_extract_embedded_position
are needed.
Well, yes, but we need this worked around now and I don't want to wait for/depend on a fix to Cython.
comment:16 follow-up: 17 Changed 5 years ago by
Replying to jdemeyer:
Replying to embray:
I'm not saying I disagree but why do you think it should be inverted?
Because it is easier to understand if the minimal amount of code is inside the context. Otherwise, you give the impression that the
except:
block should be executed also in a specific directory.
True, but you could also argue the other way around it makes less clear exactly what the exception handling pertains to. In this case it's clear enough, but it's also just as clear that there's nothing in the exception handling that's dependent on the current directory.
Perhaps in a case like this the context manager is almost more of a distraction and deserves to be simply a finally:
statement. I just added it because I've written the same context manager almost every time I've wanted some code that temporarily changes directories.
comment:17 follow-up: 19 Changed 5 years ago by
Replying to embray:
Perhaps in a case like this the context manager is almost more of a distraction and deserves to be simply a
finally:
statement.
Actually, that makes a lot of sense here, especially because we already have a try
block.
comment:18 Changed 5 years ago by
Cc: | fbissey added |
---|
comment:19 Changed 5 years ago by
Replying to jdemeyer:
Replying to embray:
Perhaps in a case like this the context manager is almost more of a distraction and deserves to be simply a
finally:
statement.Actually, that makes a lot of sense here, especially because we already have a
try
block.
Indeed; I'll just drop the context manager for now and maybe it add it back in later.
comment:20 Changed 5 years ago by
On second thought, I ended up keeping the context manager as a utility, as it's useful for implementing the regression test (indeed most of the times I find myself in need of such a utility it's for testing). So I only used it in the doctest for now.
comment:21 Changed 5 years ago by
Branch: | u/embray/misc/cython-path-bug → u/embray/misc/cython-path-bug-2 |
---|---|
Commit: | a43134ce20db0ba1463efdd3ff539a7dddcca5af → e03cf202160df6a8d619363ad7405e9dd443eb1c |
Kept basically the same logic in sageinspect for now, but I think clarified the approach somewhat. Still open to other ideas, including just forcing Cython to write an absolute path.
New commits:
3662dbd | Fixes https://trac.sagemath.org/ticket/24097 and includes a regression test
|
e03cf20 | Make the logic for how this funcion handles relative paths a bit clearer
|
comment:22 Changed 5 years ago by
Commit: | e03cf202160df6a8d619363ad7405e9dd443eb1c → 0a1259e4237f72feae65b6118116f3defa0b5052 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
0a1259e | This didn't work as advertised if the filename relative to SPYX_TMP was not
|
comment:23 follow-up: 24 Changed 5 years ago by
Funny that you put os.chdir(target_dir)
inside the try
. I remember a long discussion about that :-)
comment:24 Changed 5 years ago by
Replying to jdemeyer:
Funny that you put
os.chdir(target_dir)
inside thetry
. I remember a long discussion about that :-)
I thought about that.... I did it because I didn't want to have the discussion again.
comment:26 Changed 5 years ago by
Replying to embray:
So is this fine for now?
I still think that this would be better fixed upstream in Cython (I just wrote to cython-devel about it). How important/urgent is this ticket?
comment:27 Changed 5 years ago by
It is urgent because it's a regression in the tests. With the fix the overall approach is no better/worse than it was before except that it doesn't break due to the user's current directory so it's an obvious improvement.
comment:28 Changed 5 years ago by
In other words, I'm going to change the default behavior of the docker images just to work around a minor regression in the test suite that is demonstrably fixed.
comment:29 Changed 5 years ago by
Reviewers: | → Jeroen Demeyer |
---|
I can live with this as work-around. I still think that it should be fixed in a better way in the future.
Did you verify that your docker image works with the latest version of this ticket?
comment:30 Changed 5 years ago by
Yes, in fact that was the first place I tested it. Feel free to open a follow-up ticket if you think it should be updated later.
comment:31 Changed 5 years ago by
Status: | needs_review → positive_review |
---|
comment:32 Changed 5 years ago by
Status: | positive_review → needs_work |
---|
Sorry, one final detail: I don't like to add more stuff to sage.misc.misc
. Would you agree to add restore_cwd
to sage.misc.sage_ostools
(or some other more specific module)?
comment:33 Changed 5 years ago by
I certainly don't mind moving it (I would suggest breaking up sage.misc.misc
even further eventually). I just wasn't sure where would be better. I don't know about sage_ostools
(do we really need "sage" in the name of that module? It seems superfluous...)
Instead ISTM there's a case for some kind of pathutils
module for filename/path-related utilities.
comment:34 Changed 5 years ago by
Branch: | u/embray/misc/cython-path-bug-2 → u/jdemeyer/misc/cython-path-bug-2 |
---|
comment:35 Changed 5 years ago by
Commit: | 0a1259e4237f72feae65b6118116f3defa0b5052 → 1cf56a587a15c89b9b759fa327fc255e7779eb6c |
---|---|
Status: | needs_work → positive_review |
comment:38 Changed 5 years ago by
Branch: | u/jdemeyer/misc/cython-path-bug-2 → 1cf56a587a15c89b9b759fa327fc255e7779eb6c |
---|---|
Resolution: | → fixed |
Status: | positive_review → closed |
Marked as critical just because this regression prevents new builds of the docker container from being pushed, since the script that does so requires all tests to pass (though perhaps this requirement could be relaxed somewhat for pre-releases...)
New commits:
Add some dependencies for ipykernel
Add a restore_cwd context manager
Fixes https://trac.sagemath.org/ticket/24097 and includes a regression test