Opened 5 years ago
Closed 5 years ago
#24223 closed defect (fixed)
py3: several string conversion fixes
Reported by: | embray | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-8.2 |
Component: | python3 | Keywords: | |
Cc: | chapoton, jdemeyer | Merged in: | |
Authors: | Erik Bray | Reviewers: | Jeroen Demeyer |
Report Upstream: | N/A | Work issues: | |
Branch: | e9a582e (Commits, GitHub, GitLab) | Commit: | e9a582e0a3a5fe777ea34475200f18288fc3db2c |
Dependencies: | Stopgaps: |
Description
This fixes several (hardly exhaustive) string conversions around Sage such that functions and methods that returned str
on Python 2 also return str
on Python 3 (and the same for functions and methods that take str
as arguments) using the new string utilities from #24222.
This also incorporates/replaces the fixes from #23812.
Change History (41)
comment:1 Changed 5 years ago by
comment:2 Changed 5 years ago by
Commit: | 44a55407569e8919b7598ca367a05469732e78bf → 25d6c32154c88dc4e9f1520d105b586b61c43cdc |
---|
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
34ef6dc | Add HAVE_GMPY2 compile-time constant
|
158d984 | Force absolute import in have_module()
|
7f06e71 | Fix documentation
|
3ac146c | Add a Cython compile-time constant for Python major version.
|
f71fd41 | Add new string conversion utilities.
|
a5d5219 | Add support for optional error handling.
|
627d4c7 | Add new string conversion utilities.
|
25d6c32 | Numerous string conversions from bytes to str and from str to bytes for Python 3
|
comment:4 follow-up: 6 Changed 5 years ago by
Status: | needs_review → needs_work |
---|
Something is wrong: there are still the files src/sage/misc/string.*
comment:5 follow-up: 7 Changed 5 years ago by
This is again a ticket with many small independent changes where a "partial positive review" like what I proposed on #24025 would make sense. I didn't like how we handled that, so how should we proceed here?
comment:6 Changed 5 years ago by
Replying to jdemeyer:
Something is wrong: there are still the files
src/sage/misc/string.*
Oops--part of the problem here is I'm moving between two different working trees, one for Python 2 and one for Python 3, so it can get confusing. I'll clean this up.
comment:7 Changed 5 years ago by
Replying to jdemeyer:
This is again a ticket with many small independent changes where a "partial positive review" like what I proposed on #24025 would make sense. I didn't like how we handled that, so how should we proceed here?
I'd be fine if you proposed your changes and tossed it back to me as "needs review", rather than going immediately to "positive review".
comment:8 Changed 5 years ago by
It seems strange to me to use locale.getpreferredencoding()
for strings which do not depend on the locale. I would propose:
- Use
locale.getpreferredencoding()
for strings which actually depend on the locale.
- Use
sys.getfilesystemencoding()
for strings which actually depend on the file system encoding.
- Use UTF-8 otherwise.
comment:9 Changed 5 years ago by
Milestone: | sage-8.1 → sage-8.2 |
---|
comment:10 Changed 5 years ago by
Commit: | 25d6c32154c88dc4e9f1520d105b586b61c43cdc → 56f1c02add22c05178f43c2d14ea4bfa1aae0db9 |
---|
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
64567aa | Make type checks explicit
|
a358aea | Require a const char*--this is more consistent for use in cases where
|
64a1c2d | Use a less unusual character in the tests
|
1bf6709 | Use r-strings for the docstrings since they contain slashes
|
2bb8396 | Clean up these declarations:
|
6d73a41 | Use more specific type declarations here.
|
56f1c02 | Numerous string conversions from bytes to str and from str to bytes for Python 3
|
comment:11 Changed 5 years ago by
Commit: | 56f1c02add22c05178f43c2d14ea4bfa1aae0db9 → a9ebae6bfffc84a549af6e8da8abc52840b2b74c |
---|
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
a285af4 | Add new string conversion utilities.
|
5b3719a | Add support for optional error handling.
|
bf5f893 | Make type checks explicit
|
1bd27fc | Require a const char*--this is more consistent for use in cases where
|
bd2d931 | Use a less unusual character in the tests
|
b32f000 | Use r-strings for the docstrings since they contain slashes
|
ccf7bcd | Clean up these declarations:
|
dec9f3a | Use more specific type declarations here.
|
a9ebae6 | Numerous string conversions from bytes to str and from str to bytes for Python 3
|
comment:13 Changed 5 years ago by
Status: | needs_work → needs_review |
---|
Yeah, IIRC this was just a bunch of miscellaneous string conversions from some other tickets, but redone using the sage.cpython.string
utilities.
comment:15 Changed 5 years ago by
Status: | needs_review → needs_work |
---|
comment:16 follow-up: 17 Changed 5 years ago by
Interesting--when I added str_to_bytes
I predicted we might later want to allow Python 2 unicode
objects to be supported somehow, but put it off because I didn't think there was much code in Sage that already was expected to handle unicode
(except in a few places like unicode art, etc.). But the tests that are failing here clearly show that there are some places it's expected to work.
So I see at least three options:
- Modify
str_to_bytes
to treatunicode
appropriate on Python 2 - Add some new function like
unicode_to_str
that's analogous tostr_to_bytes
(I'm not crazy about this) - Just special case
unicode
on Python 2 where we do want to support it.
I lean towards 1. but I'm open to ideas.
comment:17 Changed 5 years ago by
Replying to embray:
I lean towards 1.
I agree. We should just support bytes
and unicode
(in the Cython sense!) as input to str_to_bytes
regardless of Python version. One complication is that the Python 3 code path of str_to_bytes
uses some functions specific to Python 3, so we cannot just run that code on Python 2.
comment:18 follow-ups: 20 21 Changed 5 years ago by
Right, I'll give it a look though. It will just have to have a slightly different implementation for Python 2. Welcome back btw!
comment:20 Changed 5 years ago by
Replying to embray:
Right, I'll give it a look though. It will just have to have a slightly different implementation for Python 2.
I would rather avoid that. Instead of PyUnicode_EncodeLocale
, you can just pass NULL
as encoding
to PyUnicode_AsEncodedString
. That leaves PyUnicode_AsUTF8
which is really only used to translate the encoding
and errors
arguments. We could require that this is always of type str
(so, not unicode
on Python 2).
comment:21 Changed 5 years ago by
Replying to embray:
I'll give it a look though.
I can also do that. Just let me know, to avoid double work.
comment:22 Changed 5 years ago by
Commit: | a9ebae6bfffc84a549af6e8da8abc52840b2b74c → c393e41ee8f627ca5b76149100039c5664208235 |
---|
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
9cf556b | Rework str_to_bytes so that it accepts unicode on Python 2 and treats it same as str on Python 3.
|
086e523 | Numerous string conversions from bytes to str and from str to bytes for Python 3
|
728a0bc | Use str_to_bytes for 'unicode' (which will be str on Python 3 and unicode on Python 2)
|
c393e41 | Minor Python 3 fix for sage.libs.flint.fmpz_poly
|
comment:24 Changed 5 years ago by
Dependencies: | #24222 → #24222, #24475 |
---|
comment:25 Changed 5 years ago by
Commit: | c393e41ee8f627ca5b76149100039c5664208235 → 0925f3b3d747e2f935006576a929c317eda7f1f3 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
0925f3b | A few additional string conversion fixes for sage.libs.ecl
|
comment:26 Changed 5 years ago by
Status: | needs_work → needs_review |
---|
comment:27 Changed 5 years ago by
Commit: | 0925f3b3d747e2f935006576a929c317eda7f1f3 → 7e4d3ccff889e902f6e93c42533b68e9991c83b4 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
7e4d3cc | py3: fix pickling of integers and other minor bytes/str fixes
|
comment:28 Changed 5 years ago by
Commit: | 7e4d3ccff889e902f6e93c42533b68e9991c83b4 → 1338ba4db9db82a4e96191fa5b8ecb3f51cf5805 |
---|
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
998911f | Rework str_to_bytes so that it accepts unicode on Python 2 and treats it same as str on Python 3.
|
c2c898e | Numerous string conversions from bytes to str and from str to bytes for Python 3
|
0637cde | Use str_to_bytes for 'unicode' (which will be str on Python 3 and unicode on Python 2)
|
c0158a9 | Minor Python 3 fix for sage.libs.flint.fmpz_poly
|
f415e2f | A few additional string conversion fixes for sage.libs.ecl
|
1338ba4 | py3: fix pickling of integers and other minor bytes/str fixes
|
comment:29 Changed 5 years ago by
Looks good to me, globally, even if do not feel I understand everything in detail.
Jeroen, do you think we can move forward with this important ticket ?
comment:30 Changed 5 years ago by
I always find it a bit unpleasant to review tickets with non-merged dependencies. So I would rather wait until the next beta.
Two immediate issues that I see:
- In
src/sage/libs/mpmath/utils.pyx
you add an import that you don't use.
- In several files, you
import
stuff fromsage.cpython.string
where you could usecimport
.
comment:31 Changed 5 years ago by
Branch: | u/embray/python3/string-fixes → public/python3-trac24223 |
---|---|
Commit: | 1338ba4db9db82a4e96191fa5b8ecb3f51cf5805 → 00f2776101f58a9c232fc06a3ee3eb2d87fc4d62 |
Dependencies: | #24222, #24475 |
New commits:
00f2776 | Merge branch 'u/embray/python3/string-fixes' in 8.2.b4
|
comment:32 Changed 5 years ago by
Commit: | 00f2776101f58a9c232fc06a3ee3eb2d87fc4d62 → 7d68d8c73f2524d8ceb7af3d74bba57f31a4cbe7 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
7d68d8c | some details of imports
|
comment:33 Changed 5 years ago by
Reviewers: | → Jeroen Demeyer |
---|---|
Status: | needs_review → positive_review |
comment:34 Changed 5 years ago by
Status: | positive_review → needs_work |
---|
This does not work:
-from sage.cpython.string import FS_ENCODING, str_to_bytes +from sage.cpython.string cimport FS_ENCODING, str_to_bytes
FS_ENCODING
is can't be cimported.
comment:35 Changed 5 years ago by
Commit: | 7d68d8c73f2524d8ceb7af3d74bba57f31a4cbe7 → 80b0d1f0c5d761c6804abff6bfbf94a1645105c8 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
80b0d1f | fix import
|
comment:37 Changed 5 years ago by
Branch: | public/python3-trac24223 → u/embray/python3/string-fixes |
---|---|
Commit: | 80b0d1f0c5d761c6804abff6bfbf94a1645105c8 → e9a582e0a3a5fe777ea34475200f18288fc3db2c |
Status: | needs_review → positive_review |
I rebased and squashed some of these changes.
New commits:
cc214fd | Numerous string conversions from bytes to str and from str to bytes for Python 3
|
f7e1153 | Use str_to_bytes for 'unicode' (which will be str on Python 3 and unicode on Python 2)
|
a4d6fe3 | A few additional string conversion fixes for sage.libs.ecl
|
d03a8d4 | py3: fix pickling of integers and other minor bytes/str fixes
|
e9a582e | import -> cimport where appropriate
|
comment:38 Changed 5 years ago by
Status: | positive_review → needs_work |
---|
[sagelib-8.2.beta4] Error compiling Cython file: [sagelib-8.2.beta4] ------------------------------------------------------------ [sagelib-8.2.beta4] ... [sagelib-8.2.beta4] from sage.rings.finite_rings.finite_field_prime_modn import FiniteField_prime_modn [sagelib-8.2.beta4] from sage.rings.finite_rings.finite_field_givaro import FiniteField_givaro [sagelib-8.2.beta4] from sage.rings.finite_rings.finite_field_ntl_gf2e import FiniteField_ntl_gf2e [sagelib-8.2.beta4] from sage.libs.pari.all import pari [sagelib-8.2.beta4] from sage.libs.gmp.all cimport * [sagelib-8.2.beta4] from sage.cpython.string cimport FS_ENCODING, str_to_bytes [sagelib-8.2.beta4] ^ [sagelib-8.2.beta4] ------------------------------------------------------------ [sagelib-8.2.beta4] [sagelib-8.2.beta4] sage/libs/singular/singular.pyx:40:0: 'sage/cpython/string/FS_ENCODING.pxd' not found [sagelib-8.2.beta4] [sagelib-8.2.beta4] Error compiling Cython file: [sagelib-8.2.beta4] ------------------------------------------------------------ [sagelib-8.2.beta4] ... [sagelib-8.2.beta4] lib = SINGULAR_SO [sagelib-8.2.beta4] [sagelib-8.2.beta4] if not os.path.exists(lib): [sagelib-8.2.beta4] raise ImportError("cannot locate Singular library ({})".format(lib)) [sagelib-8.2.beta4] [sagelib-8.2.beta4] lib = str_to_bytes(lib, FS_ENCODING, "surrogateescape") [sagelib-8.2.beta4] ^ [sagelib-8.2.beta4] ------------------------------------------------------------ [sagelib-8.2.beta4] [sagelib-8.2.beta4] sage/libs/singular/singular.pyx:778:28: 'FS_ENCODING' is not a constant, variable or function identifier [sagelib-8.2.beta4] Traceback (most recent call last): [sagelib-8.2.beta4] File "/home/buildbot/slave/sage_git/build/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 1179, in cythonize_one_helper [sagelib-8.2.beta4] return cythonize_one(*m) [sagelib-8.2.beta4] File "/home/buildbot/slave/sage_git/build/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 1161, in cythonize_one [sagelib-8.2.beta4] raise CompileError(None, pyx_file) [sagelib-8.2.beta4] CompileError: sage/libs/singular/singular.pyx
comment:40 Changed 5 years ago by
Status: | needs_work → positive_review |
---|
He didn't. Setting this back to positive review since this otherwise fixes Jeroen's issue.
comment:41 Changed 5 years ago by
Branch: | u/embray/python3/string-fixes → e9a582e0a3a5fe777ea34475200f18288fc3db2c |
---|---|
Resolution: | → fixed |
Status: | positive_review → closed |
With this, plus a few other small fixes (including fixes from other existing tickets merged in) I was able to get the Sage REPL up and
1 + 1
working.