Opened 3 years ago
Closed 20 months ago
#28540 closed defect (invalid)
Let lookup_global accept unicode as input
Reported by: | SimonKing | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-duplicate/invalid/wontfix |
Component: | python3 | Keywords: | python3 |
Cc: | jhpalmieri, gh-mwageringel | Merged in: | |
Authors: | Reviewers: | Markus Wageringel | |
Report Upstream: | N/A | Work issues: | |
Branch: | u/SimonKing/allow_unicode_in_lookup_global (Commits, GitHub, GitLab) | Commit: | 4a33e7797f668270f0a60bb95c0c26cfa3400c55 |
Dependencies: | #28444 | Stopgaps: |
Description (last modified by )
In #28444, bytes_to_str was made more tolerant: It now accepts as input not only bytes but also str. A str input is returned unaltered.
bytes_to_str is also used in sage.structure.factory.lookup_global. It turns out that a pickle created in py3 may not be unpicklable in py2 because the stored name of the global is a unicode, but a bytes or str is expected.
Obvious fix: Try to convert unicode to str in lookup_global.
Change History (14)
comment:1 follow-up: ↓ 2 Changed 3 years ago by
comment:2 in reply to: ↑ 1 Changed 3 years ago by
Replying to nbruin:
That needs to go into a differently-named routine. You're clearly not converting bytes to str here. You're taking *some* input and trying to ensure it's "str".
There is of course a way to convert a unicode object to py2:
s.encode("utf8")
. Which in Py3 will actually result in a bytes object. That's a pretty clear clue that this behaviour does NOT belong in a routine namedbytes_to_str
.
OK, that somehow makes sense to me. So, I will not touch it.
Question: Is the change that I have introduced in #28444 OK, then? At least it made it possible to read a py2 pickle in py3.
comment:3 Changed 3 years ago by
- Description modified (diff)
- Summary changed from Let bytes_to_str accept unicode as input to Let lookup_global accept unicode as input
Rather than stretching byte_to_str beyond its specifications, it may be better to fix the function that actually needs a conversion from unicode to str, namely lookup_glocal. See the new ticket description.
comment:4 Changed 3 years ago by
- Branch set to u/SimonKing/allow_unicode_in_lookup_global
comment:5 Changed 3 years ago by
- Commit set to 4a33e7797f668270f0a60bb95c0c26cfa3400c55
- Status changed from new to needs_review
See #28414 for a situation where the fix from this ticket actually solves a problem.
New commits:
2e0d11e | Pass unpickling options to pickle.load, default encoding 'latin1'. Accept both str and bytes in mtx_unpickle
|
8c105cd | Make str_to_bytes/bytes_to_str accept both str and bytes input.
|
c8a0748 | Add tests for #28444
|
89ac77a | Fix keyword for py3-only test
|
3693024 | Fix doc strings in sage.misc.persist
|
f0828ee | Add a comment regarding the expected data type for an unpickle helper
|
ba41ebe | Fix two typos in a comment
|
4a33e77 | Allow names of type bytes, str and unicode in lookup_global
|
comment:6 Changed 3 years ago by
- Dependencies set to #28444
comment:8 Changed 2 years ago by
- Milestone changed from sage-9.0 to sage-9.1
Ticket retargeted after milestone closed
comment:9 Changed 2 years ago by
- Milestone changed from sage-9.1 to sage-9.2
Batch modifying tickets that will likely not be ready for 9.1, based on a review of the ticket title, branch/review status, and last modification date.
comment:10 Changed 23 months ago by
- Cc gh-mwageringel added
comment:11 Changed 23 months ago by
As we do not support Python 2 anymore, should we close this as outdated? I imagine there is rarely a need to load a py3 pickle in a Python-2-based Sage now.
comment:12 Changed 23 months ago by
- Milestone changed from sage-9.2 to sage-duplicate/invalid/wontfix
- Status changed from needs_work to needs_review
That was my thought too.
comment:13 Changed 23 months ago by
- Reviewers set to Markus Wageringel
- Status changed from needs_review to positive_review
Ok.
comment:14 Changed 20 months ago by
- Resolution set to invalid
- Status changed from positive_review to closed
If this needs to be reopened, use the milestone sage-9.1.1
That needs to go into a differently-named routine. You're clearly not converting bytes to str here. You're taking *some* input and trying to ensure it's "str".
There is of course a way to convert a unicode object to py2:
s.encode("utf8")
. Which in Py3 will actually result in a bytes object. That's a pretty clear clue that this behaviour does NOT belong in a routine namedbytes_to_str
.