Opened 2 years ago
Last modified 2 years ago
#28444 closed defect
Fix backwards incompatibility of unpickling in Python 3 — at Version 7
Reported by: | SimonKing | Owned by: | |
---|---|---|---|
Priority: | blocker | Milestone: | sage-8.9 |
Component: | python3 | Keywords: | unpickling UnicodeError backwards compatibility |
Cc: | Merged in: | ||
Authors: | Reviewers: | ||
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
EDIT: I replaced the original ticket description by something that I wrote in a comment, because now I have a much smaller example, and moreover pickles of the same object created with Python-3 and with Python-2, so that one can compare.
The following examples require the optional meataxe package, but I am not sure yet if meataxe is to blame or Python-3 (I hope it is the former, because I guess it would be more easy to fix).
attachment:Py2.sobj and attachment:Py3.sobj result in the following behaviour in Python-3
sage: load('/home/king/Projekte/coho/tests/Py2.sobj') --------------------------------------------------------------------------- UnicodeDecodeError Traceback (most recent call last) <ipython-input-3-5705b555470a> in <module>() ----> 1 load('/home/king/Projekte/coho/tests/Py2.sobj') /home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.load (build/cythonized/sage/misc/persist.c:2824)() 149 150 ## Load file by absolute filename --> 151 with open(filename, 'rb') as fobj: 152 X = loads(fobj.read(), compress=compress) 153 try: /home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.load (build/cythonized/sage/misc/persist.c:2774)() 150 ## Load file by absolute filename 151 with open(filename, 'rb') as fobj: --> 152 X = loads(fobj.read(), compress=compress) 153 try: 154 X._default_filename = os.path.abspath(filename) /home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.loads (build/cythonized/sage/misc/persist.c:7270)() 967 968 unpickler = SageUnpickler(io.BytesIO(s)) --> 969 return unpickler.load() 970 971 UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0: ordinal not in range(128) sage: load('/home/king/Projekte/coho/tests/Py3.sobj') [1 0 0 0 0 0 0 0] [0 0 0 1 1 1 1 1]
and in Python-2
sage: load('/home/king/Projekte/coho/tests/Py2.sobj') [1 0 0 0 0 0 0 0] [0 0 0 1 1 1 1 1] sage: load('/home/king/Projekte/coho/tests/Py3.sobj') [1 0 0 0 0 0 0 0] [0 0 0 1 1 1 1 1] sage: __ == _ True
So, the Python-3 pickle can be unpickled in Python-2, but not the other way around. What is the problem?
Change History (10)
Changed 2 years ago by
comment:1 follow-ups: ↓ 2 ↓ 3 Changed 2 years ago by
Can Sage/Py3 produce the pickle? In that case, you could compare the produced pickles to see how far apart they are. Of course, if an ASCII decoder encounters 0x80 it's justified to not decode it, so it might be interesting to see what py3 makes from it itself. My guess would be that the bytestring should NOT be decoded by ascii, but something else. Perhaps unpickle can be configured to use a different decoder. But it would be good to see what generates the non-ascii symbol and what its meaning is.
comment:2 in reply to: ↑ 1 Changed 2 years ago by
Replying to nbruin:
Of course, if an ASCII decoder encounters 0x80 it's justified to not decode it
Then the same should hold for Python-2. It doesn't. Hence, it shouldn't hold for Python-3 either.
comment:3 in reply to: ↑ 1 Changed 2 years ago by
Replying to nbruin:
Can Sage/Py3 produce the pickle?
The problem is that the pickle comes from an old version of an optional Sage package. That's why I use the "unpickle override". But I'll see what I can do.
comment:4 Changed 2 years ago by
The new attachment was created with Python-2 and can be used without "unpickle override", but it requires the optional meataxe spkg.
comment:5 Changed 2 years ago by
I think now I have a very small example. It uses the optional meataxe package. It would be a good news if actually the meataxe wrapper was to blame for the unpickling problem --- but I am not expert enough to tell whether (1) it is the case and (2) how it could be fixed (if it was the case).
Anyway. I have a new version of attachment:Py2.sobj, and a new attachment:Py3.sobj. It results in the following behaviour in Python-3
sage: load('/home/king/Projekte/coho/tests/Py2.sobj') --------------------------------------------------------------------------- UnicodeDecodeError Traceback (most recent call last) <ipython-input-3-5705b555470a> in <module>() ----> 1 load('/home/king/Projekte/coho/tests/Py2.sobj') /home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.load (build/cythonized/sage/misc/persist.c:2824)() 149 150 ## Load file by absolute filename --> 151 with open(filename, 'rb') as fobj: 152 X = loads(fobj.read(), compress=compress) 153 try: /home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.load (build/cythonized/sage/misc/persist.c:2774)() 150 ## Load file by absolute filename 151 with open(filename, 'rb') as fobj: --> 152 X = loads(fobj.read(), compress=compress) 153 try: 154 X._default_filename = os.path.abspath(filename) /home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.loads (build/cythonized/sage/misc/persist.c:7270)() 967 968 unpickler = SageUnpickler(io.BytesIO(s)) --> 969 return unpickler.load() 970 971 UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0: ordinal not in range(128) sage: load('/home/king/Projekte/coho/tests/Py3.sobj') [1 0 0 0 0 0 0 0] [0 0 0 1 1 1 1 1]
and in Python-2
sage: load('/home/king/Projekte/coho/tests/Py2.sobj') [1 0 0 0 0 0 0 0] [0 0 0 1 1 1 1 1] sage: load('/home/king/Projekte/coho/tests/Py3.sobj') [1 0 0 0 0 0 0 0] [0 0 0 1 1 1 1 1] sage: __ == _ True
So, the Python-3 pickle can be unpickled in Python-2, but not the other way around. What is the problem?
comment:6 Changed 2 years ago by
Note that the Python-3 pickle is as much as 25% larger than the Python-2 pickle. Is that regression typical?
comment:7 Changed 2 years ago by
- Description modified (diff)
- Keywords meataxe added
File that cannot be unpickled in Python-3