Opened 2 years ago

Last modified 2 years ago

#28444 closed defect

Fix backwards incompatibility of unpickling in Python 3 — at Version 7

Reported by: SimonKing Owned by:
Priority: blocker Milestone: sage-8.9
Component: python3 Keywords: unpickling UnicodeError backwards compatibility
Cc: Merged in:
Authors: Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by SimonKing)

EDIT: I replaced the original ticket description by something that I wrote in a comment, because now I have a much smaller example, and moreover pickles of the same object created with Python-3 and with Python-2, so that one can compare.

The following examples require the optional meataxe package, but I am not sure yet if meataxe is to blame or Python-3 (I hope it is the former, because I guess it would be more easy to fix).

attachment:Py2.sobj​ and attachment:Py3.sobj​ result in the following behaviour in Python-3

sage: load('/home/king/Projekte/coho/tests/Py2.sobj')
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-3-5705b555470a> in <module>()
----> 1 load('/home/king/Projekte/coho/tests/Py2.sobj')

/home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.load (build/cythonized/sage/misc/persist.c:2824)()
    149 
    150     ## Load file by absolute filename
--> 151     with open(filename, 'rb') as fobj:
    152         X = loads(fobj.read(), compress=compress)
    153     try:

/home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.load (build/cythonized/sage/misc/persist.c:2774)()
    150     ## Load file by absolute filename
    151     with open(filename, 'rb') as fobj:
--> 152         X = loads(fobj.read(), compress=compress)
    153     try:
    154         X._default_filename = os.path.abspath(filename)

/home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.loads (build/cythonized/sage/misc/persist.c:7270)()
    967 
    968     unpickler = SageUnpickler(io.BytesIO(s))
--> 969     return unpickler.load()
    970 
    971 

UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0: ordinal not in range(128)
sage: load('/home/king/Projekte/coho/tests/Py3.sobj')
[1 0 0 0 0 0 0 0]
[0 0 0 1 1 1 1 1]

and in Python-2

sage: load('/home/king/Projekte/coho/tests/Py2.sobj')
[1 0 0 0 0 0 0 0]
[0 0 0 1 1 1 1 1]
sage: load('/home/king/Projekte/coho/tests/Py3.sobj')
[1 0 0 0 0 0 0 0]
[0 0 0 1 1 1 1 1]
sage: __ == _
True

So, the Python-3 pickle can be unpickled in Python-2, but not the other way around. What is the problem?

Change History (10)

Changed 2 years ago by SimonKing

File that cannot be unpickled in Python-3

comment:1 follow-ups: Changed 2 years ago by nbruin

Can Sage/Py3 produce the pickle? In that case, you could compare the produced pickles to see how far apart they are. Of course, if an ASCII decoder encounters 0x80 it's justified to not decode it, so it might be interesting to see what py3 makes from it itself. My guess would be that the bytestring should NOT be decoded by ascii, but something else. Perhaps unpickle can be configured to use a different decoder. But it would be good to see what generates the non-ascii symbol and what its meaning is.

comment:2 in reply to: ↑ 1 Changed 2 years ago by SimonKing

Replying to nbruin:

Of course, if an ASCII decoder encounters 0x80 it's justified to not decode it

Then the same should hold for Python-2. It doesn't. Hence, it shouldn't hold for Python-3 either.

Last edited 2 years ago by SimonKing (previous) (diff)

comment:3 in reply to: ↑ 1 Changed 2 years ago by SimonKing

Replying to nbruin:

Can Sage/Py3 produce the pickle?

The problem is that the pickle comes from an old version of an optional Sage package. That's why I use the "unpickle override". But I'll see what I can do.

comment:4 Changed 2 years ago by SimonKing

The new attachment was created with Python-2 and can be used without "unpickle override", but it requires the optional meataxe spkg.

Changed 2 years ago by SimonKing

Pickle of MeatAxe? in Python-2

Changed 2 years ago by SimonKing

Pickle of MeaAxe? matrix created with Python-3

comment:5 Changed 2 years ago by SimonKing

I think now I have a very small example. It uses the optional meataxe package. It would be a good news if actually the meataxe wrapper was to blame for the unpickling problem --- but I am not expert enough to tell whether (1) it is the case and (2) how it could be fixed (if it was the case).

Anyway. I have a new version of attachment:Py2.sobj, and a new attachment:Py3.sobj. It results in the following behaviour in Python-3

sage: load('/home/king/Projekte/coho/tests/Py2.sobj')
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-3-5705b555470a> in <module>()
----> 1 load('/home/king/Projekte/coho/tests/Py2.sobj')

/home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.load (build/cythonized/sage/misc/persist.c:2824)()
    149 
    150     ## Load file by absolute filename
--> 151     with open(filename, 'rb') as fobj:
    152         X = loads(fobj.read(), compress=compress)
    153     try:

/home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.load (build/cythonized/sage/misc/persist.c:2774)()
    150     ## Load file by absolute filename
    151     with open(filename, 'rb') as fobj:
--> 152         X = loads(fobj.read(), compress=compress)
    153     try:
    154         X._default_filename = os.path.abspath(filename)

/home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.loads (build/cythonized/sage/misc/persist.c:7270)()
    967 
    968     unpickler = SageUnpickler(io.BytesIO(s))
--> 969     return unpickler.load()
    970 
    971 

UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0: ordinal not in range(128)
sage: load('/home/king/Projekte/coho/tests/Py3.sobj')
[1 0 0 0 0 0 0 0]
[0 0 0 1 1 1 1 1]

and in Python-2

sage: load('/home/king/Projekte/coho/tests/Py2.sobj')
[1 0 0 0 0 0 0 0]
[0 0 0 1 1 1 1 1]
sage: load('/home/king/Projekte/coho/tests/Py3.sobj')
[1 0 0 0 0 0 0 0]
[0 0 0 1 1 1 1 1]
sage: __ == _
True

So, the Python-3 pickle can be unpickled in Python-2, but not the other way around. What is the problem?

Last edited 2 years ago by SimonKing (previous) (diff)

comment:6 Changed 2 years ago by SimonKing

Note that the Python-3 pickle is as much as 25% larger than the Python-2 pickle. Is that regression typical?

comment:7 Changed 2 years ago by SimonKing

  • Description modified (diff)
  • Keywords meataxe added
Note: See TracTickets for help on using tickets.