Opened 2 years ago

Last modified 2 years ago

#28444 closed defect

Fix backwards incompatibility of unpickling in Python 3 — at Version 11

Reported by: SimonKing Owned by:
Priority: blocker Milestone: sage-8.9
Component: python3 Keywords: unpickling UnicodeError backwards compatibility
Cc: Merged in:
Authors: Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by SimonKing)

EDIT: In the original ticket description, I stated: "I believe that a backwards incompatible change of pickling is a blocker for Python-3 support." In that (and ONLY in that) sense I believe this ticket is a blocker. I replaced the original ticket description by something that I wrote in a comment, because now I have a much smaller example, and moreover pickles of the same object created with Python-3 and with Python-2, so that one can compare.

The following examples require the optional meataxe package, but I am not sure yet if meataxe is to blame or Python-3 (I hope it is the former, because I guess it would be more easy to fix).

attachment:Py2.sobj​ and attachment:Py3.sobj​ result in the following behaviour in Python-3

sage: load('/home/king/Projekte/coho/tests/Py2.sobj')
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-3-5705b555470a> in <module>()
----> 1 load('/home/king/Projekte/coho/tests/Py2.sobj')

/home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.load (build/cythonized/sage/misc/persist.c:2824)()
    149 
    150     ## Load file by absolute filename
--> 151     with open(filename, 'rb') as fobj:
    152         X = loads(fobj.read(), compress=compress)
    153     try:

/home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.load (build/cythonized/sage/misc/persist.c:2774)()
    150     ## Load file by absolute filename
    151     with open(filename, 'rb') as fobj:
--> 152         X = loads(fobj.read(), compress=compress)
    153     try:
    154         X._default_filename = os.path.abspath(filename)

/home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.loads (build/cythonized/sage/misc/persist.c:7270)()
    967 
    968     unpickler = SageUnpickler(io.BytesIO(s))
--> 969     return unpickler.load()
    970 
    971 

UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0: ordinal not in range(128)
sage: load('/home/king/Projekte/coho/tests/Py3.sobj')
[1 0 0 0 0 0 0 0]
[0 0 0 1 1 1 1 1]

and in Python-2

sage: load('/home/king/Projekte/coho/tests/Py2.sobj')
[1 0 0 0 0 0 0 0]
[0 0 0 1 1 1 1 1]
sage: load('/home/king/Projekte/coho/tests/Py3.sobj')
[1 0 0 0 0 0 0 0]
[0 0 0 1 1 1 1 1]
sage: __ == _
True

So, the Python-3 pickle can be unpickled in Python-2, but not the other way around. What is the problem?

Change History (14)

Changed 2 years ago by SimonKing

File that cannot be unpickled in Python-3

comment:1 follow-ups: Changed 2 years ago by nbruin

Can Sage/Py3 produce the pickle? In that case, you could compare the produced pickles to see how far apart they are. Of course, if an ASCII decoder encounters 0x80 it's justified to not decode it, so it might be interesting to see what py3 makes from it itself. My guess would be that the bytestring should NOT be decoded by ascii, but something else. Perhaps unpickle can be configured to use a different decoder. But it would be good to see what generates the non-ascii symbol and what its meaning is.

comment:2 in reply to: ↑ 1 Changed 2 years ago by SimonKing

Replying to nbruin:

Of course, if an ASCII decoder encounters 0x80 it's justified to not decode it

Then the same should hold for Python-2. It doesn't. Hence, it shouldn't hold for Python-3 either.

Last edited 2 years ago by SimonKing (previous) (diff)

comment:3 in reply to: ↑ 1 Changed 2 years ago by SimonKing

Replying to nbruin:

Can Sage/Py3 produce the pickle?

The problem is that the pickle comes from an old version of an optional Sage package. That's why I use the "unpickle override". But I'll see what I can do.

comment:4 Changed 2 years ago by SimonKing

The new attachment was created with Python-2 and can be used without "unpickle override", but it requires the optional meataxe spkg.

Changed 2 years ago by SimonKing

Pickle of MeatAxe? in Python-2

Changed 2 years ago by SimonKing

Pickle of MeaAxe? matrix created with Python-3

comment:5 Changed 2 years ago by SimonKing

I think now I have a very small example. It uses the optional meataxe package. It would be a good news if actually the meataxe wrapper was to blame for the unpickling problem --- but I am not expert enough to tell whether (1) it is the case and (2) how it could be fixed (if it was the case).

Anyway. I have a new version of attachment:Py2.sobj, and a new attachment:Py3.sobj. It results in the following behaviour in Python-3

sage: load('/home/king/Projekte/coho/tests/Py2.sobj')
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-3-5705b555470a> in <module>()
----> 1 load('/home/king/Projekte/coho/tests/Py2.sobj')

/home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.load (build/cythonized/sage/misc/persist.c:2824)()
    149 
    150     ## Load file by absolute filename
--> 151     with open(filename, 'rb') as fobj:
    152         X = loads(fobj.read(), compress=compress)
    153     try:

/home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.load (build/cythonized/sage/misc/persist.c:2774)()
    150     ## Load file by absolute filename
    151     with open(filename, 'rb') as fobj:
--> 152         X = loads(fobj.read(), compress=compress)
    153     try:
    154         X._default_filename = os.path.abspath(filename)

/home/king/Sage/git/py3/local/lib/python3.7/site-packages/sage/misc/persist.pyx in sage.misc.persist.loads (build/cythonized/sage/misc/persist.c:7270)()
    967 
    968     unpickler = SageUnpickler(io.BytesIO(s))
--> 969     return unpickler.load()
    970 
    971 

UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0: ordinal not in range(128)
sage: load('/home/king/Projekte/coho/tests/Py3.sobj')
[1 0 0 0 0 0 0 0]
[0 0 0 1 1 1 1 1]

and in Python-2

sage: load('/home/king/Projekte/coho/tests/Py2.sobj')
[1 0 0 0 0 0 0 0]
[0 0 0 1 1 1 1 1]
sage: load('/home/king/Projekte/coho/tests/Py3.sobj')
[1 0 0 0 0 0 0 0]
[0 0 0 1 1 1 1 1]
sage: __ == _
True

So, the Python-3 pickle can be unpickled in Python-2, but not the other way around. What is the problem?

Last edited 2 years ago by SimonKing (previous) (diff)

comment:6 Changed 2 years ago by SimonKing

Note that the Python-3 pickle is as much as 25% larger than the Python-2 pickle. Is that regression typical?

comment:7 Changed 2 years ago by SimonKing

  • Description modified (diff)
  • Keywords meataxe added

comment:8 follow-up: Changed 2 years ago by chapoton

This is in no way a blocker, IMHO.

comment:9 in reply to: ↑ 8 Changed 2 years ago by SimonKing

Replying to chapoton:

This is in no way a blocker, IMHO.

If it is due to meataxe (an optional package), then it is not a blocker. If it is due to the upcoming switch to Python-3, then IMHO it is a blocker to that switch (not to a Python-2 version of Sage, though). Since currently it isn't clear if the example reveals a problem in Python-3 or not, I'd say better safe than sorry.

comment:10 follow-up: Changed 2 years ago by chapoton

So we agree that this is not a blocker for the upcoming 8.9 release, still py2. This is the usual meaning of blocker. But in this time of transition, we must be clearer about what blocker means.

comment:11 in reply to: ↑ 10 Changed 2 years ago by SimonKing

  • Description modified (diff)

Replying to chapoton:

So we agree that this is not a blocker for the upcoming 8.9 release, still py2. This is the usual meaning of blocker. But in this time of transition, we must be clearer about what blocker means.

In the original ticket description, I told in what sense I believe it was a blocker, but I somehow deleted that clarification when I changed the ticket description. Now, the statement is back, at the top of the ticket description.

Note: See TracTickets for help on using tickets.