Opened 8 years ago
Closed 8 years ago
#18848 closed defect (fixed)
do not save() objects we cannot load()
Reported by: | dimpase | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-6.8 |
Component: | interfaces | Keywords: | |
Cc: | ncohen, vbraun | Merged in: | |
Authors: | Simon King | Reviewers: | Volker Braun |
Report Upstream: | N/A | Work issues: | |
Branch: | bb63cd2 (Commits, GitHub, GitLab) | Commit: | bb63cd205f7150776d04f5e1e7ec02438b8dc384 |
Dependencies: | Stopgaps: |
Description
currently
sage: x=singular(2) sage: x.save("blah")
results in blah.sobj
we cannot load().
cf. sage-devel post
Change History (35)
comment:1 follow-up: 3 Changed 8 years ago by
comment:2 Changed 8 years ago by
Cc: | SimonKing added |
---|
comment:3 Changed 8 years ago by
Cc: | SimonKing removed |
---|
Replying to dimpase:
Blindly doing
save(x.sage(),"blah")
is not quite right, as this would e.g. break saving/loading Python objects (something that works just fine).
Well, each object has a __reduce__
method. In the case of sage.interfaces.interface.InterfaceElement.__reduce__
, this is
def __reduce__(self): return reduce_load, (self.parent(), self._reduce())
and
def _reduce(self): return repr(self)
The latter should change (by default) to
def _reduce(self): return self.sage()
But then, it is still needed to fix something, since SingularElement
has its own __reduce__
method, that explicitly makes sure that the result will be invalid:
def __reduce__(self): """ Note that the result of the returned reduce_load is an invalid Singular object. EXAMPLES:: sage: singular(2).__reduce__() (<function reduce_load at 0x...>, ()) """ return reduce_load, () # default is an invalid object
And finally, one should remove the stupid custom reduce_load
methods. There is a default one:
def reduce_load(parent, x): return parent(x)
but the Singular and GAP interfaces override it to, for example,
def reduce_load(): """ Returns an invalid GAP element. Note that this is the object returned when a GAP element is unpickled. EXAMPLES:: sage: from sage.interfaces.gap import reduce_load sage: reduce_load() <repr(<sage.interfaces.gap.GapElement at 0x...>) failed: ValueError: The session in which this object was defined is no longer running.> sage: loads(dumps(gap(2))) <repr(<sage.interfaces.gap.GapElement at 0x...>) failed: ValueError: The session in which this object was defined is no longer running.> """ return GapElement(None, None)
So, summary of the status quo: Several interfaces explicitly override the default to make sure that pickling does NOT work. And the default currently is not good enough to ensure pickling, since the default relies on string representations.
Summary of the proposed solution: Interfaces should rely on the default pickling, as long as they don't have anything substantial to say about pickling. And the default should rely on the .sage()
method, whose functionality should be improved in some cases.
comment:4 Changed 8 years ago by
Or perhaps one should do a compromise: If self.sage()
throws an error, then we should try if the string representation is good enough to reconstruct the object. If it does then use it, otherwise throw an error upon pickling.
It would be better than nothing. But it could be quite costly to first check, for each interface element to be pickled, if the string representation is good enough.
comment:5 Changed 8 years ago by
PPS: In the case of GAP elements, string representation sometimes is not good enough, and GAP does not support pickling by default. But there was some thread on sage-devel concerning that point. It was mentioned that with the IO package of GAP, which is GPL3, pickling of individual objects can be achieved. Perhaps we can make it a standard package?
comment:6 follow-up: 7 Changed 8 years ago by
PPPS: Both Singular and GAP can dump a complete session. I don't know if that has been wrapped in Sage's interface. But it could be useful for some. Not for pickling individual objects, though.
comment:7 follow-up: 8 Changed 8 years ago by
Replying to SimonKing:
PPPS: Both Singular and GAP can dump a complete session. I don't know if that has been wrapped in Sage's interface. But it could be useful for some. Not for pickling individual objects, though.
one can save GAP's workspace via gap.save_workspace(...)
, but there is no mechanism in Sage to load GAP with such a custom-made workspace; although this is doable.
No idea about Singular though.
comment:8 follow-up: 9 Changed 8 years ago by
Replying to dimpase:
Replying to SimonKing:
PPPS: Both Singular and GAP can dump a complete session. I don't know if that has been wrapped in Sage's interface. But it could be useful for some. Not for pickling individual objects, though.
one can save GAP's workspace via
gap.save_workspace(...)
, but there is no mechanism in Sage to load GAP with such a custom-made workspace; although this is doable.
Haha! So, it is "write only memory"...
No idea about Singular though.
See dump.
comment:9 Changed 8 years ago by
Replying to SimonKing:
Replying to dimpase:
one can save GAP's workspace via
gap.save_workspace(...)
, but there is no mechanism in Sage to load GAP with such a custom-made workspace; although this is doable.Haha! So, it is "write only memory"...
I had a look, and it seems that really it is impossible to load a workspace into a running session (overriding all previously assigned variables of the session). One can only start a new session that is using the saved workspace. Doh...
comment:10 follow-up: 11 Changed 8 years ago by
Something totally different: Why is there a .save()
and .dumps()
method? Since __reduce__
is implemented, save(Foo, "bar")
would be the recommended way to store Foo
in "bar.sobj", and Foo.save("bar")
isn't really adding anything. Can we deprecate these methods?
comment:11 Changed 8 years ago by
Replying to SimonKing:
Something totally different: Why is there a
.save()
and.dumps()
method? Since__reduce__
is implemented,save(Foo, "bar")
would be the recommended way to storeFoo
in "bar.sobj", andFoo.save("bar")
isn't really adding anything. Can we deprecate these methods?
These are not defined on the level of sage.interfaces, but sage.structure.sage_object. So, I guess I better don't touch it...
comment:12 Changed 8 years ago by
Branch: | → u/SimonKing/do_not_save___objects_we_cannot_load__ |
---|
comment:13 Changed 8 years ago by
Authors: | → Simon King |
---|---|
Commit: | → 547f511ff186696ac2617c6324852c59b079c807 |
Status: | new → needs_review |
I think the attached branch is a step to the right direction. It implements most of what I announced above:
sage.interfaces.gap
andsage.interfaces.singular
do not override the pickling protocol fromsage.interfaces.interface
.
-. The default pickling tries to convert the interface element to a Sage object. Unpickling converts this element back to an interface element. If conversion to a Sage object fails with a NotImplementedError
, then the string representation is used, as a last resort.
- I added to
SingularElement._sage_()
the forgotten case of integers.
I am not sure if it solves the problem to the extent possible. For example, one still can not unpickle a string in gap or singular:
sage: loads(dumps(singular('"a"'))) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) ... TypeError: (TypeError(SingularError('Singular error:\n ? `a` is undefined\n ? error occurred in or before STDIN line 14: ``',),), <function reduce_load at 0x7f08e40a1500>, (Singular, 'a'))
Reason is: The corresponding Sage object of singular('"a"')
is the string "a"
, but the evaluation singular("a")
fails.
It has failed before. So, I am not sure if it is absolutely critical to fix this. Possible semi-fixes are:
singular('"a"').sage()
should return the string"'a'"
, not just"a"
. But this wouldn't be the correct answer.- Alternatively,
singular("a")
could first try if there is an object in singular whose name is "a", and returnsingular("'a'")
if that fails. That's bad, because think of what happens if the namea
is defined in Singular...
A real fix would be to modify the _reduce()
method so that it recognises whether self
is a string in the respective interface. I.e., there would be a new method is_string()
, and we would do
def _reduce(self): if self.is_string(): return repr(repr(self)) try: return self.sage() except NotImplementedError: return repr(self)
And is_string()
would of course be implemented in each interface, such as testing self.type()=="string"
in the case of SingularElement
or bool(self.IsString())
in the case of GapElement
.
Do you think we should care about strings?
New commits:
547f511 | Improve pickling for InterfaceElement
|
comment:14 Changed 8 years ago by
Commit: | 547f511ff186696ac2617c6324852c59b079c807 → 2445264ab24d28344b7bb850c2bdaef446f1e65c |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
2445264 | Allow pickling for more types of interface elements, and preserve backwards compatibility
|
comment:15 Changed 8 years ago by
With the new commit, many interface elements became picklable that haven't been before. Previously, elements of the r or singular or gap or pari or maxima interface would unpickle as invalid. But, as demonstrated in the new tests, now they can be pickled and unpickled.
One missing case are strings in maxima. In fact, a string in maxima is not converted to a string in Sage, but it is attempted to transform it into an expression, which for what ever reason fails.
Do you know how to recognise a string in maxima? I.e., is there a command in maxima that tells whether a variable is a string?
In any case, I believe the current branch extends the pickling capabilities of interface elements considerably.
comment:16 Changed 8 years ago by
What I don't understand is why Lisp interface does not suffer from this issue:
sage: x=lisp(1) sage: x 1 sage: type(x) <class 'sage.interfaces.lisp.LispElement'> sage: x.save("/tmp/foo") sage: load("/tmp/foo") 1 sage: y=load("/tmp/foo") sage: y==x True
And in fact /tmp/foo.sobj
is loadable in a fresh Sage session.
comment:17 follow-up: 18 Changed 8 years ago by
That works because it actually has something implemented:
sage: explain_pickle(lisp(1).dumps()) pg_reduce_load = unpickle_global('sage.interfaces.interface', 'reduce_load') pg_reduce_load_Lisp = unpickle_global('sage.interfaces.lisp', 'reduce_load_Lisp') pg = unpickle_instantiate(pg_reduce_load_Lisp, ()) pg_reduce_load(pg, '1')
And if you look at reduce_load_Lisp
, it simply returns lisp
so the pickle consists of the instructions "execute lisp(1)
", which indeed reconstructs the object. If you try to do this with more complicated objects, you'll see it go wrong for whatever reason (mainly because the lisp interface is very rudimentary):
sage: a=lisp("'(1 2)") sage: a (1 2) sage: loads(a.dumps()) 1
which is explained by the fact that lisp('(1 2)')
doesn't produce something that equals a
.
comment:18 Changed 8 years ago by
Dima, what do you mean by "this issue"?
Replying to nbruin:
That works because it actually has something implemented:
sage: explain_pickle(lisp(1).dumps()) pg_reduce_load = unpickle_global('sage.interfaces.interface', 'reduce_load') pg_reduce_load_Lisp = unpickle_global('sage.interfaces.lisp', 'reduce_load_Lisp') pg = unpickle_instantiate(pg_reduce_load_Lisp, ()) pg_reduce_load(pg, '1')And if you look at
reduce_load_Lisp
, it simply returnslisp
so the pickle consists of the instructions "executelisp(1)
", which indeed reconstructs the object.
Well, this is how it used to work all the time. Each interface "Foo" has a __reduce__
method that returns reduce_load_Foo
, and reduce_load_Foo()
returns Foo()
. And each ELEMENT (!) of Foo (not Foo itself) also used to have a __reduce__
method. The default was that the __reduce__
method returns reduce_load, (self.parent(), self._reduce())
, and reduce_load(P, R)
did then return P(R)
, thus, unpickling the element.
The issue was that
- often the string representation of an element is not valid input to the interface and
- several sub-classes of
InterfaceElement
overrode__reduce__
andreduce_load
, making absolutely sure that under no circumstances pickling could work (not even when the string representation happened to be suitable for pickling).
That's what my branch changed (plus some improvements for conversion from InterfaceElement to Sage, which is now used for pickling instead of string representation).
comment:19 Changed 8 years ago by
Commit: | 2445264ab24d28344b7bb850c2bdaef446f1e65c → 123f29504557ae471ac9d55b0c5a6fbad587c30f |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
123f295 | Fix doctests
|
comment:20 Changed 8 years ago by
There were tests to be fixed. Actually one of the failing tests tried to make sure that the unpickled element is invalid...
New commits:
123f295 | Fix doctests
|
comment:21 Changed 8 years ago by
I am just wondering: Should we perhaps get rid of reduce_load
? After all, an interface is callable, and hence there is no need to have a function that calls the interface when we can do it directly.
comment:22 Changed 8 years ago by
Commit: | 123f29504557ae471ac9d55b0c5a6fbad587c30f → 595df9606f14cffa7f8131aedb6daeaf1b08c1ce |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
595df96 | Remove reduce_load from sage.interfaces.interface
|
comment:23 Changed 8 years ago by
Commit: | 595df9606f14cffa7f8131aedb6daeaf1b08c1ce → d123d33c89ff4524ecf960d894ea07a1f0e96196 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
d123d33 | Add one doctest
|
comment:24 Changed 8 years ago by
Reviewers: | → Volker Braun |
---|---|
Status: | needs_review → positive_review |
comment:25 follow-up: 26 Changed 8 years ago by
Status: | positive_review → needs_work |
---|
Fails on 32-bit linux:
sage -t --long src/sage/interfaces/gp.py ********************************************************************** File "src/sage/interfaces/gp.py", line 878, in sage.interfaces.gp.GpElement._reduce Failed example: loads(dumps(E)) # indirect doctest Expected: [1, 2, 3, 4, 5, 9, 11, 29, 35, -183, -3429, -10351, 6128487/10351, Vecsmall([1]), [Vecsmall([128, -1])], [0, 0, 0, 0, 0, 0, 0, 0]] Got: [1, 2, 3, 4, 5, 9, 11, 29, 35, -183, -3429, -10351, 6128487/10351, Vecsmall([1]), [Vecsmall([96, -1])], [0, 0, 0, 0, 0, 0, 0, 0]] ********************************************************************** File "src/sage/interfaces/gp.py", line 881, in sage.interfaces.gp.GpElement._reduce Failed example: E.sage() Expected: [1, 2, 3, 4, 5, 9, 11, 29, 35, -183, -3429, -10351, 6128487/10351, [1], [[128, -1]], [0, 0, 0, 0, 0, 0, 0, 0]] Got: [1, 2, 3, 4, 5, 9, 11, 29, 35, -183, -3429, -10351, 6128487/10351, [1], [[96, -1]], [0, 0, 0, 0, 0, 0, 0, 0]] ********************************************************************** 1 item had failures: 2 of 4 in sage.interfaces.gp.GpElement._reduce [155 tests, 2 failures, 7.53 s]
comment:26 Changed 8 years ago by
Replying to vbraun:
Fails on 32-bit linux:
I did not change the _sage_()
method for !GPElement, if I recall correctly. Thus, the fact that things changed from yielding [128, -1]
to [96, -1]
has been there before. And now I wonder if that's a bug, or if the different answer just tells something about internal gp data structures that depend on the architecture. In the latter case, we should replace [128, -1]
by [..., -1]
in the expected answer.
comment:27 Changed 8 years ago by
I guess it would be better to ask on sage-nt or sage-algebra, since it appears to be about elliptic curves.
comment:28 follow-up: 32 Changed 8 years ago by
Volker, before I ask sage-nt, can you show me the result of
sage: gp('ellinit([1,2,3,4,5])') [1, 2, 3, 4, 5, 9, 11, 29, 35, -183, -3429, -10351, 6128487/10351, Vecsmall([1]), [Vecsmall([128, -1])], [0, 0, 0, 0, 0, 0, 0, 0]]
on 32-bit linux?
If it gives [..., [Vecsmall([96, -1])], ...]
, then the problem is very likely a different internal representation, and we should just replace [128, -1]
by [..., -1]
. If it isn't, and the different answer only occurs upon pickling, then we have a serious problem.
comment:29 Changed 8 years ago by
PS: Or even better, we should just replace the test by
sage: E = gp('ellinit([1,2,3,4,5])') sage: loads(dumps(E)) == E True
I'll commit and push it now.
comment:30 Changed 8 years ago by
Commit: | d123d33c89ff4524ecf960d894ea07a1f0e96196 → bb63cd205f7150776d04f5e1e7ec02438b8dc384 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
bb63cd2 | Make a test architecture independent
|
comment:31 Changed 8 years ago by
Status: | needs_work → needs_review |
---|
comment:32 follow-up: 33 Changed 8 years ago by
Replying to SimonKing:
Volker, before I ask sage-nt, can you show me the result of
sage: gp('ellinit([1,2,3,4,5])') [1, 2, 3, 4, 5, 9, 11, 29, 35, -183, -3429, -10351, 6128487/10351, Vecsmall([1]), [Vecsmall([128, -1])], [0, 0, 0, 0, 0, 0, 0, 0]]on 32-bit linux?
If it gives
[..., [Vecsmall([96, -1])], ...]
, then the problem is very likely a different internal representation, and we should just replace[128, -1]
by[..., -1]
. If it isn't, and the different answer only occurs upon pickling, then we have a serious problem.
here you are:
sage: gp('ellinit([1,2,3,4,5])') [1, 2, 3, 4, 5, 9, 11, 29, 35, -183, -3429, -10351, 6128487/10351, Vecsmall([1]), [Vecsmall([96, -1])], [0, 0, 0, 0, 0, 0, 0, 0]]
on
dima@arando:~/sage/dev/sage$ uname -a Linux arando 3.13.0-40-generic #69-Ubuntu SMP Thu Nov 13 17:56:26 UTC 2014 i686 i686 i686 GNU/Linux
Indeed, you see [Vecsmall([96, -1])]
there!
(most probably the same machine Volker was referring to).
comment:33 Changed 8 years ago by
Replying to dimpase:
Indeed, you see
[Vecsmall([96, -1])]
there!
OK. In that case, the string representation is machine dependent, and we have to use a
sage: loads(dumps(E)) == E True
test (which is cleaner anyway).
The main point of my test was to show that pickling of the gp interface does not rely on the sage-representation of a gp-element, but on the string representation (whereas by this ticket all other interfaces use the sage-representation when available). So, in the most recent commit, I change the test into
sage: E == gp(E.sage()) False
which hopefully is the same on 32-bit.
In other words: Needs review again...
comment:34 Changed 8 years ago by
Status: | needs_review → positive_review |
---|
comment:35 Changed 8 years ago by
Branch: | u/SimonKing/do_not_save___objects_we_cannot_load__ → bb63cd205f7150776d04f5e1e7ec02438b8dc384 |
---|---|
Resolution: | → fixed |
Status: | positive_review → closed |
Blindly doing
save(x.sage(),"blah")
is not quite right, as this would e.g. break saving/loading Python objects (something that works just fine).Ideas?