Opened 10 years ago
Last modified 4 weeks ago
#13811 needs_info defect
Metaticket: Support Python's __copy__ / __deepcopy__ protocol
Reported by:  Christian Nassau  Owned by:  William Stein 

Priority:  critical  Milestone:  sage9.8 
Component:  pickling  Keywords:  LazyFamily, copy, pickle 
Cc:  Travis Scrimshaw, Michael Jung, Nils Bruin, Kwankyu Lee, William Stein, Markus Wageringel, KarlDieter Crisman  Merged in:  
Authors:  Matthias Koeppe  Reviewers:  
Report Upstream:  N/A  Work issues:  
Branch:  u/mkoeppe/__copy___and___deepcopy___methods_for_all_immutable_sage_objects (Commits, GitHub, GitLab)  Commit:  2bd01ae26f1447cfc35d2fc1901413baab43819d 
Dependencies:  #32454  Stopgaps: 
Description (last modified by )
Most Sage objects are immutable. Nevertheless, copy
and deepcopy
make copies (through pickling/unpickling) for them because we have not provided the classes with __copy__
methods (which should just return the object) and __deepcopy__
methods (which in many cases should just return the object).
sage: a = 0 sage: copy(a) is a False
Another symptom (from the original ticket description): Copying of (immutable!) LazyFamily
objects only works for families that can be pickled:
sage: def unpicklableFamily(): ... x = 10 ... return LazyFamily([1,2,3], lambda n: x*n) sage: f = unpicklableFamily() sage: copy(f) Traceback (most recent call last): ... ValueError: Cannot pickle code objects from closures
The case of __deepcopy__
of immutable containerlike objects (like Family
or Sequence
) is a bit trickier: Is immutability intended to imply that the elements are immutable too? In this ticket, we make no such assumption.
Tickets:
 #32454 trivial
__copy__
and__deepcopy__
methods for number types  #32478 trivial
__copy__
and__deepcopy__
methods for number field elements  #32480 trivial
__copy__
and__deepcopy__
methods for symbolic expressions and functions  #32462 trivial
__copy__
and__deepcopy__
methods forFamily
andSet
 #32453
__copy__
methods for all classes that definecopy
methods  #32476
sage.tensor
,sage.manifolds
:__copy__
,__deepcopy__
methods for all classes that definecopy
methods  #23075
copy(CombinatorialFreeModule.Element)
broken by #22632  #5417 Fix some more deepcopy/caching issues in the quadratic forms code
Related:
Attachments (1)
Change History (66)
comment:1 Changed 10 years ago by
Status:  new → needs_review 

comment:2 Changed 10 years ago by
Authors:  → cnassau 

comment:3 Changed 10 years ago by
Authors:  cnassau → Christian Nassau 

comment:4 followup: 5 Changed 10 years ago by
comment:5 followup: 6 Changed 10 years ago by
Replying to nthiery:
Hi Christian!
Families are (semantically) immutable objects. Why would we want to copy them?
Good point... I stumbled across this issue while I was struggling to create a disjoint union of a dynamically generated family of LazyFamily
objects, while the real problem was that I was using closures with reference to a loop variable. I eventually resolved this issue, and my solution does not requiry any copies now. So I agree that there's no need to create shallow copies.
I do think that user code can expect that copy(any_sage_object)
does not throw an error. Maybe LazyFamily.__copy__(self)
should just return self
? This would be in accordance with
sage: copy(Integers()) is Integers() True
Cheers, Christian
comment:6 Changed 10 years ago by
Replying to cnassau:
I do think that user code can expect that
copy(any_sage_object)
does not throw an error. MaybeLazyFamily.__copy__(self)
should just returnself
? This would be in accordance withsage: copy(Integers()) is Integers() True
Yes, it would make sense to have copy(x) return x for all immutable objects in Sage. I am not sure how to achieve this though: we do not (yet?) have a class for providing code for all immutable objects in Sage, and I would not want to force every relevant class to reimplement a trivial copy method.
For the case at hand (families), maybe one could add a method Parent.copy returning self
? As far as I can tell, parents are required to be immutable anyway. But one should double check this.
Cheers,
Nicolas
comment:7 Changed 9 years ago by
Milestone:  sage5.11 → sage5.12 

comment:8 Changed 9 years ago by
Milestone:  sage6.1 → sage6.2 

comment:9 Changed 8 years ago by
Milestone:  sage6.2 → sage6.3 

comment:10 Changed 8 years ago by
Milestone:  sage6.3 → sage6.4 

comment:11 Changed 8 years ago by
Status:  needs_review → needs_work 

Hi,
As Nicolas suggested in his comnent 6, __copy__
should be implemented as
def __copy__(self): return self
It is standard Python for immutable objects
sage: t = (1,2,3) sage: copy(t) is t True sage: i = 1231491283r sage: copy(i) is i True
Vincent
comment:12 Changed 13 months ago by
Authors:  Christian Nassau 

Cc:  Travis Scrimshaw Michael Jung added 
Description:  modified (diff) 
Milestone:  sage6.4 → sage9.5 
Priority:  minor → critical 
Summary:  LazyFamily cannot be copied if it can't be pickled → __copy__ and __deepcopy__ methods for all immutable Sage objects 
comment:13 Changed 13 months ago by
Description:  modified (diff) 

comment:14 Changed 13 months ago by
Cc:  Nils Bruin Kwankyu Lee added 

comment:15 Changed 13 months ago by
Description:  modified (diff) 

comment:16 followup: 18 Changed 13 months ago by
Immutability of a container object implies all elements of it are also immutable. This is important for hashability. Although there is not a strict adherence to this, it can be an assumption of good input.
comment:18 followup: 31 Changed 13 months ago by
Replying to tscrim:
Immutability of a container object implies all elements of it are also immutable. This is important for hashability. Although there is not a strict adherence to this, it can be an assumption of good input.
That's not quite the case in python, though: tuples are "immutable", but you're free to put mutable objects in them. Because tuples compute their hash by incorporating the hashes of all the constituents, such tuples then fail to be hashable. From that perspective, I think it can be fairly expensive in general to determine if an object is hashable/really immutable in a generic way (concrete implementations can answer these questions much better), so it might not be such a good idea to let copy/deepcopy really depend on it. I don't see why it's important to optimize the operations anyway: if you're interested in optimizing your code, then you should probably be avoiding unnecessary copying yourself.
comment:19 followup: 20 Changed 13 months ago by
I think in __deepcopy__
of an immutable containerlike object, we should first call deepcopy
for all elements; and if the copies are all identical to the source elements, then we can just return self
instead of making a new object. (The positive result of this test can be cached.)
comment:20 followup: 22 Changed 13 months ago by
Replying to mkoeppe:
I think in
__deepcopy__
of an immutable containerlike object, we should first calldeepcopy
for all elements; and if the copies are all identical to the source elements, then we can just returnself
instead of making a new object. (The positive result of this test can be cached.)
But by then: why would you bother? You've already done the hard work. I don't think there's much benefit and you'd make the code more complex. Just make a deep copy. That's what was requested.
comment:21 Changed 13 months ago by
Branch:  → u/mkoeppe/__copy___and___deepcopy___methods_for_all_immutable_sage_objects 

comment:22 Changed 13 months ago by
Commit:  → 51022647e924e985224785d7dfd3948dcf42f29e 

Replying to nbruin:
Replying to mkoeppe:
I think in
__deepcopy__
of an immutable containerlike object, we should first calldeepcopy
for all elements; and if the copies are all identical to the source elements, then we can just returnself
instead of making a new object. (The positive result of this test can be cached.)But by then: why would you bother?
The time for copying is sunk, but the memory isn't
New commits:
65ba0c8  Vector_integer_dense.__copy__: If immutable, just return self

2df991c  Vector_integer_dense.__deepcopy__: New, delegate to __copy__

dc79c29  Sequence_generic.__copy__: If immutable, just return self

5102264  Sequence_generic.__deepcopy__: New

comment:23 Changed 13 months ago by
Here's an implementation for two example classes. Obviously we would have to do the same for many more classes.
comment:24 Changed 13 months ago by
Commit:  51022647e924e985224785d7dfd3948dcf42f29e → c9539d05b98c07fb988d1ca1cb0625eeb66cf7a8 

Branch pushed to git repo; I updated commit sha1. New commits:
c9539d0  Sequence_generic.__deepcopy__: Fixup doctest

comment:25 Changed 13 months ago by
Authors:  → Matthias Koeppe 

Status:  needs_work → needs_review 
Setting to needs review so the patchbot will run
comment:26 Changed 13 months ago by
Commit:  c9539d05b98c07fb988d1ca1cb0625eeb66cf7a8 → d5096b6eb4645bc5579f7318c794972e72ffc92f 

Branch pushed to git repo; I updated commit sha1. New commits:
d5096b6  FreeModuleElement[_generic_{dense,sparse}].__copy__: If immutable, return self

comment:27 Changed 13 months ago by
Commit:  d5096b6eb4645bc5579f7318c794972e72ffc92f → 6fe95340b9d002ead9770b0f51470467403f194b 

comment:28 Changed 13 months ago by
Commit:  6fe95340b9d002ead9770b0f51470467403f194b → 4a902ef9b7c4c252fb5fac5c37462627d37e81f6 

Branch pushed to git repo; I updated commit sha1. New commits:
61ceb67  Vector_mod2_dense.__copy__: If immutable, just return self

c744c2c  Vector_mod2_dense.__deepcopy__: New, delegate to __copy__

c907f7b  Vector_modn_dense.__copy__: If immutable, just return self

4a902ef  Vector_modn_dense.__deepcopy__: New, delegate to __copy__

comment:29 Changed 13 months ago by
Commit:  4a902ef9b7c4c252fb5fac5c37462627d37e81f6 → 90c982f323adaf3187e96d6707fd948c18297122 

comment:30 followup: 40 Changed 13 months ago by
Missing in sage.modules
is now only a __deepcopy__
method for generic dense and sparse vectors. Not sure if we may assume that ring elements are immutable (see #32450)...
comment:31 Changed 13 months ago by
Replying to nbruin:
Replying to tscrim:
Immutability of a container object implies all elements of it are also immutable. This is important for hashability. Although there is not a strict adherence to this, it can be an assumption of good input.
That's not quite the case in python, though: tuples are "immutable", but you're free to put mutable objects in them. Because tuples compute their hash by incorporating the hashes of all the constituents, such tuples then fail to be hashable.
Right, that is what I was saying that there is an assumption on good input (which is something we can check when necessary).
From that perspective, I think it can be fairly expensive in general to determine if an object is hashable/really immutable in a generic way (concrete implementations can answer these questions much better), so it might not be such a good idea to let copy/deepcopy really depend on it. I don't see why it's important to optimize the operations anyway: if you're interested in optimizing your code, then you should probably be avoiding unnecessary copying yourself.
Sometimes you are using an algorithm that is designed to handle something that is designed to work with something that does mutate (say, uses +=
) as it makes a copy to be sure to be safe. Granted, I don't think this occurs so frequently, but it is good to support this case as I don't think it adds that much more to the complexity.
comment:32 Changed 13 months ago by
In any case, adding the __deepcopy__
methods gives some speedup also for mutable vectors:
Before:
sage: %timeit deepcopy(vector(ZZ, 100000)) 396 ms ± 1.67 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
After:
sage: %timeit deepcopy(vector(ZZ, 100000)) 1.39 ms ± 15 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
comment:33 Changed 13 months ago by
Sage even has actual method Integer.__copy__
that makes an actual copy! The implementation is from 15 years ago; perhaps mutable integers similar to mpz were considered at the time.
comment:34 Changed 13 months ago by
Commit:  90c982f323adaf3187e96d6707fd948c18297122 → a2f43eade80a0438f7e839e12e7c45626b3676d2 

Branch pushed to git repo; I updated commit sha1. New commits:
a2f43ea  Integer.__copy__, __deepcopy__: Immutable, so just return self

comment:35 Changed 13 months ago by
Commit:  a2f43eade80a0438f7e839e12e7c45626b3676d2 → 75e43a18114a5d0ab5a6bfc9439ec07dba357e14 

Branch pushed to git repo; I updated commit sha1. New commits:
75e43a1  Rational.__copy__, __deepcopy__: Immutable, so just return self

comment:36 Changed 13 months ago by
Commit:  75e43a18114a5d0ab5a6bfc9439ec07dba357e14 → ecc13bc7e360138323c258e917226ef673ff5b44 

Branch pushed to git repo; I updated commit sha1. New commits:
ecc13bc  NumberFieldElement_quadratic.__copy__, __deepcopy__: Immutable, so just return self

comment:37 Changed 13 months ago by
Making the same change to src/sage/rings/number_field/number_field_element.pyx
gives weird errors:
sage t randomseed=0 src/sage/rings/number_field/number_field_element.pyx ********************************************************************** File "src/sage/rings/number_field/number_field_element.pyx", line 4193, in sage.rings.number_field.number_field_element.NumberFieldElement._matrix_over_base_morphism Failed example: alpha._matrix_over_base_morphism(h) == alpha.matrix(QQ) Exception raised: Traceback (most recent call last): File "/Users/mkoeppe/s/sage/sagerebasing/local/lib/python3.9/sitepackages/sage/doctest/forker.py", line 718, in _run self.compile_and_execute(example, compiler, test.globs) File "/Users/mkoeppe/s/sage/sagerebasing/local/lib/python3.9/sitepackages/sage/doctest/forker.py", line 1137, in compile_and_execute exec(compiled, globs) File "<doctest sage.rings.number_field.number_field_element.NumberFieldElement._matrix_over_base_morphism[3]>", line 1, in <module> alpha._matrix_over_base_morphism(h) == alpha.matrix(QQ) File "sage/rings/number_field/number_field_element.pyx", line 4204, in sage.rings.number_field.number_field_element.NumberFieldElement._matrix_over_base_morphism (build/cythonized/sage/rings/number_field/number_field_element.cpp:34434) raise ValueError("codomain of phi must be parent of self") ValueError: codomain of phi must be parent of self ********************************************************************** File "src/sage/rings/number_field/number_field_element.pyx", line 4470, in sage.rings.number_field.number_field_element.NumberFieldElement.different Failed example: b.different(K=phi) Expected: 4*b^3 Got: 4*alpha^3 **********************************************************************
Are number field elements mutable??
comment:38 Changed 13 months ago by
Commit:  ecc13bc7e360138323c258e917226ef673ff5b44 → 4b260e0c802b8403ea8a7076a727350a0755f021 

Branch pushed to git repo; I updated commit sha1. New commits:
4b260e0  RealIntervalFieldElement.__deepcopy__: New

comment:39 Changed 13 months ago by
Commit:  4b260e0c802b8403ea8a7076a727350a0755f021 → 9f7f45a936a153973cc9690e0d3cd0bfd261d816 

Branch pushed to git repo; I updated commit sha1. New commits:
9f7f45a  FreeModuleElement.__deepcopy__: New

comment:40 Changed 13 months ago by
comment:41 Changed 13 months ago by
Commit:  9f7f45a936a153973cc9690e0d3cd0bfd261d816 → 2bd01ae26f1447cfc35d2fc1901413baab43819d 

comment:42 Changed 13 months ago by
Description:  modified (diff) 

comment:43 followup: 44 Changed 13 months ago by
Obviously a lot of the added trivial __copy__
and __deepcopy__
methods could be replaced by using a mixin class; but I am not sure what it would be called or if it would be better in terms of the clarity of the code.
comment:44 Changed 13 months ago by
Replying to mkoeppe:
Obviously a lot of the added trivial
__copy__
and__deepcopy__
methods could be replaced by using a mixin class; but I am not sure what it would be called or if it would be better in terms of the clarity of the code.
A number of them we probably wouldn't want for the very low level classes as cdef
classes don't allow for multiple inheritance. This might mitigate some of the utility of a mixin class.
comment:45 followup: 49 Changed 13 months ago by
There are some doctest failures now from bad code like this in src/sage/geometry/cone.py
:
# rays has immutable elements rays = [copy(ray) for ray in rays] for i, ray in enumerate(rays): rays[i][0] = pm * (ray[0].abs() + 1)
comment:46 Changed 13 months ago by
Description:  modified (diff) 

comment:47 Changed 13 months ago by
Status:  needs_review → needs_work 

comment:48 Changed 13 months ago by
Cc:  William Stein Markus Wageringel added 

For the pattern in comment:45, with #29101 ("Refined protocol for _element_constructor_
") we would generally be able to replace copy(ray)
by ray.parent()(ray, mutable=True, copy=True)
; but perhaps we should define a shorthand for that.
 #32353 proposes new methods
mutable
,immutable
for this src/sage/combinat/constellation.py
defines.mutable_copy
src/sage/graphs/generic_graph.py
definesGenericGraph.copy
, which has a parameterimmutable=None
: some classes have
copy
methods with optional arguments that could be extended (see also #32453) to allowray.copy(mutable=True)
(proposed in https://groups.google.com/g/sagedevel/c/DNrbtItMVmQ/m/BbMwMqeAQAJ)
Decision time...
comment:49 followup: 52 Changed 13 months ago by
Cc:  KarlDieter Crisman added 

Replying to mkoeppe:
There are some doctest failures now from bad code like this in
src/sage/geometry/cone.py
:# rays has immutable elements rays = [copy(ray) for ray in rays] for i, ray in enumerate(rays): rays[i][0] = pm * (ray[0].abs() + 1)
This pattern  expecting that ray.__copy__
(and thus copy(ray)
) make a mutable copy of an immutable object  appears to come from a decision made many many years ago in Sage (#111, #6522, #15132, ...). This was probably a good idea at the time, but I think it is not compatible with current Python practices.
comment:50 Changed 13 months ago by
Dependencies:  → #32454 

comment:51 Changed 13 months ago by
Status:  needs_work → needs_info 

comment:52 Changed 13 months ago by
This pattern  expecting that
ray.__copy__
(and thuscopy(ray)
) make a mutable copy of an immutable object  appears to come from a decision made many many years ago in Sage (#111, #6522, #15132, ...). This was probably a good idea at the time, but I think it is not compatible with current Python practices.
That may indeed be the case. As you say, at the time the goal was standardized Sage command structure, but if there is a more (modern) Pythonic way to handle this, #6522 alone shows how much gnashing of teeth surrounded some of these changes at the time, so more power to you as long as any syntax changes would be documented.
One should also be sure that things which would "naturally" be seen as copyable are, but other than Sequence
which I am not as familiar with, it seems like e.g. vectors can still be copied unless asked not to be, so seems fine to me.
comment:53 Changed 13 months ago by
Regarding the pair of methods v.mutable()
, v.immutable()
proposed in #32353: v.immutable()
is clear, of course, and shorter than v.copy(mutable=False)
. But I think it is unclear what v.mutable()
is supposed to be if v
is already mutable. Does it return a copy? Then the method's name should say so that it is obvious to users / readers of the code that mutating the result is not only allowed but also safe.
So a pair v.mutable_copy()
, v.immutable_copy()
would be fine, although I'm not sure if it's much better than v.copy(mutable=True)
, v.copy(mutable=False)
comment:54 Changed 13 months ago by
Summary:  __copy__ and __deepcopy__ methods for all immutable Sage objects → Metaticket: __copy__ and __deepcopy__ methods for all immutable Sage objects 

comment:55 Changed 13 months ago by
Description:  modified (diff) 

comment:56 Changed 13 months ago by
Description:  modified (diff) 

comment:57 Changed 13 months ago by
Some more data points on other libraries:
 numpy's
ndarray
s (and subclasses) have acopy
method with an optional parameter that controls the memory layout of the copy; the__copy__
method is equivalent to one of these choices.  sympy defines
copy
methods but nodeepcopy
,__copy__
, or__deepcopy__
methods
comment:58 Changed 13 months ago by
Description:  modified (diff) 

comment:59 Changed 13 months ago by
Description:  modified (diff) 

comment:60 Changed 13 months ago by
Description:  modified (diff) 

comment:61 Changed 13 months ago by
Description:  modified (diff) 

comment:62 Changed 13 months ago by
Summary:  Metaticket: __copy__ and __deepcopy__ methods for all immutable Sage objects → Metaticket: Support Python's __copy__ / __deepcopy__ protocol 

comment:63 Changed 10 months ago by
Milestone:  sage9.5 → sage9.6 

comment:64 Changed 7 months ago by
Milestone:  sage9.6 → sage9.7 

comment:65 Changed 4 weeks ago by
Milestone:  sage9.7 → sage9.8 

Families are (semantically) immutable objects. Why would we want to copy them?
Cheers,