Opened 7 months ago
Last modified 6 months ago
#23807 needs_work defect
different affine patches are the same object in memory
Reported by: | bhutz | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-8.1 |
Component: | algebraic geometry | Keywords: | |
Cc: | paulfili, atowsley, pbruin | Merged in: | |
Authors: | Ben Hutz | Reviewers: | |
Report Upstream: | N/A | Work issues: | |
Branch: | u/bhutz/affine_patch_23807 (Commits) | Commit: | f143fb8c49a9d5ecc19d858fec501e3452ebfae6 |
Dependencies: | Stopgaps: |
Description (last modified by )
Currently
sage: PP = ProjectiveSpace(QQ,1) sage: AA = PP.affine_patch(0) sage: BB = PP.affine_patch(1) sage: AA is BB True
These patches are isomorphic, but should not be the same object.
Finally, since this ticket is discovering modification of global immutable objects, I think it deserves status "major" at the very least.
Change History (16)
comment:1 Changed 7 months ago by
- Branch set to u/bhutz/affine_patch_23807
- Commit set to f143fb8c49a9d5ecc19d858fec501e3452ebfae6
- Status changed from new to needs_review
comment:2 Changed 7 months ago by
I'm not sure that your solution is the best fix for the underlying problem. This discrepency seems strange:
sage: AffineSpace(QQ,2) is AffineSpace(QQ,2) True sage: ProjectiveSpace(QQ,2) is ProjectiveSpace(QQ,2) False
Maybe AffineSpace
shouldn't be unique, for the same reasons ProjectiveSpace
isn't?
comment:3 Changed 7 months ago by
- Description modified (diff)
- Priority changed from minor to major
- Status changed from needs_review to needs_work
Changing AffineScheme
(yes, it's that high up) to UniqueRepresentation
happened on:
https://trac.sagemath.org/ticket/17008
It was already predicted we'd be having unintended consequences from this and indeed here we are.
If we're keeping UniqueRepresentation?, then we must make AfffineSpace? immutable. In affine_patch
we have this line:
AA._default_embedding_index = i
(where AA is either a freshly constructed affine space or an affine space that was passed in). We can't make that change, so
- AA as an optional parameter to
affine_patch
must be removed _default_embedding_index
must be a construction parameter toAffineSpace
(if the default is "None" and the projective closure of the space is asked, a projective space should be created and the right affine patch should be forced into the projective space -- clearly projective spaces are caching their affine patches, not the other way around)- If affine space is going to be
UniqueRep
then it should probably cache projective space rather than the other way around. We already have a problem there now:sage: P2.<x,y,z>=ProjectiveSpace(QQ,2); sage: A=P2.affine_patch(2) sage: A.projective_embedding() == P2 False
(the variable names get lost)
- I would recommend using something along the lines of
names=[repr(a) for a in self.gens()] names.pop(i) AA = AffineSpace(n, self.base_ring(), names = names, default_embedding_index = i)
to get generator names. Given that we cannot make assumptions on what the variable names of the projective space are, I don't think we can do much else than just dropping the variable name whose hyperplane we're removing. If you do want to have a modification, perhaps something like
[repr(a)+"_aff" for a in self.gens()]
or so.
EDIT: I think we're forced to choose between the following:
- Ditch
UniqueRepresentation
onAffineSpace
- Ditch
default_embedding
entirely and accept that one cannot get back projective closures from affine patches (in fact that's already not really possible anyway). - make
ProjectiveSpace
also intoUniqueRepresentation
.
The reason is that with unique rep affine spaces, we're always going to have an issue with
P2a.<x,y,z>=ProjectiveSpace(QQ,2) Aa=P2a.affine_patch(0) P2b.<x,y,z>=ProjectiveSpace(QQ,2) Ab=P2b.affine_patch(0)
With UniqueRepresentation
, we're going to have that Aa is identical to Ab, because there's no info to distinguish the two, so there's no room for Aa
and Ab
to lead back to P2a
and P2b
respectively.
comment:4 follow-up: ↓ 5 Changed 7 months ago by
- Cc paulfili atowsley added
One correction as it affects choice (2): .projective_embedding is just the map, you are trying to compare to the codomain, and that works.
sage: P2.<x,y,z>=ProjectiveSpace(QQ,2); sage: A=P2.affine_patch(2) sage: A.projective_embedding().codomain() == P2 True }}
comment:5 in reply to: ↑ 4 Changed 7 months ago by
Replying to bhutz:
One correction as it affects choice (2): .projective_embedding is just the map, you are trying to compare to the codomain, and that works.
sage: P2.<x,y,z>=ProjectiveSpace(QQ,2); sage: A=P2.affine_patch(2) sage: A.projective_embedding().codomain() == P2 True }}
Thanks! In fact:
sage: A.projective_embedding().codomain() is P2 True
That's what I originally expected to hold. It shows that A caches P2 somewhere, which makes the global modification stuff even worse.
comment:6 follow-up: ↓ 7 Changed 7 months ago by
The affine_patch function and the projective_embedding function update a dictionary object associated to the affine or projective space for caching of these associations. So making those spaces immutable would totally break that functionality.
I personally don't like using unique representation here, but was not involved in the changes you cited that added that. Is it even reasonable to consider undoing that?
Alternatively, it seems perhaps going unique everywhere is the next alternative. But I'm not sure how hard it would be to preserve the connections between embeddings and patches.
comment:7 in reply to: ↑ 6 Changed 7 months ago by
Replying to bhutz:
I personally don't like using unique representation here, but was not involved in the changes you cited that added that. Is it even reasonable to consider undoing that?
Alternatively, it seems perhaps going unique everywhere is the next alternative. But I'm not sure how hard it would be to preserve the connections between embeddings and patches.
I think both are reasonable solutions. I don't like UniqueRepresentation
either, so I'd have a slight preference for the former.
comment:8 Changed 6 months ago by
Here is another unfortunate current behavior
sage: A.<x,y>=AffineSpace(QQ,2) sage: B.<x,y>=AffineSpace(QQ,2) sage: A is B True sage: X = A.subscheme(x-y) sage: Y = B.subscheme(x-y) sage: X is Y False
Similar example, but more complicated in construction
sage: P.<x,y,z>=ProjectiveSpace(QQ,2) sage: P2.<x,y,z> = ProjectiveSpace(QQ,2) sage: X = P.subscheme(x-y) sage: Y = P2.subscheme(x-y) sage: X == Y, X is Y True, False sage: A = X.affine_patch(0) sage: B = Y.affine_patch(0) sage: A == B, A is B, A.ambient_space() is B.ambient_space() (True, False, True) sage: B.projective_embedding().codomain() == Y True sage: B.projective_embedding().codomain() is Y False sage: B.projective_embedding().codomain().ambient_space() is P2 True
comment:9 Changed 6 months ago by
- Cc pbruin added
Adding Peter Bruin, since the effects seen here are a direct result of his modifications on #17008. He might have an idea on how to fix this.
Options I see:
- Undo
UniqueRepresentation
on (some) schemes and deal with ensuing problems in another way
- Go all-the-way on making schemes
UniqueRepresentation
, which is going to take a lot of design consideration because it means they *really* need to be immutable)
I'd expect that anything in between is going to be very hard to reason about (as it turns out: anything involving UniqueRepresentation
is hard to reason/program with in practice, because of Python's imperative streak: people get used to modifying objects)
EDIT: The real problem I see on #17008 is that consistent production of point sets relies on the fact that schemes.generic.scheme.point_homset
is cached. So, in SchemeHomset_generic.__reduce__
we need to do something along the lines of:
if <self> is the result of self.codomain().pointset(...): return PointSet,(self.codomain(),self.domain().ring_this_is_spec_of()) else: return what we did before
in order to give the cache a shot at kicking in. Once we do that, the UniqueRepresentation
from #17008 shouldn't be necessary for the problem encountered there.
Note that this applies to *all* cached routines: once you cache something, you should rewrite the pickler to allow a chance for the cache to do the lookup for you.
comment:10 Changed 6 months ago by
An argument FOR some UniqueRepresentation
: the construct AffineScheme
is really just Spec
. Every ring HAS a spec associated with it, so it would be defendable to make sure that the Spec
of a ring can be obtained from the ring reliably. This could be done by caching the thing on the ring and obtaining it from there, but since it's possible to have a ring without a spec but not its spec without the ring, the spec should probably be allowed to be GCed even when the ring is still around. A UniqueRepresentation
construction does that.
If we go this route, then obviously we need a unique polynomial ring for each affine patch for each projective space that we construct. I doubt we can key polynomial rings with projective spaces ... (that's a memory leak waiting to happen). The possibility for registering a new default embedding on a spec would need to go away.
comment:11 follow-up: ↓ 12 Changed 6 months ago by
Sure, I see that line of reasoning, and that is essentially what the branch on this ticket is doing (fixing the naming of the affine patches).
But if we go this route (which I'm not yet convinced of) then we'd still need to think about what things should be immutable and how that affects the connection between patches and embeddings. Not to mention fixing some of those examples involving subschemes which should also be unique by the same argument.
I'd still like to see peter's response concerning adding uniquerepresentation before making a plan.
comment:12 in reply to: ↑ 11 Changed 6 months ago by
Replying to bhutz:
Sure, I see that line of reasoning, and that is essentially what the branch on this ticket is doing (fixing the naming of the affine patches).
Yes, but see the example in comment:3. Just keying on variable names makes things slightly better, but not sufficient. Even putting the ProjectiveSpace? as part of the construction keys for the affine patches wouldn't save us because
sage: ProjectiveSpace(QQ,1)==ProjectiveSpace(QQ,1) True
UniqueRepresentation
is like a virus ...
But if we go this route (which I'm not yet convinced of) then we'd still need to think about what things should be immutable and how that affects the connection between patches and embeddings. Not to mention fixing some of those examples involving subschemes which should also be unique by the same argument.
I'm not so sure that the same argument applies: While subschemes of affine schemes may be affine again (and hence a spec of something), that's not how they are constructed in your examples. Since they are not primarily "the spec" of a ring to begin with, I don't think there's the same urgency. Obviously, spec of the coordinate ring would hopefully give the subscheme back, but I'm not sure we're supporting that, and I think we could live without that (for now).
That being said, it seems that having some schemes UR
and other not is going to be incredibly painful.
comment:13 follow-up: ↓ 14 Changed 6 months ago by
A useful way to store the patches/embeddings, which allows us to keep UniqueRepresentation
for affine schemes (I am strongly in favour of keeping unique representation for the reason Nils mentioned in comment;10) would be the "structure" construction that is currently used for number fields (introduced by Julian Rüth in #11670).
The idea of "structure" is the following: number fields have unique representation, but the defining data of a number field includes an optional description of how it relates to another number field. In this way, you can create a number field either on its own or (for example) as a subfield of a different number field. See sage/rings/number_field/structure.py
for details.
Analogously, we could use this construction to create, say, both an affine space on its own and a different affine space that is abstractly the same as the first one but is equipped with the extra structure of being an affine patch of a given projective space, or a closed subscheme of another scheme.
One advantage is that this would make it easier to reason about our objects, as well as to pickle them, because it is specified exactly what the defining data of a given scheme is. We do probably have to think about where/how to store the "structure" in such a way that we avoid memory leaks.
comment:14 in reply to: ↑ 13 ; follow-up: ↓ 15 Changed 6 months ago by
Replying to pbruin:
A useful way to store the patches/embeddings, which allows us to keep
UniqueRepresentation
for affine schemes (I am strongly in favour of keeping unique representation for the reason Nils mentioned in comment;10) would be the "structure" construction that is currently used for number fields (introduced by Julian Rüth in #11670).
That construction is a complete memory leak. For instance:
sage: u=id(NumberField(x^2+1,name="a").absolute_field('b')) sage: import gc sage: gc.collect() 65 sage: [type(v) for v in gc.get_objects() if id(v)==u] [<class 'sage.rings.number_field.number_field.NumberField_quadratic_with_category'>]
This leaks for the following reason. If we do:
sage: K.<a>=NumberField(x^2+1) sage: L.<b>=K.absolute_field() sage: L._structure._reduction[1][0] is K True
We see that L
has a strong link to K
. So L
will now keep K
alive. What's more, via sage.rings.number_field.number_field.NumberField_version2._cache
there's now a strong global link to K
as a key in a WeakValueDictionary
with value L
, so as long as L
is alive, K
will be reachable (and not just part of a cycle).
So, if K
somehow references L
then K
and L
will not be collected. And here it is:
sage: K._cache__absolute_field.values()[0] is L True
Note that in principle it's fine for L
and K
to reference each other: the cyclic garbage collector will find those dead cycles just fine. The problem is that by using any of L
and K
as part of a key in a UniqueFactory
or similar, you're getting a strong reference to the cycle. So the key now keeps the value in the WeakValueDictionary
alive.
An important part of the problem is a double caching strategy: UniqueFactory
already provides caching. absolute_field
implements its own cache. With the introduction of structure
the first caching has been made a little better. But that now clashes with the second caching.
The situation might be slightly better if absolute_field
were non-caching, but it could well be that the required computations to arrive at the key under which the other cache operates are now too expensive. One of the reasons why UniqueRepresentation
etc. is so insidious is because it forces a caching strategy in one place, which precludes caching from being used elsewhere.
In this case a solution could be to implement absolute_field
as:
@cached_method def compute_absolute_field_data(...): ... def absolute_field(...): data = self.compute_absolute_field_data(...) L = NumberField(data,...) return L
so that the expensive data gets stored without creating or referencing the field that can be constructed with it.
comment:15 in reply to: ↑ 14 ; follow-up: ↓ 16 Changed 6 months ago by
Replying to nbruin:
Replying to pbruin:
A useful way to store the patches/embeddings, which allows us to keep
UniqueRepresentation
for affine schemes (I am strongly in favour of keeping unique representation for the reason Nils mentioned in comment;10) would be the "structure" construction that is currently used for number fields (introduced by Julian Rüth in #11670).That construction is a complete memory leak.
Ouch. It seems that neither the author nor the reviewers of #11670 (including me) paid attention to memory leaks.
In this case a solution could be to implement
absolute_field
as:@cached_method def compute_absolute_field_data(...): ... def absolute_field(...): data = self.compute_absolute_field_data(...) L = NumberField(data,...) return Lso that the expensive data gets stored without creating or referencing the field that can be constructed with it.
This is already how absolute_field()
is implemented; data
is the absolute defining polynomial, which is computed and cached by the method absolute_polynomial()
. Removing the @cached_method
decorator from NumberField_{generic,relative}.absolute_field()
solves the memory leak in your example:
sage: u=id(NumberField(x^2+1,name="a").absolute_field('b')) sage: import gc sage: gc.collect() 53 sage: [type(v) for v in gc.get_objects() if id(v)==u] []
comment:16 in reply to: ↑ 15 Changed 6 months ago by
Replying to pbruin:
Replying to nbruin:
Replying to pbruin:
A useful way to store the patches/embeddings, which allows us to keep
UniqueRepresentation
for affine schemes (I am strongly in favour of keeping unique representation for the reason Nils mentioned in comment;10) would be the "structure" construction that is currently used for number fields (introduced by Julian Rüth in #11670).That construction is a complete memory leak.
Ouch. It seems that neither the author nor the reviewers of #11670 (including me) paid attention to memory leaks.
Whenever you see a CachedRepresentation
or a UniqueFactory
that has included in its input parameters objects that are similar to what is returned, you can expect memory leaks (and I really mean, you should expect to be able to easily create an example as above). That is one of my main reasons to not like it. It's very difficult to use it in non-trivial examples and not create memory leaks. Basically, input to these constructors should consist of just strings and integers. If you can't do that, it's a good sign you should perhaps not use it, because someone will create a memory leak with it in the near future.
Fix suggested above now at #23851.
New commits:
23807: make affine patches distinct objects