Opened 7 years ago
Closed 7 years ago
#14054 closed enhancement (fixed)
Cythoned UniqueRepresentation
Reported by: | SimonKing | Owned by: | tbd |
---|---|---|---|
Priority: | major | Milestone: | sage-5.8 |
Component: | performance | Keywords: | cython UniqueRepresentation |
Cc: | nthiery | Merged in: | sage-5.8.beta4 |
Authors: | Simon King | Reviewers: | Travis Scrimshaw |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | #14017, #6495, #14182, #14040, #14011 | Stopgaps: |
Description (last modified by )
UniqueRepresentation
provides a comfortable way to create unique parent structures, and automatically provides a hash and certain comparison methods.
Problems:
- It relies on a metaclass, namely
ClasscallMetaclass
and thus has to be a Python class. That's bad for speed. - It combines two features, namely a cache and unique instance behaviour. But in many some cases we want a cache so that two distinct instances can still be equal.
Here, I suggest to
- separate the two features, by creating a new class
sage.unique_representation.CachedRepresentation
as one base ofUniqueRepresentation
. - create a new cdef class
sage.misc.fast_methods.WithRichCmpById
, that provides hash and rich comparison, as expected for unique instance behaviour.
Apply
Attachments (5)
Change History (94)
Changed 7 years ago by
comment:1 Changed 7 years ago by
- Status changed from new to needs_review
comment:2 Changed 7 years ago by
- Cc nthiery added
Cc to Nicolas as the original author of UniqueRepresentation
.
comment:3 Changed 7 years ago by
I just did some experiments with the tp_hash function from Python's C-API: It is 1/3 faster than a cythoned hash. I'll try to do similar things with the comparison methods.
comment:4 Changed 7 years ago by
While we are at speeding up UniqueRepresentation
, I think we should actually refactor it. Namely, UniqueRepresentation
serves two purposes, namely (1) caching, and (2) comparison and hash by identity.
Some classes misuse UniqueRepresentation
by only using feature (1), overriding comparison in a way that violates the unique representation condition. See my monologue at sage-devel.
I suggest to split the two features.
comment:5 Changed 7 years ago by
FWIW, I finished a more experimental and rather intrusive version of UniqueRepresentation
.
Idea:
- Create a cdef function, that results in faster C code than what Cython makes of
def __hash__(self): return id(self)
- Override tp_hash with this function, for every instance of
UniqueRepresentation
. Likewise for tp_richcompare.
It remains possible to override those parts of comparison that can't be decided by looking at identity (such as "a<b" if "a is not b").
It would be great to just define the fast hash and comparison for UniqueRepresentation
itself, but alas it seems that subclasses forget these settings, whether they override __hash__
or not. See the comments on sage-devel.
I still think it is a good idea to separate UniqueRepresentation
from CachedRepresentation
, but I am not so sure about enforcing the uniqueness behaviour, without the possibility to override it---this wouldn't be pythonic...
Let the patchbot do some work:
Apply trac14054_fast_methods.patch
comment:6 Changed 7 years ago by
From sage-devel, I understand that people think that separating the cache feature from the uniqueness feature of UniqueRepresentation
is a good idea.
However, in my old patch, I was enforcing uniqueness behaviour for instances of UniqueRepresentation
. This isn't pythonic. Hence, I do differently in the new patch version.
Question
In the current implementation, inheritance from UniqueRepresentation
will overload rich comparison (==, >=, !=, etc.) inherited from a base class, but it will not overload comparison (cmp). Do you think that both should be overloaded?
Patchbot reported two failures with the previous patch version. I guess that's because of an additional dependency. So, as soon as I have a decent internet connection, I'll download the latest beta, and rebase on top of it.
Apply trac14054_fast_methods.patch
comment:7 Changed 7 years ago by
Other question: Does provide_hash_by_id
really make sense to have?
comment:8 Changed 7 years ago by
The updated patch should make the coverage script happy, but three tests with 5.7.rc0 currently fail. I am downloading 5.7.rc0 now.
Apply trac14054_fast_methods.patch
comment:9 follow-up: ↓ 10 Changed 7 years ago by
I am not qualified to look over this ticket, but glancing at it I spotted this
308 complete = complete = self.complete()
which looks like a typo.
comment:10 in reply to: ↑ 9 Changed 7 years ago by
Replying to jlopez:
I am not qualified to look over this ticket, but glancing at it I spotted this
308 complete = complete = self.complete()which looks like a typo.
Yes, thank you for spotting it! Will remove it when I rebase the patch against 5.7.rc0.
comment:11 Changed 7 years ago by
- Reviewers set to Travis Scrimshaw
Hey Simon,
Let me know when this is ready for review again.
Best,
Travis
comment:12 follow-up: ↓ 14 Changed 7 years ago by
- Description modified (diff)
The patch should now be ready for review.
Two questions that I'd like to have addressed:
- I introduce
provide_hash_by_id
, but I don't use it. Shall it be deleted? - Is it enough to override the rich comparison methods? Currently, one can have
hash(a)!=hash(b)
butcmp(a,b)==0
. Does this violate the contract of hash functions? Or is it enough thathash(a)!=hash(b)
impliesa!=b
?
Pathbot:
Apply trac14054_fast_methods.patch
comment:13 Changed 7 years ago by
To answer the question about the hash contract:
Apparently dictionaries use the rich comparison methods and not cmp:
sage: class Bla(object): ....: def __hash__(self): ....: return 2 ....: def __cmp__(self, other): ....: return 0 ....: def __eq__(self, other): ....: return self is other ....: def __ne__(self, other): ....: return self is not other ....: sage: a = Bla() sage: b = Bla() sage: a==b False sage: hash(a)==hash(b) True sage: cmp(a,b) 0 sage: D = {a:1}
Since hash(a)==hash(b)
, a and b belong to the same hash bucket of the dictionary. And comparison by cmp tells that they are equal. But we still have:
sage: D[b] --------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-8-6cf1ee6b63d5> in <module>() ----> 1 D[b] KeyError: <__main__.Bla object at 0x5064310>
Hence, dictionaries use comparison by ==
, and thus my patch does the right thing, IMHO.
comment:14 in reply to: ↑ 12 ; follow-up: ↓ 15 Changed 7 years ago by
Replying to SimonKing:
- I introduce
provide_hash_by_id
, but I don't use it. Shall it be deleted?
Yes, you should. I'm pretty sure changing type->tp_hash
after calling PyType_Ready()
is not supported by the Python C-API (see the inheritance problems we saw).
- Is it enough to override the rich comparison methods? Currently, one can have
hash(a)!=hash(b)
butcmp(a,b)==0
.
Does this violate the contract of hash functions? Or is it enough that
hash(a)!=hash(b)
impliesa!=b
?
I think Python requires hash(a)!=hash(b)
implies not(a==b)
, which Python does not enforce to be the same thing. In any case, I'm pretty sure that "rich comparison" is fully exhausted before trying to use __cmp__
, which is only there because of backward compatibility.
There's another peculiarity for membership and lookup in python:
sage: class neq(object): ....: def __eq__(self,other): ....: return False ....: def __ne__(self,other): ....: return True ....: sage: a=neq() sage: V={a} sage: a in V True sage: sage: [a == v for v in V] [False]
They explicitly mention this in the documentation: because hash collisions are rare, they first test "is" for dict lookup before trying "==" (it's cheaper).
comment:15 in reply to: ↑ 14 Changed 7 years ago by
Replying to nbruin:
Replying to SimonKing:
- I introduce
provide_hash_by_id
, but I don't use it. Shall it be deleted?Yes, you should.
OK.
Does this violate the contract of hash functions? Or is it enough that
hash(a)!=hash(b)
impliesa!=b
?I think Python requires
hash(a)!=hash(b)
impliesnot(a==b)
,
Right, that's a difference.
In any case, I'm pretty sure that "rich comparison" is fully exhausted before trying to use
__cmp__
, which is only there because of backward compatibility.
Nope! See my post on sage-devel.
If you have a non-cdef class, then it seems __richcmp__
is simply ignored (but methods like __eq__
have precedence over __cmp__
). In a cdef class, __richcmp__
has priority over __cmp__
when deciding binary relations such as ==, <, <=, etc. But for cmp(*,*), __cmp__
will be used, even if there is __richcmp__
!
So, the decisive question is: Do dictionaries compare stuff by cmp(a,b)==0 or by a==b?
It is the latter, and thus making __richcmp__
compatible with __hash__
was the right thing to do.
For now, it needs work, because I will delete the C API hack, and apparently some script of the patchbot has a complaint...
comment:16 Changed 7 years ago by
Done!
It seems that the patchbot complains about an increased startup time. How did that happen? Is it because of a slightly longer mro for UniqueRepresentation
?
Patchbot:
Apply trac14054_fast_methods.patch
comment:17 follow-up: ↓ 18 Changed 7 years ago by
What's your rationale to define 'a < b' etc. when a is b
? I agree that it's likely that your answers are what any ordering override will want, but Python does not mandate any particular properties of semantics of these operators. So I think the proper thing is to just not do anything there. It's going to be a rare occasion anyway, so a little speed gain won't be very noticeable anyway.
The behaviour you are implementing is much easier to explain if you don't override those cases: Then you can just say that WithRichCmpById
provides __hash__
, __eq__
and __ne__
, period.
I think WithEqualityById
would make a better conceptual name that is less dependent on the implementation, by the way. Your motivation here seems to be as much to provide conceptual units in preference of just technical implementation tools as to provide more efficient implementations. Someone writing a Python class may not know what "richcomp" is, and doesn't need to.
comment:18 in reply to: ↑ 17 ; follow-up: ↓ 20 Changed 7 years ago by
Hey,
Replying to nbruin:
What's your rationale to define 'a < b' etc. when
a is b
? I agree that it's likely that your answers are what any ordering override will want, but Python does not mandate any particular properties of semantics of these operators. So I think the proper thing is to just not do anything there. It's going to be a rare occasion anyway, so a little speed gain won't be very noticeable anyway.
So how would you want comparisons in the complex numbers to behave? In python, they return an error:
sage: complex(2+2*i) < complex(2-2*i) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-2-ce2e269601f6> in <module>() ----> 1 complex(Integer(2)+Integer(2)*i) < complex(Integer(2)-Integer(2)*i) TypeError: no ordering relation is defined for complex numbers
However in sage they are not so well-behaved (see #14088):
sage: CC(1+2*i) < CC(2-2*i) True
so there's no prescribed sage way.
As far as the actual comparison code, I think it would be better to do something like
if m == 2: return True elif m == 3: return False else: return m == 1 or m == 5
but this is a micro-optimization, so feel free to ignore this.
The behaviour you are implementing is much easier to explain if you don't override those cases: Then you can just say that
WithRichCmpById
provides__hash__
,__eq__
and__ne__
, period.I think
WithEqualityById
would make a better conceptual name that is less dependent on the implementation, by the way. Your motivation here seems to be as much to provide conceptual units in preference of just technical implementation tools as to provide more efficient implementations. Someone writing a Python class may not know what "richcomp" is, and doesn't need to.
I believe the response to both of these points depends on how we want to do comparison between objects (when there is not a [natural] ordering).
Last thing for now, shouldn't we issue a deprecation warning for FastHashable_class
since it has changed locations?
Thanks,
Travis
comment:19 Changed 7 years ago by
The issue is that if someone wants to define
class C(WithEqualityById): def __cmp__(self,other): return -1
they will find that
sage: a=C() sage: a < a False
which might surprise them, because they thought that WithEqualityById
only affected (in)equality testing and let ordering comparisons fall through to __cmp__
if implemented. With the current code, this is not the case if the two arguments are identical.
There have been extensive discussions about what inequality testing SHOULD be in sage and for the most part the implementation here is staying clear of the topic (which I think is a good thing). There's just this one small optimization that's probably usually OK, but if left out makes for much more predictable behaviour.
comment:20 in reply to: ↑ 18 ; follow-up: ↓ 21 Changed 7 years ago by
Replying to nbruin:
What's your rationale to define 'a < b' etc. when
a is b
? I agree that it's likely that your answers are what any ordering override will want, but Python does not mandate any particular properties of semantics of these operators.
Python does not. But I think "comparison by identity" implies a semantics. And that is: a < a
is False, a <= a
is True.
The behaviour you are implementing is much easier to explain if you don't override those cases: Then you can just say that
WithRichCmpById
provides__hash__
,__eq__
and__ne__
, period.
OK.
I think
WithEqualityById
would make a better conceptual name that is less dependent on the implementation, by the way.
Good idea.
Replying to tscrim:
Last thing for now, shouldn't we issue a deprecation warning for
FastHashable_class
since it has changed locations?
I think I introduced it. But when? Or what ticket? I don't think anyone used it.
comment:21 in reply to: ↑ 20 Changed 7 years ago by
Replying to SimonKing:
The behaviour you are implementing is much easier to explain if you don't override those cases: Then you can just say that
WithRichCmpById
provides__hash__
,__eq__
and__ne__
, period.OK.
I think
WithEqualityById
would make a better conceptual name that is less dependent on the implementation, by the way.Good idea.
Note: I would not like the name "WithCmpById
", because cmp
relies on __cmp__
(which is not touched) even if __richcmp__
exists.
Replying to tscrim:
Last thing for now, shouldn't we issue a deprecation warning for
FastHashable_class
since it has changed locations?I think I introduced it. But when? Or what ticket? I don't think anyone used it.
It was in #11900. We could of course keep it in sage.categories.category_singleton, but I think that's not the right place. Note that I remove the dependency of CategorySingleton
on FastHashable_class
, because the new cythoned hash is slightly faster.
comment:22 Changed 7 years ago by
- Dependencies changed from #14017 to #14017, #6495
- Description modified (diff)
This will not go into 5.7 anyway. Hence, I was rebasing against #6495 (which is a minor change anyway). In addition, I took into account your comments:
- I rename the new class into
WithEqualityById
. - I move
FastHashable_class
back into category_singleton (wrong as this may be...)
I hope this addresses all complaints.
Apply trac14054_fast_methods-5.8.patch
comment:23 Changed 7 years ago by
Oops, I forgot to rename two occurrences of WithRichCmpById
into WithEqualityById
...
Should now pass all tests.
Apply trac14054_fast_methods-5.8.patch
comment:24 Changed 7 years ago by
Strange. I can not replicate the error reported by the patchbot.
comment:25 follow-up: ↓ 26 Changed 7 years ago by
Neither can I by hand or testing the individual files. It has to be something since the testbot can reproduce it... Have you tried running testall by chance? I ran tests in the categories
folder than on free_module.py
and that passed.
It must have something to testing things in the right order...I'm wondering if the problem lies with free_module.tensor_constructor()
being a cached_method and something is being improperly stored...
comment:26 in reply to: ↑ 25 Changed 7 years ago by
Replying to tscrim:
It must have something to testing things in the right order...I'm wondering if the problem lies with
free_module.tensor_constructor()
being a cached_method and something is being improperly stored...
The failing assertion is
File "/mnt/storage2TB/patchbot/Sage/sage-5.8.beta0/local/lib/python/site-packages/sage/categories/modules_with_basis.py", line 1387, in __call__ assert(x.parent() is self.domain())
I can hardly imagine that there is yet another premature deallocation via weak references in play. After all, both x._parent
and self._domain
are just plain strong references.
comment:27 follow-up: ↓ 28 Changed 7 years ago by
This parent mismatch happens in a morphism call. If a morphism gets registered in the wrong slot in a conversion/coercion then it could indeed end up getting mismatched input (shouldn't __call__
be a little more permissive about its arguments by the way? Certainly, an error would be more appropriate than an assert if this is something that actually can go wrong)
Anyway, TripleDict
does store homomorphisms and we've recently seen it's not entirely safe wrt. garbage collections that can happen during update procedures. That can cause homomorphisms to get registered under the wrong domain/codomain keys.
I think it's extremely unlikely that this is happening in this particular situation, but if all other conceivable alternatives are impossible ... (since the corruption would be dependent on a GC happening at just the wrong time, it would be extremely fickle behaviour that is very hard to reproduce).
Of course, if this test is using a map that wasn't retrieved from a TripleDict
or a MonoDict
the above explanation is impossible.
comment:28 in reply to: ↑ 27 ; follow-up: ↓ 29 Changed 7 years ago by
Replying to nbruin:
This parent mismatch happens in a morphism call. If a morphism gets registered in the wrong slot in a conversion/coercion then it could indeed end up getting mismatched input
If it is in a wrong slot (you mean: Addressed by a non-identic copy of a parent?) then it should simply not occur here, because TripleDict
would compare by identity.
(shouldn't
__call__
be a little more permissive about its arguments by the way? Certainly, an error would be more appropriate than an assert if this is something that actually can go wrong)
No, I think an assertion is the right thing to do here. Perhaps one could provide the assertion with an error message that names the two parents that don't match. In that way, we could at least see what two parents are involved.
Note that my patch changes CombinatorialFreeModule
from UniqueRepresentation
to CachedRepresentation
, since it overrides the equality tests. So, it could actually be that the two parents are genuinely non-unique. And since a cached_method is involved here, which does comparison by equality and not identity, we could really be in trouble here.
Of course, it could be that changing to CachedRepresentation
was wrong in this case.
comment:29 in reply to: ↑ 28 Changed 7 years ago by
Replying to SimonKing:
If it is in a wrong slot (you mean: Addressed by a non-identic copy of a parent?) then it should simply not occur here, because
TripleDict
would compare by identity.
No, what I meant is: bucket position to write value into gets determined, dictionary gets changed due to a GC, value gets placed into bucket position determined earlier, which now belongs to an entirely different key triple. Without #13387 we couldn't strictly exclude that from happening, but it would need extreme bad luck.
Your hypothesis sounds much more probable.
comment:30 follow-up: ↓ 31 Changed 7 years ago by
By the way: One reason for removing FastHashable_class
from sage.categories.category_singleton is the fact that the hash inherited from the cythoned version of UniqueRepresentation
is faster than what was provided by FastHashable_class
.
comment:31 in reply to: ↑ 30 ; follow-up: ↓ 32 Changed 7 years ago by
Replying to SimonKing:
By the way: One reason for removing
FastHashable_class
from sage.categories.category_singleton is the fact that the hash inherited from the cythoned version ofUniqueRepresentation
is faster than what was provided byFastHashable_class
.
I didn't want it to not be removed, but I thought it was suppose to have a deprecation warning saying the class had moved/changed namespaces?
comment:32 in reply to: ↑ 31 ; follow-ups: ↓ 33 ↓ 37 Changed 7 years ago by
Replying to tscrim:
I didn't want it to not be removed, but I thought it was suppose to have a deprecation warning saying the class had moved/changed namespaces?
Would you mind to just leave it in its original space (being an orphan, though)?
Otherwise: How does one deprecate a class that has no init method?
comment:33 in reply to: ↑ 32 Changed 7 years ago by
Replying to SimonKing:
Otherwise: How does one deprecate a class that has no init method?
"Had" no init method, I should say.
Anyway, if I move it to a different file, then the import statement "from sage.categories.category_singleton import FastHashable_class
" should result in a deprecation warning. How can this be achieved?
comment:34 Changed 7 years ago by
- Dependencies changed from #14017, #6495 to #14017, #6495, #14159
Let's test whether the safer use of callbacks for weak references to Homsets stored in a TripleDict
from #14159 will fix the problem here.
Apply http://trac.sagemath.org/sage_trac/raw-attachment/ticket/14054/trac14054_fast_methods-5.8.patch
comment:35 Changed 7 years ago by
Patchbot,
Apply trac14054_fast_methods-5.8.patch
comment:36 Changed 7 years ago by
Hooray, I can reproduce the assertion error (it only occurs with the patch from here)! So, that should make it possible to debug it.
comment:37 in reply to: ↑ 32 ; follow-ups: ↓ 38 ↓ 39 Changed 7 years ago by
Replying to SimonKing:
Would you mind to just leave it in its original space (being an orphan, though)?
Otherwise: How does one deprecate a class that has no init method?
What I've done is turned the class into a function and have that function return the desired class after issuing the deprecation warning. I guess an alternative would be to implement a custom call which does the same as above.
How are you reproducing the assertion error?
comment:38 in reply to: ↑ 37 Changed 7 years ago by
comment:39 in reply to: ↑ 37 Changed 7 years ago by
Replying to tscrim:
Replying to SimonKing: What I've done is turned the class into a function and have that function return the desired class after issuing the deprecation warning.
Perhaps like this:
sage: from sage.misc.superseded import deprecated_function_alias sage: g = deprecated_function_alias(13109, number_of_partitions)
comment:40 Changed 7 years ago by
I hesitate to do things like
from sage.misc.superseded import deprecated_function_alias from sage.misc.fast_methods import FastFashable_class as FH_class FastHashable_class = deprecated_function_alias(14054,FH_class)
in sage.categories.category_singleton:
- It adds an unneeded import statement
FastHashable_class
is a base class, and hence one would like to inherit from it. But doing the above,FastHashable_class
would not be a class but a function in sage.categories.category_singleton.- In its now deprecated use, it was necessary to cimport (not just import!) the class. That will be impossible, no matter what way of deprecation warning we use, because deprecation warnings won't work at compile time.
Is there a "lazy import statement with deprecation"? Then, one would not actually import FastHashable_class
in category_singleton, but from sage.categories.category_singleton import FastHashable_class
would result in a deprecation warning. And we can't cope with the compile time cimport problem anyway.
comment:41 follow-up: ↓ 42 Changed 7 years ago by
Replying to SimonKing:
Perhaps like this:
sage: from sage.misc.superseded import deprecated_function_alias sage: g = deprecated_function_alias(13109, number_of_partitions)
That might work, but I haven't tried it (in principle, I don't see why it wouldn't). What I've done is like this:
def DeprecatedClass(self, arg1, arg2, ..., *old_args, **old_kwds): from sage.misc.superseded import deprecation deprecation(14054, "DepC is deprecated. Use path.to.new_mod.NewClass instead") # Do whatever needs to be done to convert to NewClass's inputs from path.to.new_mod import NewClass return NewClass(arg1, arg2, ..., *old_args, **old_kwds)
However that's a good point about the inheritance...perhaps something like
class Dep(NewClass): def __init__(self, arg1, arg2, ..., *args, **kwds): from sage.misc.superseded import deprecation deprecation(14054, "DepC is deprecated. Use path.to.new_mod.NewClass instead") NewClass.__init__(self, arg1, arg2, ..., *args, **kwds)
?
comment:42 in reply to: ↑ 41 ; follow-up: ↓ 43 Changed 7 years ago by
Replying to tscrim:
However that's a good point about the inheritance...perhaps something like
class Dep(NewClass): def __init__(self, arg1, arg2, ..., *args, **kwds): from sage.misc.superseded import deprecation deprecation(14054, "DepC is deprecated. Use path.to.new_mod.NewClass instead") NewClass.__init__(self, arg1, arg2, ..., *args, **kwds)?
Well, that would still mean that we need to actually import it.
I am tempted to ask what sage-devel has to say about this issue.
comment:43 in reply to: ↑ 42 ; follow-up: ↓ 44 Changed 7 years ago by
Replying to SimonKing:
Well, that would still mean that we need to actually import it.
I thought if we turn it into a python class, we wouldn't need to cimport it and it would only be imported if the old class was actually used? (To me honest, part of me is still wondering if we even really need a deprecation warning since it is such a low-level base class.)
I am tempted to ask what sage-devel has to say about this issue.
That might be for the best.
comment:44 in reply to: ↑ 43 Changed 7 years ago by
Replying to tscrim:
Replying to SimonKing:
Well, that would still mean that we need to actually import it.
I thought if we turn it into a python class, we wouldn't need to cimport it
Its original use requires cimport. Namely, it had no __init__
, and setting the hash value requires writing into the cdef attribute _hash
, thus we need cimport. Now, I added an init method---hence, now an import is enough.
(To me honest, part of me is still wondering if we even really need a deprecation warning since it is such a low-level base class.)
Same here...
comment:45 follow-up: ↓ 46 Changed 7 years ago by
Back to the failing assertion:
By inserting information on the mismatching parents, I find that indeed we get two distinct instances of the same parent Free module generated by {1, 2, 3, 4} over Integer Ring
.
So, how is it possible to create these distinct instances? Aha! They have indeed been created in different ways. Namely, by printing the _reduction
attribute that is set during classcall, I found that x.parent() is created with the arguments
(Integer Ring, {1, 2, 3, 4}), {'category': Category of modules with basis over Integer Ring, 'prefix': 'y'})
but self.domain() is created using
(Integer Ring, {1, 2, 3, 4}), {'category': Category of modules with basis over Integer Ring, 'prefix': 'G'})
comment:46 in reply to: ↑ 45 Changed 7 years ago by
Replying to SimonKing:
(Integer Ring, {1, 2, 3, 4}), {'category': Category of modules with basis over Integer Ring, 'prefix': 'y'})(Integer Ring, {1, 2, 3, 4}), {'category': Category of modules with basis over Integer Ring, 'prefix': 'G'})
That's astounding! There is no example in modules_with_basis.py which uses prefix 'y' or prefix 'G'. So, no idea where the two distinct but equal instances comes from.
comment:47 Changed 7 years ago by
Argh, sorry for my preceding post. It would have been needed to look at sage/combinat/free_module.py, not at sage/categories/modules_with_basis.py, where the assertion error is raised.
comment:48 Changed 7 years ago by
We have
def _repr_term(self, term): """ TESTS:: sage: F = CombinatorialFreeModule(ZZ, [1,2,3], prefix="F") sage: G = CombinatorialFreeModule(ZZ, [1,2,3,4], prefix="G") sage: f = F.monomial(1) + 2 * F.monomial(2) sage: g = 2*G.monomial(3) + G.monomial(4) sage: tensor([f, g]) # indirect doctest 2*F[1] # G[3] + F[1] # G[4] + 4*F[2] # G[3] + 2*F[2] # G[4] """
and
def _latex_term(self, term): """ TESTS:: sage: F = CombinatorialFreeModule(ZZ, [1,2,3], prefix='x') sage: G = CombinatorialFreeModule(ZZ, [1,2,3,4], prefix='y') sage: f = F.monomial(1) + 2 * F.monomial(2) sage: g = 2*G.monomial(3) + G.monomial(4) sage: latex(tensor([f, g])) # indirect doctest 2x_{1} \otimes y_{3} + x_{1} \otimes y_{4} + 4x_{2} \otimes y_{3} + 2x_{2} \otimes y_{4} """
Apparently, this is where the two equal free modules generated by {1, 2, 3, 4} over Integer Ring with prefix 'G' respectively 'y' are defined.
Now, I see what is happening.
We have two distinct but equal parents, and the tensor construction uses a strong cache (hence, has a side-effect across tests) using comparison by equality. But later, we want comparison by identity. Boom. I'll ask sage-combinat-devel for advice.
comment:49 Changed 7 years ago by
Yessss! The following is without the patch. Hence, what we see here is an existing bug, that did not show up by mere coincidence (different ordering of the doctests)!!
I can trigger it on the command line (with sage-5.6.rc0 without the patch from here):
sage: F = CombinatorialFreeModule(ZZ, [1,2,3], prefix="F") sage: G = CombinatorialFreeModule(ZZ, [1,2,3,4], prefix="G") sage: f = F.monomial(1) + 2 * F.monomial(2) sage: g = 2*G.monomial(3) + G.monomial(4) sage: tensor([f, g]) # indirect doctest 2*F[1] # G[3] + F[1] # G[4] + 4*F[2] # G[3] + 2*F[2] # G[4] sage: F = CombinatorialFreeModule(ZZ, [1,2,3], prefix='x') sage: G = CombinatorialFreeModule(ZZ, [1,2,3,4], prefix='y') sage: f = F.monomial(1) + 2 * F.monomial(2) sage: g = 2*G.monomial(3) + G.monomial(4) sage: latex(tensor([f, g])) # indirect doctest Traceback (most recent call last): ... /home/simon/SAGE/prerelease/sage-5.6.rc0/local/lib/python2.7/site-packages/sage/categories/modules_with_basis.pyc in __call__(self, *args) 1385 after = args[self._position+1:len(args)] 1386 x = args[self._position] -> 1387 assert(x.parent() is self.domain()) 1388 1389 if self._is_module_with_basis_over_same_base_ring: AssertionError:
comment:50 follow-up: ↓ 51 Changed 7 years ago by
So then this ticket does not depend on #14159, correct?
comment:51 in reply to: ↑ 50 Changed 7 years ago by
- Dependencies changed from #14017, #6495, #14159 to #14017, #6495
- Status changed from needs_review to needs_work
- Work issues set to Fix non-uniqueness trouble of combinatorial free modules
comment:52 Changed 7 years ago by
Back to the deprecation: Volker Braun stated on sage-devel: "Deprecation is for user-visible functionality. No need to deprecate hash implementation details, otherwise it'll be total and immediate development standstill."
So, I think no deprecation for removing FastHashable_class
should be needed.
comment:53 Changed 7 years ago by
Alright. It's good to know exactly when we don't need a deprecation.
comment:54 Changed 7 years ago by
I found:
- Before this patch,
CombinatorialFreeModule
did inherit fromUniqueRepresentation
, but destroyed the uniqueness property. - With my changes to
UniqueRepresentation
,CombinatorialFreeModule
behaves as follows, if one keeps inheriting fromUniqueRepresentation
:sage: G = CombinatorialFreeModule(ZZ, [1,2,3,4], prefix="G") sage: y = CombinatorialFreeModule(ZZ, [1,2,3,4], prefix='y') sage: G is y False sage: G == y False sage: G <= y True sage: y <= G True
Hence, while the uniqueness of the oldUniqueRepresentation
was destroyed, it is preserved in the new version. - Since uniqueness is preserved, the bug exposed in comment:49 has actually been fixed!
- However, since I wanted to preserve the old behaviour, I made
CombinatorialFreeModule
inherit fromCachedRepresentation
, not fromUniqueRepresentation
. In this way, the original bug was re-established. And since by coincidence the order of tests changed, the bug surfaced.
I am now trying if anything breaks if one returns to the old layout of CombinatorialFreeModule
.
comment:55 follow-up: ↓ 56 Changed 7 years ago by
Very minor nitpick: I think providing <=
and >=
even for identical inputs is not quite the correct thing to do. Python has objects for which those are false or for which those raise an exception:
sage: a=float('NaN') sage: a <= a False sage: b=complex(1,2) sage: b<=b TypeError: no ordering relation is defined for complex numbers
In python 3 this is even the default:
Python 3.2.3 (default, Jun 8 2012, 05:40:07) [GCC 4.6.3 20120306 (Red Hat 4.6.3-2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> a=object() >>> a<=a TypeError: unorderable types: object() <= object() >>> a==a True >>> a is a True
I think not providing (a <= a) == True
for UniqueRepresentation
is less likely to cause surprises in the future.
comment:56 in reply to: ↑ 55 Changed 7 years ago by
Replying to nbruin:
I think not providing
(a <= a) == True
forUniqueRepresentation
is less likely to cause surprises in the future.
OK, this should be the next thing I'll test.
comment:57 Changed 7 years ago by
Using UniqueRepresentation
on CombinatorialFreeModule
works mostly fine. There were only some error in tests that were testing the absence of the unique parent condition (so, these are tests against behaviour I would like to fix!), and also there was some error in my new coercion tutorial.
comment:58 Changed 7 years ago by
- Dependencies changed from #14017, #6495 to #14017, #6495, #14182
- Status changed from needs_work to needs_review
I think I fixed all the remaining issues. This includes a fix to my recently added coercion tutorial (the number of inherited methods in two examples changes).
Concerning combinatorial free modules: They used to violate the unique parent condition, even though they inherited from UniqueRepresentation
. I have demonstrated that this was a bug, when combining the failure to be unique with a cache on the tensor product construction. It is automatically fixed with the changes to UniqueRepresentation
introduced in this patch, and gave rise to a new doctest.
Needs review!
Apply trac14054_fast_methods-5.8.patch
comment:59 Changed 7 years ago by
PS: The new dependency #14182 is because of the change to the coercion tutorial.
Apply trac14054_fast_methods-5.8.patch
comment:60 Changed 7 years ago by
- Status changed from needs_review to positive_review
Looks good to me now.
Thank you Simon,
Travis
comment:61 Changed 7 years ago by
- Work issues Fix non-uniqueness trouble of combinatorial free modules deleted
comment:62 Changed 7 years ago by
- Dependencies changed from #14017, #6495, #14182 to #14017, #6495, #14182, #14040
This should be rebased to #14040.
comment:63 Changed 7 years ago by
Done. I only needed to change the expected result for one example in the coercion tutorial: The number of available methods for matrix spaces changes with #14040.
comment:64 Changed 7 years ago by
- Merged in set to sage-5.8.beta3
- Resolution set to fixed
- Status changed from positive_review to closed
comment:65 Changed 7 years ago by
- Merged in sage-5.8.beta3 deleted
- Milestone changed from sage-5.8 to sage-5.9
- Resolution fixed deleted
- Status changed from closed to new
This causes a doctest failure on Solaris SPARC 32-bit (Skynet machine mark
):
sage -t --long -force_lib devel/sage/sage/combinat/rigged_configurations/rigged_configurations.py ********************************************************************** File "/home/buildbot/build/sage/mark-1/mark_full/build/sage-5.8.beta3/devel/sage-main/sage/combinat/rigged_configurations/rigged_configurations.py", line 48: sage: RC.list() Expected: [ (/) <BLANKLINE> (/) <BLANKLINE> (/) , (/) <BLANKLINE> -1[ ]-1 <BLANKLINE> (/) , (/) <BLANKLINE> 0[ ]0 <BLANKLINE> -1[ ]-1 , -1[ ]-1 <BLANKLINE> 0[ ]0 <BLANKLINE> (/) , -1[ ]-1 <BLANKLINE> 1[ ]1 <BLANKLINE> -1[ ]-1 , 0[ ]0 <BLANKLINE> -1[ ]-1 -1[ ]-1 <BLANKLINE> 0[ ]0 ] Got: [ (/) <BLANKLINE> (/) <BLANKLINE> (/) , (/) <BLANKLINE> -1[ ]-1 <BLANKLINE> (/) , -1[ ]-1 <BLANKLINE> 0[ ]0 <BLANKLINE> (/) , -1[ ]-1 <BLANKLINE> 1[ ]1 <BLANKLINE> -1[ ]-1 , (/) <BLANKLINE> 0[ ]0 <BLANKLINE> -1[ ]-1 , 0[ ]0 <BLANKLINE> -1[ ]-1 -1[ ]-1 <BLANKLINE> 0[ ]0 ] **********************************************************************
comment:66 Changed 7 years ago by
OK. Since the list is not sorted, I guess the cleanest solution is to test against sorted(RC.list())
.
comment:67 follow-up: ↓ 87 Changed 7 years ago by
- Description modified (diff)
- Milestone changed from sage-5.9 to sage-5.8
- Status changed from new to needs_review
Can this really not go into 5.8, if the tests pass with the second patch?
Apply trac14054_fast_methods-5.8.patch trac_14054-fix-rigged-list.patch
Changed 7 years ago by
Sort output of RiggedConfigurations?.list(), to make it reproducible on different machines
comment:68 Changed 7 years ago by
Trivial rebase to #14011.
comment:69 Changed 7 years ago by
Now that is strange. With the sorted output, one gets another error, this time on several patchbots.
It seems that I messed up something. Try to fix it now...
comment:70 Changed 7 years ago by
FWIW I also get the same error as the patchbot.
comment:71 Changed 7 years ago by
- Description modified (diff)
Now it should work^{(T)}.
Apply trac14054_fast_methods-5.8.patch trac_14054-fix-rigged-list.2.patch
comment:72 Changed 7 years ago by
This fails to apply for me:
patching file sage/combinat/rigged_configurations/rigged_configurations.py Hunk #3 FAILED at 930 1 out of 3 hunks FAILED -- saving rejects to file sage/combinat/rigged_configurations/rigged_configurations.py.rej patch failed, unable to continue (try -v) patch failed, rejects left in working dir errors during apply, please fix and refresh trac_14054-fix-rigged-list.patch
comment:73 Changed 7 years ago by
- Dependencies changed from #14017, #6495, #14182, #14040 to #14017, #6495, #14182, #14040, #14011
Argh. Jeroen forgot to update the list of dependencies, it seems.
comment:74 Changed 7 years ago by
Rebasing really has been trivial. The error was about a missing newline at the end of the file...
Apply trac14054_fast_methods-5.8.patch trac_14054-fix-rigged-list.2.patch
comment:75 Changed 7 years ago by
Note that the tests in rigged_configurations do pass for me, with
trac11490-coercion_tutorial.patch 14182_whitespace.patch trac_13618-rings_doc_real-ts.patch trac_13618-rings_doc_complex-ts.patch trac_13618-rings_doc_others-ts.patch trac_14040_housekeeping.patch trac_14040_repr_option.patch trac_14040_doctest.patch trac_14040_review.patch trac14054_fast_methods-5.8.patch trac_12313-mono_dict-combined-random-sk.patch trac_13387-combined.patch trac_13387-guard_gc.patch trac_14159_weak_value_triple_dict.patch trac_14159_use_cdef_get.patch trac12951_cached_extension.patch trac_14214-cython_homset.patch trac_14214-backup_cache.patch trac_14063-remove_cc_compositions-ts.patch trac_13688-finite_sets_cardinality_override-ts.patch trac_14138.patch trac_14138-doctests.patch trac_13605-partition_options-ts.patch trac_14011-Sphinx_roles-fh.patch trac_14054-fix-rigged-list.2.patch
applied on top of sage-5.8.beta0. But they fail if I have some more patches applied.
In other words, I get the impression that sorting of the output of RiggedConfigurations.list()
is broken.
comment:76 follow-up: ↓ 78 Changed 7 years ago by
Arrgh. Can it be that sage.combinat.rigged_configurations.rigged_configuration_element.RiggedConfigurationElement
has no sorting implemented? Travis, you are the author. How can one implement comparison of these things?
Jeroen, what can we do, since apparently sorting the output is not available here? I would say that it is not the fault (for a reasonable definition of "fault") of my patch that the test in rigged_configurations is unstable.
comment:77 Changed 7 years ago by
Would it be better to give them an (somewhat unnatural) ordering or to change the doctest?
Actually...why not just make it #random
with a note saying the output order can be random since there is no ordering?
comment:78 in reply to: ↑ 76 Changed 7 years ago by
Replying to SimonKing:
I would say that it is not the fault (for a reasonable definition of "fault") of my patch that the test in rigged_configurations is unstable.
True, but with doctests there is no alternative but to shoot the messenger (i.e. the person adding the patch causing the doctest failure always gets the blame)
comment:79 Changed 7 years ago by
Replying to tscrim:
Would it be better to give them an (somewhat unnatural) ordering or to change the doctest?
Actually...why not just make it
#random
with a note saying the output order can be random since there is no ordering?
I would be fine with making it random, and test that the length of the output (or another easy invariant) is correct.
Is there a mathematically meaningful ordering? From the code, I guess that rigged configuration elements rely on list of tableaus, and if the tableaus can be ordered then one might have some kind of lexicographical ordering, perhaps. But this should be on a different ticket (and perhaps would not make sense at all), so that I would prefer to change the test.
Jeroen, what is your opinion on that matter?
comment:80 Changed 7 years ago by
Thankfully I'm here to take the bullet. :P
I don't know of any mathematical meaning of sorting rigged configurations. I don't think there is one...
The reason why there's different (unsorted) outputs is because at the backend of the iterator in TransitiveIdeal
, there is a (python) set
working. If this is changed to a list, there should be no real loss of speed, but the iteration should be completely consistent across all machines (and for anything else which uses TransitiveIdeal
for iteration and does not have a valid sorting).
Changed 7 years ago by
Mark output of RiggedConfigurations.list()
random and test against cardinality, to make it reproducible on different machines.
comment:81 follow-up: ↓ 85 Changed 7 years ago by
I have updated the patch, marking the test random and testing that the length of the list coincides with the cardinality.
Apply trac14054_fast_methods-5.8.patch trac_14054-fix-rigged-list.2.patch
comment:82 Changed 7 years ago by
PS: With the new patch, the tests will also pass (for me, at least) when I apply the remaining patches in my queue...
comment:83 Changed 7 years ago by
- Status changed from needs_review to positive_review
Everything looks good to me as well. I don't know why the patchbot just timed-out so I gave it a kick.
For patchbot:
Apply: trac14054_fast_methods-5.8.patch trac_14054-fix-rigged-list.2.patch
comment:84 Changed 7 years ago by
Hooray, and the patchbot doesn't complain either, except for the fact that a new module is loaded during start-up. But this can hardly be avoided in this case, and the patchbot does not observe the slightest increase in startup time :-)
comment:85 in reply to: ↑ 81 Changed 7 years ago by
Replying to SimonKing:
I have updated the patch, marking the test random and testing that the length of the list coincides with the cardinality.
Apply trac14054_fast_methods-5.8.patch trac_14054-fix-rigged-list.2.patch
Just for the record: in similar situations, I occasionally used the following idiom:
sage: sorted(..., key=str) ...
Which makes the output deterministic (well, assumming repr is ...) even if there is no natural total ordering on the objects to be sorted.
comment:86 Changed 7 years ago by
I forgot to say: great job getting this done! Thanks!
comment:87 in reply to: ↑ 67 ; follow-up: ↓ 88 Changed 7 years ago by
Replying to SimonKing:
Can this really not go into 5.8, if the tests pass with the second patch?
Well, it didn't pass tests :-)
But seriously: is there any particular reason this patch should be in sage-5.8? I'd rather not take the risk and postpone it for sage-5.9.beta0.
comment:88 in reply to: ↑ 87 Changed 7 years ago by
comment:89 Changed 7 years ago by
- Merged in set to sage-5.8.beta4
- Resolution set to fixed
- Status changed from positive_review to closed
Note that the patch also needed to change some auxiliary class
CartanType_simple_finite
, which is used to unpickle some old data. It used to inherit fromobject
, but for an incompatibility of Cython types it has to inherit fromUniqueRepresentation
instead.For the timings, I use
MatrixSpace
, which inherits fromUniqueRepresentation
:With sage-5.6.rc0 plus #14017:
Adding the patch from here:
Hence, the time drops by 2/3. I did run make ptest successfully. Needs review!