Opened 5 years ago
Last modified 5 weeks ago
#25662 needs_work defect
Wrong contains in PartitionDiagrams
Reported by: | mantepse | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | combinatorics | Keywords: | |
Cc: | alauve, tscrim, zabrocki | Merged in: | |
Authors: | Reviewers: | Daniel Krenn | |
Report Upstream: | N/A | Work issues: | |
Branch: | u/mantepse/wrong_contains_in_partitiondiagrams (Commits, GitHub, GitLab) | Commit: | ab58ad8dbc68c29df337f19fbfba4544f2237f46 |
Dependencies: | #25659, #25462, #25642 | Stopgaps: |
Description
In 8.3 beta 7:
sage: import sage.combinat.diagram_algebras as da sage: PD = da.PartitionDiagrams(3/2) sage: PD.list() [{{-2, -1, 1, 2}}, {{-2, 1, 2}, {-1}}, {{-2, 2}, {-1, 1}}, {{-2, -1, 2}, {1}}, {{-2, 2}, {-1}, {1}}] sage: [[-2,-1],[2,1]] in PD True
The fix is to remove the individual __contains__
methods, and correct the generic one, and optimize afterwards. However, to avoid a mess, I'd like to wait for the dependencies.
Change History (25)
comment:1 Changed 5 years ago by
comment:2 Changed 5 years ago by
Branch: | → u/mantepse/wrong_contains_in_partitiondiagrams |
---|
comment:3 Changed 4 years ago by
Commit: | → c32cbed705a207cf13c246e8bdda87f4a90bf680 |
---|
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
3d5f553 | Merge branch 'public/combinat/speedup_set_partitions-25462' of git://trac.sagemath.org/sage into t/25659/make_braueralgebra_faster
|
c7fdf9b | Initial Cython version of the PerfectMatchings iterator.
|
dd4441a | Added fixed point free involution generator and convert to SP. Fixing bug in PerfectMatchingsIterator.
|
034589d | Optimizing generation using fixed-point-free involutions.
|
7942eab | Using FPF involution iterator and skipping sorting.
|
80b13f4 | Use the new backend iterators in diagram_algebras.py.
|
b486103 | add reference
|
77d6902 | Minor tweaks.
|
14ed798 | Merge branch 'public/combinat/speedup_brauer_algebra-25659' of git://trac.sagemath.org/sage into t/25662/wrong_contains_in_partitiondiagrams
|
c32cbed | fix doctests destroyed by merge
|
comment:4 Changed 4 years ago by
Status: | new → needs_review |
---|
comment:7 Changed 4 years ago by
Reviewers: | → Daniel Krenn |
---|---|
Status: | needs_review → needs_work |
Started reviewing:
-
+ obj = self._element_constructor_(obj)
I think here self(obj)
is the correct way to do as this includes the shortcuts for elements with the "correct" parent.
2.
+ if self.order not in ZZ: + # check that k and -k are in the same block + k = ZZ(self.order + ZZ(1)/ZZ(2))
I have no glue when the order is not in ZZ (apperently it can have half values, but is this always the case? (I am a bit scared about this code...)
So, either someone tells me, where this is explained or someone else takes a look (I'd prefer the latter).
Rest looks good AFAICS.
comment:8 Changed 4 years ago by
Commit: | c32cbed705a207cf13c246e8bdda87f4a90bf680 → fee85a739034b72b7fa39ac4d9be2a77bbfa5648 |
---|
comment:9 follow-up: 10 Changed 4 years ago by
Wow, that was quick - thank you!
OK, I admit I don't know about the first one, so I simply tried and it looks OK. If I understand correctly, self(obj)
will invoke __call__
, which in turn calls _element_constructor_
, right? I did not check whether there is a performance penality.
The second one is by design of this class (which is not myself). For the partition algebra, it makes algebraically a lot of sense to have half-integer order. Combinatorially, this simply implies that the final elements of the top and the bottom row, by convention k and -k, are in the same block.
By design, all other diagram algebras are currently regarded as sub-algebras of the partition algebra. This may be up to debate, but is certainly outside the scope of this ticket.
Martin
New commits:
e7d2738 | Merge branch 'u/mantepse/wrong_contains_in_partitiondiagrams' of git://trac.sagemath.org/sage into HEAD
|
fee85a7 | replace self._element_constructor_(obj) with self(obj)
|
comment:10 Changed 4 years ago by
Replying to mantepse:
OK, I admit I don't know about the first one, so I simply tried and it looks OK. If I understand correctly,
self(obj)
will invoke__call__
, which in turn calls_element_constructor_
, right? I did not check whether there is a performance penality.
The thing is that _element_constructor_
is not called if the element has already the parent self
(maybe there are some additional shortcuts if the element_class
already fits). So, therefore this is the way to convert an element. I do not think that the performance is here an issue (and I am very certain that one could construct an example, where your code does not work). So therefore, I am in favor of changing it.
The second one is by design of this class (which is not myself). For the partition algebra, it makes algebraically a lot of sense to have half-integer order. Combinatorially, this simply implies that the final elements of the top and the bottom row, by convention k and -k, are in the same block.
Ok. So then the code is somehow fine for me (because if for whatever reason the order is not half-integer, then there will be an error and not a false result).
It might be worth considering restructuring the code so that if self.order in ZZ: return True
so that it is clear that the code ends here in this case and only otherwise we do something. (This is also the style on the lines before).
By design, all other diagram algebras are currently regarded as sub-algebras of the partition algebra. This may be up to debate, but is certainly outside the scope of this ticket.
Clear, this is not for discussion here.
comment:11 Changed 4 years ago by
Status: | needs_work → needs_review |
---|
comment:12 Changed 4 years ago by
Commit: | fee85a739034b72b7fa39ac4d9be2a77bbfa5648 → ab58ad8dbc68c29df337f19fbfba4544f2237f46 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
ab58ad8 | restructure flow
|
comment:15 follow-up: 19 Changed 4 years ago by
Status: | positive_review → needs_work |
---|
There are a number of things wrong or misleading. First, self(foo)
redirects to self._element_constructor_(foo)
unless foo
is already an element of the parent self
. I generally disagree with testing self(foo)
and catching an error first. This is very slow for things that inherit from AbstractPartitionDiagram
. I believe you also get None in PartitionDiagrams(0)
returning True
whereas the current version, this is False
.
I think what you should do is actually properly fix the __contains__
that are there. This might include a better check
method too.
comment:16 follow-up: 17 Changed 4 years ago by
As with your comment on the permutation code, I am completely lost.
Maybe you could provide a patch yourself, so I can see what you mean?
Patching the individual __contains__
is not the way to go in my opinion, because it is quite error prone (as visible in the original code). As I wrote above, I think it is better to have a working version first, and optimise afterwards. Note that there are many other diagram algebras that would be good to have, so it is important that the generic method works well.
comment:17 Changed 4 years ago by
Replying to mantepse:
As with your comment on the permutation code, I am completely lost.
Maybe you could provide a patch yourself, so I can see what you mean?
No because there is no patch to provide. What I am saying is that I do not think your proposed change is the way forward with reasoning why. Although I did see one typo above that I have fixed.
Patching the individual
__contains__
is not the way to go in my opinion, because it is quite error prone (as visible in the original code). As I wrote above, I think it is better to have a working version first, and optimise afterwards. Note that there are many other diagram algebras that would be good to have, so it is important that the generic method works well.
It is definitely the way to go because it has better separation-of-concerns (checks are localized to each part), it is faster, and it should not be hard to do. If you want to make it work, then fix it properly.
comment:18 Changed 4 years ago by
Authors: | Martin Rubey |
---|
OK, I'll abandon this for the moment, I don't understand what you want me to do (except for implementing a method for each diagram algebra separately, which I believe is wrong). I'll do the quick fix in the permutations file, but this one is too much work right now. Exams coming up.
comment:19 follow-up: 20 Changed 4 years ago by
Replying to tscrim:
There are a number of things wrong or misleading. First,
self(foo)
redirects toself._element_constructor_(foo)
unlessfoo
is already an element of the parentself
. I generally disagree with testingself(foo)
and catching an error first. This is very slow for things that inherit fromAbstractPartitionDiagram
. I believe you also getNone in PartitionDiagrams(0)
returningTrue
whereas the current version, this isFalse
.I think what you should do is actually properly fix the
__contains__
that are there. This might include a bettercheck
method too.
I am not sure either, why you think the proposed code is so bad. Isn't having a quite general implementation favored over individual implementations for each class. (But it might also be that I do not really understand your concerns; so, could we elaborate on this...?)
comment:20 Changed 4 years ago by
Replying to dkrenn:
Replying to tscrim:
There are a number of things wrong or misleading. First,
self(foo)
redirects toself._element_constructor_(foo)
unlessfoo
is already an element of the parentself
. I generally disagree with testingself(foo)
and catching an error first. This is very slow for things that inherit fromAbstractPartitionDiagram
. I believe you also getNone in PartitionDiagrams(0)
returningTrue
whereas the current version, this isFalse
.I think what you should do is actually properly fix the
__contains__
that are there. This might include a bettercheck
method too.I am not sure either, why you think the proposed code is so bad. Isn't having a quite general implementation favored over individual implementations for each class. (But it might also be that I do not really understand your concerns; so, could we elaborate on this...?)
The __contains__
method should be fast as a general rule. Moreover, general code is usually more complicated as it has poorer separation-of-conerns. This is why we have subclasses. We also already have implementations that mostly work. I see this proposal as 1 step forward, 2 steps back.
comment:21 follow-up: 22 Changed 4 years ago by
A few questions, comments and a clarification:
1.) I actually did propose to implement efficient methods once the generic check works, and as need arises. As you know I spent quite a bit of time speeding up the code for diagram algebras, because it was too slow for my computations. Thus, I think you could trust me that I care about speed and thought about how to achieve it.
Fun fact: removing the "specialised" TemperleyLiebDiagrams.__contains__
results in a factor 0.8 speedup of the following test:
sage: r=5; td = da.TemperleyLiebDiagrams(r); l = da.PartitionDiagrams(r).list() sage: %timeit sum(1 for d in l if d in td)
I am guessing that this is because it first checks whether the candidate is in BrauerDiagrams
, thus duplicating a lot of work.
2.) I do think the general rule is to go for correctness first, and then optimize. (This is not saying that my patch is perfect, of course. In particular, I think I should have removed the __contains__
from TemperleyLiebDiagrams
and PlanarDiagrams
, too.)
In the case at hand I think it is particularly important to have a well working generic check, to make it less work to implement new diagram algebras. In particular, #25637 moves three from partition_algebra.py
to this module, using the same generic __contains__
method.
I do not see what "separation-of-concerns" has to do with my approach: I am simply determining containment by trying to construct the diagram.
3.) I understand that you think "it is definitively the way to go", but I do not see why. I think your argument about speed is invalid, because the current implementation is buggy, and duplication of code is often a source of bugs.
comment:22 follow-up: 23 Changed 4 years ago by
Replying to mantepse:
A few questions, comments and a clarification:
1.) I actually did propose to implement efficient methods once the generic check works, and as need arises. As you know I spent quite a bit of time speeding up the code for diagram algebras, because it was too slow for my computations. Thus, I think you could trust me that I care about speed and thought about how to achieve it.
Martin, I know you care about speed. Patronizing me is not useful to the discussion.
Fun fact: removing the "specialised"
TemperleyLiebDiagrams.__contains__
results in a factor 0.8 speedup of the following test:sage: r=5; td = da.TemperleyLiebDiagrams(r); l = da.PartitionDiagrams(r).list() sage: %timeit sum(1 for d in l if d in td)I am guessing that this is because it first checks whether the candidate is in
BrauerDiagrams
, thus duplicating a lot of work.
Probably. That is an interesting data point.
2.) I do think the general rule is to go for correctness first, and then optimize. (This is not saying that my patch is perfect, of course. In particular, I think I should have removed the
__contains__
fromTemperleyLiebDiagrams
andPlanarDiagrams
, too.)
IMO that is even more a step in the wrong direction.
In the case at hand I think it is particularly important to have a well working generic check, to make it less work to implement new diagram algebras. In particular, #25637 moves three from
partition_algebra.py
to this module, using the same generic__contains__
method.
I feel that is a bit of a misrepresentation of those __contains__
as we are deprecating those classes.
I do not see what "separation-of-concerns" has to do with my approach: I am simply determining containment by trying to construct the diagram.
You are taking things that should be in separate classes and pulling them to one class. The containment check for being a Brauer diagram should be done by the Brauer diagram parent class after it has been determined that it is a partition diagram by the partition diagram class. Each class tests what it is suppose to be.
3.) I understand that you think "it is definitively the way to go", but I do not see why. I think your argument about speed is invalid, because the current implementation is buggy, and duplication of code is often a source of bugs.
That does not make it invalid. It just means we need to fix a bug. We are only arguing about how to fix the bug. IMO, your fix makes the code less clear, harder to optimize in the future (as any optimization would undo your change), and likely generically makes the code slower (but I have not run timings yet).
There is also very little duplication of code. The only part that is duplicated is the part that converts a non (abstract) partition diagram to an object of the parent. If you are worried about that, you can make that into a function. However, the meat of the issue has no duplication.
I wonder if the only thing that is going wrong is this part:
# check that k and -k are in the same block
comment:23 follow-up: 24 Changed 4 years ago by
Replying to tscrim:
Martin, I know you care about speed. Patronizing me is not useful to the discussion.
reciprocally.
In the case at hand I think it is particularly important to have a well working generic check, to make it less work to implement new diagram algebras. In particular, #25637 moves three from
partition_algebra.py
to this module, using the same generic__contains__
method.I feel that is a bit of a misrepresentation of those
__contains__
as we are deprecating those classes.
No. Those classes (rooks, planar rooks and permutations) are migrated to the better framework in diagram_algebra.py
, only the old implementations are deprecated.
You are taking things that should be in separate classes and pulling them to one class.
No. This is absolutely not what the patch does. I am only removing code.
Besides:
I believe you also get
None
inPartitionDiagrams(0)
returningTrue
[...]
No. I agree it is a good idea to check this corner case!
Anyway. My offer is as follows:
- remove the
__contains__
here - do #25637, so that all algebras are in this class
- discuss and implement a better
__contains__
machinery - but after having migrated the other classes frompartition_algebra.py
todiagram_algebra.py
.
A final question:
I wonder if the only thing that is going wrong is this part:
# check that k and -k are in the same block
I don't really know. The logic is (for my brain) scattered over too many places.
The containment check for being a Brauer diagram should be done by the Brauer diagram parent class after it has been determined that it is a partition diagram by the partition diagram class. Each class tests what it is suppose to be.
However, I think it would make sense to swap the __contains__
and check
methods: the meat of the check could be in __contains__
of the parent classes, and the check
method could be generic in AbstractPartitionDiagram
.
The advantage is speed, the cost is that one probably looses some information for the error message. For example, TemperleyLieb
is the intersection of Brauer
and Planar
. Currently the user is told whether the diagram has bad blocks, and if all blocks are OK, whether the diagram is planar. A generic check
could probably only say "This is not a Temperley-Lieb diagram".
The other way to gain performance and remove duplication of code is probably to remove AbstractSetPartition
.
comment:24 Changed 4 years ago by
Replying to mantepse:
Replying to tscrim:
You are taking things that should be in separate classes and pulling them to one class.
No. This is absolutely not what the patch does. I am only removing code.
Yes it is. That is precisely what you are doing. Everything goes into the base class's __contains__
, so that is the only method that is being called instead of the subclasses' __contains__
.
Anyway. My offer is as follows:
- remove the
__contains__
here
This is the sticking point and what I disagree with doing. How about we fix the actual bug?
- do #25637, so that all algebras are in this class
#25637 is basically independent of this and those old classes are meant to be deprecated. So that is irrelevant to this discussion.
- discuss and implement a better
__contains__
machinery - but after having migrated the other classes frompartition_algebra.py
todiagram_algebra.py
.
This is not a good strategy. Fix the problem at hand. It sounds like you are suggesting to remove code just to add it back in. If you are not sure and feel it warrants a full design discussion, then let us fix the bug using the current framework, which should be easy, and then have the discussion.
A final question:
I wonder if the only thing that is going wrong is this part:
# check that k and -k are in the same blockI don't really know. The logic is (for my brain) scattered over too many places.
Are you saying you are unable to debug this with the current implementation?
The containment check for being a Brauer diagram should be done by the Brauer diagram parent class after it has been determined that it is a partition diagram by the partition diagram class. Each class tests what it is suppose to be.
However, I think it would make sense to swap the
__contains__
andcheck
methods: the meat of the check could be in__contains__
of the parent classes, and thecheck
method could be generic inAbstractPartitionDiagram
.
I generally agree with this, but your proposed change actually goes the opposite direction of this.
The advantage is speed, the cost is that one probably looses some information for the error message. For example,
TemperleyLieb
is the intersection ofBrauer
andPlanar
. Currently the user is told whether the diagram has bad blocks, and if all blocks are OK, whether the diagram is planar. A genericcheck
could probably only say "This is not a Temperley-Lieb diagram".
In properly optimized code, there would be no speed advantage to having a generic method. Generic code needs to work generically, and so it cannot take advantage of special structure.
The other way to gain performance and remove duplication of code is probably to remove
AbstractSetPartition
.
Removing ABCs is usually independent of performance, and it generally involves more code duplication because there is less commonality with related classes.
comment:25 Changed 5 weeks ago by
Milestone: | sage-8.3 |
---|
I do not understand these doctests:
Why is
PA([])
supposed to give the identity, butD([])
throw an error? Is it because ofUnitDiagramMixin
magic?Put differently, the
check
method ofAbstractPartitionDiagram
explicitely allows the empty diagram, irrespective of order. Should[] in PartitionDiagrams(2)
return true or false?