Opened 5 years ago

# Wrong contains in PartitionDiagrams

Reported by: Owned by: mantepse major combinatorics alauve, tscrim, zabrocki Daniel Krenn N/A u/mantepse/wrong_contains_in_partitiondiagrams ab58ad8dbc68c29df337f19fbfba4544f2237f46 #25659, #25462, #25642

### Description

In 8.3 beta 7:

```sage: import sage.combinat.diagram_algebras as da
sage: PD = da.PartitionDiagrams(3/2)
sage: PD.list()
[{{-2, -1, 1, 2}},
{{-2, 1, 2}, {-1}},
{{-2, 2}, {-1, 1}},
{{-2, -1, 2}, {1}},
{{-2, 2}, {-1}, {1}}]
sage: [[-2,-1],[2,1]] in PD
True
```

The fix is to remove the individual `__contains__` methods, and correct the generic one, and optimize afterwards. However, to avoid a mess, I'd like to wait for the dependencies.

### comment:1 Changed 5 years ago by mantepse

I do not understand these doctests:

``` sage: import sage.combinat.diagram_algebras as da
sage: R.<x> = QQ[]
sage: PA = PartitionAlgebra(2, x, R, 'P')
sage: PA([]) == PA.one()
True
sage: D = da.DiagramAlgebra(2, x, R, 'P', da.PartitionDiagrams(2))
sage: D([]) == D.one()
Traceback (most recent call last):
...
ValueError: invalid input of []
```

Why is `PA([])` supposed to give the identity, but `D([])` throw an error? Is it because of `UnitDiagramMixin` magic?

Put differently, the `check` method of `AbstractPartitionDiagram` explicitely allows the empty diagram, irrespective of order. Should `[] in PartitionDiagrams(2)` return true or false?

### comment:2 Changed 5 years ago by mantepse

Branch: → u/mantepse/wrong_contains_in_partitiondiagrams

### comment:3 Changed 4 years ago by git

Commit: → c32cbed705a207cf13c246e8bdda87f4a90bf680

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

 ​3d5f553 `Merge branch 'public/combinat/speedup_set_partitions-25462' of git://trac.sagemath.org/sage into t/25659/make_braueralgebra_faster` ​c7fdf9b `Initial Cython version of the PerfectMatchings iterator.` ​dd4441a `Added fixed point free involution generator and convert to SP. Fixing bug in PerfectMatchingsIterator.` ​034589d `Optimizing generation using fixed-point-free involutions.` ​7942eab `Using FPF involution iterator and skipping sorting.` ​80b13f4 `Use the new backend iterators in diagram_algebras.py.` ​b486103 `add reference` ​77d6902 `Minor tweaks.` ​14ed798 `Merge branch 'public/combinat/speedup_brauer_algebra-25659' of git://trac.sagemath.org/sage into t/25662/wrong_contains_in_partitiondiagrams` ​c32cbed `fix doctests destroyed by merge`

### comment:4 Changed 4 years ago by mantepse

Status: new → needs_review

ping?

ping ping?

### comment:7 Changed 4 years ago by dkrenn

Reviewers: → Daniel Krenn needs_review → needs_work

Started reviewing:

1. ```+            obj = self._element_constructor_(obj)
```

I think here `self(obj)` is the correct way to do as this includes the shortcuts for elements with the "correct" parent.

2.

```+        if self.order not in ZZ:
+            # check that k and -k are in the same block
+            k = ZZ(self.order + ZZ(1)/ZZ(2))
```

I have no glue when the order is not in ZZ (apperently it can have half values, but is this always the case? (I am a bit scared about this code...)

So, either someone tells me, where this is explained or someone else takes a look (I'd prefer the latter).

Rest looks good AFAICS.

### comment:8 Changed 4 years ago by git

Commit: c32cbed705a207cf13c246e8bdda87f4a90bf680 → fee85a739034b72b7fa39ac4d9be2a77bbfa5648

Branch pushed to git repo; I updated commit sha1. New commits:

 ​e7d2738 `Merge branch 'u/mantepse/wrong_contains_in_partitiondiagrams' of git://trac.sagemath.org/sage into HEAD` ​fee85a7 `replace self._element_constructor_(obj) with self(obj)`

### comment:9 follow-up:  10 Changed 4 years ago by mantepse

Wow, that was quick - thank you!

OK, I admit I don't know about the first one, so I simply tried and it looks OK. If I understand correctly, `self(obj)` will invoke `__call__`, which in turn calls `_element_constructor_`, right? I did not check whether there is a performance penality.

The second one is by design of this class (which is not myself). For the partition algebra, it makes algebraically a lot of sense to have half-integer order. Combinatorially, this simply implies that the final elements of the top and the bottom row, by convention k and -k, are in the same block.

By design, all other diagram algebras are currently regarded as sub-algebras of the partition algebra. This may be up to debate, but is certainly outside the scope of this ticket.

Martin

New commits:

 ​e7d2738 `Merge branch 'u/mantepse/wrong_contains_in_partitiondiagrams' of git://trac.sagemath.org/sage into HEAD` ​fee85a7 `replace self._element_constructor_(obj) with self(obj)`

### comment:10 in reply to:  9 Changed 4 years ago by dkrenn

OK, I admit I don't know about the first one, so I simply tried and it looks OK. If I understand correctly, `self(obj)` will invoke `__call__`, which in turn calls `_element_constructor_`, right? I did not check whether there is a performance penality.

The thing is that `_element_constructor_` is not called if the element has already the parent `self` (maybe there are some additional shortcuts if the `element_class` already fits). So, therefore this is the way to convert an element. I do not think that the performance is here an issue (and I am very certain that one could construct an example, where your code does not work). So therefore, I am in favor of changing it.

The second one is by design of this class (which is not myself). For the partition algebra, it makes algebraically a lot of sense to have half-integer order. Combinatorially, this simply implies that the final elements of the top and the bottom row, by convention k and -k, are in the same block.

Ok. So then the code is somehow fine for me (because if for whatever reason the order is not half-integer, then there will be an error and not a false result).

It might be worth considering restructuring the code so that `if self.order in ZZ: return True` so that it is clear that the code ends here in this case and only otherwise we do something. (This is also the style on the lines before).

By design, all other diagram algebras are currently regarded as sub-algebras of the partition algebra. This may be up to debate, but is certainly outside the scope of this ticket.

Clear, this is not for discussion here.

### comment:11 Changed 4 years ago by mantepse

Status: needs_work → needs_review

### comment:12 Changed 4 years ago by git

Branch pushed to git repo; I updated commit sha1. New commits:

 ​ab58ad8 `restructure flow`

Done!

### comment:14 Changed 4 years ago by dkrenn

Status: needs_review → positive_review

Thanks, LGTM.

### comment:15 follow-up:  19 Changed 4 years ago by tscrim

Status: positive_review → needs_work

There are a number of things wrong or misleading. First, `self(foo)` redirects to `self._element_constructor_(foo)` unless `foo` is already an element of the parent `self`. I generally disagree with testing `self(foo)` and catching an error first. This is very slow for things that inherit from `AbstractPartitionDiagram`. I believe you also get `None in PartitionDiagrams(0)` returning `True` whereas the current version, this is `False`.

I think what you should do is actually properly fix the `__contains__` that are there. This might include a better `check` method too.

Last edited 4 years ago by tscrim (previous) (diff)

### comment:16 follow-up:  17 Changed 4 years ago by mantepse

As with your comment on the permutation code, I am completely lost.

Maybe you could provide a patch yourself, so I can see what you mean?

Patching the individual `__contains__` is not the way to go in my opinion, because it is quite error prone (as visible in the original code). As I wrote above, I think it is better to have a working version first, and optimise afterwards. Note that there are many other diagram algebras that would be good to have, so it is important that the generic method works well.

### comment:17 in reply to:  16 Changed 4 years ago by tscrim

As with your comment on the permutation code, I am completely lost.

Maybe you could provide a patch yourself, so I can see what you mean?

No because there is no patch to provide. What I am saying is that I do not think your proposed change is the way forward with reasoning why. Although I did see one typo above that I have fixed.

Patching the individual `__contains__` is not the way to go in my opinion, because it is quite error prone (as visible in the original code). As I wrote above, I think it is better to have a working version first, and optimise afterwards. Note that there are many other diagram algebras that would be good to have, so it is important that the generic method works well.

It is definitely the way to go because it has better separation-of-concerns (checks are localized to each part), it is faster, and it should not be hard to do. If you want to make it work, then fix it properly.

### comment:18 Changed 4 years ago by mantepse

Authors: Martin Rubey

OK, I'll abandon this for the moment, I don't understand what you want me to do (except for implementing a method for each diagram algebra separately, which I believe is wrong). I'll do the quick fix in the permutations file, but this one is too much work right now. Exams coming up.

### comment:19 in reply to:  15 ; follow-up:  20 Changed 4 years ago by dkrenn

There are a number of things wrong or misleading. First, `self(foo)` redirects to `self._element_constructor_(foo)` unless `foo` is already an element of the parent `self`. I generally disagree with testing `self(foo)` and catching an error first. This is very slow for things that inherit from `AbstractPartitionDiagram`. I believe you also get `None in PartitionDiagrams(0)` returning `True` whereas the current version, this is `False`.

I think what you should do is actually properly fix the `__contains__` that are there. This might include a better `check` method too.

I am not sure either, why you think the proposed code is so bad. Isn't having a quite general implementation favored over individual implementations for each class. (But it might also be that I do not really understand your concerns; so, could we elaborate on this...?)

### comment:20 in reply to:  19 Changed 4 years ago by tscrim

There are a number of things wrong or misleading. First, `self(foo)` redirects to `self._element_constructor_(foo)` unless `foo` is already an element of the parent `self`. I generally disagree with testing `self(foo)` and catching an error first. This is very slow for things that inherit from `AbstractPartitionDiagram`. I believe you also get `None in PartitionDiagrams(0)` returning `True` whereas the current version, this is `False`.

I think what you should do is actually properly fix the `__contains__` that are there. This might include a better `check` method too.

I am not sure either, why you think the proposed code is so bad. Isn't having a quite general implementation favored over individual implementations for each class. (But it might also be that I do not really understand your concerns; so, could we elaborate on this...?)

The `__contains__` method should be fast as a general rule. Moreover, general code is usually more complicated as it has poorer separation-of-conerns. This is why we have subclasses. We also already have implementations that mostly work. I see this proposal as 1 step forward, 2 steps back.

### comment:21 follow-up:  22 Changed 4 years ago by mantepse

A few questions, comments and a clarification:

1.) I actually did propose to implement efficient methods once the generic check works, and as need arises. As you know I spent quite a bit of time speeding up the code for diagram algebras, because it was too slow for my computations. Thus, I think you could trust me that I care about speed and thought about how to achieve it.

Fun fact: removing the "specialised" `TemperleyLiebDiagrams.__contains__` results in a factor 0.8 speedup of the following test:

```sage: r=5; td = da.TemperleyLiebDiagrams(r); l = da.PartitionDiagrams(r).list()
sage: %timeit sum(1 for d in l if d in td)
```

I am guessing that this is because it first checks whether the candidate is in `BrauerDiagrams`, thus duplicating a lot of work.

2.) I do think the general rule is to go for correctness first, and then optimize. (This is not saying that my patch is perfect, of course. In particular, I think I should have removed the `__contains__` from `TemperleyLiebDiagrams` and `PlanarDiagrams`, too.)

In the case at hand I think it is particularly important to have a well working generic check, to make it less work to implement new diagram algebras. In particular, #25637 moves three from `partition_algebra.py` to this module, using the same generic `__contains__` method.

I do not see what "separation-of-concerns" has to do with my approach: I am simply determining containment by trying to construct the diagram.

3.) I understand that you think "it is definitively the way to go", but I do not see why. I think your argument about speed is invalid, because the current implementation is buggy, and duplication of code is often a source of bugs.

### comment:22 in reply to:  21 ; follow-up:  23 Changed 4 years ago by tscrim

A few questions, comments and a clarification:

1.) I actually did propose to implement efficient methods once the generic check works, and as need arises. As you know I spent quite a bit of time speeding up the code for diagram algebras, because it was too slow for my computations. Thus, I think you could trust me that I care about speed and thought about how to achieve it.

Martin, I know you care about speed. Patronizing me is not useful to the discussion.

Fun fact: removing the "specialised" `TemperleyLiebDiagrams.__contains__` results in a factor 0.8 speedup of the following test:

```sage: r=5; td = da.TemperleyLiebDiagrams(r); l = da.PartitionDiagrams(r).list()
sage: %timeit sum(1 for d in l if d in td)
```

I am guessing that this is because it first checks whether the candidate is in `BrauerDiagrams`, thus duplicating a lot of work.

Probably. That is an interesting data point.

2.) I do think the general rule is to go for correctness first, and then optimize. (This is not saying that my patch is perfect, of course. In particular, I think I should have removed the `__contains__` from `TemperleyLiebDiagrams` and `PlanarDiagrams`, too.)

IMO that is even more a step in the wrong direction.

In the case at hand I think it is particularly important to have a well working generic check, to make it less work to implement new diagram algebras. In particular, #25637 moves three from `partition_algebra.py` to this module, using the same generic `__contains__` method.

I feel that is a bit of a misrepresentation of those `__contains__` as we are deprecating those classes.

I do not see what "separation-of-concerns" has to do with my approach: I am simply determining containment by trying to construct the diagram.

You are taking things that should be in separate classes and pulling them to one class. The containment check for being a Brauer diagram should be done by the Brauer diagram parent class after it has been determined that it is a partition diagram by the partition diagram class. Each class tests what it is suppose to be.

3.) I understand that you think "it is definitively the way to go", but I do not see why. I think your argument about speed is invalid, because the current implementation is buggy, and duplication of code is often a source of bugs.

That does not make it invalid. It just means we need to fix a bug. We are only arguing about how to fix the bug. IMO, your fix makes the code less clear, harder to optimize in the future (as any optimization would undo your change), and likely generically makes the code slower (but I have not run timings yet).

There is also very little duplication of code. The only part that is duplicated is the part that converts a non (abstract) partition diagram to an object of the parent. If you are worried about that, you can make that into a function. However, the meat of the issue has no duplication.

I wonder if the only thing that is going wrong is this part:

```# check that k and -k are in the same block
```

### comment:23 in reply to:  22 ; follow-up:  24 Changed 4 years ago by mantepse

Martin, I know you care about speed. Patronizing me is not useful to the discussion.

reciprocally.

In the case at hand I think it is particularly important to have a well working generic check, to make it less work to implement new diagram algebras. In particular, #25637 moves three from `partition_algebra.py` to this module, using the same generic `__contains__` method.

I feel that is a bit of a misrepresentation of those `__contains__` as we are deprecating those classes.

No. Those classes (rooks, planar rooks and permutations) are migrated to the better framework in `diagram_algebra.py`, only the old implementations are deprecated.

You are taking things that should be in separate classes and pulling them to one class.

No. This is absolutely not what the patch does. I am only removing code.

Besides:

I believe you also get `None` in `PartitionDiagrams(0)` returning `True` [...]

No. I agree it is a good idea to check this corner case!

Anyway. My offer is as follows:

• remove the `__contains__` here
• do #25637, so that all algebras are in this class
• discuss and implement a better `__contains__` machinery - but after having migrated the other classes from `partition_algebra.py` to `diagram_algebra.py`.

A final question:

I wonder if the only thing that is going wrong is this part:

```# check that k and -k are in the same block
```

I don't really know. The logic is (for my brain) scattered over too many places.

The containment check for being a Brauer diagram should be done by the Brauer diagram parent class after it has been determined that it is a partition diagram by the partition diagram class. Each class tests what it is suppose to be.

However, I think it would make sense to swap the `__contains__` and `check` methods: the meat of the check could be in `__contains__` of the parent classes, and the `check` method could be generic in `AbstractPartitionDiagram`.

The advantage is speed, the cost is that one probably looses some information for the error message. For example, `TemperleyLieb` is the intersection of `Brauer` and `Planar`. Currently the user is told whether the diagram has bad blocks, and if all blocks are OK, whether the diagram is planar. A generic `check` could probably only say "This is not a Temperley-Lieb diagram".

The other way to gain performance and remove duplication of code is probably to remove `AbstractSetPartition`.

### comment:24 in reply to:  23 Changed 4 years ago by tscrim

You are taking things that should be in separate classes and pulling them to one class.

No. This is absolutely not what the patch does. I am only removing code.

Yes it is. That is precisely what you are doing. Everything goes into the base class's `__contains__`, so that is the only method that is being called instead of the subclasses' `__contains__`.

Anyway. My offer is as follows:

• remove the `__contains__` here

This is the sticking point and what I disagree with doing. How about we fix the actual bug?

• do #25637, so that all algebras are in this class

#25637 is basically independent of this and those old classes are meant to be deprecated. So that is irrelevant to this discussion.

• discuss and implement a better `__contains__` machinery - but after having migrated the other classes from `partition_algebra.py` to `diagram_algebra.py`.

This is not a good strategy. Fix the problem at hand. It sounds like you are suggesting to remove code just to add it back in. If you are not sure and feel it warrants a full design discussion, then let us fix the bug using the current framework, which should be easy, and then have the discussion.

A final question:

I wonder if the only thing that is going wrong is this part:

```# check that k and -k are in the same block
```

I don't really know. The logic is (for my brain) scattered over too many places.

Are you saying you are unable to debug this with the current implementation?

The containment check for being a Brauer diagram should be done by the Brauer diagram parent class after it has been determined that it is a partition diagram by the partition diagram class. Each class tests what it is suppose to be.

However, I think it would make sense to swap the `__contains__` and `check` methods: the meat of the check could be in `__contains__` of the parent classes, and the `check` method could be generic in `AbstractPartitionDiagram`.

I generally agree with this, but your proposed change actually goes the opposite direction of this.

The advantage is speed, the cost is that one probably looses some information for the error message. For example, `TemperleyLieb` is the intersection of `Brauer` and `Planar`. Currently the user is told whether the diagram has bad blocks, and if all blocks are OK, whether the diagram is planar. A generic `check` could probably only say "This is not a Temperley-Lieb diagram".

In properly optimized code, there would be no speed advantage to having a generic method. Generic code needs to work generically, and so it cannot take advantage of special structure.

The other way to gain performance and remove duplication of code is probably to remove `AbstractSetPartition`.

Removing ABCs is usually independent of performance, and it generally involves more code duplication because there is less commonality with related classes.

### comment:25 Changed 5 weeks ago by mkoeppe

Milestone: sage-8.3
Note: See TracTickets for help on using tickets.