Opened 4 years ago

Closed 4 years ago

#19197 closed enhancement (fixed)

LatticePoset: add breadth()

Reported by: jmantysalo Owned by:
Priority: major Milestone: sage-6.9
Component: combinatorics Keywords:
Cc: Merged in:
Authors: Jori Mäntysalo Reviewers: Nathann Cohen
Report Upstream: N/A Work issues:
Branch: 9c05f49 (Commits) Commit: 9c05f49e95504acc1a37ea942764886c53f57198
Dependencies: Stopgaps:

Description

Add a function to compute the breadth of a lattice.

Change History (34)

comment:1 Changed 4 years ago by jmantysalo

  • Branch set to u/jmantysalo/latticeposet__add_breadth__

comment:2 follow-ups: Changed 4 years ago by ncohen

  • Commit set to cd99bd6b82aaece6bd535d6fdfd01f7bfb28a8c5

Hello Jori,

Here is a way to improve the algorithm a bit:

Let us say that a set S is "locally_minimal" if the join of the elements in S is different from the join of the elements in <any proper subset in S>. What you are looking for is the set locally_mininal set of maximum cardinality.

Why do you iterate on antichains? You iterate on antichains, because you know that in *any* "locally_minimal" set S, any 2-subset of S must be locally minimal. And the sets of locally minimal sets of cardinality 2 are precisely the antichains.

Why wouldn't you go further? Indeed, the same works for anything greater than 2: in any locally minimal set, *any* proper subset is also locally minimal.

Your implementation, however, does not use that. You test all antichains, but when testing those of size 5 you do not use the information obtained from those of size 4.

So here is a way out: enumerate all locally minimal sets directly, in increasing order of size. This can be done with subsets_with_hereditary_property [1], to which you can feed a function that detects if a given set of points is locally minimal (when you remove only one element).

This should do the trick, and *use* the information that <some given set of size 3 is not locally minimal> when trying to figure out one of size 4.

Note: It may be slower for small cases, but better above.

Additionally, your code may spend a lot of time hashing your elements: it may be better to work directly with the integer ID of the poset's elements (and to access the matrix directly).

Also, could you be a bit more formal in the definition of 'breadth'? I did not follow it at first (english is not a well-paenthesized language). It would also be cool if you could redirect toward a textbook/paper that defines it: I found it in "Semimodular Lattices Theory and Applications" where it is said to originate from "Lattice Theory" by Birkhoff.

Nathann

[1] http://doc.sagemath.org/html/en/reference/combinat/sage/combinat/subsets_hereditary.html


New commits:

cd99bd6Added function breadth() to lattices.

comment:3 Changed 4 years ago by git

  • Commit changed from cd99bd6b82aaece6bd535d6fdfd01f7bfb28a8c5 to 8d81130c3ea697cbfe3ce178dd5a7fac052e78d3

Branch pushed to git repo; I updated commit sha1. New commits:

8d81130Sphinx does not like me.

comment:4 Changed 4 years ago by jmantysalo

Technical correction committed. Now the code should work and Sphinx should not complain.

I will try Nathann's suggestion later. I would have marked this as needs_work, but the system does not allow it.

comment:5 in reply to: ↑ 2 ; follow-up: Changed 4 years ago by jmantysalo

Replying to ncohen:

So here is a way out: enumerate all locally minimal sets directly, in increasing order of size. This can be done with subsets_with_hereditary_property [1], to which you can feed a function that detects if a given set of points is locally minimal (when you remove only one element).

This should do the trick, and *use* the information that <some given set of size 3 is not locally minimal> when trying to figure out one of size 4.

OK... So you think something like

f = lambda ac: self.is_antichain_of_poset(ac) and self.join(ac) not in
  [self.join(set(ac).difference(set([e]))) for e in ac]
l = list(subsets_with_hereditary_property(f, self))
len(l[-1])

? Seems faster, but still stucks for 132-element Posets.TamariLattice(6).canonical_label(). Btw, what is faster way to iterate over "list L minus an element for every element"?

comment:6 in reply to: ↑ 5 Changed 4 years ago by ncohen

OK... So you think something like

There is no need to check that it is an antichain. Also, if you do not write it as a lambda function you have more freedom and do not need to create sets at each test, test.

btw, what is faster way to iterate over "list L minus an element for every element"?

What's wrong with L[:i] + L[i+1:] ?

Nathann

comment:7 Changed 4 years ago by jmantysalo

Just a random thought: Can the thinking be reversed? I mean that for a certificate A there is the element j=join(A). What kind of element that can be? Can we somehow go back from it/those, i.e. have algorithm j -> A instead of A -> j?

If this is not possible, is this an NP-problem?

comment:8 in reply to: ↑ 2 Changed 4 years ago by jmantysalo

Replying to ncohen:

Also, could you be a bit more formal in the definition of 'breadth'? I did not follow it at first (english is not a well-paenthesized language). It would also be cool if you could redirect toward a textbook/paper that defines it: I found it in "Semimodular Lattices Theory and Applications" where it is said to originate from "Lattice Theory" by Birkhoff.

True. "- - of the breadth of a lattice L, - - as the least positive integer b = b[L] such that any meet x_1 ^ . . . ^ x_n [n > b] is always a meet of a subset of b of the x_i. This is on p. 99. Not very intuitive definition... So let's try from scratch:

Let E be an n-element subset of elements of the lattice, j be a join of E, and join of every proper subset of E be not equal j. The breadth of lattice is number of elements in a largest such subset E.

? Hmm... not best possible.

comment:9 Changed 4 years ago by ncohen

To me the first definition you gave is good. The one from the branch, however, was harder to parse. If you find this first definition not very intuitive, it may be because it defines it as "the smallest integer" instead as a "largest integer"? If that is the problem you can probably adapt it.

Nathann

comment:10 Changed 4 years ago by jmantysalo

Here is, if I thinked this right, the smallest lattice with breadth 4:

n = 4
l = [[0,i] for i in range(1,n+1)]

l += [[i, i+n] for i in range(1,n+1)]
l += [[i, i+n+1] for i in range(1,n)]
l += [[n, n+1]]

l += [[i, i+n] for i in range(n+1, 2*n+1)]
l += [[i, i+n+1] for i in range(n+1,n*2)]
l += [[n*2, n*2+1]]

l += [[i, n*3+1] for i in range(2*n+1, 3*n+1)]

L = Poset(( [], l)).completion_by_cuts().canonical_label()

This is not easily generalized. n=5 will not make a lattice of breadth 5.

However, can this be used as a part of test for a lattice to have breadht 4? If br(L)=4, there must be an antichain A of 4 elements with join j. Element j must cover at least four element. They can not be the set A. They can not directly cover A, because it is just impossible to make the lattice so that three element from A would not have meet j.

So to check take one element a time and try to put in as a j in this structure. If the element covers less than four element, it is not right one. If it does, take all four-element subsets of the lower covers of j-candidate. For those... somehow continue a-kind-of backtracking.

We can remove all doubly irreducible elements, as removing thme won't change the breadth, except from 2 to 1 in a diamond-like stucture.

comment:11 Changed 4 years ago by ncohen

I do not know what you are looking for, but if I wanted to build posets of arbitrarily high breadth I'd try booleanlattices.

Nathann

comment:12 Changed 4 years ago by jmantysalo

I am thinking about better algorithm to compute the breadth.

As an example: to a lattice have breadth k there must be an element that covers k elements. It does not suffice that it has an antichain of k elements.

But I'll continue thinking and try to make some example code.

comment:13 Changed 4 years ago by git

  • Commit changed from 8d81130c3ea697cbfe3ce178dd5a7fac052e78d3 to 33610ad2c4327679b8dfc1d3af445fe4d9080e4a

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

33610adAdded function for lattice breadth.

comment:14 Changed 4 years ago by jmantysalo

  • Status changed from new to needs_review

Ready for review. I did some timings, and for example Posets.TamariLattice(8), which has 1430 elements, it takes 4,5 seconds to compute the breadth. I know that there is places for optimization, but at least this should be working code and fast enought to be usable.

comment:15 follow-ups: Changed 4 years ago by ncohen

Hello Jori,

Here is a first-pass review while on the bus.

  • I was very surprised to see a 'distance' argument in the depth-first-search function. You do not compute the 'distance' when running a depth-first-search. At least not the usual notion of distance in graph theory. We should probably remove it, as it is *highly* dangerous to advertise it this way O_o
  • In your code, please use breadth-first-search instead.
  • Definition of 'elems': this line often test for containment in 'too_close'. Do not run containment tests in a list. use a 'set' for that: containment is faster.
  • Anyway, you probably should do this differently, i.e. with breadth_first_search(report_distance=True) while filtering 'too close' elements according to their distance.
  • I still believe that it would be faster to use 'subsets_with_hereditary_property' inside of your code. It would also make the code slightly clearer, as the test you perfom on each antichain would be an independent function.
  • Why do you deal with breadth 2 differently from the rest? That could mean a couple more operations, but compared to everything else that should not mean much.

Besides that, congratulations for this algorithm. It works, and it works well.

Nathann

comment:16 in reply to: ↑ 15 ; follow-up: Changed 4 years ago by jmantysalo

Replying to ncohen:

  • I was very surprised to see a 'distance' argument in the depth-first-search function. You do not compute the 'distance' when running a depth-first-search. At least not the usual notion of distance in graph theory. We should probably remove it, as it is *highly* dangerous to advertise it this way O_o

For me the parameter name distance feels very clear. But could like also max_depth. I think that it is useful to have a function that tells for example to what cities you can reach using max three bus.

  • In your code, please use breadth-first-search instead.

Why? Doesn't it use more memory?

  • Definition of 'elems': this line often test for containment in 'too_close'. Do not run containment tests in a list. use a 'set' for that: containment is faster.

OK.

  • Anyway, you probably should do this differently, i.e. with breadth_first_search(report_distance=True) while filtering 'too close' elements according to their distance.

I can test and see what happens. But I must make some test lattices, maybe some ordinal sum of ordinal products.

  • I still believe that it would be faster to use 'subsets_with_hereditary_property' inside of your code.

I am not sure about that. If I am right, it will make all 2-element antichains, then all 3-element antichains and so on. There will be many of them, only one is needed. (But the one must be the right one, of course...)

  • Why do you deal with breadth 2 differently from the rest?

There was some complication with breadth 2. Maybe it is not needed now in the near-to-final version.

And then, this depends on what one is doing. I guess it makes no difference for a big lattice, but for 10^4 small it might be different thing.

Besides that, congratulations for this algorithm. It works, and it works well.

Thanks! And thanks for the comments.

comment:17 in reply to: ↑ 16 ; follow-ups: Changed 4 years ago by ncohen

Hello,

For me the parameter name distance feels very clear.

And it is not the graph-theoretic distance in the graph, right? Plus there may be vertices at distance 2 from your start vertex that *never* get explored because you set distance=3. Do we also agree there? This is due to the fact that a vertex is at 'distance' k from the source vertex in a DFS if "there exists a path of length k from the source vertex to it" (possibly a *very long* path) and that if this vertex is once discovered at 'distance' 10 000 it may not be 'rediscovered' again at distance 2.

Why? Doesn't it use more memory?

no reason to. It may be trigger additional copies of the 'queue' list (unless we implement a deque), but that's all. Additionally, if this cost worries you then you probably shouldn't use list-comprehension in your code :-P

I am not sure about that. If I am right, it will make all 2-element antichains, then all 3-element antichains and so on. There will be many of them, only one is needed. (But the one must be the right one, of course...)

With your algorithm: if the antichain {a,b,c} is not "locally minimal", then your code will test all antichains which contain the antichain {a,b,c} too. With subset_with_hereditary_property it would be filtered out.

Nathann

comment:18 in reply to: ↑ 17 Changed 4 years ago by ncohen

For me the parameter name distance feels very clear.

Said differently: if I make no mistake, the set of vertices returned by depth_first_search(v,distance=k) is not defined in theory, and in practice is architecture-dependent.

Nathann

comment:19 in reply to: ↑ 17 ; follow-up: Changed 4 years ago by jmantysalo

More comments later (getting late here), but for this one:

Replying to ncohen:

Plus there may be vertices at distance 2 from your start vertex that *never* get explored because you set distance=3. Do we also agree there? This is due to the fact that a vertex is at 'distance' k from the source vertex in a DFS if "there exists a path of length k from the source vertex to it" (possibly a *very long* path) and that if this vertex is once discovered at 'distance' 10 000 it may not be 'rediscovered' again at distance 2.

No. The vertex was not discovered at distance 10 000, because search in that path was stopped at distance 3.

I can see nothing complicated here, except maybe on terms to use. Just "Go back when there is no more unseen vertices as neighbours." is changed to "Go back when there is no more unseen vertices as neighbours or if the maximum distance has already been reached."

comment:20 in reply to: ↑ 19 ; follow-up: Changed 4 years ago by ncohen

I can see nothing complicated here, except maybe on terms to use.

Let us say that you have a path of length 10 on {0,1,...,9}, and that you add a path of length 3 from 0 to 1, i.e. {0,a,1}.

If you want all vertices at distance 2 through a DFS starting from 0, you may discover a first (at distance 1) then 1 (at distance 2). You will not explore 2 (because it would be a distance 3).

When next you will test edge (0,1) and re-discover 1 (this time at distance 1), you will stop there because '1' has already been discovered.

Thus you will never discover vertex '2'.

If, on the other hand, the algorithm discovered 1 first, then you would proceed to 2 without any problem.

This is what is wrong.

Nathann

Version 0, edited 4 years ago by ncohen (next)

comment:21 in reply to: ↑ 20 ; follow-up: Changed 4 years ago by jmantysalo

Replying to ncohen:

When next you will test edge (0,1) and re-discover 1 (this time at distance 1), you will stop there because '1' has already been discovered.

True. Now I understand.

So vertices at exactly the limit distance should be handled differently. They must be on different list, from where they will be moved to "normal" already_seen list if they are re-discovered by smaller distance.

comment:22 in reply to: ↑ 21 Changed 4 years ago by ncohen

So vertices at exactly the limit distance should be handled differently. They must be on different list, from where they will be moved to "normal" already_seen list if they are re-discovered by smaller distance.

That's going too far. I don't think that we need to 'fix' that, it probably shouldn't even exist. What you want to use in this case is a BFS. Or, actually, Graph.shortest_path_lengths (whose name is awful).

Nathann

comment:23 Changed 4 years ago by jmantysalo

I opened a new ticket for the DFS distance-bug: #19227.

comment:24 in reply to: ↑ 15 Changed 4 years ago by jmantysalo

Replying to ncohen:

  • Definition of 'elems': this line often test for containment in 'too_close'. Do not run containment tests in a list. use a 'set' for that: containment is faster.

True. This gave a sligth benefit to speed.

  • Anyway, you probably should do this differently, i.e. with breadth_first_search(report_distance=True) while filtering 'too close' elements according to their distance.

False. After

elems = [e[0] for e in H.breadth_first_search(j, neighbors=H.neighbors_in, report_distance=True) if e[1] > B-2]` 

it took about infinite time to compute breadth of TamariLattice(6), whereas

too_close = set(H.breadth_first_search(j, neighbors=H.neighbors_in,  distance=B-2))
elems = [e for e in H.order_ideal([j]) if e not in too_close]

did it in 40 milliseconds.

  • I still believe that it would be faster to use 'subsets_with_hereditary_property' inside of your code.

Don't know. Please not that this code tries first with maximal breadth and then goes down one by one. subsets_with_hereditary_property would work to other direction. I guess that it will stuck with too many pairs, triples and so on.

  • Why do you deal with breadth 2 differently from the rest? That could mean a couple more operations, but compared to everything else that should not mean much.

True. This only slowed down the code like sum(L.breadth() for L in L10), where L10 is a list of 10-element lattices. And even it was very slight loss.

Anyways, I did some tests with product lattices and so. This code seems to be "fast enought". For example it takes 4,4 seconds to create a Tamari lattice, and 4,6 seconds to compute it's breadth. So this should not be a bottleneck.

comment:25 Changed 4 years ago by git

  • Commit changed from 33610ad2c4327679b8dfc1d3af445fe4d9080e4a to b3ae7c4d82d67adb484d53fdecec4256b196e6a1

Branch pushed to git repo; I updated commit sha1. New commits:

b3ae7c4Some simplifications.

comment:26 follow-up: Changed 4 years ago by ncohen

  • Reviewers set to Nathann Cohen

Helloooooooooooo,

I've got only one thing to add to this code: self[e] looks dangerous to me, just because I do not know what exactly it does. The __getitem__ function defined on posets seems to be inherited and (very) generic. In particular, I suspect that self[e] may be as "efficient" as list(self)[e]. I think that you should prefer _element_to_vertex(x) in this situation.

Nathann

comment:27 Changed 4 years ago by git

  • Commit changed from b3ae7c4d82d67adb484d53fdecec4256b196e6a1 to 9c05f49e95504acc1a37ea942764886c53f57198

Branch pushed to git repo; I updated commit sha1. New commits:

9c05f49From self[e] to self._vertex_to_element(e).

comment:28 in reply to: ↑ 26 Changed 4 years ago by jmantysalo

Replying to ncohen:>

I think that you should prefer _element_to_vertex(x) in this situation.

Corrected. (To _vertex_to_element of course, not _element_to_vertex.)

comment:29 Changed 4 years ago by ncohen

  • Status changed from needs_review to positive_review

OOps, right. Good to go then. Good work, really.

Nathann

comment:30 follow-up: Changed 4 years ago by ncohen

By the way: it seems to me that, theoretically speaking, having breadth >=b is equivalent to:

sage: G1 = my_lattice.hasse_diagram().transitive_closure()
sage: G2 = posets.BooleanLattice(p).hasse_diagram().transitive_closure()
sage: G1.subgraph_search(G2,induced=True)
True

But well. Does not seem like it would help on the computational side.

Nathann

comment:31 in reply to: ↑ 30 ; follow-up: Changed 4 years ago by jmantysalo

Replying to ncohen:

By the way: it seems to me that, theoretically speaking, having breadth >=b is equivalent to:

sage: G1 = my_lattice.hasse_diagram().transitive_closure()
sage: G2 = posets.BooleanLattice(p).hasse_diagram().transitive_closure()
sage: G1.subgraph_search(G2,induced=True)
True

And that is same as L.has_isomorphic_subposet(Posets.BooleanLattice(b)). I guess you are right. (But not sure - must think about this.)

I made an example of smallest lattice with breadth 4 with completion_by_cuts() and did not notice that the result was just BooleanLattice(4). Oops.

And that means that my example of lattice with breadth 4 is stupid, as it hides the structure to dig6 string.

comment:32 in reply to: ↑ 31 ; follow-up: Changed 4 years ago by ncohen

Hello,

And that is same as L.has_isomorphic_subposet(Posets.BooleanLattice(b))

Oh, sorry. I tend to turn everything into graphs, and work with that.

I made an example of smallest lattice with breadth 4 with completion_by_cuts() and did not notice that the result was just BooleanLattice(4). Oops.

And that means that my example of lattice with breadth 4 is stupid, as it hides the structure to dig6 string.

I see. Well, you can fix this in any other ticket if you wish anyway.

Nathann

comment:33 in reply to: ↑ 32 Changed 4 years ago by jmantysalo

Replying to ncohen:

I see. Well, you can fix this in any other ticket if you wish anyway.

I'll do that after #19123 on #19190.

comment:34 Changed 4 years ago by vbraun

  • Branch changed from u/jmantysalo/latticeposet__add_breadth__ to 9c05f49e95504acc1a37ea942764886c53f57198
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.