Opened 7 years ago

Closed 3 years ago

#14126 closed enhancement (fixed)

Count Number of Linear Extensions of a Poset

Reported by: csar Owned by: sage-combinat
Priority: major Milestone: sage-7.3
Component: combinatorics Keywords: days45
Cc: kdilks, jmantysalo Merged in:
Authors: Jori Mäntysalo Reviewers: Frédéric Chapoton
Report Upstream: N/A Work issues:
Branch: 24bc331 (Commits) Commit: 24bc3319d440b0172666a2cace440e20986e91b6
Dependencies: Stopgaps:

Description

Posets appear to have no mechanism to count the number of linear extensions other than computing all the linear extensions and finding the length of that list. Using recursion, one can count the number of linear extensions without having to generate all the linear extensions.

Attachments (1)

trac_14126-count_linear_extensions_of_a_poset-csar.patch (2.5 KB) - added by csar 7 years ago.

Download all attachments as: .zip

Change History (45)

comment:1 Changed 7 years ago by csar

  • Status changed from new to needs_review

comment:2 Changed 7 years ago by csar

  • Status changed from needs_review to needs_work

comment:3 Changed 7 years ago by csar

Patch does not apply and it's not clear why.

comment:4 follow-up: Changed 7 years ago by ncohen

  • Authors set to aschilling, nthiery, hivert

(this code should probably be stored in from sage.combinat.posets.linear_extensions. Besides, your code may be faster for small cases but can also require an exponential memory, which can be a problem too).

comment:5 in reply to: ↑ 4 Changed 7 years ago by csar

Replying to ncohen:

(this code should probably be stored in from sage.combinat.posets.linear_extensions. Besides, your code may be faster for small cases but can also require an exponential memory, which can be a problem too).

I've moved the code to sage.categories.finite_poset, which I realise is not what you've said, and added a note in the docstring saying one may prefer to count linear extensions of P by using P.linear_extensions().cardinality(), which uses the iterator belonging to LinearExtensionsOfPoset and avoids all the cacheing.

comment:6 follow-up: Changed 7 years ago by ncohen

O_o

Well, shouldn't the two codes that compute the number of linear extensions be at the same place ?... A choice could be madebetween the two with an argument to cardinality().

Nathann

comment:7 follow-up: Changed 7 years ago by kdilks

We originally made this function not aware that linear extensions had an iterator which would compute P.linear_extensions().cardinality() without computing all of P.linear_extensions().

In it's current form, I believe this code iterates over all linear extensions roughly same way the iterator does. So they should be approximately the same speed, and if this code is better, then those changes should be applied directly to the iterator.

However, Stembridge's posets package is still much faster at counting linear extensions. He iterates over all linear extensions approximately the same way we do, but when you just want a count of linear extensions, he has separate code that counts maximal chains in J(P). There's a fast way to count chains that doesn't involve iterating over all of them. One might think computing J(P) is expensive, but one only needs to know the covering relations, and we could come up with a faster implementation than P.order_ideals_lattice() for this purpose (that method computes a poset whose elements are sets of elements in P, and we should be able to more quickly make an isomorphic poset on [1..n]).

I think it's worth keeping this ticket open for development of this alternate way of counting linear extensions. Once we have a satisfactory implementation, we can change P.linear_extensions().cardinality() to call this special code for enumeration, rather than using the linear_extensions iterator.

comment:8 Changed 7 years ago by kdilks

But for what it's worth, I was successfully able to apply the latest version of the patch.

comment:9 in reply to: ↑ 6 ; follow-up: Changed 7 years ago by csar

Replying to ncohen:

O_o

Well, shouldn't the two codes that compute the number of linear extensions be at the same place ?... A choice could be madebetween the two with an argument to cardinality().

Nathann

Fair point. cardinality() doesn't seem to live in sage.combinat.posets.linear_extensions, though. It looks like it comes from the FiniteEnumeratedSets category. Is it legitimate to give LinearExtensionsOfPoset a cardinality function with an argument to choose between the _cardinality_from_iterator from FiniteEnumeratedSets and counting recursively? My understanding of how categories are meant to work does not extend this far.

comment:10 in reply to: ↑ 9 ; follow-up: Changed 7 years ago by ncohen

Hellooooooooo !!

Fair point. cardinality() doesn't seem to live in sage.combinat.posets.linear_extensions, though. It looks like it comes from the FiniteEnumeratedSets category.

Oh. Right. So it builds all linear extensions then counts them -_-

Is it legitimate to give LinearExtensionsOfPoset a cardinality function with an argument to choose between the _cardinality_from_iterator from FiniteEnumeratedSets and counting recursively? My understanding of how categories are meant to work does not extend this far.

My understanding of categories is that when they prevent you from doing something you need then categories should be changed to allow you to do whatever you wanted to do in the first place.

Technically I do not think that categories would mind an optional argument to .cardinality. I guess that they would not notice it, so that's all good.

This being said, if you are at the Sage Days 45 right now Nicolas Thiery should be around. And he's THE guy you should bug for anything related to categories :-P

Nathann

comment:11 in reply to: ↑ 10 Changed 7 years ago by nthiery

Replying to ncohen:

Fair point. cardinality() doesn't seem to live in sage.combinat.posets.linear_extensions, though. It looks like it comes from the FiniteEnumeratedSets category.

Oh. Right. So it builds all linear extensions then counts them -_-

To be precise, for an object in FiniteEnumeratedSets?, cardinality is set by default to _cardinality_from_iterator. So it indeed goes through all linear extensions. However the counting is done along the way, without storing them. So the memory complexity is roughly constant (well, linear in the size of one linear extension).

Is it legitimate to give LinearExtensionsOfPoset a cardinality function with an argument to choose between the _cardinality_from_iterator from FiniteEnumeratedSets and counting recursively? My understanding of how categories are meant to work does not extend this far.

+1

My understanding of categories is that when they prevent you from doing something you need then categories should be changed to allow you to do whatever you wanted to do in the first place.

Certainly: in any object oriented system, abstract classes are here to provide you with default implementations. By design they are meant to be overridden whenever there is a better algorithm in a special case and the application in mind makes it worth it to implement that algorithm

Categories are nothing but a way to organize those abstract classes.

Cheers,

Nicolas

comment:12 Changed 6 years ago by csar

I will give LinearExtensionsOfPoset the choice of which method to count linear extensions.

comment:13 Changed 6 years ago by jdemeyer

  • Milestone changed from sage-5.11 to sage-5.12

comment:14 Changed 6 years ago by vbraun_spam

  • Milestone changed from sage-6.1 to sage-6.2

comment:15 Changed 5 years ago by vbraun_spam

  • Milestone changed from sage-6.2 to sage-6.3

comment:16 Changed 5 years ago by vbraun_spam

  • Milestone changed from sage-6.3 to sage-6.4

comment:17 Changed 4 years ago by jmantysalo

  • Cc jmantysalo added

comment:18 in reply to: ↑ 7 Changed 4 years ago by jmantysalo

Replying to kdilks:

There's a fast way to count chains that doesn't involve iterating over all of them.

What is it?

If there are several ways to do this, none of which is best from all sides, shouldn't we then have algorithm-keyword for the function? And in any version I guess that the poset should first be splitted to connected parts.

Is there something wrong with making a function with dumb implementation first? Then we would have a reference implementation, docstring and already thinked interface and place for the function.

comment:19 Changed 4 years ago by jmantysalo

  • Branch set to u/jmantysalo/count-lin-ext

comment:20 Changed 4 years ago by git

  • Commit set to f908696aa7c63cefbc382af326cfe085e0dfa522

Branch pushed to git repo; I updated commit sha1. New commits:

f908696Added counting of linear extensions without listing them.

comment:21 Changed 4 years ago by jmantysalo

I just tested how much we could save by just mechanically copying the code.

After P6 = Posets(6).list() it took 0.45 seconds to compute sum([len(P.linear_extensions().list()) for P in P6]), and 0.11 second to sum([P.linear_extensions().count() for P in P6]). For 14-element Posets.TamariLattice(4) it took 1.42 seconds to compute list of linear extensions and 0.24 seconds to just count them. Is this speedup useful?

For general posets we could split it to connected components first. For lattices we could check if it is vertically decomposable. But is there some other ways to speed up this? Maybe somehow find chains of doubly irreducible elements?

comment:22 Changed 4 years ago by kdilks

I just need to sit down and spend some time re-figuring out exactly how the algorithm works (I had it coded a while ago, I thought, but lost it). Maybe this weekend.

The idea behind counting maximal chains without enumerating them is you have some function f on elements of your poset which gives 'number of maximal chains with x as its maximal element'. Initialize it to be 1 on minimal elements, you can recursively compute f(x) = sum_{y covered by x} f(y), and then sum up f(x) on maximal elements.

With linear extensions, you think of them as being maximal chains in the lattice of order ideals. I forget exactly how the voodoo works, but there's some way that you can use the fact that covering relations in the lattice of order ideals arise from covering relations in the original poset, and iterate over a table of covering relations in your original poset without computing the full lattice of order ideals.

This would be many many orders of magnitude faster. I don't have access to Maple anymore to get timings with Stembridge's code, but I know it has no problems counting number of linear extensions of things like a product of 3 chains of length 4 (64 elements) in a fraction of a second. Using the code in this ticket, computing number of linear extensions of [2]x[3]x[3] with just 18 elements has been running for a few minutes.

comment:23 Changed 4 years ago by jmantysalo

Well, counting linear extensions is #P-problem, I think. Of course it's complexity can still be O(1.001^n)...

But I will wait and see. At least my quick and dirty implementation gives a way to check the results of better code.

Last edited 4 years ago by jmantysalo (previous) (diff)

comment:24 Changed 4 years ago by jmantysalo

Pinging this one...

About faster algorithms: I am quite sure that we can do better in some cases. For example if you do a series-parallel decomposition, then counting can be done to indecomposable parts and just summed up with simpler formula.

But still, this is proven to be #P.

comment:25 Changed 3 years ago by chapoton

  • Authors changed from aschilling, nthiery, hivert to Anne Schilling, Nicolas M. Thiéry, Florent Hivert

comment:26 Changed 3 years ago by jmantysalo

?? Current version is written by me. (But of course it is trivial modification that just basically still counts all linear extensions.)

comment:27 Changed 3 years ago by chapoton

I was just cleaning a few invalid author fields. Please change it.

comment:28 Changed 3 years ago by jmantysalo

  • Authors changed from Anne Schilling, Nicolas M. Thiéry, Florent Hivert to Jori Mäntysalo

comment:29 Changed 3 years ago by kdilks

I am still intending on figuring out the 'right' way to do this. After recently realizing I can make the method be on the Hasse diagram, and can assume that I'm working with a naturally labelled poset on 0...n-1 with a dictionary of covering relations, I should be able to make it work.

comment:30 Changed 3 years ago by jmantysalo

I don't quite understand. AFAIK it is proved that counting linear extensions is in #P, and according to http://link.springer.com/chapter/10.1007%2F11537311_39 there is an algorithm that lists them in linear time.

We can trivially reduce time by doing series-parallel decomposition first. It won't help on average.

So do you have some good algorithm in mind? Being in #P does not mean that k could not be improved in O(k^n).

comment:31 Changed 3 years ago by kdilks

The #P result you keep citing only applies to generating a list all linear extensions, not counting them. For example, I can tell you that the number of linear extensions of an antichain of size n is n!, independent of any complexity issues that arise when trying to generate all n! permutations.

The algorithm is used in Stembridge's Posets package. The general idea is that linear extensions are maximal chains in the lattice of order ideals.

If you want to enumerate (not generate) maximal chains, then you can use fact that if f(x) is the number of saturated chains from a minimal element to x, then f(y) = sum f(x) over all x covered by y.

If we wanted to keep the code simple, we could just create the lattice of order ideals J(P) using Sage. But then the elements of this poset would be subsets, and we'd have another (potentially expensive) computation to make the Hasse diagram and get a naturally labelled poset on 0...(|I|-1) (where |I| is the number of order ideals). I think I tried this implementation in the past, and it was still an order of magnitude or two slower than the corresponding Maple code.

However, there should be some wizardry that lets us quickly create a dictionary on 0 .. (|I|-1) corresponding to the up edges in J(P). That wizardry is what I'm trying to figure out.

This algorithm works whether or not P is connected. However, there would be some significant speed-up for posets with a large number of connected components if we just counted the number of linear extensions on each connected component, and combined the results appropriately (if f(P) is the number of linear extensions on P, and P=P_1 U ... U P_k , then f(P)= multinomial(|P|,{|P_1|,...,|P_k|}) * product f(P_i) .

For example, say we look at the anti-chain of 11 elements. Currently, we have to iterate over close to 40 million permutations. Using the wizardry, we would just have to create a dictionary of 2048 elements corresponding to up-edges in the Boolean lattice and iterate over it. And if we split into connected components...getting 11! is almost automatic.

comment:32 Changed 3 years ago by jmantysalo

?? Isn't the concept of #P about counting something? And http://link.springer.com/article/10.1007/BF00383444 seems to say that too. It is trivial to say that listing them takes more than polynomial time.

Of course there are easy answers for some posets. For the ordinal sum of P and Q the count is just product.

But I will wait for the code and see.

E: Something in lattice of ideals - but every ideal corresponds to an antichain, and counting antichains is in #P I think.

Last edited 3 years ago by jmantysalo (previous) (diff)

comment:33 Changed 3 years ago by kdilks

Whatever it is, asymptotics aren't particularly relevant when in practice we're not going past |P|>100.

Even without any wizardry, P.order_ideals_lattice().chain_polynomial().leading_coefficient() (the naive implementation I mentioned) is significantly faster than P.linear_extensions().cardinality() (which defaults to using the iterator and generates all of them). Twice as fast on the antichain of 8 elements, ten times as fast on P=Posets.TamariLattice(4) (14 elements, 20243 linear extensions), ~600 times as fast on P=Posets.IntegerCompositions(5) (16 elements, 1680384 linear extensions).

comment:34 Changed 3 years ago by jmantysalo

Very interesting! Must think about this.

And as_ideals=False will get still more speed.

comment:35 Changed 3 years ago by jmantysalo

This got more speed with even very basic implementation of maximal chains counting:

def countmax(P):
    L = P.level_sets()
    c = {}
    for l in L[0]:
        c[l] = 1
    for lev in L[1:]:
        for l in lev:
            c[l] = sum(c[i] for i in P.lower_covers(l))
    return sum(c[i] for i in P.maximal_elements())

If we have better way to make a order ideals lattice, it is a place for new ticket. It would be nice feature in itself.

Hmm... Assuming we have n-element poset P and corresponding L=J(P), how we modify L when we add element n+1 as a minimal element to P? I guess the algorithm should be done by thinking this.

comment:36 Changed 3 years ago by jmantysalo

I guess this is worth reading about order ideals lattice: http://www.sciencedirect.com/science/article/pii/S0166218X00002584

comment:37 Changed 3 years ago by kdilks

After much staring at Maple code, I now mostly understand the algorithm to generate a naturally labelled poset isomorphic to the order ideal lattice given a naturally labelled poset.

The idea is similar to what you mentioned. Run a loop that takes the lattice of order ideals for P restricted to {1,...,k}, and think about what happens when you add k+1. You need to know which order ideal I_{k+1} corresponds to the principal order ideal generated by k+1 (but with k+1 removed, since we haven't added it to the poset yet).

Now, you create a list K of the order ideals that lie above I_{k+1} in your partial lattice of order ideals. Each of these is going to yield a new order ideal to add to your list (the corresponding ideal I plus the new k+1). Make sure that K is ordered so that our lattice of order ideals stayed naturally labelled. The relations you need to add are I is covered by I plus k+1, and for every relation A covers B in K, you're going to have to add A plus k+1 covers B plus k+1.

The main subtlety is in creating and maintaining a helper list that will eventually tell you where I_{k+1} is by the time you reach the k+1st step. Call the list L, indexed by your poset elements, initialize it with all 1s. As you go along, after k iterations, the entry L[j] tells you which order ideal corresponds to I_j restricted to 1...k. If you add something incomparable to j, then I_j restricted to 1...k+1 doesn't change. If you add something comparable to j, then I_j restricted to 1...k+1 is one of the things in K plus k+1. So if we started with N order ideals, and I_j restricted to 1...k was the rth thing in K, then I_j restricted to 1...k+1 will be the N+rth thing, and we update L accordingly.

Since it might take me some time to actually sit down and implement this, I agree that it makes sense to just change the P.linear_extensions().cardinality() method to something simple like P.order_ideals_lattice().chain_polynomial().leading_coefficient() (maybe with an optional parameter allowing one to default back to the iterator). Then I can work on a separate ticket for generating a naturally labelled poset isomorphic to the order ideal lattice, which P.linear_extensions().cardinality() could eventually utilize.

comment:38 follow-up: Changed 3 years ago by jmantysalo

Does that mean that order_ideals_lattice() will have an option to get a lattice with plain integers as elements? If so, we should now deprecate as_ideal keyword, as it is plain Boolean value. Or at least it sounds funny to have as_ideal with possible values True, False and 'integers'.

comment:39 Changed 3 years ago by git

  • Commit changed from f908696aa7c63cefbc382af326cfe085e0dfa522 to 24bc3319d440b0172666a2cace440e20986e91b6

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

24bc331Add cardinality() to linear extensions of poset.

comment:40 Changed 3 years ago by jmantysalo

  • Status changed from needs_work to needs_review

Shall we then accept this solution, and later think about optimizing this in two different ways?

Series-parallel decomposition could be a good thing anyways.

comment:41 in reply to: ↑ 38 ; follow-up: Changed 3 years ago by kdilks

Replying to jmantysalo:

Does that mean that order_ideals_lattice() will have an option to get a lattice with plain integers as elements? If so, we should now deprecate as_ideal keyword, as it is plain Boolean value. Or at least it sounds funny to have as_ideal with possible values True, False and 'integers'.

Yes. It probably makes sense to deprecate as_ideals, and change it to a keyword like representative that defaults to ideal, but can also be antichains or integers.

I guess I'd want to quantify how much overhead calculating connected components and a series-parallel decomposition of the original Hasse diagram introduces (my guess is a little and lot, respectively) before making it part of every single computation. I feel like unless you explicitly construct something as an ordinal/disjoint sum of smaller posets (and thus should already know the decomposition), almost every poset you'd throw at this where you'd really need optimal performance is going to be connected and not have a series-parallel decomposition.

comment:42 in reply to: ↑ 41 Changed 3 years ago by jmantysalo

Replying to kdilks:

Yes. It probably makes sense to deprecate as_ideals, and change it to a keyword like representative that defaults to ideal, but can also be antichains or integers.

That could be done now; a little less time for new users to learn a feature to be deprecated.

I guess I'd want to quantify how much overhead calculating connected components and a series-parallel decomposition of the original Hasse diagram introduces (my guess is a little and lot, respectively) before making it part of every single computation. I feel like unless you explicitly construct something as an ordinal/disjoint sum of smaller posets (and thus should already know the decomposition), almost every poset you'd throw at this where you'd really need optimal performance is going to be connected and not have a series-parallel decomposition.

True, but then whole decomposition will be just one check to see that the poset can not be decomposed.

And in any case that should a place for it's own ticket.

comment:43 Changed 3 years ago by chapoton

  • Milestone changed from sage-6.4 to sage-7.3
  • Reviewers set to Frédéric Chapoton
  • Status changed from needs_review to positive_review

ok, looks good to me.

comment:44 Changed 3 years ago by vbraun

  • Branch changed from u/jmantysalo/count-lin-ext to 24bc3319d440b0172666a2cace440e20986e91b6
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.