#28273 closed enhancement (fixed)

allow laziness in DisjointUnionEnumeratedSet

Reported by: gh-mwageringel Owned by:
Priority: major Milestone: sage-8.9
Component: combinatorics Keywords:
Cc: tscrim, nthiery Merged in:
Authors: Markus Wageringel Reviewers: Travis Scrimshaw
Report Upstream: N/A Work issues:
Branch: 65dddde (Commits) Commit: 65dddde951aaab5cab1c1819c61b5e185ea092e9
Dependencies: Stopgaps:

Description

This ticket changes DisjointUnionEnumeratedSet to be lazy in the case where a disjoint union is taken over a lazy family of sets. For this, __contains__ is implemented in LazyFamily.

This started as a discussion in 28106#comment:15. In particular, this will solve the memory issues of the doctests in sage.combinat.tableau as StandardTableaux(50).cardinality() is more efficient now.

Change History (10)

comment:1 Changed 14 months ago by gh-mwageringel

  • Authors set to Markus Wageringel
  • Branch set to u/gh-mwageringel/28273
  • Cc tscrim added
  • Commit set to 576261ad89cfddd8288025dbfd06ba508874b16c
  • Status changed from new to needs_review

New commits:

576261a28273: implement LazyFamily.__contains__ for lazy disjoint unions

comment:2 Changed 14 months ago by tscrim

  • Cc nthiery added
  • Reviewers set to Travis Scrimshaw
  • Status changed from needs_review to needs_work

I am now seeing a problem with this (and probably what I saw before as mentioned on #28106): that is if you pass a mutable object L, then it will lead to inconsistencies when you change L. For instance, if you pass in a list. So my suggestion would be to do

-                self._facade_for = tuple(family)
+                if isinstance(family, Family):
+                    self._facade_for = family
+                else:
+                    self._facade_for = tuple(family)

So I think we should make sure that case is covered. Other than that, I think this is good to go.

comment:3 Changed 14 months ago by git

  • Commit changed from 576261ad89cfddd8288025dbfd06ba508874b16c to cb6751b3425876644f67b6a764fad274c71401e9

Branch pushed to git repo; I updated commit sha1. New commits:

cb6751b28273: add clarification about family

comment:4 Changed 14 months ago by gh-mwageringel

  • Status changed from needs_work to needs_review

Thank you for the feedback. As far as I can tell, it is an error to subclass DisjointUnionEnumeratedSets and initialize it with something mutable. For example, the following raises a type error when initialized with a list.

class Foo(DisjointUnionEnumeratedSets):
    def __init__(self, family):
        DisjointUnionEnumeratedSets.__init__(self, family,
                category=FiniteEnumeratedSets(), facade=True, keepkey=False)

a = Foo([FiniteEnumeratedSet([1,2,3]),
         FiniteEnumeratedSet([4,5,6])])
# ... TypeError: unhashable type: 'list'

The reason for this is that DisjointUnionEnumeratedSets inherits from UniqueRepresentation which makes use of caching and therefore needs hashable arguments.

When creating a disjoint union directly (without subclassing) as in DisjointUnionEnumeratedSets([FiniteEnumeratedSet([1])]), then DisjointUnionEnumeratedSets.__classcall_private__ asserts that the family is converted to a Family. In subclasses, this method is not called during initialization though, so they must ensure that they correctly initialize their super class with a hashable type like family or tuple.

I added a comment about this to the code in order to make this clearer. I also checked that DisjointUnionEnumeratedSets.__init__ is always called with a family, within the Sage library, which seems to be the case.

Edit: The requirement of family being a Family also applies to several other places in disjoint_union_enumerated_sets where it is assumed that _family supports certain methods.

Last edited 14 months ago by gh-mwageringel (previous) (diff)

comment:5 follow-up: Changed 14 months ago by tscrim

That the data passed to DisjointUnionEnumeratedSets is hashable is not a requirement of UniqueRepresentation. Subclasses, say S, are free to pass lists or other mutable things as long as the data that defines S uniquely identifies it (which is fully independent, e.g., Tableaux). So what you are saying is not quite accurate. In the example you gave, it is using the default input, but what if the input was just an integer n and the family became [range(n)]*n.

Now I believe it is the subclass's responsibility to know and properly handle what data it passes to its superclass(es). Since nothing within the Sage codebase was relying on that tuple call, then we are okay (and it was an undocumented feature, so subject to change without notice). So I think the correct thing to say here is:

# Note that family is not copied when it is a finite enumerated sets,
#   thus, any subclass must ensure that it does not mutate this input.

comment:6 Changed 14 months ago by git

  • Commit changed from cb6751b3425876644f67b6a764fad274c71401e9 to 65dddde951aaab5cab1c1819c61b5e185ea092e9

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

65dddde28273: add clarification about family

comment:7 in reply to: ↑ 5 Changed 14 months ago by gh-mwageringel

Replying to tscrim:

That the data passed to DisjointUnionEnumeratedSets is hashable is not a requirement of UniqueRepresentation. Subclasses, say S, are free to pass lists or other mutable things as long as the data that defines S uniquely identifies it (which is fully independent, e.g., Tableaux). So what you are saying is not quite accurate. In the example you gave, it is using the default input, but what if the input was just an integer n and the family became [range(n)]*n.

Oh, I agree, you are absolutely right. I changed the comment to what you suggested. Thank you.

comment:8 Changed 14 months ago by tscrim

  • Status changed from needs_review to positive_review

Great, thank you. Looks good. I appreciate all your work.

comment:9 Changed 14 months ago by embray

Thanks for fixing this. I was surprised too when I saw that this code was generating so many subsets up-front but wasn't sure how to fix it.

Regardless, I still think the change in #28106 makes sense to do as well, if anyone wants to give it another look. I made sure this time that it's working as intended.

comment:10 Changed 14 months ago by vbraun

  • Branch changed from u/gh-mwageringel/28273 to 65dddde951aaab5cab1c1819c61b5e185ea092e9
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.