Opened 8 years ago

Closed 8 years ago

#14228 closed enhancement (fixed)

Caching of data needed for computations in k_dual

Reported by: SimonKing Owned by: sage-combinat
Priority: major Milestone: sage-5.8
Component: combinatorics Keywords:
Cc: sage-combinat, tscrim, nthiery, nbruin, zabrocki Merged in: sage-5.8.beta4
Authors: Travis Scrimshaw Reviewers: Simon King
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: #13605, #14225 Stopgaps:

Description (last modified by SimonKing)

#12313 brought a severe slow-down to sage.combinat.sf.k_dual. The purpose of task #13991 is to mitigate this. The purpose of this ticket is to deal with one aspect of the slow-down.

Certain data of partitions should be cached, namely WeylGroup and the Young subgroup. And sage.combinat.sf.k_dual constructs a couple of partitions, but they are only weakly cached, so that they eventually are created repeatedly. Hence, they should be cached, too.

Apply

Attachments (4)

trac_14288-partition_caches.patch (8.8 KB) - added by SimonKing 8 years ago.
trac_14228-partition_caches.patch (8.8 KB) - added by SimonKing 8 years ago.
Disregard the first patch, it has a wrong ticket number.
trac_14228-sym_speedup-ts.patch (20.6 KB) - added by tscrim 8 years ago.
Fixed doctests
trac_14228-reviewer.patch (2.1 KB) - added by SimonKing 8 years ago.

Download all attachments as: .zip

Change History (39)

Changed 8 years ago by SimonKing

comment:1 Changed 8 years ago by SimonKing

  • Dependencies set to #13605
  • Status changed from new to needs_review

Changed 8 years ago by SimonKing

Disregard the first patch, it has a wrong ticket number.

comment:2 Changed 8 years ago by SimonKing

  • Description modified (diff)

Sorry for the "wrong" patch name.

Apply trac_14228-partition_caches.patch

comment:3 Changed 8 years ago by tscrim

  • Authors changed from Simon King to Simon King Travis Scrimshaw
  • Cc sage-combinat nthiery added

I'm wondering how much of a speedup we'd get if we made IntegerListsLex a cached representation (currently it takes lambda functions as arguments which makes it impossible to hash).

sage: P = Partitions(4, max_part=2)
sage: P
Partitions of the integer 4 satisfying constraints max_part=2
sage: type(P)
sage.combinat.integer_list.IntegerListsLex_with_category
sage: P is Partitions(4, max_part=2)
False

So we alternatively want to make a new partitions parent specifically for this which just wraps the iterator given by IntegerListsLex...

I also restructured it slightly by creating an abstract base class for the actual bases for easier maintenance.

Anyways, I've attached my patch.

comment:4 Changed 8 years ago by SimonKing

  • Dependencies changed from #13605 to #13605, #14225

There is one hunk of my patch that only applies with #14225. Hence, new dependency.

comment:5 follow-up: Changed 8 years ago by SimonKing

Wow. You made a major restructuring of code. It will be difficult to merge the two approaches.

comment:6 Changed 8 years ago by tscrim

There was one failing doctest in the first version of my patch. Because of the structural changes, the error thrown is changed (although it fundamentally is the same error).

comment:7 in reply to: ↑ 5 ; follow-up: Changed 8 years ago by tscrim

Replying to SimonKing:

Wow. You made a major restructuring of code. It will be difficult to merge the two approaches.

The big thing that I did was pulled out the common code for the bases into an ABC. It looks scarier than it is. I also avoided calling Partition() directly in case what was being passed in was a list, in which case it should be slower than constructing an element from the appropriate parent. Although it's faster if we are passing in a Partition object, but this might be a marginal speed change... I'll do some investigating to see what's being passed around.

I can handle the merging.

comment:8 Changed 8 years ago by SimonKing

Arrgh. I tried to post this twice, but trac didn't accept because you posted too. A minute later, trac didn't seem to accept posts at all. Let's try again.

Some timings:

  • With #13605 and #14225:
    sage: from sage.combinat.sf.k_dual import AffineSchurFunctions
    sage: F = AffineSchurFunctions(SymmetricFunctions(QQ['t']).kBoundedQuotient(Integer(4),t=Integer(1)))
    sage: %time TestSuite(F).run()
    CPU times: user 11.17 s, sys: 0.09 s, total: 11.26 s
    Wall time: 11.42 s
    sage: %time TestSuite(F).run()
    CPU times: user 9.80 s, sys: 0.09 s, total: 9.89 s
    Wall time: 9.93 s
    
  • With #13605 and #14225 and trac_14228-sym_speedup-ts.patch
    sage: from sage.combinat.sf.k_dual import AffineSchurFunctions
    sage: F = AffineSchurFunctions(SymmetricFunctions(QQ['t']).kBoundedQuotient(Integer(4),t=Integer(1)))
    sage: %time TestSuite(F).run()
    CPU times: user 9.14 s, sys: 0.04 s, total: 9.18 s
    Wall time: 9.19 s
    sage: %time TestSuite(F).run()
    CPU times: user 7.05 s, sys: 0.02 s, total: 7.08 s
    Wall time: 7.08 s
    
  • With #13605 and #14225 and trac_14228-partition_caches.patch:
    sage: from sage.combinat.sf.k_dual import AffineSchurFunctions
    sage: F = AffineSchurFunctions(SymmetricFunctions(QQ['t']).kBoundedQuotient(Integer(4),t=Integer(1)))
    sage: %time TestSuite(F).run()
    CPU times: user 5.94 s, sys: 0.04 s, total: 5.98 s
    Wall time: 6.09 s
    sage: %time TestSuite(F).run()
    CPU times: user 3.49 s, sys: 0.03 s, total: 3.52 s
    Wall time: 3.52 s
    

Unfortunately, our two patches are incompatible. How to get the best of both worlds?

comment:9 in reply to: ↑ 7 Changed 8 years ago by SimonKing

Replying to tscrim:

Replying to SimonKing:

Wow. You made a major restructuring of code. It will be difficult to merge the two approaches.

The big thing that I did was pulled out the common code for the bases into an ABC. It looks scarier than it is. I also avoided calling Partition() directly in case what was being passed in was a list, in which case it should be slower than constructing an element from the appropriate parent. Although it's faster if we are passing in a Partition object, but this might be a marginal speed change... I'll do some investigating to see what's being passed around.

I can handle the merging.

Great, thank you. Looking at your patch, it seems that your code includes parts of the changes that I suggested (but differently in the case of the Weyl group), and I think other parts of my code (namely the cached method _Partitions) make sense to be used in your patch, too.

comment:10 Changed 8 years ago by tscrim

I'll let you know what happens some of my other possible speedups as well. Expect something within a few hours.

comment:11 follow-up: Changed 8 years ago by tscrim

  • Authors changed from Simon King Travis Scrimshaw to Simon King, Travis Scrimshaw
  • Cc nbruin added

Okay, I've merged the patches (I didn't incorperate the Weyl group) and made IntegerListsLex a cached representation (this was desirable anyways). However I realized that PartitionsGreatestLE does exactly what Partitions(n, max_part=k) does except it already is a unique representation, so I'm using that instead. I saw a very small difference between caching this in a function and just explicitly calling this, but I feel safer not having this be strongly cached (however feel free to ignore my opinion as I'm not a python/sage memory expert).

Here's some data. Without patch (with #14225):

sage: from sage.combinat.sf.k_dual import AffineSchurFunctions
sage: F = AffineSchurFunctions(SymmetricFunctions(QQ['t']).kBoundedQuotient(4,>
sage: %time TestSuite(F).run()
CPU times: user 7.79 s, sys: 0.22 s, total: 8.01 s
Wall time: 8.39 s
sage: %time TestSuite(F).run()
CPU times: user 4.82 s, sys: 0.04 s, total: 4.86 s
Wall time: 5.07 s

With:

sage: from sage.combinat.sf.k_dual import AffineSchurFunctions
sage: F = AffineSchurFunctions(SymmetricFunctions(QQ['t']).kBoundedQuotient(4,>
sage: %time TestSuite(F).run()
CPU times: user 7.17 s, sys: 0.09 s, total: 7.26 s
Wall time: 7.52 s
sage: %time TestSuite(F).run()
CPU times: user 4.26 s, sys: 0.03 s, total: 4.29 s
Wall time: 4.58 s

Edit - I still have not been able to see the approx 2x speedup that you got with just your patch, so I'm still doing something relatively slow...

Last edited 8 years ago by tscrim (previous) (diff)

comment:12 in reply to: ↑ 11 Changed 8 years ago by SimonKing

Replying to tscrim:

... but I feel safer not having this be strongly cached... ... Edit - I still have not been able to see the approx 2x speedup that you got with just your patch, ...

I suppose that's why...

UniqueRepresentation is weakly cached. Hence, if you call the same Partitions, Weyl groups or permutation groups, you will get different instances, unless you keep a strong reference. Since the same Partitions (and Weyl groups or permutation groups, too) occur in different methods, it won't help to keep the reference inside of a method. Instead, one needs a cache that is shared by the methods.

It would probably be bad to have a global cache. That's why in my patch I suggested to have the cache local to the parent that uses the data. In that way, the data can still be garbage collected (it is not a global cache!), as soon as the parent using the data is done.

It seems to me that we will not get up to speed, unless we cache the few partitions that are local to these k-bounded thingies.

comment:13 Changed 8 years ago by SimonKing

Your patch seems to perform quite well:

sage: from sage.combinat.sf.k_dual import AffineSchurFunctions
sage: F = AffineSchurFunctions(SymmetricFunctions(QQ['t']).kBoundedQuotient(Integer(4),t=Integer(1)))
sage: %time TestSuite(F).run()
CPU times: user 5.78 s, sys: 0.12 s, total: 5.90 s
Wall time: 5.92 s
sage: %time TestSuite(F).run()
CPU times: user 3.18 s, sys: 0.03 s, total: 3.21 s
Wall time: 3.22 s

with

trac11490-coercion_tutorial.patch
14182_whitespace.patch
trac_13618-rings_doc_real-ts.patch
trac_13618-rings_doc_complex-ts.patch
trac_13618-rings_doc_others-ts.patch
trac_14040_housekeeping.patch
trac_14040_repr_option.patch
trac_14040_doctest.patch
trac_14040_review.patch
trac14054_fast_methods-5.8.patch
trac_12313-mono_dict-combined-random-sk.patch
trac_13387-combined.patch
trac_13387-guard_gc.patch
trac_14159_weak_value_triple_dict.patch
trac_14159_use_cdef_get.patch
trac12951_cached_extension.patch
trac_14214-cython_homset.patch
trac_14214-backup_cache.patch
trac_14063-remove_cc_compositions-ts.patch
trac_13688-finite_sets_cardinality_override-ts.patch
trac_14138.patch
trac_14138-doctests.patch
trac_13605-partition_options-ts.patch
trac_14011-Sphinx_roles-fh.patch
trac_14054-fix-rigged-list.2.patch
trac_11410-zero_one_sequence_partitions-pod.v2.patch
trac_11410-zero_one_sequence_partitions-review-ts.patch
trac_14225-partition_classcall_private.patch
trac_14228-sym_speedup-ts.patch

Let's see if caching the few permutations gives us yet some more speed.

comment:14 Changed 8 years ago by SimonKing

Question about your merged patch: In partition.py, you've put a cached_method in front of one of the two young_subgroup methods and one of the two young_soubgroup_generator methods.

Was that intended? I thought it would be better to keep a pointer to the resulting permutation groups, not to some lists.

comment:15 Changed 8 years ago by tscrim

I tried with the cached method in the parent as well and got approximately the same timings:

sage: %time TestSuite(F).run()
CPU times: user 7.23 s, sys: 0.10 s, total: 7.33 s
Wall time: 7.46 s
sage: %time TestSuite(F).run()
CPU times: user 4.38 s, sys: 0.01 s, total: 4.40 s
Wall time: 4.63 s

Hmm...strange we're getting such different timings. I'm running 5.8.beta1 with the following patches:

trac_14040_doctest.patch
trac_14040_housekeeping.patch
trac_14040_repr_option.patch
trac_14040_review.patch
trac_14085-root_system-ambient_spaces-nt.patch
trac_14000.patch
trac_14000-doctests.patch
trac_14174-coxeter_matrix-type_H.patch
trac_14063-remove_cc_compositions-ts.patch
trac_8920-alphabet.patch
trac_7886_conjugacy_classes_combined.patch
trac_7886-conjugacy_classes-review-ts.patch
trac_14177-graph-color_by_labels-nt.patch
trac_14176-polyhedrons-operators-nt.patch
14182_whitespace.patch
trac_14130-gyw-bs.patch
trac_13605-partition_options-ts.patch
trac_14111-qsym_tutorial-sb.patch
trac_14111-qsym_tutorial-review-ts.patch
trac_12543-import_statements-vd.patch
trac14054_fast_methods-5.8.patch
trac_14054-fix-rigged-list.2.patch
trac_2023-dynkin_graphs-ts.patch
trac_11410-zero_one_sequence_partitions-pod.v2.patch
trac_11410-zero_one_sequence_partitions-review-ts.patch
trac_14225-partition_classcall_private.patch
trac_14228-sym_speedup-ts.patch

I thought that's what your patch did. Whoops! I'll change it on the next version. There's still a few more things I want to try.

comment:16 Changed 8 years ago by SimonKing

Did you do tests with your patch?

I get errors such as

    Traceback (most recent call last):
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_0[17]>", line 1, in <module>
        Partitions(Integer(4), outer=[Integer(3),Integer(1),Integer(1)]).list()###line 138:
    sage: Partitions(4, outer=[3,1,1]).list()
      File "classcall_metaclass.pyx", line 279, in sage.misc.classcall_metaclass.ClasscallMetaclass.__call__ (sage/misc/classcall_metaclass.c:932)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/lib/python/site-packages/sage/combinat/partition.py", line 4006, in __classcall_private__
        return IntegerListsLex(n, **kwargs)
      File "classcall_metaclass.pyx", line 279, in sage.misc.classcall_metaclass.ClasscallMetaclass.__call__ (sage/misc/classcall_metaclass.c:932)
      File "cachefunc.pyx", line 991, in sage.misc.cachefunc.WeakCachedFunction.__call__ (sage/misc/cachefunc.c:5120)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/lib/python2.7/weakref.py", line 56, in __getitem__
        o = self.data[key]()
    TypeError: unhashable type: 'list'

Hence, it seems that IntegerListsLex? should have a __classcall_private__ or something like that. UniqueRepresentation will only work if the defining data are hashable.

comment:17 Changed 8 years ago by tscrim

I just did them and removed the CachedRepresentation since that no longer will affect the timing (since I'm using PartitionsGreatestLE which already is a UniqueRepresentation). I'm also removing the (near) duplicate young_subgroup* methods.

comment:18 Changed 8 years ago by tscrim

  • Description modified (diff)

Okay, I also added a little micro-optimization by explicitly calling the element_class of the stored k bounded partitions parent which circumvents some the additional/duplicate _element_constructor_ tests and doctests in modified files all pass. I don't know what else to do (or we can do) at this point. With your patches, it seems we're back close to the original speed.

comment:19 Changed 8 years ago by SimonKing

With the patch that you posted last, I get

sage: from sage.combinat.sf.k_dual import AffineSchurFunctions
sage: F = AffineSchurFunctions(SymmetricFunctions(QQ['t']).kBoundedQuotient(Integer(4),t=Integer(1)))
sage: %time TestSuite(F).run()
CPU times: user 5.77 s, sys: 0.06 s, total: 5.83 s
Wall time: 5.87 s
sage: %time TestSuite(F).run()
CPU times: user 3.19 s, sys: 0.04 s, total: 3.24 s
Wall time: 3.24 s

Additionally using a local cache for Partitions or Weyl group gives no further improvement.

So, I suggest we use the merged patch as basis for a our reviews. Is it cross review? Does this patch contain my code?

comment:20 Changed 8 years ago by tscrim

I believe only the cached method on young_subgroup() and the change in the import statements in partition.py.

comment:21 Changed 8 years ago by SimonKing

  • Authors changed from Simon King, Travis Scrimshaw to Travis Scrimshaw
  • Reviewers set to Simon King

OK. So, I am supposed to review it, then.

comment:22 Changed 8 years ago by SimonKing

The change to PartitionsGreatestLE gives the following errors in sage.combinat.sf.kschur. I know that this module is deprecated, but nevertheless, I think the following attribute error (occurring in several tests) should be fixed:

File "/home/simon/SAGE/prerelease/sage-5.8.beta0/devel/sage-main/sage/combinat/sf/kschur.py", line 42:
    sage: s(ks3([3,2,1]))
Exception raised:
    Traceback (most recent call last):
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_1[5]>", line 1, in <module>
        s(ks3([Integer(3),Integer(2),Integer(1)]))###line 42:
    sage: s(ks3([3,2,1]))
      File "parent.pyx", line 914, in sage.structure.parent.Parent.__call__ (sage/structure/parent.c:7826)
      File "coerce_maps.pyx", line 82, in sage.structure.coerce_maps.DefaultConvertMap_unique._call_ (sage/structure/coerce_maps.c:3656)
      File "coerce_maps.pyx", line 77, in sage.structure.coerce_maps.DefaultConvertMap_unique._call_ (sage/structure/coerce_maps.c:3558)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/lib/python/site-packages/sage/combinat/free_module.py", line 1469, in _element_constructor_
        return self.monomial(self._basis_keys(x))
      File "parent.pyx", line 910, in sage.structure.parent.Parent.__call__ (sage/structure/parent.c:7772)
      File "parent.pyx", line 2288, in sage.structure.parent.Parent.convert_map_from (sage/structure/parent.c:15651)
      File "parent.pyx", line 2295, in sage.structure.parent.Parent.discover_convert_map_from (sage/structure/parent.c:15816)
      File "parent.pyx", line 2106, in sage.structure.parent.Parent.coerce_map_from (sage/structure/parent.c:14358)
      File "parent.pyx", line 2927, in sage.structure.parent._register_pair (sage/structure/parent.c:21520)
      File "parent.pyx", line 2901, in sage.structure.parent.EltPair.__hash__ (sage/structure/parent.c:21168)
      File "category_object.pyx", line 788, in sage.structure.category_object.CategoryObject.__hash__ (sage/structure/category_object.c:8150)
      File "sage_object.pyx", line 154, in sage.structure.sage_object.SageObject.__repr__ (sage/structure/sage_object.c:1769)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/lib/python/site-packages/sage/combinat/partition.py", line 5653, in _repr_
        return "Partitions of %s having parts less than or equal to %s"%(self.n, self.k)
      File "parent.pyx", line 669, in sage.structure.parent.Parent.__getattr__ (sage/structure/parent.c:6277)
      File "misc.pyx", line 205, in sage.structure.misc.getattr_from_other_class (sage/structure/misc.c:1406)
    AttributeError: 'PartitionsGreatestLE_with_category' object has no attribute 'n'

For the patchbot:

Apply trac_14228-sym_speedup-ts.patch

comment:23 Changed 8 years ago by SimonKing

  • Status changed from needs_review to needs_work

comment:24 Changed 8 years ago by SimonKing

PS: According to the traceback of this error, it uses the hash of CategoryObject---which is extremely slow, as it involves computing the string representation. So, does PartitionsGreatestLE_with_category not have a reasonable hash?

comment:25 Changed 8 years ago by tscrim

I'll implement a custom hash for it and fix those attribute errors shortly.

Changed 8 years ago by tscrim

Fixed doctests

comment:26 Changed 8 years ago by tscrim

  • Status changed from needs_work to needs_review

Sorry that took so long. I went out for dinner in between. I fixed the doctests in kschur.py, it was trying to pass in a NonNegativeIntergers() (which is currently not really supported by Partitions but probably should be), so I just removed that. I also implemented a custom hash which may not generate the best hash values, but it should do the job.

comment:27 Changed 8 years ago by tscrim

I also forgot:

For patchbot:

Apply: trac_14228-sym_speedup-ts.patch

comment:28 Changed 8 years ago by SimonKing

It seems that the coverage script is complaining. Indeed, KBoundedQuotientBasis.__init__ needs a test, and perhaps documentation what it is doing.

You now define a hash for PartitionsGreatestLE. But I see that it inherits from unique representation. Hence, it should have a blazingly fast hash anyway! So, I wonder where that hash (and comparison!) has disappeared!

I reckon that one should do

class PartitionsGreatestLE(UniqueRepresentation, IntegerListsLex):

and not

class PartitionsGreatestLE(IntegerListsLex, UniqueRepresentation):

I think it is documented somewhere that UniqueRepresentation should be the first base class given.

My plan is to add a reviewer patch, removing the custom hash and bringing UniqueRepresentation's hash forward, and adding documentation plus test for the new base class.

Changed 8 years ago by SimonKing

comment:29 Changed 8 years ago by SimonKing

  • Description modified (diff)

comment:30 Changed 8 years ago by SimonKing

I added a reviewer patch, as I indicated in my previous post. Hope you agree with my changes. I will finish the review as soon as all tests pass.

Apply trac_14228-sym_speedup-ts.patch trac_14228-reviewer.patch

comment:31 Changed 8 years ago by SimonKing

The tests pass (including the review patch), Travis' patch looks fine to me, and things in k_dual become a lot faster. Hence, it is a positive review here. I will now see how the new timings compare with pre-#12313 and post-#12313, so that hopefully #13991 can be resolved as fixed.

comment:32 Changed 8 years ago by SimonKing

  • Status changed from needs_review to positive_review

comment:33 Changed 8 years ago by tscrim

Looks good to me too. Thank you Simon.

Best,
Travis

comment:34 Changed 8 years ago by zabrocki

  • Cc zabrocki added

comment:35 Changed 8 years ago by jdemeyer

  • Merged in set to sage-5.8.beta4
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.