Opened 6 years ago

Closed 5 years ago

Last modified 4 years ago

#13589 closed defect (fixed)

Controlling C3 to solve once for all the Method Resolution Order issues for category classes

Reported by: nthiery Owned by: nthiery
Priority: major Milestone: sage-5.12
Component: categories Keywords: method resolution order, C3
Cc: sage-combinat, SimonKing Merged in: sage-5.12.beta0
Authors: Nicolas M. Thiéry, Simon King Reviewers: Simon King, Florent Hivert
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: #12894, #12876, #11935, #12895, #10193 Stopgaps:

Description (last modified by nthiery)

Teaser
------

Python handles multiple inheritance by computing, for each class,
a linear extension of all its super classes (the Method Resolution
Order, MRO). The MRO is calculated recursively from local
information (the *ordered* list of the direct super classes), with
the so-called ``C3`` algorithm. This algorithm can fail if the local
information is not consistent; worst, there exist hiearchies of
classes with provably no consistent local information.

For large hierarchy of classes, like those derived from categories in
Sage, maintaining consistent local information by hand does not scale
and leads to unpredictable ``C3`` failures (the dreaded "could not
find a consistent method resolution order"); a maintenance nightmare.

This patch implements a final solution to this problem. Namely, it
allows for building automatically the local information from the bare
class hierarchy in such a way that guarantees that the ``C3``
algorithm will never fail.

Err, but you said that this was provably impossible? Well, not if
one relaxes a bit the hypotheses, but that's not something one
would want to do by hand :-)

Details
-------

Please see the extensive documentation at the top of the file
sage/misc/c3_controlled.py in the attached patch.

Content of the patch
--------------------

- Implement controlled C3 in sage.misc.c3_controlled.

- Implement a total order in Category, and have Category use
  C3 controlled by this order instead of plain C3.

- Tweak the current total order to minimize changes in the order of
  categories.

- Update doctests w.r.t. remaining changes of in the order of
  categories.

- Remove a coupld doctests displaying "all_super_categories" that did not bring useful information to the user nor intesting test, yet needed to be constantly updated; nothing but a good source of conflicts.

- Rewrite doctests in sage.misc.c3 to be independent of categories
  since those do not use anymore this implementation of C3.

- Resolve some ambiguities to make the code more independent of the
  order of categories. In particular, FiniteCoxeterGroups prefer
  __iter__ and some_elements from CoxeterGroups to that of
  FiniteGroups.

- Update the section in the primer about order of categories.

- Provide further tools in ``sage.misc.c3_controlled`` to
  experiment with C3 and friends.

- Extract category_sample from category_graph

Credits
-------

This patch is a followup to a study of the C3 algorithm together with
Florent Hivert, and to discussions with Simon King and his
implementation of C3.

Apply

Attachments (7)

c3-fix-nt.patch (12.3 KB) - added by nthiery 6 years ago.
trac_13589-categories-c3_under_control-category_sample-nt.patch (4.5 KB) - added by nthiery 6 years ago.
trac13589_cmp_key_attribute.patch (8.9 KB) - added by SimonKing 5 years ago.
For speed reasons, make _cmp_key a lazy attribute, not a (cached) method
trac13589_improve_startuptime.patch (10.3 KB) - added by SimonKing 5 years ago.
Several attempts to improve startuptime
trac13589_cython_cmp_key.patch (14.2 KB) - added by SimonKing 5 years ago.
Another attempt to make the startup faster
trac13589_cython_cmp_key-review-nt.patch (9.6 KB) - added by nthiery 5 years ago.
trac_13589-categories-c3_under_control-nt.patch (111.3 KB) - added by nthiery 5 years ago.

Download all attachments as: .zip

Change History (94)

comment:1 Changed 6 years ago by nthiery

  • Description modified (diff)

comment:2 Changed 6 years ago by nthiery

  • Dependencies changed from #13501 to #13501, #12894

comment:3 Changed 6 years ago by elixyre

Hello,

Could you update your patch, there is several hunk in the last sage versions (i work without combinat to use NCSF...)

patching file sage/categories/category.py
Hunk #6 FAILED at 1090
Hunk #7 FAILED at 1200
Hunk #9 FAILED at 2003
Hunk #10 FAILED at 2084
4 out of 11 hunks FAILED -- saving rejects to file sage/categories/category.py.rej
abort: patch failed to apply

Furthermore, it could be useful to add that quickly in sage. The graded Hopf algebras seems to be the last bastion before recurrent MRO errors.

Thanks, Jean-Baptiste

comment:4 Changed 6 years ago by SimonKing

I currently have

trac_14159_weak_value_triple_dict.patch
trac_14159_use_cdef_get.patch
trac_13184_sage_5.9.beta.patch
trac_14287-rebased.patch
trac_14217_base_functionality.patch
trac_12876_category_abstract_classes_for_hom.patch
trac11935_weak_pickling_by_construction-nt.patch
trac_11935-weak_pickling_by_construction-review-ts.patch
trac_14249-coercion_without_an_element.patch
trac_12894-classcall_setter-nt.patch
trac_12895-subcategory-methods-nt.patch
trac_12895-review.patch

on top of sage-5.9.rc0 (these all have positive review or are even merged in sage-5.10.beta), and then the patch fails to apply like this:

Füge trac_13589-categories-c3_under_control-nt.patch zur Seriendatei hinzu
Wende trac_13589-categories-c3_under_control-nt.patch an
Wende Patch auf Datei sage/categories/category.py an
FEHLSCHLAG von Teilstück #1 in Zeile 94
Teilstück #6 wurde erfolgreich in Zeile 1105 mit Unschärfe 1 angewandt (16 Zeilen verschoben).
FEHLSCHLAG von Teilstück #7 in Zeile 1289
Teilstück #9 wurde erfolgreich in Zeile 2152 mit Unschärfe 1 angewandt (58 Zeilen verschoben).
2 von 11 Teilstücken sind FEHLGESCHLAGEN -- speichere Ausschuss in Datei sage/categories/category.py.rej
Patch schlug fehl und Fortsetzung unmöglich (versuche -v)
Patch schlug fehl, Fehlerabschnitte noch im Arbeitsverzeichnis
Fehler beim Anwenden. Bitte beheben und trac_13589-categories-c3_under_control-nt.patch aktualisieren

So, there is some improvement with respect to what Jean-Baptiste reports. Nevertheless, it seems that dependencies should be stated, and probably the patch needs rebasing.

comment:5 Changed 6 years ago by SimonKing

In particular, the patch uses some CategoryWithAxiom, which is not defined here or in the given dependencies.

comment:6 Changed 6 years ago by SimonKing

I have not been able to find CategoryWithAxiom or category with axiom on trac.

comment:7 Changed 6 years ago by nthiery

Yes, this patch still needs a bit of work. It should be ready tuesday or so. You can have a look at the text in the patch where I describe the purpose and principle of the patch, but don't waste time with a more detailed review at this point!

Thanks!

comment:8 Changed 6 years ago by SimonKing

Hi Nicolas,

OK, I thought this is next, after #11935.

Best regards,

Simon

comment:9 Changed 6 years ago by nthiery

#12895 was next! And now I have to run behind :-) Thanks for all your review work! I'll pile up some stuff for you soon and let you know :-)

comment:10 Changed 6 years ago by nthiery

  • Dependencies changed from #13501, #12894 to #13501, #12894, #12895
  • Status changed from new to needs_review

comment:11 Changed 6 years ago by nthiery

Hi Simon,

The updated patch should be roughly close to completion. Most if not all tests should pass (they did when I was working on the patch in git; I may have screwed up my export back to mercurial, and/or some dependencies).

I still need to scan once again through the patch to check that everything is 100% doctested, and I also want to reread the explanations in sage.misc.c3_controlled. I'll do that tomorrow. But I think you can start reviewing it in particular checking whether the whole logic makes sense to you. Let me know if/when you start working on it so that we avoid conflicts.

Thanks!

Nicolas

comment:12 Changed 6 years ago by nthiery

  • Dependencies changed from #13501, #12894, #12895 to #13501, #12894, #12876, #11935, #12895

comment:13 Changed 6 years ago by nthiery

For info: I am running the tests now and will report when I wake up.

comment:14 Changed 6 years ago by nthiery

All long tests passed on my machine with 5.10beta4 and the following patches applied:

trac_14612-gyw_test_speedup-ts.patch
trac_14574-folded.patch
trac_13735_fix_repr_lincomb.patch
trac_14123-binary-trees-maps-rebase-cs.patch
trac_12876_category_abstract_classes_for_hom.patch
trac11935_weak_pickling_by_construction-nt.patch
trac_11935-weak_pickling_by_construction-review-ts.patch
trac_12895-subcategory-methods-nt.patch
trac_13589-categories-c3_under_control-nt.patch

comment:15 Changed 6 years ago by nthiery

Ok, patchbot is happy too except for coverage in c3_controlled, doctest continuations, startup time and startup modules. The two first ones will be easy fixes. I'll investigate the two others this morning.

comment:16 Changed 6 years ago by nthiery

  • Description modified (diff)

comment:17 Changed 6 years ago by nthiery

Hi Simon,

The patch is now completely ready for review:

  • I fixed the coverage and continuation issue.
  • I cythoned sage.misc.c3_controlled; hopefuly this will fix the startup time regression
  • I went through the whole module, improved the doc and threw away some scories.
  • I don't know why the bot complains about the non existent modules sage.categories.inspect and itertools. I guess its just confused. As for sage.misc.c3_controlled, well yes, it's new :-)
  • I am running all long tests, and will report soon.

Thanks for your upcoming review!

Off to work on the main functorial construction patch!

Nicolas

comment:18 Changed 6 years ago by nthiery

Oops, I had forgotten a little improvement I wanted to do in the implementation of the total order. It looks a tiny bit less hacky now and could be a tiny bit faster.

All test pass. Running long tests now.

comment:19 Changed 6 years ago by nthiery

Gosh, I had fumbled my export and uploaded the wrong patch. Fixed!

comment:20 Changed 6 years ago by nthiery

  • Description modified (diff)

comment:21 Changed 6 years ago by nthiery

Shoot, the Cythonisation has broken one longtest failure in sage.misc.c3_controlled. I am investigating this; the rest can be reviewed in the mean time.

The cythonization has not improved the startup time. It's not yet clear to me what can be causing the slower startup time. To be investigated ...

comment:22 follow-up: Changed 6 years ago by SimonKing

Some random remarks:

  • Why is Category._cmp_key a cached method and not a lazy attribute?
  • Why is CategoryWithParameters._cmp_key a method and not a lazy attribute or at least a cached method?
  • Why has this example
    sage: Groups().example().algebra(ZZ).categories()
    ...
    
    been completely removed from sage/categories/groups.py? Similarly sage: Modules(Integers(9)).all_super_categories() from sage/categories/modules.py etc.
  • In the documentation of primer.py:
    However this must be considered as an *implementation detail*: `C_1` 
    and `C_2` are incomparable categories, then the order in which they 
    appear must be mathematically irrelevant:
    
    I guess there is "If" missing after the first colon.
  • In sage/misc/c3_controlled.pyx, line 123: Should be "classes that an object inherits from.", not "classes that an object inherit from."
  • ... line 139, "However, this has several inconvenients:" I guess this should be "However, this has several drawbacks" or "However, this is inconvenient in several regards" or so.

Do I understand correctly: As you outline in lines 148-166, the creation of classes will become slower (O(n^3) instead of O(n^2) for getting the MRO, etc) if one explicitly puts the desired MRO into a long list of bases. This would certainly be a reason for an increased startup time and other regressions. Therefore, in a first step, you compute short lists of bases that ensure that the C3 algorithm still reconstruct the intended MRO. However, is this additional step (namely: Computing lists of bases) takes some time, not affecting the startup time?

I still have to read the actual code (and not just the documentation). One question, though, which is in the spirit of #11935. In sage.categories.category, you have

        (result, bases) = C3_sorted_merge([cat._all_super_categories 
                                           for cat in self._super_categories] + 
                                          [self._super_categories], 
                                          key = attrcall('_cmp_key')) 
        self._super_categories_for_classes = bases 
        return [self] + result 

I guess in many cases the result will be the same up to the base rings. Shouldn't we think of a way to avoid duplication of work? I could imagine that here is the reason for the startup time regression.

comment:23 in reply to: ↑ 22 Changed 6 years ago by nthiery

Replying to SimonKing:

Some random remarks:

  • Why is Category._cmp_key a cached method and not a lazy attribute?
  • Why is CategoryWithParameters._cmp_key a method and not a lazy attribute or at least a cached method?

To be discussed. If I remember correctly, it's slightly easier to use super in cached methods than attributes, but I guess both would work.

  • Why has this example
    sage: Groups().example().algebra(ZZ).categories()
    ...
    
    been completely removed from sage/categories/groups.py? Similarly sage: Modules(Integers(9)).all_super_categories() from sage/categories/modules.py etc.

Because they neither bring useful test or information, and need to be updated whenever the category order changes which is a good source of conflicts.

  • In the documentation of primer.py:
    However this must be considered as an *implementation detail*: `C_1` 
    and `C_2` are incomparable categories, then the order in which they 
    appear must be mathematically irrelevant:
    
    I guess there is "If" missing after the first colon.
  • In sage/misc/c3_controlled.pyx, line 123: Should be "classes that an object inherits from.", not "classes that an object inherit from."
  • ... line 139, "However, this has several inconvenients:" I guess this should be "However, this has several drawbacks" or "However, this is inconvenient in several regards" or so.

Thanks, I'll fix that! Probably on Monday, together with the fix for the failing long test.

Do I understand correctly: As you outline in lines 148-166, the creation of classes will become slower (O(n^3) instead of O(n^2) for getting the MRO, etc) if one explicitly puts the desired MRO into a long list of bases. This would certainly be a reason for an increased startup time and other regressions. Therefore, in a first step, you compute short lists of bases that ensure that the C3 algorithm still reconstruct the intended MRO. However, is this additional step (namely: Computing lists of bases) takes some time, not affecting the startup time?

Please read further :-) That would be O(n^3) if one was brute forcing the complete mro in the list of bases. Luckily it's more clever than that! New bases are only added when absolutely necessary; in fact, in the current situation it turns out that no base is actually added even for non trivial categories like Fields or GradedHopfAlgebrasWithBasis?.

I still have to read the actual code (and not just the documentation). One question, though, which is in the spirit of #11935. In sage.categories.category, you have

        (result, bases) = C3_sorted_merge([cat._all_super_categories 
                                           for cat in self._super_categories] + 
                                          [self._super_categories], 
                                          key = attrcall('_cmp_key')) 
        self._super_categories_for_classes = bases 
        return [self] + result 

I guess in many cases the result will be the same up to the base rings. Shouldn't we think of a way to avoid duplication of work? I could imagine that here is the reason for the startup time regression.

This code is only called if all_super_categories is called. And by #11935 this should happen only once for all base ring in the same category. (unless one calls explicitly all_super_categories). I have not kept the timings under hand, but I did not see a difference in our usual elliptic curves benchmark.

Cheers,

Nicolas

Last edited 6 years ago by nthiery (previous) (diff)

comment:24 follow-up: Changed 6 years ago by SimonKing

For the record, the only failing example is this:

File "devel/sage/sage/misc/c3_controlled.pyx", line 266, in sage.misc.c3_controlled
Failed example:
    for l in L:                                          # long time
        x = HierarchyElement(10, l.to_poset())
        assert x.mro            == list(P)
        assert x.mro_controlled == list(P)
        assert x.all_bases_len() == 15
        assert x.all_bases_controlled_len() == 19
        try:
            x.mro_standard
            assert False
        except:
            pass
Exception raised:
    Traceback (most recent call last):
      File "/home/simon/SAGE/prerelease/sage-5.9.rc0/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 466, in _run
        self.execute(example, compiled, test.globs)
      File "/home/simon/SAGE/prerelease/sage-5.9.rc0/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 825, in execute
        exec compiled in globs
      File "<doctest sage.misc.c3_controlled[35]>", line 6, in <module>
        assert x.all_bases_controlled_len() == Integer(19)
    AssertionError

Indeed, on the command line, I get

sage: for l in L:
....:     print "l =",l
....:     x = HierarchyElement(10, l.to_poset())
....:     print x.all_bases_controlled_len()
....:     
l = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
19
l = [10, 9, 8, 7, 6, 5, 4, 3, 1, 2]
19
l = [10, 9, 8, 7, 6, 5, 4, 2, 3, 1]
19
l = [10, 9, 8, 7, 6, 5, 4, 2, 1, 3]
19
l = [10, 9, 8, 7, 6, 5, 4, 1, 3, 2]
19
l = [10, 9, 8, 7, 6, 5, 4, 1, 2, 3]
19
l = [10, 9, 8, 7, 6, 5, 3, 4, 2, 1]
18
...
l = [10, 7, 9, 8, 5, 6, 4, 1, 3, 2]
20
l = [10, 7, 9, 8, 5, 6, 4, 1, 2, 3]
20
l = [10, 7, 9, 8, 5, 6, 3, 4, 2, 1]
18
l = [10, 7, 9, 8, 5, 6, 3, 4, 1, 2]
18
...
l = [10, 7, 9, 8, 4, 5, 6, 1, 3, 2]
20
l = [10, 7, 9, 8, 4, 5, 6, 1, 2, 3]
20
l = [10, 7, 9, 8, 4, 5, 1, 6, 3, 2]
17
l = [10, 7, 9, 8, 4, 5, 1, 6, 2, 3]
17
l = [10, 7, 9, 6, 8, 5, 4, 3, 2, 1]
19
...
l = [10, 7, 9, 6, 8, 5, 4, 1, 2, 3]
19
l = [10, 7, 9, 6, 8, 5, 3, 4, 2, 1]
16
l = [10, 7, 9, 6, 8, 5, 3, 4, 1, 2]
16
l = [10, 7, 9, 6, 8, 4, 5, 3, 2, 1]
18
l = [10, 7, 9, 6, 8, 4, 5, 3, 1, 2]
18
l = [10, 7, 9, 6, 8, 4, 5, 2, 3, 1]
18
l = [10, 7, 9, 6, 8, 4, 5, 2, 1, 3]
18
l = [10, 7, 9, 6, 8, 4, 5, 1, 3, 2]
18
l = [10, 7, 9, 6, 8, 4, 5, 1, 2, 3]
18
l = [10, 7, 9, 6, 8, 4, 2, 5, 3, 1]
17
l = [10, 7, 9, 6, 8, 4, 2, 5, 1, 3]
17
l = [10, 7, 9, 6, 4, 8, 5, 3, 2, 1]
19
l = [10, 7, 9, 6, 4, 8, 5, 3, 1, 2]
19
...
l = [10, 7, 4, 8, 9, 6, 5, 1, 2, 3]
19
l = [10, 7, 4, 8, 9, 6, 2, 5, 3, 1]
16
l = [10, 7, 4, 8, 9, 6, 2, 5, 1, 3]
16
l = [10, 7, 4, 8, 9, 5, 6, 3, 2, 1]
18
l = [10, 7, 4, 8, 9, 5, 6, 3, 1, 2]
18
...
l = [10, 7, 4, 8, 5, 9, 6, 1, 2, 3]
17
l = [10, 7, 4, 8, 5, 9, 1, 6, 3, 2]
16
l = [10, 7, 4, 8, 5, 9, 1, 6, 2, 3]
16
sage:

I am not surprise that some posets are easier to control than others. Why do you expect that all_bases_controlled_len is the same in all cases?

comment:25 in reply to: ↑ 24 Changed 6 years ago by nthiery

Replying to SimonKing:

I am not surprise that some posets are easier to control than others. Why do you expect that all_bases_controlled_len is the same in all cases?

The fact that the number of bases to be added does not depend on the linear extension is certainly specific to this poset. But before cythonisation this used to be the case. So I need to investigate what went wrong!

comment:26 Changed 6 years ago by nthiery

Hi Simon!

It turns out that I had just fooled myself because of a typo in the test. Even for this example, the number of bases to be added *does* depend on the linear extension.

So all is good, the python/cython implementations agree.

The updated patch:

  • fixes the typos you mentionned
  • reworks a bit the text to make it clearer that the code implements an optimized "add bases" trick which does not have the drawbacks of the brute force approach.
  • fixes the incorrect doctest, and gather some stats on the number of bases to be added for each linear extension
  • mentions the removed doctests, and the rationale for removing them, in the patch header

There remains to decide between a lazy attribute or a cached method for _cmp_key. Any idea on how to investigate the startup time welcome.

Cheers,

Nicolas

comment:27 Changed 6 years ago by nthiery

  • Description modified (diff)

comment:28 follow-up: Changed 6 years ago by nthiery

Hi Simon,

While playing with larger hierarchy of classes for the functorial construction patch, I stumbled on one execution path which was not treated correctly. I'll post an updated patch shortly.

Changed 6 years ago by nthiery

comment:29 in reply to: ↑ 28 ; follow-up: Changed 6 years ago by nthiery

Replying to nthiery:

While playing with larger hierarchy of classes for the functorial construction patch, I stumbled on one execution path which was not treated correctly. I'll post an updated patch shortly.

Ok, the updated patch includes the (hopefuly) now correct implementation together with relevant tests. At this occasion, I declared a couple more variables for cython and added some debugging code (commented out by default).

You can look at :attachment:c3-fix-nt.patch if you just want to see the changes.

I guess last time I wrote such a long function was when I played around with F5! It would be a good candidate for a computer assisted proof of correctness or for automatic test generation.

comment:30 Changed 6 years ago by nthiery

I forgot to mention: all long tests pass on my machine.

comment:31 in reply to: ↑ 29 ; follow-up: Changed 6 years ago by SimonKing

Replying to nthiery:

I guess last time I wrote such a long function was when I played around with F5! It would be a good candidate for a computer assisted proof of correctness or for automatic test generation.

Do we have those things (I mean "computer assisted correctness proofs", not "F5") in Sage?

I am travelling this week. So, I will probably not be able to finish the review right now.

comment:32 in reply to: ↑ 31 Changed 6 years ago by nthiery

Do we have those things (I mean "computer assisted correctness proofs", not "F5") in Sage?

Nope. But we have experts in Orsay in the office next to ours :-)

I am travelling this week. So, I will probably not be able to finish the review right now.

Ok.

comment:33 Changed 6 years ago by nthiery

  • Dependencies changed from #13501, #12894, #12876, #11935, #12895 to #12894, #12876, #11935, #12895

comment:35 Changed 6 years ago by nthiery

The updated patch includes a method category_sample which saves a couple lines and which I needed anyway later on. trac_13589-categories-c3_under_control-category_sample-nt.patch shows the diff.

comment:36 Changed 6 years ago by nthiery

Arr, I can't wait until we have a more semantic way to specify which patches to apply; this is way too error prone to trivial syntax errors ...

Apply: trac_13589-categories-c3_under_control-nt.patch

comment:37 Changed 6 years ago by nthiery

  • Description modified (diff)

comment:38 Changed 5 years ago by nthiery

  • Description modified (diff)

comment:39 Changed 5 years ago by SimonKing

  • Dependencies changed from #12894, #12876, #11935, #12895 to #12894, #12876, #11935, #12895, #10193

comment:40 Changed 5 years ago by SimonKing

Apply: trac_13589-categories-c3_under_control-nt.patch

comment:41 Changed 5 years ago by SimonKing

Nicolas and I just discussed: _cmp_key should at least be a cached method, not a plain method, so that it plays nicely with super(...). However, we may try whether lazy attributes would work, because Nicolas calls super(...) only to compute the value, but the value that should eventually be used does not depend on the class.

comment:42 Changed 5 years ago by SimonKing

This is a minimal example of what Nicolas wants to do:

sage: class A(object):
    @lazy_attribute
    def x(self):
        print "computing the attribute with A"
        return 1
....:     
sage: class B(A):
    @lazy_attribute
    def x(self):
        print "this is lazy attribute with B"
        r= super(B,self).x
        self.y = r
        return r
....:     
sage: b = B()
sage: b.x
this is lazy attribute with B
computing the attribute with A
1
sage: b.y
1

So, it seems to work with lazy attribute, and this will be much faster than calling a method (repeatedly). Trying to change it now.

Changed 5 years ago by SimonKing

For speed reasons, make _cmp_key a lazy attribute, not a (cached) method

comment:43 Changed 5 years ago by SimonKing

  • Description modified (diff)

Apply: trac_13589-categories-c3_under_control-nt.patch trac13589_cmp_key_attribute.patch

comment:44 Changed 5 years ago by SimonKing

To me, the code looks fine. Patchbot does not report any errors. However, it reports a significant increase of 2.5% of startup time.

How can this be analysed?

comment:45 Changed 5 years ago by SimonKing

Something findings:

  • C3_sorted_merge was called 121 times with the current patch during startup.
  • C3_sorted_merge is called on different Groupoids, which should not happen, because all groupoids have the same super categories. Solution: Make it CategoryWithParameters. Then, C3_sorted_merge is only called 93 times during startup.
  • There are some optimizations possible in C3_sorted_merge:
sage: L1 = Fields().all_super_categories()
sage: L2 = Algebras(QQ).all_super_categories()
sage: cython("""
def test1(list L):
    cdef list out = L[::-1]
def test2(list L):
    cdef list out = list(reversed(L))
""")
....: 
sage: %timeit test1(L1)
1000000 loops, best of 3: 541 ns per loop
sage: %timeit test2(L1)
100000 loops, best of 3: 2.25 us per loop

and

sage: cython("""
....: def test1(list L):
....:     cdef set S = set(x for x in L)
....: def test2(list L):
....:     cdef set S = set([x for x in L])
....: """)
....: 
sage: %timeit test1(L1)
100000 loops, best of 3: 3.91 us per loop
sage: %timeit test2(L1)
100000 loops, best of 3: 3.99 us per loop
sage: %timeit test1(L1)
100000 loops, best of 3: 3.89 us per loop
sage: %timeit test1(L1)
100000 loops, best of 3: 3.89 us per loop
sage: %timeit test2(L1)
100000 loops, best of 3: 4.06 us per loop

In both examples above, test2 is what is currently done in C3_sorted_merge, and test1 is apparently what should be done. I am preparing a patch now.

comment:46 Changed 5 years ago by SimonKing

Another observation: key(O) is called repeatedly, even though its result is already stored in O_key, so, it should be used consistently.

comment:47 Changed 5 years ago by SimonKing

  • Description modified (diff)

Apply trac_13589-categories-c3_under_control-nt.patch trac13589_cmp_key_attribute.patch trac13589_improve_startuptime.patch

comment:48 Changed 5 years ago by SimonKing

On my computer, running startuptime once, it seemed to me that the problem was close to being solved. However, the patchbot still sees a slow-down of 2.5% with very high confidence.

So, let's try to find further tweaks.

comment:49 Changed 5 years ago by SimonKing

Here is another potential improvement:

sage: cython("""
....: cpdef inline sort_key(object x):
....:     return x._cmp_key
....: def test(list L, key):
....:     cdef list out = sorted(L, key=key, reverse=True)
....: """)
....: 
sage: class C(object):   
....:     def __init__(self, n):
....:         self._cmp_key = n
....:         
sage: L = [C(ZZ.random_element(1,10^6)) for _ in range(10000)]
sage: from operator import attrgetter
sage: sk = attrgetter('_cmp_key')
sage: %timeit test(L,sort_key)
100 loops, best of 3: 10.8 ms per loop
sage: %timeit test(L,sk)
100 loops, best of 3: 12.9 ms per loop
sage: %timeit test(L,sort_key)
100 loops, best of 3: 10.8 ms per loop
sage: %timeit test(L,sk)
100 loops, best of 3: 12.9 ms per loop

So, in C3_controlled, we could implement this cpdef inline function as the default key for comparison.

In addition, we could decide that the key must return a tuple, because this is the format of Category._cmp_key. But this would mean we loose flexibility.

comment:50 Changed 5 years ago by SimonKing

Let's do some statistics.

This is what I get in three runs of sage -startuptime, when only the dependencies of this ticket are applied.

    29.550   1376.061          4  sage.all
Total time (sum over exclusive time): 1455.386ms
    30.054   1389.055          4  sage.all
Total time (sum over exclusive time): 1469.172ms
    29.410   1376.077          4  sage.all
Total time (sum over exclusive time): 1455.600ms

Adding the first patch from here

    28.707   1390.722          4  sage.all
Total time (sum over exclusive time): 1470.469ms
    28.887   1397.835          4  sage.all
Total time (sum over exclusive time): 1477.698ms
    28.699   1409.829          4  sage.all
Total time (sum over exclusive time): 1489.124ms

Is that really a significant slow-down? I don't think so.

And when applying the other patches (including an update of the last, that I did not post yet):

    29.099   1386.676          4  sage.all
Total time (sum over exclusive time): 1466.655ms
    29.705   1391.458          4  sage.all
Total time (sum over exclusive time): 1471.162ms
    35.740   1480.115          4  sage.all
Total time (sum over exclusive time): 1574.091ms

What does this give us?

comment:51 Changed 5 years ago by SimonKing

I've updated the patch, FWIW.

Apply trac_13589-categories-c3_under_control-nt.patch trac13589_cmp_key_attribute.patch trac13589_improve_startuptime.patch

comment:52 Changed 5 years ago by SimonKing

Any idea what we shall do about the regression at startup?

comment:53 Changed 5 years ago by SimonKing

Here is another possibility to speed things up: Replace itertools.count() by a cpdef method, defined in some auxiliary file.

Timing:

sage: cython("""
cdef long counter = 0
cpdef count1():
    global counter
    counter += 1
    return counter
def count2():
    cdef long c = 0
    while True:
        c += 1
        yield c
""")
sage: C = count1
sage: %timeit a = C()
10000000 loops, best of 3: 73.8 ns per loop
sage: C = count2()
sage: %timeit a = C.next()
1000000 loops, best of 3: 252 ns per loop
sage: import itertools
sage: C = itertools.count()
sage: %timeit a = C.next()
1000000 loops, best of 3: 231 ns per loop

Note that this would also make the startup_module plugin happy, since it complains about the new import of itertools during startup.

Changed 5 years ago by SimonKing

Several attempts to improve startuptime

comment:54 follow-up: Changed 5 years ago by nthiery

Thanks for the timings! The startup gain should be of roughly 50*(231-74)ns which will be anyway far less than a ms and negligible. I would thus tend to stick to the standard Python module itertools, and ignore the little indicator from the startup module plugin.

As for the time increase: I will post a poll on sage-devel later today.

comment:55 in reply to: ↑ 54 Changed 5 years ago by SimonKing

Replying to nthiery:

Thanks for the timings! The startup gain should be of roughly 50*(231-74)ns which will be anyway far less than a ms and negligible. I would thus tend to stick to the standard Python module itertools, and ignore the little indicator from the startup module plugin.

Too late... I already posted an update of my patch.

Well, I'd say let the patchbot try, at least.

As for the time increase: I will post a poll on sage-devel later today.

Thank you.

I've put the cpdef counter into sage.misc.c3_controlled---I think this is a good place, given that it is needed when applying the controlled c3 in categories.

With the original patch, one had Category._cmp_key_generator.next(), hence, two attribute lookups and a function call. With the new patch, one just has one function call, which also seems to be faster than the function call to next().

Let us keep our fingers crossed concerning the findings of the patchbot...

Apply trac_13589-categories-c3_under_control-nt.patch trac13589_cmp_key_attribute.patch trac13589_improve_startuptime.patch

comment:56 Changed 5 years ago by SimonKing

In the computation of _cmp_key, it says:

    atoms = ("FacadeSets", "Finite", "Infinite", "EnumeratedSets", "FiniteDimensionalModulesWithBasis", "GradedModules", "ModulesWithBasis", "AdditiveMagmas", "Magmas", "Semigroups", "Monoids", "FinitePermutationGroups", "Rngs", "Domains")
    classname = self.__class__.__name__
    flag = 0
    for i, axiom in enumerate(atoms):
        if classname.startswith(axiom):
            flag = flag | (1 << i)

First a Python question: Will the tuple atoms be constructed repeatedly in this method? If this is the case, it might be better to move it to module level.

Second question: With the exception of "Finite" and "FiniteDimensionalModulesWithBases?", the atoms are pairwise incompatible, in the sense of "if classname starts with one of the atoms, then it will certainly not start with any other atom". Hence, it seems possible to rewrite it such that the for loop is quit after the first case in which the if clause is true.

But again, probably a speed gain in a method that is called only once will be negligible.

comment:57 Changed 5 years ago by SimonKing

We have to save a total of about 70 ms in 93 function calls. This is not much, and perhaps it could be obtained by cythoning the computation of _cmp_key.

Also, you do

assert not isinstance(self, JoinCategory)

and later comment

for cat in self._super_categories: # not self.super_categories() to avoid join categories! 

So, is the assertion not needed, after all?

comment:58 Changed 5 years ago by SimonKing

  • Description modified (diff)

Are the patch bots out of work? I've not seen any non-grey blob recently.

Anyway. I implemented a Cython version of the _cmp_key attribute. It is similar to lazy attribute, but avoids some overhead, because it is known that it is applied to instances that allow for attribute assignment.

Certainly there can not be a big speed-up. We are talking here about the attempt to make 90 function calls 1 ms faster in each run. I'd really be interested to know what the startup_time plugin finds. So, let us hope that the patch bots resume work soon...

Of course, it could be that we will eventually disregard my startup time patches, should it turn out that they don't help.

Apply trac_13589-categories-c3_under_control-nt.patch trac13589_cmp_key_attribute.patch trac13589_improve_startuptime.patch trac13589_cython_cmp_key.patch

comment:59 Changed 5 years ago by SimonKing

I did some timings. My benchmark is: Delete and recreated Rings()._cmp_key in a tight loop.

With only the first two patches, I get:

sage: C = Rings()
sage: C._cmp_key
(6016, 12)
sage: timeit("if C._cmp_key: del C._cmp_key", number=100000)
100000 loops, best of 3: 12.9 µs per loop
sage: C._cmp_key
(6016, 300059)

With all four patches, I get:

sage: C = Rings()
sage: C._cmp_key
(6016, 12)
sage: timeit("if C._cmp_key: del C._cmp_key", number=100000)
100000 loops, best of 3: 901 ns per loop
sage: C._cmp_key
(6016, 300059)

Since the time needed to delete the attribute should be the same, it follows that the last two patches save 12 µs for each creation of the _cmp_key attribute. Better than nothing, but by far not enough to sum up to 70 ms.

comment:60 Changed 5 years ago by SimonKing

I have cythoned the _cmp_key attribute of CategoryWithParameters as well. The raw improvement is:

Without the first two patches:

sage: C = Algebras(FractionField(QQ['x']))
sage: timeit("if C._cmp_key: del C._cmp_key", number=100000)
100000 loops, best of 3: 5.38 µs per loop

With all patches:

sage: C = Algebras(FractionField(QQ['x']))
sage: timeit("if C._cmp_key: del C._cmp_key", number=100000)
100000 loops, best of 3: 1.28 µs per loop

But I think the startup time is what we are really interested in. My starting point is sage-5.11.b3 with these patches applied:

trac_14471_dynamic_class_hash.patch
trac_14471-review.patch
trac_14516-crystals_speedup-ts.2.patch
trac_14722-lazy_import_at_startup-nt.patch

I am giving the total time provided by sage -startuptime, in 5 consecutive runs:

Total time (sum over exclusive time): 1485.052ms
Total time (sum over exclusive time): 1499.287ms
Total time (sum over exclusive time): 1499.921ms
Total time (sum over exclusive time): 1492.086ms
Total time (sum over exclusive time): 1493.375ms

After applying the first two patches, I get:

Total time (sum over exclusive time): 1495.342ms
Total time (sum over exclusive time): 1504.880ms
Total time (sum over exclusive time): 1502.199ms
Total time (sum over exclusive time): 1497.281ms
Total time (sum over exclusive time): 1508.277ms

And after applying the other two patches, I get:

Total time (sum over exclusive time): 1499.909ms
Total time (sum over exclusive time): 1498.731ms
Total time (sum over exclusive time): 1499.289ms
Total time (sum over exclusive time): 1502.179ms
Total time (sum over exclusive time): 1489.887ms

I think one sees a tendency. But we'd really need to see what the startup-time plugin has to say!

Apply trac_13589-categories-c3_under_control-nt.patch trac13589_cmp_key_attribute.patch trac13589_improve_startuptime.patch trac13589_cython_cmp_key.patch

comment:61 Changed 5 years ago by SimonKing

I don't know about statistics, but here it goes:

sage: T1 = stats.TimeSeries([1485.052,1499.287,1499.921,1492.086,1493.375])
sage: T2 = stats.TimeSeries([1495.342,1504.880,1502.199,1497.281,1508.277])
sage: T3 = stats.TimeSeries([1499.909,1498.731,1499.289,1502.179,1489.887])
sage: T1.mean(), T1.variance()
(1493.9442000000001, 36.77894170000064)
sage: T2.mean(), T2.variance()
(1501.5958, 28.378941700000148)
sage: T3.mean(), T3.variance()
(1497.999, 22.281242000000503)

I guess five runs are simply not enough to get anything significant, but the means seem to indicate that the last two patches reduce the regression by 50%. Admittedly, what is 4 ms, if the variance is 22 ms?

comment:62 Changed 5 years ago by SimonKing

Let's try a bit more. I created time series T0,T1,T2 for 20 startup-time tests, for only the dependencies, the first two patches or all patches. I got time series

sage: sage: T0.list()
[1489.553,
 1480.942,
 1478.854,
 1486.503,
 1475.071,
 1475.266,
 1479.731,
 1476.892,
 1496.149,
 1489.795,
 1485.942,
 1492.043,
 1484.346,
 1482.017,
 1488.826,
 1473.604,
 1490.681,
 1488.502,
 1488.512,
 1485.275]
sage: T1.list()
[1549.77,
 1533.565,
 1530.685,
 1532.984,
 1530.423,
 1528.377,
 1534.03,
 1522.763,
 1534.116,
 1531.709,
 1529.705,
 1561.183,
 1534.724,
 1524.297,
 1533.473,
 1540.683,
 1531.19,
 1528.275,
 1524.075,
 1527.196]
sage: T2.list()
[1499.464,
 1493.059,
 1506.909,
 1486.872,
 1493.657,
 1496.276,
 1496.01,
 1493.497,
 1497.991,
 1492.52,
 1488.484,
 1515.637,
 1483.794,
 1496.187,
 1492.245,
 1496.936,
 1492.311,
 1490.689,
 1495.659,
 1490.345]

with means and standard deviations

sage: T0.mean()
1484.4252000000001
sage: T0.standard_deviation()
6.349568941854885
sage: T1.mean()
1533.16115
sage: T1.standard_deviation()
8.900689998533823
sage: T2.mean()
1494.9271000000003
sage: T2.standard_deviation()
6.912493905203355

I have no idea how to compute from these data whether there is a significant increase of x% of startup time.

Last edited 5 years ago by SimonKing (previous) (diff)

comment:63 Changed 5 years ago by SimonKing

  • Authors changed from Nicolas M. Thiéry to Nicolas M. Thiéry, Simon King
  • Work issues set to Add tests in c3_controlled

Hooray!!!!! Did you see what the patchbot plugins have to tell??

  • Doctest coverage: +Missing doctests misc/c3_controlled.pyx 18 / 20 = 90%. That's my fault, I need to add tests for the functions I introduced.
  • Startup modules: Of course, there is now c3_controlled. No surprise and now problem
  • Startup time:
    -Main:   1.5331 sec (30 samples, std_dev=0.0508)
    -Ticket: 1.5186 sec (30 samples, std_dev=0.0525)
    +real	0m1.482s
    +user	0m1.243s
    +sys	0m0.227s
    +9, 1.5900211334228516, 1.6482288837432861, 1.5097029209136963, 1.4862918853759766, 1.623434066772461, 1.4829378128051758, 1.582334041595459, 1.6098928451538086, 1.554724931716919, 1.5065560340881348, 1.495615005493164, 1.5109539031982422, 1.5232598781585693, 1.4945759773254395, 1.4797320365905762, 1.6103341579437256, 1.5885298252105713, 1.5899219512939453, 1.4962730407714844, 1.5993151664733887, 1.5715739727020264, 1.5021510124206543, 1.5274248123168945, 1.4769189357757568, 1.6029491424560547, 1.625540018081665, 1.6075868606567383, 1.5096728801727295, 1.6023800373077393]
    
    -Average decrease of 0.014 secs or 0.94%.
    +Main:   1.5629 sec (30 samples, std_dev=0.0471)
    +Ticket: 1.5551 sec (30 samples, std_dev=0.0561)
    +
    +Average decrease of 0.0079 secs or 0.5%.
    
    -No statistically significant difference.
    +With 32% confidence, startup time decreased by at least 0.25%
    +With 44% confidence, startup time decreased by at least 0.1%
    

In other words, the problem is solved!

I think I can give your patch a positive review. My patches are big enough so that I should add me as author, and someone else needs to review them. And I will soon add the missing tests in C3_controlled.

comment:64 Changed 5 years ago by SimonKing

Concerning a small function I added:

sage: from operator import attrgetter
sage: f = attrgetter("_cmp_key")
sage: C = Rings()
sage: %timeit a = f(C)
1000000 loops, best of 3: 439 ns per loop
sage: from sage.misc.c3_controlled import category_sort_key
sage: %timeit a = category_sort_key(C)
10000000 loops, best of 3: 154 ns per loop

That's why I added it (and: It is cpdef inline, hence, if Cython is clever, the speed difference with attrgetter will be even better).

Changed 5 years ago by SimonKing

Another attempt to make the startup faster

comment:65 Changed 5 years ago by SimonKing

  • Work issues Add tests in c3_controlled deleted

I have added the two missing tests, updating the cython_cmp_key patch.

Apply trac_13589-categories-c3_under_control-nt.patch trac13589_cmp_key_attribute.patch trac13589_improve_startuptime.patch trac13589_cython_cmp_key.patch

comment:66 Changed 5 years ago by SimonKing

  • Description modified (diff)

comment:67 follow-up: Changed 5 years ago by SimonKing

Now there is a second patchbot commenting: It finds 0.5% regression (not 5%)!

comment:68 in reply to: ↑ 67 Changed 5 years ago by SimonKing

Replying to SimonKing:

Now there is a second patchbot commenting: It finds 0.5% regression (not 5%)!

I misread. 0.5% is not significant. Only 0.25% is significant. So, virtually nothing.

comment:69 Changed 5 years ago by nthiery

Oh wow!

Congrats Simon :-)

I was not expecting we could really do anything about that, and had just sent an e-mail to sage-devel; counter e-mail sent!

Thanks a lot, I'll review your patches!

comment:70 Changed 5 years ago by nthiery

  • Description modified (diff)

comment:71 Changed 5 years ago by nthiery

I started to work on the things we had discussed over the phone. I attached the preliminary patch to see what the patchbot says.

comment:72 Changed 5 years ago by SimonKing

One typo in the review patch: "It is sematically equivalent" should be "It is semantically equivalent". Also I think you mean

:func:`operator.attrgetter```("cmp_key")``

not

:func:`operator.attrgetter```(category)``

I wonder: You changed the default sort function from category_sort_key to identity (as we agreed on by phone), but you do not explicitly use the now non-default category_sort_key when calling c3_sorted_merge in sage.categories.category, isn't it? That should be fixed.

comment:73 Changed 5 years ago by SimonKing

Worse: Compilation of sage.misc.c3_controlled fails with

Error compiling Cython file:
------------------------------------------------------------
...

        sage: from sage.misc.c3_controlled import category_sort_key
        sage: category_sort_key(Rings()) is Rings()._cmp_key
        True
    """
    return C._cmp_key
           ^
------------------------------------------------------------

sage/misc/c3_controlled.pyx:377:12: undeclared name not builtin: C

comment:74 Changed 5 years ago by SimonKing

After trivially changing it, building Sage works, but it crashes at startup with

/home/simon/SAGE/prerelease/sage-5.11.beta3/local/lib/python2.7/site-packages/sage/categories/category.pyc in _all_super_categories(self=Category of category_singleton)
    871              Category of rngs,
    872              Category of semirings,
    873              Category of monoids,
    874              Category of semigroups,
    875              Category of magmas,
    876              Category of commutative additive groups,
    877              Category of commutative additive monoids,
    878              Category of commutative additive semigroups,
    879              Category of additive magmas,
    880              Category of sets,
    881              Category of sets with partial maps,
    882              Category of objects]
    883         """
    884         (result, bases) = C3_sorted_merge([cat._all_super_categories
    885                                            for cat in self._super_categories] +
--> 886                                           [self._super_categories])
        self._super_categories = [Category of rngs, Category of semirings]
    887         self._super_categories_for_classes = bases
    888         return [self] + result
    889 
    890     @lazy_attribute
    891     def _all_super_categories_proper(self):
    892         r"""
    893         All the proper super categories of this category.
    894 
    895         Since :trac:`11943`, the order of super categories is
    896         determined by Python's method resolution order C3 algorithm.
    897 
    898         .. seealso:: :meth:`all_super_categories`
    899 
    900         .. note:: this attribute is likely to eventually become a tuple.
    901 

/home/simon/SAGE/prerelease/sage-5.11.beta3/local/lib/python2.7/site-packages/sage/misc/c3_controlled.so in sage.misc.c3_controlled.C3_sorted_merge (sage/misc/c3_controlled.c:4377)()

KeyError: Category of sets

But this can be fixed by using key=category_sort_key in C3_sorted_merge.

Hence, with the following diff, Sage starts:

  • sage/categories/category.py

    diff --git a/sage/categories/category.py b/sage/categories/category.py
    a b  
    883883        """
    884884        (result, bases) = C3_sorted_merge([cat._all_super_categories
    885885                                           for cat in self._super_categories] +
    886                                           [self._super_categories])
     886                                          [self._super_categories],
     887                                          key = category_sort_key)
    887888        self._super_categories_for_classes = bases
    888889        return [self] + result
    889890
  • sage/misc/c3_controlled.pyx

    diff --git a/sage/misc/c3_controlled.pyx b/sage/misc/c3_controlled.pyx
    a b  
    374374        sage: category_sort_key(Rings()) is Rings()._cmp_key
    375375        True
    376376    """
    377     return C._cmp_key
     377    return category._cmp_key
    378378
    379379cdef class CmpKey:
    380380    r"""

comment:75 Changed 5 years ago by nthiery

Oops, sorry for the wasted time; I posted this in a rush and it was half baked with the last round of changes not tested ... Here is an updated patch which fixes the mentionned issues.

Changed 5 years ago by nthiery

comment:76 Changed 5 years ago by SimonKing

The patch looks good. I only wonder about the findings of the startup-time plugin.

comment:77 Changed 5 years ago by SimonKing

Why the heck is no patchbot giving a result? It is now 15 hours after attaching the review patch!

comment:78 Changed 5 years ago by SimonKing

  • Status changed from needs_review to positive_review
  • Work issues set to commit message for review patch

Hooray, patchbot is back, and says:

+With 60% confidence, startup time increased by at least 0.5%
+With 96% confidence, startup time increased by at least 0.25%
+With 99.1% confidence, startup time increased by at least 0.1%

60% confidence is not significant, hence, we only have a significant regression of 0.25%, which I deem acceptable and a price worth paying.

So, I'm already putting it to positive review, but please add a proper commit message to the review patch!

comment:79 Changed 5 years ago by nthiery

Excellent! I'll fold the patches together and double check the commit message tomorrow.

comment:80 Changed 5 years ago by nthiery

  • Description modified (diff)
  • Work issues commit message for review patch deleted

comment:81 Changed 5 years ago by nthiery

Patchbot getting confused again about which patch to apply.

Apply: trac_13589-categories-c3_under_control-nt.patch

Getting really tired of this ...

comment:82 Changed 5 years ago by darij

Without the colon perhaps?

apply trac_13589-categories-c3_under_control-nt.patch

(No idea how it works, I just want all the stuff depending on this to be checked.)

comment:83 Changed 5 years ago by jdemeyer

  • Milestone changed from sage-5.11 to sage-5.12

comment:84 Changed 5 years ago by jdemeyer

  • Merged in set to sage-5.12.beta0
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:85 Changed 4 years ago by jdemeyer

While looking through the Sage sources, I noticed the following doctest, coming from this ticket:

    sage: for l in L:
    ....:     x = HierarchyElement(10, l.to_poset())
    ....:     try:
    ....:         x.mro_standard
    ....:         assert False
    ....:     except:
    ....:         pass
    ....:     assert x.mro            == list(P)
    ....:     assert x.mro_controlled == list(P)
    ....:     assert x.all_bases_len() == 15
    ....:     stats.append(x.all_bases_controlled_len()-x.all_bases_len())

What is the purpose of the assert False here? I am asking because

try:
    foo()
    assert False
except:
    pass

is entirely equivalent to

try:
    foo()
except:
    pass

comment:86 follow-up: Changed 4 years ago by jdemeyer

bump

comment:87 in reply to: ↑ 86 Changed 4 years ago by nthiery

Replying to jdemeyer:

bump

Thanks for spotting, that was an oversight indeed! See #17106 (needs review).

Note: See TracTickets for help on using tickets.