Opened 7 years ago

Closed 7 years ago

#14912 closed defect (fixed)

UniqueRepresentation tutorial could use more love

Reported by: darij Owned by:
Priority: major Milestone: sage-6.1
Component: documentation Keywords: documentation, structure
Cc: SimonKing, tscrim Merged in:
Authors: Simon King Reviewers: Darij Grinberg, Travis Scrimshaw
Report Upstream: N/A Work issues:
Branch: u/vbraun/unique_rep_tutorial (Commits) Commit:
Dependencies: #14888 Stopgaps:

Description (last modified by tscrim)

Sadly not my love, unless someone teaches me enough of the stuff that I understand it myself.

http://www.sagemath.org/doc/reference/structure/sage/structure/unique_representation.html

One thing I don't understand is:

the preprocessing on the arguments should be idempotent. Namely, If MyClass2.__classcall__ calls CachedRepresentation.__classcall__(<some_arguments>), then it should accept <some_arguments> as its own input, and pass it down unmodified to CachedRepresentation.__classcall__().

What is meant by "accept <some_arguments> as its own input"?

Something that possibly should be added is some explanation of the difference between __classcall__ and __classcall_private__. Maybe it doesn't belong there, but then again I have no idea where it belongs as it seems not documented at all...

Also there seems to be a typo:

The UniqueRepresentation and UniqueFactory classes provide two alternative implementations of this design pattern. Both implementations have their own merits. UniqueRepresentation is very easy to use: a class just needs to derive from it, or make sure some of its super classes does. Also, it groups together the class and the factory in a single gadget; in the example above, one would want to do:

I think it's UniqueFactory?, not UniqueRepresentation?, which is easy to use etc.

Apply:

Attachments (7)

trac_14912-typos-dg.patch (6.0 KB) - added by darij 7 years ago.
Just a couple typos fixed. Feel free to qfold into your patch. This is not a review, sorry (I don't have the competence to do it; I barely understood the UniqueRepresentation? part and I know about nothing on Python's OOP).
trac14912-unique_doc.patch (42.4 KB) - added by SimonKing 7 years ago.
Combined patch, including Darij's fixes
trac_14912-comments-dg.patch (22.3 KB) - added by darij 7 years ago.
almost a review
trac_14912-link_tweaks-ts.patch (2.3 KB) - added by tscrim 7 years ago.
trac_14912-more-ts-dg.patch (19.7 KB) - added by darij 7 years ago.
includes Travis' link tweaks. proofread by Travis
trac_14912-doc-fixes-sk.patch (2.2 KB) - added by SimonKing 7 years ago.
trac_14912-fix_leaking_print-ts.patch (729 bytes) - added by tscrim 7 years ago.

Download all attachments as: .zip

Change History (80)

comment:1 Changed 7 years ago by darij

  • Description modified (diff)

comment:2 Changed 7 years ago by darij

  • Description modified (diff)

comment:3 in reply to: ↑ description Changed 7 years ago by SimonKing

Replying to darij:

I think it's UniqueFactory?, not UniqueRepresentation?, which is easy to use etc.

You couldn't be more mistaken, I think. In the wast majority of cases, if you have defined a class C,

class C(base1, base2):
    code

then you only need to change it into

class C(UniqueRepresentation, base1, base2):
    code

and have caching and pickling for free. Setting up a UniqueFactory is a lot more work.

comment:4 follow-up: Changed 7 years ago by darij

Something else is wrong then, because the example directly under the paragraph I quoted is about GF, which is a UniqueFactory?.

comment:5 in reply to: ↑ 4 Changed 7 years ago by SimonKing

Replying to darij:

Something else is wrong then, because the example directly under the paragraph I quoted is about GF, which is a UniqueFactory?.

OK, if this is the case, then indeed it is a bad example.

One question, though: What do you mean by UniqueRepresentation tutorial? Is there a separate tutorial, or is it just the pages for sage.structure.factory and sage.structure.unique_representation, respectively?

comment:6 Changed 7 years ago by darij

Oops, what I meant is really just the docstring for UniqueRepresentation in sage/structure/unique_representation.py.

comment:7 follow-up: Changed 7 years ago by SimonKing

I think both sage.structure.unique_representation and sage.structure.factory need more care.

Certainly it is a severe fault that GF(...) appears as example of UniqueRepresentation.

And I think both parts of the documentation should give hints on when to use UniqueRepresentation and when to use UniqueFactory.

For example, a UniqueFactory can return different implementations of the same algebraic structure, depending on the input data. This would be impossible (or at least not so easily possible) with UniqueRepresentation. And you would not be able to use UniqueRepresentation with a cdef class---you can define your class in a pyx file, but it has to be class Foo(UniqueRepresentation, ...), not cdef class Foo(UniqueRepresentation, ...), simply since UniqueRepresentation uses meta-classes, which is not possible in a cdef class, if I am not mistaken (and of course, multiple inheritance does not work yet in Cython, and after all UniqueRepresentation is a python class.

These are reasons for using a factory.

On the other hand, using UniqueRepresentation would automatically provide your class with very fast hash and comparison methods implemented in Cython (UniqueRepresentation is a Python class, but also inherits from the Cython class sage.misc.fast_methods.WithEqualityById). If you use a factory instead, you'd need to take care of hash and comparison all by yourself---and this means: It is possible that a UniqueFactory returns objects that are not unique parents!! And moreover, using UniqueRepresentation is very easy---in most cases, it is enough to take an existing class and add UniqueRepresentation as base.

So, these are reasons for using a unique representation.

comment:8 in reply to: ↑ 7 Changed 7 years ago by SimonKing

Replying to SimonKing:

Certainly it is a severe fault that GF(...) appears as example of UniqueRepresentation.

Or not? I now see that GF(...) appears in order to demonstrate the design pattern that is common to both unique representation and unique factory. And it is clearly stated that one would like to have something like

isinstance(GF(p), GF)

which is not possible with a unique factory but is the case for UniqueRepresentation.

comment:9 Changed 7 years ago by SimonKing

I think it is a good idea to have GF mentioned both in the documentation of unique representation and unique factory, since it explains one detail one can base one's choice on.

Namely, as an example of a behaviour one can not get with UniqueRepresentation:

sage: type(GF(5))
<class 'sage.rings.finite_rings.finite_field_prime_modn.FiniteField_prime_modn_with_category'>
sage: type(GF(25,'x'))
<class 'sage.rings.finite_rings.finite_field_givaro.FiniteField_givaro_with_category'>
sage: type(GF(5^10,'x'))
<class 'sage.rings.finite_rings.finite_field_ext_pari.FiniteField_ext_pari_with_category'>
sage: type(GF(2^20,'x'))
<class 'sage.rings.finite_rings.finite_field_ntl_gf2e.FiniteField_ntl_gf2e_with_category'>

comment:10 Changed 7 years ago by SimonKing

Questions:

  • Neither sage.structure.factory nor sage.structure.unique_representation have a meaningful documentation at module level. Would it make sense to give documentation at module level that helps for choosing among the two models of creating a unique parent behaviour? Or should this be only provided by CachedRepresentation and UniqueRepresentation resp. by UniqueFactory?
  • You mentioned that you find the distinction between __classcall__ and __classcall_private__ unclear. There is an example clarifying the distinction, but it is hidden in the documentation of _clear_cache_(), which is an underscore method and is thus not visible in the reference manual. Do you think that the example from _clear_cache_() is clear? Then we could promote it (or a similar example) to a visible place.

comment:11 follow-up: Changed 7 years ago by darij

I fear the doc of _clear_cache_() is beyound my comprehension... Is the difference just in the fact that a __classcall_private__ method will not be inherited by subclasses while a __classcall__ will be? (But I thought all methods would be inherited by subclasses?)

comment:12 in reply to: ↑ 11 Changed 7 years ago by SimonKing

Replying to darij:

I fear the doc of _clear_cache_() is beyound my comprehension... Is the difference just in the fact that a __classcall_private__ method will not be inherited by subclasses while a __classcall__ will be? (But I thought all methods would be inherited by subclasses?)

Your question really is about sage.misc.classcall_metaclass.ClasscallMetaclass, which is where methods such as __classcall__, __classget__, __classcontains__ and __classcall_private__ take effect.

Here, we look at sage.misc.classcall_metaclass.ClasscallMetaclass.__call__, which implements the creation of instances of a class. Of course, if a class C defines __classcall_private__ and a class D inherits from C, then D also has a __classcall_private__ method. However, you can find this method in C.__dict__ but not in D.__dict__.

And the rule is: If an instance of a class (here: C or D) is created by calling ClasscallMetaclass.__class__, then

  • it will be checked whether the class has a __classcall_private__ in its own __dict__. If this is the case, then __classcall_private__ will be called. Otherwise,
  • it is checked whether the class has a __classcall__ method (which is not necessarily in __dict__, as it could be inherited from a base class). If this is the case, then __classcall__ will be called.

I just said "we look at ClasscallMetaclass.__call__". Well, in fact we look at CLasscallMetaclass.__cinit__, because this is where the choice between __classcall__ and __classcall_private__ is made.

What does all this mean for this ticket?

Question: Do you think it would be enough to state in the documentation of CachedRepresentation that it has the ClasscallMetaclass and that it provides a __classcall__ implementing the cache, and that one can use __classcall_private__ or __classcall__ as explained in the docs of ClasscallMetaclass to overload the default way of caching?

Or I guess it would still be a good idea to have examples in place, without the need to read the ClasscallMetaclass docs...

comment:13 Changed 7 years ago by SimonKing

  • Authors set to Simon King
  • Status changed from new to needs_review

I have attached a patch. Do you like the new documentation better? What points would you suggest to improve further?

Note that I added documentation to both sage.structure.factory and sage.structure.unique_representation, also adding new tests.

comment:14 Changed 7 years ago by SimonKing

I have replaced my patch by an updated version. The new patch version addresses your complaint about "idempotent preprocessing", elaborating a bit more and adding an example that shows what goes wrong if the pre-processing is not idempotent.

comment:15 Changed 7 years ago by SimonKing

  • Status changed from needs_review to needs_work

I am just re-reading my patch. I find several typos/wrong grammar. In addition, in one of my examples I have a non-idempotent pre-processing. I should change it. Also, by mistake, in one of my examples one finds the version number of my Sage installation---this should of course replaced by "...".

comment:16 Changed 7 years ago by SimonKing

  • Status changed from needs_work to needs_review

I hope all problems are solved with the new patch version...

Changed 7 years ago by darij

Just a couple typos fixed. Feel free to qfold into your patch. This is not a review, sorry (I don't have the competence to do it; I barely understood the UniqueRepresentation? part and I know about nothing on Python's OOP).

comment:17 follow-up: Changed 7 years ago by darij

Thanks for the patch, which made several things clear to me (probably not enough to be able to decide whether to use factories or unique rep -- but at least a good idea how to). I fear somebody who actually knows some OOP will have to review it. The above attachment fixes a couple typos.

Last edited 7 years ago by darij (previous) (diff)

comment:18 in reply to: ↑ 17 Changed 7 years ago by SimonKing

Hi Darij,

Thank you for fixing the typos!

Replying to darij:

Thanks for the patch, which made several things clear to me (probably not enough to be able to decide whether to use factories or unique rep -- but at least a good idea how to). ... I don't have the competence to do it; I barely understood the UniqueRepresentation part and I know about nothing on Python's OOP

In a way, this means that you are fully qualified as a reviewer. This is about documentation, and thus if you think that some points are still not addressed then you should speak up! What information do you find missing? Is it

  • the definition of cached/unique representation?
  • the design pattern: Why should one want to have cached/unique representation?
  • how to implement a cached/unique representation?
  • Python background: What is a metaclass? What happens during instantiation of a class?
  • how to avoid pitfalls?
  • else?

comment:19 Changed 7 years ago by darij

One thing I'd be happy to know is what a staticmethod is -- but that's hardly a question for this patch.

And yes, some doc on metaclasses and the instantiation process (I'm confused about what goes in new, what in init and what in classcall) would be great.

As for the "why" of cached/unique representation, for me it's pretty clear already, and I think the definition in your doc is very readable.

Thanks a lot for the patience!

Changed 7 years ago by SimonKing

Combined patch, including Darij's fixes

comment:20 follow-up: Changed 7 years ago by SimonKing

  • Description modified (diff)

I have folded the two patches, and I have added a note on "static methods", including a link to the Python docs.

I did not elaborate on __new__ versus __init__ versus __classcall__, yet.

Apply trac14912-unique_doc.patch

comment:21 in reply to: ↑ 20 Changed 7 years ago by SimonKing

Replying to SimonKing:

I have folded the two patches, and I have added a note on "static methods", including a link to the Python docs.

... which was supposed to imply the question whether you find this helpful.

comment:22 Changed 7 years ago by tscrim

Hey Darij,

This is primarily a ping message asking Darij if he's happy with this ticket, but...

The @staticmethod is a python thing, which makes a method accessible via ClassName.foo and doesn't take a first argument of a type as opposed to @classmethod. If anything, this should belong with the ClasscallMetaclass documentation, but that's outside the scope of this ticket.

Best,
Travis

comment:23 follow-up: Changed 7 years ago by darij

Sorry for letting this slip off my to-do list!

I'm pretty sure that I've understood the important points of the doc now except for some that concern pickling (I don't know how it works and frankly I didn't plan on learning that). I'd feel better if I knew what exactly __classcall__ does and (as I said) how it is distinguished from __new__ and __init__ (and if it calls them, in which order), but that might as well be a different patch.

Is the comment I've added correct and is it relevant?

I've replaced "extension class" by "Cython extension-type class", since they seem to be called either "extension types" or "cdef classes" but never "extension classes".

I'd like more details on this:

104	    In addition, it is required that a unique factory instance is provided 
105	    with a name that allows to find its definition.

I assume this refers to the string parameter, e. g., in F = MyFactory("__main__.F"); but what kind of path should it contain if it is to be used somewhere in a .py file rather than in a doctest? Just F = MyFactory("F") ? What if it is defined inside a class?

Not sure about this:

And 
 	408	with a factory, it is possible to create the resulting instance by arguments 
 	409	that are different from the key used for caching.

Doesn't that also work with CachedRepresentation? if one preprocesses by declaring __classcall_private__?

@Travis: thank you, too.

Last edited 7 years ago by darij (previous) (diff)

Changed 7 years ago by darij

almost a review

comment:24 follow-up: Changed 7 years ago by darij

  • Cc tscrim added
  • Description modified (diff)
  • Milestone set to sage-5.13
  • Reviewers set to Darij Grinberg

Thanks to Travis's insistence and also his help, I've more or less reviewed this one at last. The only part I wasn't able to understand is:

.. NOTE::

    For technical reasons, it is needed that ``__classcall__`` respectively
    ``__classcall_private__`` are "static methods", i.e., they are callable
    objects that do not bind to an instance or class. For example, a
    :class:`~sage.misc.cachefunc.cached_function` can be used here, because it
    is callable, but does not bind to an instance or class, because it has no
    ``__get__()`` method. A usual Python function, however, has a
    ``__get__()`` method and would thus under normal circumstances bind to an
    instance or class, and thus the instance or class would be passed to the
    function as the first argument. To prevent a callable object from being
    bound to the instance or class, one can prepend the ``@staticmethod``
    decorator to the definition; see :class:`staticmethod`.

Well, maybe it's just that I have no idea what __get__ methods do.

for the patchbot:

apply trac14912-unique_doc.patch​ trac_14912-comments-dg.patch

comment:25 in reply to: ↑ 23 ; follow-up: Changed 7 years ago by SimonKing

Replying to darij:

Sorry for letting this slip off my to-do list!

Same here...

I'm pretty sure that I've understood the important points of the doc now except for some that concern pickling (I don't know how it works and frankly I didn't plan on learning that). I'd feel better if I knew what exactly __classcall__ does and (as I said) how it is distinguished from __new__ and __init__ (and if it calls them, in which order), but that might as well be a different patch.

In a nutshell: If C is a class with ClasscallMetaclass, then __classcall__ (or __classcall_private__) is what is executed when you do C(*args, **kwds).

Your __classcall__ can do anything. Really. It could return an instance of C, but it could also return something totally different.

Consequence: __new__ or __init__ are only called if you did not forget to call them inside of your __classcall__. If you look at what CachedRepresentation.__classcall__ does, you see that indeed __new__ and __init__ are called explicitly.

I'd like more details on this:

104	    In addition, it is required that a unique factory instance is provided 
105	    with a name that allows to find its definition.

I assume this refers to the string parameter, e. g., in F = MyFactory("__main__.F");

Exactly.

but what kind of path should it contain if it is to be used somewhere in a .py file rather than in a doctest? Just F = MyFactory("F") ?

It is exactly as it is stated: A name is needed that allows to find the factory's definition. Hence, if you put F into the global name space, then F = MyFactory("F") is fine. If F is in a module sage.foo.bar, then it is F = MyFactory("sage.foo.bar.F").

What if it is defined inside a class?

If the class is sage.bar.foo.MyClass and you want to use a factory for the attribute F of this class, then I guess it is `F = MyFactory?("sage.bar.foo.MyClass?.F"). Not tested, though.

Not sure about this:

And 
 	408	with a factory, it is possible to create the resulting instance by arguments 
 	409	that are different from the key used for caching.

Doesn't that also work with CachedRepresentation? if one preprocesses by declaring __classcall_private__?

Yes. This could be clarified.

comment:26 in reply to: ↑ 24 ; follow-up: Changed 7 years ago by SimonKing

Hi Darij,

the review patch looks fine. Disclaimer: I did not test whether the documentation builds fine after the changes introduced by the review patch. IIRC, it did build (and look) fine with my patch.

Replying to darij:

Thanks to Travis's insistence and also his help, I've more or less reviewed this one at last. The only part I wasn't able to understand is:

.. NOTE::

    For technical reasons, it is needed that ``__classcall__`` respectively
    ``__classcall_private__`` are "static methods", i.e., they are callable
    objects that do not bind to an instance or class. For example, a
    :class:`~sage.misc.cachefunc.cached_function` can be used here, because it
    is callable, but does not bind to an instance or class, because it has no
    ``__get__()`` method. A usual Python function, however, has a
    ``__get__()`` method and would thus under normal circumstances bind to an
    instance or class, and thus the instance or class would be passed to the
    function as the first argument. To prevent a callable object from being
    bound to the instance or class, one can prepend the ``@staticmethod``
    decorator to the definition; see :class:`staticmethod`.

Well, maybe it's just that I have no idea what __get__ methods do.

OK. Then one should add a pointer to the Python references. Travis, Do you have an idea what page to point to?

comment:27 in reply to: ↑ 25 Changed 7 years ago by darij

Thanks for these clarifications!

Replying to SimonKing:

Replying to darij:

I'd like more details on this:

104	    In addition, it is required that a unique factory instance is provided 
105	    with a name that allows to find its definition.

I assume this refers to the string parameter, e. g., in F = MyFactory("__main__.F");

Exactly.

but what kind of path should it contain if it is to be used somewhere in a .py file rather than in a doctest? Just F = MyFactory("F") ?

It is exactly as it is stated: A name is needed that allows to find the factory's definition. Hence, if you put F into the global name space, then F = MyFactory("F") is fine. If F is in a module sage.foo.bar, then it is F = MyFactory("sage.foo.bar.F").

What if it is defined inside a class?

If the class is sage.bar.foo.MyClass and you want to use a factory for the attribute F of this class, then I guess it is `F = MyFactory?("sage.bar.foo.MyClass?.F"). Not tested, though.

Yeah, these things could very well be explained in the doc itself.

Not sure about this:

And 
 	408	with a factory, it is possible to create the resulting instance by arguments 
 	409	that are different from the key used for caching.

Doesn't that also work with CachedRepresentation? if one preprocesses by declaring __classcall_private__?

Yes. This could be clarified.

Kind-of; on the other hand, the preprocessing probably needs to be idempotent for pickling to work well, so we have a serious restriction...

comment:28 in reply to: ↑ 26 Changed 7 years ago by tscrim

Replying to SimonKing:

Well, maybe it's just that I have no idea what __get__ methods do.

OK. Then one should add a pointer to the Python references. Travis, Do you have an idea what page to point to?

Hey Simon,

My thought would be this: http://docs.python.org/2/howto/descriptor.html (I showed this to Darij the other day).

Changed 7 years ago by tscrim

comment:29 Changed 7 years ago by tscrim

  • Description modified (diff)

I'm prepared to set this to positive review (barring my tweaks being review), but I'd like to know what else is desired to be added and where (if anything).

For patchbot;

Apply: trac14912-unique_doc.patch​, trac_14912-comments-dg.patch​, trac_14912-link_tweaks-ts.patch​

comment:30 Changed 7 years ago by darij

  • Description modified (diff)

Oops, I've just edited Travis's patch instead of qnewing my own. So glad I soon won't have the hg workflow anymore to trip over...

Anyway, here is my "review patch". It is not so much as a review but a more informative documentation of what exactly to do with UniqueFactory. Now, "more informative" does not necessarily mean "correct", so I'd much prefer someone more experienced than me to look this through (particularly, but not only, checking my interpretation of other_keys).

I still don't grok the note about the __classcall__ method being static in the unique representation doc, but I didn't really have the time to RTFM about descriptors. If the note is clear to you, Travis, just say so and it will plug this hole in my review.

for the patchbot:

apply trac14912-unique_doc.patch​ trac_14912-comments-dg.patch trac_14912-more-ts-dg.patch

comment:31 follow-up: Changed 7 years ago by chapoton

beware of unvisible characters !

apply trac14912-unique_doc.patch trac_14912-comments-dg.patch trac_14912-more-ts-dg.patch

comment:32 in reply to: ↑ 31 Changed 7 years ago by SimonKing

Replying to chapoton:

beware of unvisible characters !

Indeed. The third patch introduces some blank space.

Notice that the objects created by the factory inherit neither from 
the factory class, nor from the factory instance (which is not even a 
class). In fact, they don't even know about the factory that created 
them! 

Apart from the blank space after the "!" (AFAIK, there is no blank space before "!", ":" and ";", in contrast to French), I am not happy with the statement that the instances don't even know about the factory that created them:

sage: F = GF(5)
sage: F._factory_data
(<class 'sage.rings.finite_rings.constructor.FiniteFieldFactory'>,
 (5, 13, 'beta3'),
 (5, None, None, 'modn', '{}', 5, 1, True),
 {})
sage: _[0] is GF
True

This actually is how pickling is implemented (namely: By keeping a reference to the creating factory, plus the arguments used to call the factory).

comment:33 Changed 7 years ago by jdemeyer

  • Status changed from needs_review to needs_work

The docbuild gives warnings:

dochtml.log:[structure] /scratch/release/merger/sage-5.13.beta5/local/lib/python2.7/site-packages/sage/structure/unique_representation.py:docstring of sage.structure.unique_representation:127: WARNING: Block quote ends without a blank line; unexpected unindent.
dochtml.log:[structure] /scratch/release/merger/sage-5.13.beta5/local/lib/python2.7/site-packages/sage/structure/unique_representation.py:docstring of sage.structure.unique_representation:129: ERROR: Unexpected indentation.

comment:34 follow-up: Changed 7 years ago by darij

  • Status changed from needs_work to needs_review

Fixed false claim about factory-created objects being oblivious of their making (thanks Simon!). Couldn't find the blank space after "!" claimed by Simon, but removed some other useless whitespaces. HOPEFULLY fixed Jeroen's docbuild warnings. Made some mini-fixes to unique_representation, including changing "argument" to "arguments" in the __reduce__ docstring.

Thanks Frederic! (I'd be happier if I new where those symbols are coming from in the first place. Copypasting the names of the attachments from html maybe?)

for the patchbot:

apply trac14912-unique_doc.patch trac_14912-comments-dg.patch trac_14912-more-ts-dg.patch

Changed 7 years ago by darij

includes Travis' link tweaks. proofread by Travis

comment:35 Changed 7 years ago by tscrim

Everything looks good to me, so I'm going to set this to positive review. However feel free to object.

For patchbot:

Apply: trac14912-unique_doc.patch trac_14912-comments-dg.patch trac_14912-more-ts-dg.patch

comment:36 Changed 7 years ago by tscrim

  • Reviewers changed from Darij Grinberg to Darij Grinberg, Travis Scrimshaw
  • Status changed from needs_review to positive_review

comment:37 Changed 7 years ago by tscrim

Thank you Simon and Darij for your work on this!

comment:38 Changed 7 years ago by jdemeyer

  • Status changed from positive_review to needs_work
sage -t devel/sage/sage/structure/unique_representation.py
**********************************************************************
File "devel/sage/sage/structure/unique_representation.py", line 455, in sage.structure.unique_representation
Failed example:
    type(Kp)
Expected:
    <class 'sage.rings.finite_rings.finite_field_ext_pari.FiniteField_ext_pari_with_category'>
Got:
    <class 'sage.rings.finite_rings.finite_field_pari_ffelt.FiniteField_pari_ffelt_with_category'>
**********************************************************************
File "devel/sage/sage/structure/unique_representation.py", line 814, in sage.structure.unique_representation.CachedRepresentation
Failed example:
    n == id(SymmetricGroup(17))
Expected:
    False
Got:
    True
**********************************************************************

comment:39 Changed 7 years ago by jdemeyer

  • Dependencies set to #14888

The first "error" is due to #14888.

comment:40 Changed 7 years ago by jdemeyer

The second error happens even with a clean version of Sage 5.12.

comment:41 Changed 7 years ago by SimonKing

Hm. In a freshly build sage-5.13.b4, I get

        sage: n = id(SymmetricGroup(17))
        sage: import gc
        sage: _ = gc.collect()
        sage: n == id(SymmetricGroup(17))
        False

but I get "True" if the same is done as a doc test.

So, rather than relying on the id, I will create a test that demonstrates that this group becomes garbage collected.

comment:42 Changed 7 years ago by SimonKing

For example:

sage: S = SymmetricGroup(17)
sage: n = id(S)
sage: import gc
sage: _ = gc.collect()
sage: n in [id(x) for x in gc.get_objects()]
True
sage: del S
sage: _ = gc.collect()
sage: n in [id(x) for x in gc.get_objects()]
False

That's a better test.

comment:43 Changed 7 years ago by SimonKing

WTF??

Again, if I run the same thing as doctest, then the answer in the last line is "True", not "False". Hence, the symmetric group gets collected in an interactive session, but not in a test!

comment:44 Changed 7 years ago by SimonKing

I am still shocked that garbage collection works differently in doc tests than in an interactive session, but anyway: Here is an example that works both interactively in in a doc test.

        sage: class SomeClass(UniqueRepresentation):
        ....:     def __init__(self, i):
        ....:         print "creating new instance for argument %s"%i
        ....:         self.i = i
        ....:     def __del__(self):
        ....:         print "deleting instance for argument %s"%self.i
        ....:         
        sage: O = SomeClass(1)
        creating new instance for argument 1
        sage: O is SomeClass(1)
        True
        sage: O is SomeClass(2)
        creating new instance for argument 2
        deleting instance for argument 2
        False
        sage: del O
        deleting instance for argument 1
        sage: O = SomeClass(1)
        creating new instance for argument 1

Changed 7 years ago by SimonKing

comment:45 Changed 7 years ago by SimonKing

  • Description modified (diff)
  • Status changed from needs_work to needs_review

Fixed! Please review!

Apply trac14912-unique_doc.patch trac_14912-comments-dg.patch trac_14912-more-ts-dg.patch trac_14912-doc-fixes-sk.patch

comment:46 in reply to: ↑ 34 Changed 7 years ago by SimonKing

Replying to darij:

Couldn't find the blank space after "!" claimed by Simon,

Sorry, my mistake. In order to find trailing whitespace, I went to the html page of the patch and marked some lines, thus highlighting all characters. And it did look like a highlighted blank spaces after the exclamation mark. But when I looked at the patch itself, it was fine.

Best regards, Simon

comment:47 Changed 7 years ago by jdemeyer

I will not review the patch, but at least doctests pass now.

comment:48 follow-up: Changed 7 years ago by darij

I like how your new doctest showcases automatic garbage collection, but what do you think about keeping the old doctest in with a # not tested and a comment on how it usually works in the console and demonstrates the non-invariance of the id?

(In my very limited understanding, the doctest environment tries hard at replacing all kind of randomness by determinism, so I'm not surprised that two ids end up being the same even if they are created differently.)

Last edited 7 years ago by darij (previous) (diff)

comment:49 in reply to: ↑ 48 Changed 7 years ago by jdemeyer

Replying to darij:

(In my very limited understanding, the doctest environment tries hard at replacing all kind of randomness by determinism, so I'm not surprised that two ids end up being the same even if they are created differently.)

Nobody has any control on the memory locations of allocated objects. The doctests do use set_random_seed(), but I very much doubt that this has any influence on id.

comment:50 Changed 7 years ago by SimonKing

I would recommend against using my old suggestion --- "id" is an implementation detail and depends on randomness, beyond anything that is controllable by random seeds. So, even when it is marked as # not tested, I think it would not be very helpful to show a test against "id".

comment:51 Changed 7 years ago by jdemeyer

The unpredictability of id boils down to the unpredictability of malloc(): if one does

char *a = malloc(s);
free(a);
char *b = malloc(s);
free(b);

it is impossible to say whether a and b will be equal.

comment:52 Changed 7 years ago by jdemeyer

Any chance for a final review? It would be nice to get this into Sage 5.13.

Last edited 7 years ago by jdemeyer (previous) (diff)

comment:53 Changed 7 years ago by tscrim

Hey Simon,

With sage-5.13.beta2, I get:

sage -t unique_representation.py
**********************************************************************
File "unique_representation.py", line 822, in sage.structure.unique_representation.CachedRepresentation
Failed example:
    O is SomeClass(2)
Expected:
    creating new instance for argument 2
    deleting instance for argument 2
    False
Got:
    creating new instance for argument 2
    False
**********************************************************************
File "unique_representation.py", line 826, in sage.structure.unique_representation.CachedRepresentation
Failed example:
    del O
Expected:
    deleting instance for argument 1
Got:
    <BLANKLINE>
**********************************************************************
File "unique_representation.py", line 828, in sage.structure.unique_representation.CachedRepresentation
Failed example:
    O = SomeClass(1)
Expected:
    creating new instance for argument 1
Got:
    <BLANKLINE>
**********************************************************************
1 item had failures:
   3 of  89 in sage.structure.unique_representation.CachedRepresentation
    [229 tests, 3 failures, 4.18 s]
----------------------------------------------------------------------
sage -t unique_representation.py  # 3 doctests failed
----------------------------------------------------------------------

with the following patches applied:

trac_15327-qfold-dg.patch
trac_15327-fixes-dg.patch
trac_15305-coercion_tensor_products-ts.patch
trac13394-weak_value_dictionary.patch
trac_15311-hall_algebras-ts.patch
trac_10963-more_functorial_constructions-nt.patch
trac_10963-more_functorial_constructions-graded-modules-fix-nt.patch
trac_10963_doctest_correction-fc.patch
trac_14102-nonsymmetric-macdonald.patch
trac9107_nesting_nested_classes.patch
trac_9107_fix_cross_reference.patch
trac_14685_bug_aorder_lazypowerseries.patch
trac_15150-ncsym-ts.patch
trac_15174-qfold1.patch
trac_15174-final-touches-dg.patch
trac_15174-review-ts.patch
trac14912-unique_doc.patch
trac_14912-comments-dg.patch
trac_14912-more-ts-dg.patch
trac_14912-doc-fixes-sk.patch

I don't think there's anything from the other patches which would cause this to fail... Can someone check this on a later version of Sage?

comment:54 follow-up: Changed 7 years ago by darij

With the git version of beta5, I get:

sage -t src/sage/structure/unique_representation.py
deleting instance for argument 1
    [229 tests, 0.45 s]

Weird but supposedly OK?

Could this be the garbage collector triggering __del__ *after* the doctest? If so, one could easily fix this by explicitly trashing all instances in the doctest.

Last edited 7 years ago by darij (previous) (diff)

comment:55 Changed 7 years ago by jdemeyer

Doctests also pass for me.

comment:56 in reply to: ↑ 54 Changed 7 years ago by SimonKing

Replying to darij:

Could this be the garbage collector triggering __del__ *after* the doctest? If so, one could easily fix this by explicitly trashing all instances in the doctest.

Isn't deletion of this instance supposed to be immediate? After all, Python's cyclic garbage is not involved in this example, since the to-be-deleted objects are not part of any reference cycle.

Weird. Any idea about (1) the reason and (2) how to fix it?

Last edited 7 years ago by SimonKing (previous) (diff)

comment:57 follow-up: Changed 7 years ago by darij

I'm talking about the instance which is *not explicitly* deleted in the doctest, but just left around to dangle.

comment:58 in reply to: ↑ 57 Changed 7 years ago by SimonKing

Replying to darij:

I'm talking about the instance which is *not explicitly* deleted in the doctest, but just left around to dangle.

OK, but I am also talking about the failure reported by travis:

File "unique_representation.py", line 826, in sage.structure.unique_representation.CachedRepresentation
Failed example:
    del O
Expected:
    deleting instance for argument 1
Got:
    <BLANKLINE>
**********************************************************************
File "unique_representation.py", line 828, in sage.structure.unique_representation.CachedRepresentation
Failed example:
    O = SomeClass(1)
Expected:
    creating new instance for argument 1
Got:
    <BLANKLINE>

Here, the instance is explicitly deleted, but nonetheless it stays in cache. No idea why.

comment:59 Changed 7 years ago by tscrim

Hey Darij,

Could you push your branch to trac so I can look at it? I can't seem to import Hg patches currently. This way we can see if it's something that's been taken care of by a beta version bump.

Thanks,
Travis

comment:60 Changed 7 years ago by darij

Hi Travis,

it's on branch u/darij/do-not-fork/14912 now. I don't want to put this into public namespace so as not to confuse this ticket; I am not sure if it's a good idea right now to turn #14912 into a git patch.

Greets,
Darij

comment:61 Changed 7 years ago by tscrim

Hmmm... with the branch, I get the same output. I'm thinking we shouldn't have a print statement displaying like that.

Jeroen, do you also get the print statement when running doctests on the file? I'm building 5.13.beta5 now and can test tonight.

comment:62 Changed 7 years ago by jdemeyer

I get

Doctesting 1 file.
sage -t --long devel/sage/sage/structure/unique_representation.py
deleting instance for argument 1
    [229 tests, 6.90 s]
----------------------------------------------------------------------
All tests passed!
----------------------------------------------------------------------
Total time for all tests: 7.1 seconds
    cpu time: 1.1 seconds
    cumulative wall time: 6.9 seconds

Changed 7 years ago by tscrim

comment:63 Changed 7 years ago by tscrim

  • Description modified (diff)

Okay, I fixed the leaking print statement by explicitly deleting the object in the doctest. It's a slight hack fix, but since this is only for testing/illustrative purposed, I think its okay.

For patchbot:

Apply: trac14912-unique_doc.patch, trac_14912-comments-dg.patch, trac_14912-more-ts-dg.patch, trac_14912-doc-fixes-sk.patch, trac_14912-fix_leaking_print-ts.patch​

comment:64 Changed 7 years ago by SimonKing

If the patchbot says that it is fine, then it is fine with me as well. But I am not a reviewer...

comment:65 Changed 7 years ago by tscrim

The patchbot has not come around, but can someone just double-check that with my latest patch, the statement isn't printed anymore after running the doctest? With that I think it's positive review (at least for me). Thanks.

comment:66 Changed 7 years ago by tscrim

ping It's a simple double-check to make sure the print statement isn't shown when running sage -t on the file. That's all I've changed.

comment:67 Changed 7 years ago by darij

The patches apply fine on 6.0rc0, and the print no longer leaks.

But I've got another question on this paragraph:

.. NOTE::

    For technical reasons, it is needed that ``__classcall__`` respectively
    ``__classcall_private__`` are "static methods", i.e., they are callable
    objects that do not bind to an instance or class. For example, a
    :class:`~sage.misc.cachefunc.cached_function` can be used here, because it
    is callable, but does not bind to an instance or class, because it has no
    ``__get__()`` method. A usual Python function, however, has a
    ``__get__()`` method and would thus under normal circumstances bind to an
    instance or class, and thus the instance or class would be passed to the
    function as the first argument. To prevent a callable object from being
    bound to the instance or class, one can prepend the ``@staticmethod``
    decorator to the definition; see :class:`staticmethod`.

    For more on Python's ``__get__()`` method, see:
    http://docs.python.org/2/howto/descriptor.html

In src/sage/groups/perm_gps.py, I see:

    @weak_cached_function
    def __classcall__(cls, *args, **kwds):
        """
        This makes sure that domain is a FiniteEnumeratedSet before it gets passed
        on to the __init__ method.

        EXAMPLES::

            sage: SymmetricGroup(['a','b']).domain() #indirect doctest
            {'a', 'b'}
        """
        domain = kwds.pop('domain', None)
        if domain is not None:
            if domain not in FiniteEnumeratedSets():
                domain = FiniteEnumeratedSet(domain)
            kwds['domain'] = domain
        return super(PermutationGroup_unique, cls).__classcall__(cls, *args, **kwds)

This is NOT decorated with @staticmethod. Isn't this contradictory to the NOTE above?

comment:68 follow-up: Changed 7 years ago by tscrim

There's this line in unique_representation.py:

    @weak_cached_function # automatically a staticmethod
    def __classcall__(cls, *args, **options):

so apparently it's not a contradiction. Although I don't specifically know why it's used (in src/sage/groups/perm_gps/permgroup_named.py) instead of a @staticmethod.

comment:69 Changed 7 years ago by darij

  • Status changed from needs_review to positive_review

Ah, that clears things up; thanks a lot.

Positive_review!!!1111

comment:70 in reply to: ↑ 68 ; follow-up: Changed 7 years ago by SimonKing

Replying to tscrim:

There's this line in unique_representation.py:

    @weak_cached_function # automatically a staticmethod
    def __classcall__(cls, *args, **options):

so apparently it's not a contradiction. Although I don't specifically know why it's used (in src/sage/groups/perm_gps/permgroup_named.py) instead of a @staticmethod.

There is some argument pre-processing: kwds['domain'] might change. And this changed argument is used as a cache key when calling super(PermutationGroup_unique, cls).__classcall__(cls, *args, **kwds). Apparently the author of this __classcall__ found it desirable that the instance is pulled from the cache when the original value of kwds['domain'] is provided next time, without losing time by the preprocessing.

comment:71 in reply to: ↑ 70 Changed 7 years ago by tscrim

Replying to SimonKing:

There is some argument pre-processing: kwds['domain'] might change. And this changed argument is used as a cache key when calling super(PermutationGroup_unique, cls).__classcall__(cls, *args, **kwds). Apparently the author of this __classcall__ found it desirable that the instance is pulled from the cache when the original value of kwds['domain'] is provided next time, without losing time by the preprocessing.

Ah I see. Thanks for clarifying and for your work on this.

Thanks to Darij and Jeroen for help in reviewing.

comment:72 Changed 7 years ago by vbraun

  • Branch set to u/vbraun/unique_rep_tutorial

comment:73 Changed 7 years ago by vbraun

  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.