Opened 4 years ago

Closed 4 years ago

#25179 closed enhancement (fixed)

New function to get attribute without binding and use it instead of __func__

Reported by: jdemeyer Owned by:
Priority: major Milestone: sage-8.3
Component: categories Keywords:
Cc: nthiery Merged in:
Authors: Jeroen Demeyer Reviewers: Erik Bray
Report Upstream: N/A Work issues:
Branch: 854eb48 (Commits, GitHub, GitLab) Commit: 854eb48e0fff08c46e7afa1979c89e052569a07c
Dependencies: Stopgaps:

Status badges

Description (last modified by jdemeyer)

There are various instances in Sage of code like

someclass.ParentMethods.method.__func__

This is because we typically want to get the actual function from the ParentMethods body without binding to ParentMethods. This may apply to more general descriptors besides functions.

To solve this, I propose a new function rawattr() which is like getattr() but without binding.

Change History (51)

comment:1 follow-up: Changed 4 years ago by nthiery

Hi Jeroen,

I agree that the idiom is not great :-) (though it makes explicit that something non trivial is happening)

Do you have a way to implement your proposition without having to specify a super class or metaclass in each and every XXXMethods class?

comment:2 in reply to: ↑ 1 Changed 4 years ago by jdemeyer

Replying to nthiery:

Do you have a way to implement your proposition without having to specify a super class or metaclass in each and every XXXMethods class?

I don't think so. My idea would be to replace

class ParentMethods:

by

class ParentMethods(Namespace):

Of course, the current code works so it's not required to do that change everywhere. I would suggest to do the change only when needed.

comment:3 Changed 4 years ago by jdemeyer

  • Branch set to u/jdemeyer/parentmethods__use_a_metaclass_which_does_not_bind_methods

comment:4 Changed 4 years ago by git

  • Commit set to 39bb476c3cbc1a850a40030cf87ec2c5f0924c36

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

a674ad0NamespaceMeta for namespace classes
39bb476Use Namespace for ParentMethods/ElementMethods where needed

comment:5 follow-up: Changed 4 years ago by jdemeyer

  • Authors set to Jeroen Demeyer
  • Status changed from new to needs_review

These are the minimal changes needed to get rid of the existing __func__ accesses related to ParentMethods and ElementMethods.

comment:6 follow-up: Changed 4 years ago by nthiery

Thanks for investigating this. I definitely see the point of trying to improve the idiom which expose an implementation detail about bound/unbound methods. However I am reluctant with the current solution. Somehow it's too unlocal and non explicit. The risk is that a developper will see the idiom (without __func__) in the target class, try to replicate it, without noticing that there is something special to be done in the original class.

I would be more in favor of a local and explicit idiom like:

    foo = unbind(Bars.ParentMethods.foo)

or

    foo = get_unbound(Bars.ParentMethod, "foo")

where we would use unbind to hide the implementation details and support any kind of object, not just methods.

What do you think? Opinions anyone else?

comment:7 in reply to: ↑ 6 ; follow-up: Changed 4 years ago by jdemeyer

Replying to nthiery:

where we would use unbind to hide the implementation details and support any kind of object, not just methods.

This wouldn't simply because you cannot "unbind" arbitrary objects. Imagine

class X:
    @property
    def answer(self):
        return 42

There is no way to unbind 42 to the property object X.answer.

get_unbound would work: it's essentially a one-liner to implement that because it's exactly what the C API function _PyType_Lookup does. It gets the bare attribute from a class and it doesn't fancy stuff like __get__ or __getattr__ or metaclass shenanigans.

comment:8 Changed 4 years ago by jdemeyer

I'll try to implement that on a different branch.

comment:9 Changed 4 years ago by git

  • Commit changed from 39bb476c3cbc1a850a40030cf87ec2c5f0924c36 to 3c0b12710cf13f7b777f35c8f7bffa6f1920e10e

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

bf4c33dNew function rawattr to get attributes without binding
3c0b127Use rawattr where needed

comment:10 Changed 4 years ago by jdemeyer

  • Status changed from needs_review to needs_work

comment:11 Changed 4 years ago by jdemeyer

  • Description modified (diff)
  • Summary changed from ParentMethods: use a metaclass which does not bind methods to New function to get attribute without binding and use it instead of __func__

comment:12 Changed 4 years ago by git

  • Commit changed from 3c0b12710cf13f7b777f35c8f7bffa6f1920e10e to 680a1c301279809e26925aba9088b8e869069e03

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

ffb07a2New function rawattr to get attributes without binding
c182d50Use rawattr where needed
ff1c4d7f1
680a1c3f2

comment:13 Changed 4 years ago by git

  • Commit changed from 680a1c301279809e26925aba9088b8e869069e03 to 8598cc198db762dab37475ca3ac57e2f407dd392

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

8165f15New function rawattr to get attributes without binding
8598cc1Use rawattr where needed

comment:14 Changed 4 years ago by jdemeyer

  • Status changed from needs_work to needs_review

comment:15 follow-up: Changed 4 years ago by embray

This is neat, but it seems like extreme overkill for a relatively minor issue. Were there any other use cases for this you had in mind? Because the current use case is handled much more simply by other means.

comment:16 in reply to: ↑ 5 Changed 4 years ago by embray

Replying to jdemeyer:

These are the minimal changes needed to get rid of the existing __func__ accesses related to ParentMethods and ElementMethods.

I'm not sure what you mean by this. While I'm not particularly fond of the solution in #24955 either, it's not clear how this is any better, much less simpler.

comment:17 follow-up: Changed 4 years ago by embray

I'm also -1 on immense mucking around with Python interpreter internals for seemingly little need/benefit. Although I agree it may be a useful capability to have and might be worth proposing upstream.

As a different way of replacing the current idiom for copying methods from another class's ParentMethods, I had also floated the idea in #24955 of a class decorator that does the same thing. This avoids the need for a metaclass or inheritence--just something like @inherit_methods_from(<cls>, [<list of method names>]). Maybe "inherit" is not the best word choice though if we want to avoid allusion to class inheritance.

comment:18 in reply to: ↑ 15 ; follow-up: Changed 4 years ago by jdemeyer

Replying to embray:

Were there any other use cases for this you had in mind?

There are various instances of obj.__dict__[method] in Sage which could be replaced by rawattr(obj, method). But I haven't looked too closely.

comment:19 in reply to: ↑ 17 Changed 4 years ago by jdemeyer

Replying to embray:

something like @inherit_methods_from(<cls>, [<list of method names>]). Maybe "inherit" is not the best word choice though if we want to avoid allusion to class inheritance.

This would only help to implement fake inheritance, it wouldn't help to check for inheritance for example.

What I like about rawattr() is that it's very general: it works both on instances (which would give bound methods) and on classes (which would give unbound methods). Also, it's not specific to methods: it works with arbitrary descriptors (say, lazy_attribute). These are two advantages compared to six.get_unbound_function.

comment:20 in reply to: ↑ 18 ; follow-up: Changed 4 years ago by embray

Replying to jdemeyer:

Replying to embray:

Were there any other use cases for this you had in mind?

There are various instances of obj.__dict__[method] in Sage which could be replaced by rawattr(obj, method). But I haven't looked too closely.

So what's wrong with

def getattr_raw(obj, attr, default=None):
    return obj.__dict__.get(attr, default)

?

(I think the real implementation is a little more complicated than that but not by much). Incidentally I prefer a name that still contains getattr making it clearer that it's like getattr but more direct. The problem with "rawattr" IMO is there's no verb in the name. That's a triviality though.

comment:21 in reply to: ↑ 20 ; follow-up: Changed 4 years ago by jdemeyer

Replying to embray:

So what's wrong with

def getattr_raw(obj, attr, default=None):
    return obj.__dict__.get(attr, default)

?

I'm not sure I understand your question.

If your question is "why don't you simply implement rawattr (whatever the name) like this?", then my answer is:

  1. It doesn't get attributes from subclasses.
  1. It only looks up the attribute in the instance, not in the class.

If your question is "why does rawattr need to support subclasses?", then my answer is: that's what normal attribute lookup also does.

If your question is "why does rawattr look both in the instance and in the class?", then my answer is: it makes it convenient to replace __func__ regardless of whether the method was bound or unbound.

comment:22 Changed 4 years ago by jdemeyer

To put it differently: I wanted rawattr() to behave as much as possible like getattr(), just without binding.

comment:23 follow-up: Changed 4 years ago by nthiery

Hi Erik, Jeroen,

The rawattr approach sounds more explicit, versatile and local than the class decorator.

Nice point to have an idiom analoguous to one that many developers will already know (getattr); it helps build quickly a mental model. So maybe a good name could be something like getattr_unbounded, or get_unbounded_attr.

I don't have a strong opinion for whether it should lookup just the class or also the super classes. In the use cases I met the former is sufficient, but the latter would not hurt. Performance does not matter, since the code will be executed just a couple times when reading the file. The latter hides an implementation detail (where exactly the method is implemented), enabling to move it up when appropriate; that's good. And if we decide to rely on the analogy with getattr, we might as well be as analogous as possible.

Cheers,

Nicolas

comment:24 in reply to: ↑ 7 Changed 4 years ago by nthiery

Replying to jdemeyer:

Replying to nthiery:

where we would use unbind to hide the implementation details and support any kind of object, not just methods.

This wouldn't simply because you cannot "unbind" arbitrary objects.

Ah, right, good point:-) I somehow confused myself with documentation lookup where 'a.foo?' actually gets the object by name and therefore can retrieve the unbound object.

comment:25 in reply to: ↑ 21 ; follow-ups: Changed 4 years ago by embray

Replying to jdemeyer:

If your question is "why does rawattr look both in the instance and in the class?", then my answer is: it makes it convenient to replace __func__ regardless of whether the method was bound or unbound.

I don't understand this statement. Maybe you meant a different question? If you want to get the function from a method (bound or unbound) that doesn't have anything to do with the class afaik.

I agree with the usefulness of such a utility, and I also agree that if we have it then it's a good solution for copying a method from one class to another. I would still prefer a different way to do that is some kind of more explicit pseudo-inheritance. But just as a straightforward technical solution to "I want to get some attribute off an object without going through descriptors" I agree this is useful.

I just don't know why the implementation has to be so complicated. My point above was that a utility like that can be written just as easily in pure Python (and has been--there's plenty of precedence for it). Slightly slower, maybe, but I'd want to actually see a performance critical application to weigh against the complexity of the implementation (I'd be less bothered if it weren't messing with apparently undocumented internals).

comment:26 in reply to: ↑ 23 ; follow-up: Changed 4 years ago by embray

Replying to nthiery:

Hi Erik, Jeroen,

The rawattr approach sounds more explicit, versatile and local than the class decorator.

Nice point to have an idiom analoguous to one that many developers will already know (getattr); it helps build quickly a mental model. So maybe a good name could be something like getattr_unbounded, or get_unbounded_attr.

I don't like anything referring to "bound" since that's very specific to methods. It could also apply to other descriptors (e.g. I have written descriptors that are "bound" to a specific instance of a class when accessed through the instance). But that's just one pattern for descriptors and isn't necessarily applicable in all cases.

I can't recall where, but I've seen this spelled raw_getattr() before. Similarly, Sphinx has a safe_getattr(), but that serves a different purpose.

comment:27 in reply to: ↑ 26 Changed 4 years ago by embray

Replying to embray:

Replying to nthiery:

Hi Erik, Jeroen,

The rawattr approach sounds more explicit, versatile and local than the class decorator.

Nice point to have an idiom analoguous to one that many developers will already know (getattr); it helps build quickly a mental model. So maybe a good name could be something like getattr_unbounded, or get_unbounded_attr.

I don't like anything referring to "bound" since that's very specific to methods. It could also apply to other descriptors (e.g. I have written descriptors that are "bound" to a specific instance of a class when accessed through the instance). But that's just one pattern for descriptors and isn't necessarily applicable in all cases.

This is partly why I'm also confused about how this function is documented:

+    Like ``getattr(obj, name)`` but without binding the attribute to
+    the object. This can be used to easily get unbound methods or other
+    descriptors.

There's nothing about getattr() that has to do with "binding the attribute to the object". That wording is really confusing to me and makes me not understand what this is supposed to do. It's somewhat applicable to methods but not to all attributes by any means.

comment:28 Changed 4 years ago by embray

I think I know what you mean now--I think you're just referring general passing of the instance to the descriptor's __get__ as a "binding" operation. That makes more sense, but could be worded more clearly.

comment:29 follow-up: Changed 4 years ago by embray

I think my problem just comes down to how I read "binding the attribute to the object" (and maybe it's my reading that's wrong). I think I would have worded this just subtly differently like

"Like getattr() but without invoking the binding behavior of descriptors under normal attribute access."

Last edited 4 years ago by embray (previous) (diff)

comment:30 in reply to: ↑ 29 Changed 4 years ago by nthiery

Replying to embray:

I think my problem just comes down to how I read "binding the attribute to the object" (and maybe it's my reading that's wrong).

I guess my own reading is influenced by the first line of the Descriptor How To:

    In general, a descriptor is an object attribute with “binding behavior”

No idea what the average reading will be though.

I think I would have worded this just subtly differently like

"Like getattr() but without invoking the binding behavior of descriptors under normal attribute access."

Sounds good to me!

comment:31 in reply to: ↑ 25 Changed 4 years ago by jdemeyer

Replying to embray:

Replying to jdemeyer:

If your question is "why does rawattr look both in the instance and in the class?", then my answer is: it makes it convenient to replace __func__ regardless of whether the method was bound or unbound.

I don't understand this statement.

Suppose X is a class. I meant that we should support both rawattr(X, name) (getting the attribute from X itself) and rawattr(X(), name) (getting the attribute from X().__class__)

Last edited 4 years ago by jdemeyer (previous) (diff)

comment:32 in reply to: ↑ 25 Changed 4 years ago by jdemeyer

Replying to embray:

I just don't know why the implementation has to be so complicated.

I don't think that the implementation is complicated at all. I actually think it would be more complicated in pure Python since pure Python has no equivalent for _PyType_Lookup.

Also note that most of the complications come from the support for old-style instances and classes. Once you remove that, it's basically three lines of code.

Last edited 4 years ago by jdemeyer (previous) (diff)

comment:33 in reply to: ↑ 25 ; follow-up: Changed 4 years ago by jdemeyer

Replying to embray:

I just don't know why the implementation has to be so complicated.

Do you mean that the functionality of raw_getattr() should be simplified (that would make it less analogous to getattr) or that the implementation of that functionality can be simplified? Or is it just that you consider it complicated only because it's implemented in C?

comment:34 Changed 4 years ago by git

  • Commit changed from 8598cc198db762dab37475ca3ac57e2f407dd392 to 854eb48e0fff08c46e7afa1979c89e052569a07c

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

b605557New function raw_getattr to get attributes without binding
854eb48Use raw_getattr where needed

comment:35 Changed 4 years ago by jdemeyer

Renamed rawattr -> raw_getattr

comment:36 in reply to: ↑ 33 ; follow-up: Changed 4 years ago by embray

Replying to jdemeyer:

Replying to embray:

I just don't know why the implementation has to be so complicated.

Do you mean that the functionality of raw_getattr() should be simplified (that would make it less analogous to getattr) or that the implementation of that functionality can be simplified? Or is it just that you consider it complicated only because it's implemented in C?

Sorry, yes, I should clarify "complicated". In part I mean that it's implemented in C, though I don't believe that alone makes it complicated. I'm more concerned about the over-reliance on undocumented internals.

I could just as easily write this function in pure Python, and I would be interested in a performance comparison between a pure python version and this C version, but I could try doing that myself.

To be clear I'm otherwise +1 on this--that's my only misgiving at this point.

comment:37 Changed 4 years ago by tscrim

FYI - I have a new use-case for this on #25146 where I want to treat functions as class-level attributes, but Python by default binds them as methods.

comment:38 in reply to: ↑ 36 Changed 4 years ago by jdemeyer

Replying to embray:

I'm more concerned about the over-reliance on undocumented internals.

Well, the internals of old-style instances and classes aren't likely going to change in Python 2.7.16. And _PyType_Lookup and _PyObject_GetDictPtr are indeed private, but pretty stable: they have existed for a long time and still do in Python master.

comment:39 Changed 4 years ago by jdemeyer

I am starting to think that a super-like syntax might be better: something like namespace(obj).attr instead of raw_getattr(obj, "attr").

Advantages:

  1. It allows specifying the attribute name in a more natural way, not as string.
  1. namespace(obj) is an object which can be created once (and even stored or passed around) in case that you need to access many raw attributes of the same object.
  1. Usable as class decorator to create namespace "classes" (whether this would still be a class or not remains to be seen).

The main disadvantage is that it would be more complicated to implement and slower.

comment:40 follow-up: Changed 4 years ago by embray

For Python 3.3+ (and I think for practical purposes we're not interested in Python versions below 3.3) it looks like you want to use PyObject_GenericGetDict. This does practically the same thing you're doing with _PyObject_GetDictPtr but it also correctly implements the semantics of PEP-412.

For Python 2.7 I agree it should be safe-enough since it's not going to change.

As far as _PyType_Lookup is concerned, while I'm still wary of using any function that's not part of the limited API, I see now that we were already using it anyways so I'm not going to bother about it here, and I don't really see a better alternative--ISTM this might be a good candidate for adding to the documented API since it is important for implementing custom attribute access on extension types.

comment:41 Changed 4 years ago by embray

I like the idea of the namespace type, but it's also useful to have the stand-alone raw_getattr. namespace would then just be built on top of that.

comment:42 in reply to: ↑ 40 Changed 4 years ago by jdemeyer

Replying to embray:

for practical purposes we're not interested in Python versions below 3.3

Did you forget that 2.7 is still the default and only supported version?

EDIT: I guess you meant 3.x with x < 3.

Last edited 4 years ago by jdemeyer (previous) (diff)

comment:43 Changed 4 years ago by embray

Yes, Python 3 versions below 3.3 (and really 3.6, but I'm thinking about what Python 3 versions are still in some distros...)

comment:44 Changed 4 years ago by jdemeyer

I don't think that there is anything wrong with using _PyObject_GetDictPtr. It does exactly what I want. I don't need to create the dict when it's not there and I don't need exceptions thrown when the object does not support __dict__.

And it only works on Python 3, so it would only complicate the code.

If you insist, I will use it though.

comment:45 follow-ups: Changed 4 years ago by embray

No, it's okay. I just think it goes without saying that internal functions should be avoided. This whole thing could be written in pure Python and it seems like an overwrought pre-mature optimization not to. OTOH I feel like it's a fault on the Python side that a) this function doesn't just come built-in--sometimes it is useful, especially for introspection, to bypass some of the high-level attribute machinery, and b) that this can't currently be easily implemented by third-party C code without reliance on internal functions. Which in turn gives more credence to the idea that this should be a built-in.

I think I would like to propose that to python-ideas if you don't mind, and use your code as an example.

comment:46 in reply to: ↑ 45 Changed 4 years ago by jdemeyer

Replying to embray:

This whole thing could be written in pure Python

Yes, it could be written in Python. It could also be written in C.

You seem to find it a problem that it's implemented in C but I don't understand why. I would argue that the C implementation is conceptually simpler than the potential Python implementation because Python doesn't have the equivalent of _PyType_Lookup.

comment:47 in reply to: ↑ 45 Changed 4 years ago by jdemeyer

Replying to embray:

I think I would like to propose that to python-ideas if you don't mind, and use your code as an example.

Feel free. But do remove the Python 2 stuff if you quote my code.

comment:48 Changed 4 years ago by embray

Python is practically pseudocode IMO. It's almost always conceptually simpler than C (especially if you're also relying on undocumented internals to do something).

comment:49 Changed 4 years ago by embray

  • Reviewers set to Erik Bray
  • Status changed from needs_review to positive_review

comment:50 Changed 4 years ago by davidloeffler

  • Milestone changed from sage-8.2 to sage-8.3

comment:51 Changed 4 years ago by vbraun

  • Branch changed from u/jdemeyer/parentmethods__use_a_metaclass_which_does_not_bind_methods to 854eb48e0fff08c46e7afa1979c89e052569a07c
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.