Opened 12 years ago

Closed 12 years ago

Last modified 6 years ago

#10467 closed enhancement (fixed)

Improve lookup of private attributes

Reported by: Simon King Owned by: tbd
Priority: major Milestone: sage-4.6.2
Component: categories Keywords: private attribute lookup
Cc: Merged in: sage-4.6.2.alpha1
Authors: Simon King Reviewers: Robert Bradshaw
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by Simon King)

By a private attribute, I mean an attribute whose name starts with two underscores and does not end with an underscore. Such an attribute is used, e.g., in the default __repr__ method of Sage objects.

It should be reasonable to assume that private attributes belong to a particular instance, not to its class or its category; after all, they are subject to Python's name mangling.

In particular, if P is a parent/element, then P.__foo is either defined for the instance P or will not be available from P.category().parent_class resp. from P.category().element_class.

It turns out that this assumption holds thorought Sage: When one lets the __getattr__ methods of Parent/Element? immediately raise an AttributeError when a private attribute is requested, then all doc tests still pass.

That is what my patch mainly does. In addition, it removes several occurences of an idiom like

if hasattr(self,'foo'):
    return self.foo

and replaces it with

try:
    return self.foo
except AttributeError:
    pass

which saves half of the computation time when the attribute actually exists.

The advantage is a considerable speedup. Here are two examples (the last is actually a serious computation):

Without the patch:

sage: P.<x> = QQ[]
sage: timeit('a=repr(x)')
625 loops, best of 3: 74.7 µs per loop
sage: R.<x,y> = InfinitePolynomialRing(QQ)
sage: I = R.ideal([x[1]^2+y[2]*y[3], x[2]*y[1]*x[3]-y[1]*y[2]])
sage: %time I.groebner_basis()
CPU times: user 23.09 s, sys: 0.02 s, total: 23.11 s
Wall time: 23.67 s
[y_2*y_1^3 + y_2*y_1^2, y_2^2*y_1 - y_2*y_1^2, y_3*y_1 - y_2*y_1, x_1*y_2*y_1^2 + x_1*y_2*y_1, x_1^2 + y_2*y_1, x_2*y_2*y_1 - x_1*y_2*y_1, x_2*x_1*y_3 - y_2*y_1, x_3*y_2*y_1 - x_1*y_2*y_1, x_3*x_1*y_2 - y_2*y_1, x_3*x_2*y_1 - y_2*y_1]

With the patch:

sage: P.<x> = QQ[]
sage: timeit('a=repr(x)')
625 loops, best of 3: 40.5 µs per loop
sage: R.<x,y> = InfinitePolynomialRing(QQ)
sage: I = R.ideal([x[1]^2+y[2]*y[3], x[2]*y[1]*x[3]-y[1]*y[2]])
sage: %time I.groebner_basis()
CPU times: user 14.43 s, sys: 0.03 s, total: 14.46 s
Wall time: 14.53 s
[y_2*y_1^3 + y_2*y_1^2, y_2^2*y_1 - y_2*y_1^2, y_3*y_1 - y_2*y_1, x_1*y_2*y_1^2 + x_1*y_2*y_1, x_1^2 + y_2*y_1, x_2*y_2*y_1 - x_1*y_2*y_1, x_2*x_1*y_3 - y_2*y_1, x_3*y_2*y_1 - x_1*y_2*y_1, x_3*x_1*y_2 - y_2*y_1, x_3*x_2*y_1 - y_2*y_1]

The patch also adds doctests in Parent.__getattr__, Element.__getattr__ and several methods of SageObject.

Attachments (4)

10467attribute_lookup.patch (11.0 KB) - added by Simon King 12 years ago.
Speedup for attribute lookup of Parents and Elements
10467-attribute-referee.patch (1.3 KB) - added by Robert Bradshaw 12 years ago.
10467-attribute-referee.2.patch (1.4 KB) - added by Simon King 12 years ago.
10467-attribute-referee.3.patch (1.4 KB) - added by Jeroen Demeyer 12 years ago.
Same patch, fixed commit message properly

Download all attachments as: .zip

Change History (26)

comment:1 Changed 12 years ago by Simon King

Description: modified (diff)
Status: newneeds_review

comment:2 Changed 12 years ago by Simon King

On sage-algebra, Nicolas Thiéry wrote:

"Just please make a quick benchmark (and post the result on track) to verify that the extra check does not induce an overhead for usual lookups (probably not)."

Here are two data points.

(1) I compared "sage -tp 4 sage/" before and after applying the patch. It is 1810.3 seconds without the patch, but 1792.5 seconds with the patch. So, on average, it became quicker.

(2) The timings improve if a non-existing private attribute is requested. So, let us test a situation where an existing categorical attribute is requested. For example, the method summation method of the rational field is only defined in its category.

Without the patch, I obtained

sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 553 ns per loop
sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 563 ns per loop
sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 559 ns per loop

With the patch, I obtain

sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 530 ns per loop
sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 514 ns per loop
sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 539 ns per loop

So, that's a situation where it might expect a slight deceleration with the patch, but in fact it is as quick as before.

comment:3 Changed 12 years ago by Simon King

I found yet another way to speed things up!

Typically, in Python one requests an attribute and catches an AttributeError (this is also done in Python's hasattr, IIRC). That happens very often during computations. Hence, it is very important that raising the error happens quickly.

The __getattr__ methods of sage.structure.parent.Parent and sage.structure.element.Element call a Python method sage.structure.parent.raise_attribute_error, that first creates the error message and then raises the AttributeError. Interpreted Python is too slow for that purpose!

Therefore I updated my patch (will post it in a few minutes). Trick: Instead of

def raise_attribute_error(self, name)

it is now

cdef inline raise_attribute_error(self,name)

The effect of this little trick, compared with the benchmarks above, is:

(1) Saves almost 20% compared with the old patch, and more than 50% compared with unpatched Sage:

sage: P.<x> = QQ[]
sage: timeit('a=repr(x)')
625 loops, best of 3: 33.8 µs per loop

(2) Saves almost 10% compared with the old patch and almost 50% compared with unpatched Sage:

sage: R.<x,y> = InfinitePolynomialRing(QQ)
sage: I = R.ideal([x[1]^2+y[2]*y[3], x[2]*y[1]*x[3]-y[1]*y[2]])
sage: %time I.groebner_basis()
CPU times: user 13.30 s, sys: 0.02 s, total: 13.32 s
Wall time: 13.38 s
[y_2*y_1^3 + y_2*y_1^2, y_2^2*y_1 - y_2*y_1^2, y_3*y_1 - y_2*y_1, x_1*y_2*y_1^2 + x_1*y_2*y_1, x_1^2 + y_2*y_1, x_2*y_2*y_1 - x_1*y_2*y_1, x_2*x_1*y_3 - y_2*y_1, x_3*y_2*y_1 - x_1*y_2*y_1, x_3*x_1*y_2 - y_2*y_1, x_3*x_2*y_1 - y_2*y_1]

(3) No significant difference in the third benchmark:

sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 526 ns per loop
sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 524 ns per loop
sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 527 ns per loop

comment:4 Changed 12 years ago by Simon King

The (updated) patch avoids looking up private attributes in the category, avoids some needless "hasattr" calls, and gains additional speed by cythenising "raise_attribute_error".

comment:5 Changed 12 years ago by Simon King

Status: needs_reviewneeds_work
Work issues: Test for raise_attribute_error

I need to further update the patch: Of course, after cythonising raise_attribute_error, the doctest of that function fails, because one needs a Python function calling it.

Changed 12 years ago by Simon King

Attachment: 10467attribute_lookup.patch added

Speedup for attribute lookup of Parents and Elements

comment:6 Changed 12 years ago by Simon King

Status: needs_workneeds_review
Work issues: Test for raise_attribute_error

The patch is updated, the doctests should now pass.

comment:7 Changed 12 years ago by Robert Bradshaw

Status: needs_reviewneeds_work

Mostly looks good. My only request is that in the repr and interface_init functions that an attribute error not be caught over the call into the sub-function, as that is not the intended behavior (and, speaking from first-hand experience, leads to un-intuitive to debug errors, particularly in methods like repr).

Changed 12 years ago by Robert Bradshaw

comment:8 Changed 12 years ago by Robert Bradshaw

Status: needs_workneeds_review

Positive review to everything but my own (tiny) patch.

comment:9 in reply to:  7 Changed 12 years ago by Simon King

Replying to robertwb:

Mostly looks good. My only request is that in the repr and interface_init functions that an attribute error not be caught over the call into the sub-function, ...

You are right, that needs to be changed.

I applied your referee patch and (to be on the safe side) am running doctests. If it works then (if I understood correctly) I am entitled to change it into positive review, inserting your name as reviewer, right?

Best regards,

Simon

comment:10 Changed 12 years ago by Simon King

Reviewers: Robert Bradshaw
Status: needs_reviewpositive_review

All tests pass with the referee patch. Hence, I change it to "positive review" and insert Robert as reviewer.

To the patchbot/release managaer:

Apply 10467attribute_lookup.patch 10467-attribute-referee.patch

Cheers,

Simon

comment:11 Changed 12 years ago by Robert Bradshaw

Yes, that's all kosher. Thanks. BTW, the buildbot would pick up both patches just fine (though it only runs on needs-review tickets).

comment:12 Changed 12 years ago by Jeroen Demeyer

Milestone: sage-4.6.1sage-4.6.2

comment:13 Changed 12 years ago by Simon King

Will the patch be merged, after meanwhile 5 weeks?

comment:14 in reply to:  13 Changed 12 years ago by Jeroen Demeyer

Status: positive_reviewneeds_work

You should give the patch a proper commit message (use hg qrefresh -e for that). Make sure the ticket number appears on the first line.

Changed 12 years ago by Simon King

comment:15 Changed 12 years ago by Simon King

Status: needs_workpositive_review

Apply 10467attribute_lookup.patch 10467-attribute-referee.2.patch

Note: My patch had a proper commit message, and I hope that the referee does not mind that I added a commit message to the referee patch.

Changed 12 years ago by Jeroen Demeyer

Same patch, fixed commit message properly

comment:16 Changed 12 years ago by Jeroen Demeyer

Merged in: sage-4.6.2.alpha1
Resolution: fixed
Status: positive_reviewclosed

comment:17 Changed 6 years ago by Jeroen Demeyer

I'm undoing a large part of this patch of #20686. See that ticket for the rationale.

comment:18 Changed 6 years ago by Nicolas M. Thiéry

Hi, I am sitting next to Jeroen.

__getattr__ as implemented in this ticket makes it so that an error is raised in the following situation:

    sage: C = EuclideanDomains()
    sage: P.<x> = QQ[]
    sage: C.element_class.__foo = 'bar'
    sage: x.parent() in C
    True
    sage: x.__foo

However, when using plain Python inheritance, no error is raised in this equivalent situation:

    sage: C = Semigroups()
    sage: S = C.example()
    sage: S.category() is C
    True
    sage: C.element_class.__foo = 'bar'
    sage: x = S.an_element()
    sage: x.__foo
    'bar'

Given that the role of __getattr__ is to emulate Python inheritance as closely as possible, I agree with Jeroen that the above behavior of raising an error is not desirable. Especially since it has a cost.

Or would there be a use case for this behavior? I don't expect to see private methods in element_class'es; they probably would not work anyway.

Cheers,

Nicolas

Last edited 6 years ago by Jeroen Demeyer (previous) (diff)

comment:19 Changed 6 years ago by Jeroen Demeyer

Just checking that I can post a comment.

comment:20 Changed 6 years ago by Jeroen Demeyer

Keywords: just testing added

comment:21 Changed 6 years ago by Jeroen Demeyer

Component: performancecategories
Keywords: just testing removed

comment:22 Changed 6 years ago by Jeroen Demeyer

I can post here, but not on #20686...

Note: See TracTickets for help on using tickets.