Opened 9 years ago

Closed 8 years ago

Last modified 3 years ago

#10467 closed enhancement (fixed)

Improve lookup of private attributes

Reported by: SimonKing Owned by: tbd
Priority: major Milestone: sage-4.6.2
Component: categories Keywords: private attribute lookup
Cc: Merged in: sage-4.6.2.alpha1
Authors: Simon King Reviewers: Robert Bradshaw
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description (last modified by SimonKing)

By a private attribute, I mean an attribute whose name starts with two underscores and does not end with an underscore. Such an attribute is used, e.g., in the default __repr__ method of Sage objects.

It should be reasonable to assume that private attributes belong to a particular instance, not to its class or its category; after all, they are subject to Python's name mangling.

In particular, if P is a parent/element, then P.__foo is either defined for the instance P or will not be available from P.category().parent_class resp. from P.category().element_class.

It turns out that this assumption holds thorought Sage: When one lets the __getattr__ methods of Parent/Element? immediately raise an AttributeError when a private attribute is requested, then all doc tests still pass.

That is what my patch mainly does. In addition, it removes several occurences of an idiom like

if hasattr(self,'foo'):
    return self.foo

and replaces it with

try:
    return self.foo
except AttributeError:
    pass

which saves half of the computation time when the attribute actually exists.

The advantage is a considerable speedup. Here are two examples (the last is actually a serious computation):

Without the patch:

sage: P.<x> = QQ[]
sage: timeit('a=repr(x)')
625 loops, best of 3: 74.7 µs per loop
sage: R.<x,y> = InfinitePolynomialRing(QQ)
sage: I = R.ideal([x[1]^2+y[2]*y[3], x[2]*y[1]*x[3]-y[1]*y[2]])
sage: %time I.groebner_basis()
CPU times: user 23.09 s, sys: 0.02 s, total: 23.11 s
Wall time: 23.67 s
[y_2*y_1^3 + y_2*y_1^2, y_2^2*y_1 - y_2*y_1^2, y_3*y_1 - y_2*y_1, x_1*y_2*y_1^2 + x_1*y_2*y_1, x_1^2 + y_2*y_1, x_2*y_2*y_1 - x_1*y_2*y_1, x_2*x_1*y_3 - y_2*y_1, x_3*y_2*y_1 - x_1*y_2*y_1, x_3*x_1*y_2 - y_2*y_1, x_3*x_2*y_1 - y_2*y_1]

With the patch:

sage: P.<x> = QQ[]
sage: timeit('a=repr(x)')
625 loops, best of 3: 40.5 µs per loop
sage: R.<x,y> = InfinitePolynomialRing(QQ)
sage: I = R.ideal([x[1]^2+y[2]*y[3], x[2]*y[1]*x[3]-y[1]*y[2]])
sage: %time I.groebner_basis()
CPU times: user 14.43 s, sys: 0.03 s, total: 14.46 s
Wall time: 14.53 s
[y_2*y_1^3 + y_2*y_1^2, y_2^2*y_1 - y_2*y_1^2, y_3*y_1 - y_2*y_1, x_1*y_2*y_1^2 + x_1*y_2*y_1, x_1^2 + y_2*y_1, x_2*y_2*y_1 - x_1*y_2*y_1, x_2*x_1*y_3 - y_2*y_1, x_3*y_2*y_1 - x_1*y_2*y_1, x_3*x_1*y_2 - y_2*y_1, x_3*x_2*y_1 - y_2*y_1]

The patch also adds doctests in Parent.__getattr__, Element.__getattr__ and several methods of SageObject.

Attachments (4)

10467attribute_lookup.patch (11.0 KB) - added by SimonKing 9 years ago.
Speedup for attribute lookup of Parents and Elements
10467-attribute-referee.patch (1.3 KB) - added by robertwb 9 years ago.
10467-attribute-referee.2.patch (1.4 KB) - added by SimonKing 8 years ago.
10467-attribute-referee.3.patch (1.4 KB) - added by jdemeyer 8 years ago.
Same patch, fixed commit message properly

Download all attachments as: .zip

Change History (26)

comment:1 Changed 9 years ago by SimonKing

  • Description modified (diff)
  • Status changed from new to needs_review

comment:2 Changed 9 years ago by SimonKing

On sage-algebra, Nicolas Thiéry wrote:

"Just please make a quick benchmark (and post the result on track) to verify that the extra check does not induce an overhead for usual lookups (probably not)."

Here are two data points.

(1) I compared "sage -tp 4 sage/" before and after applying the patch. It is 1810.3 seconds without the patch, but 1792.5 seconds with the patch. So, on average, it became quicker.

(2) The timings improve if a non-existing private attribute is requested. So, let us test a situation where an existing categorical attribute is requested. For example, the method summation method of the rational field is only defined in its category.

Without the patch, I obtained

sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 553 ns per loop
sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 563 ns per loop
sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 559 ns per loop

With the patch, I obtain

sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 530 ns per loop
sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 514 ns per loop
sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 539 ns per loop

So, that's a situation where it might expect a slight deceleration with the patch, but in fact it is as quick as before.

comment:3 Changed 9 years ago by SimonKing

I found yet another way to speed things up!

Typically, in Python one requests an attribute and catches an AttributeError (this is also done in Python's hasattr, IIRC). That happens very often during computations. Hence, it is very important that raising the error happens quickly.

The __getattr__ methods of sage.structure.parent.Parent and sage.structure.element.Element call a Python method sage.structure.parent.raise_attribute_error, that first creates the error message and then raises the AttributeError. Interpreted Python is too slow for that purpose!

Therefore I updated my patch (will post it in a few minutes). Trick: Instead of

def raise_attribute_error(self, name)

it is now

cdef inline raise_attribute_error(self,name)

The effect of this little trick, compared with the benchmarks above, is:

(1) Saves almost 20% compared with the old patch, and more than 50% compared with unpatched Sage:

sage: P.<x> = QQ[]
sage: timeit('a=repr(x)')
625 loops, best of 3: 33.8 µs per loop

(2) Saves almost 10% compared with the old patch and almost 50% compared with unpatched Sage:

sage: R.<x,y> = InfinitePolynomialRing(QQ)
sage: I = R.ideal([x[1]^2+y[2]*y[3], x[2]*y[1]*x[3]-y[1]*y[2]])
sage: %time I.groebner_basis()
CPU times: user 13.30 s, sys: 0.02 s, total: 13.32 s
Wall time: 13.38 s
[y_2*y_1^3 + y_2*y_1^2, y_2^2*y_1 - y_2*y_1^2, y_3*y_1 - y_2*y_1, x_1*y_2*y_1^2 + x_1*y_2*y_1, x_1^2 + y_2*y_1, x_2*y_2*y_1 - x_1*y_2*y_1, x_2*x_1*y_3 - y_2*y_1, x_3*y_2*y_1 - x_1*y_2*y_1, x_3*x_1*y_2 - y_2*y_1, x_3*x_2*y_1 - y_2*y_1]

(3) No significant difference in the third benchmark:

sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 526 ns per loop
sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 524 ns per loop
sage: timeit('a=QQ.summation',number=50000)
50000 loops, best of 3: 527 ns per loop

comment:4 Changed 9 years ago by SimonKing

The (updated) patch avoids looking up private attributes in the category, avoids some needless "hasattr" calls, and gains additional speed by cythenising "raise_attribute_error".

comment:5 Changed 9 years ago by SimonKing

  • Status changed from needs_review to needs_work
  • Work issues set to Test for raise_attribute_error

I need to further update the patch: Of course, after cythonising raise_attribute_error, the doctest of that function fails, because one needs a Python function calling it.

Changed 9 years ago by SimonKing

Speedup for attribute lookup of Parents and Elements

comment:6 Changed 9 years ago by SimonKing

  • Status changed from needs_work to needs_review
  • Work issues Test for raise_attribute_error deleted

The patch is updated, the doctests should now pass.

comment:7 follow-up: Changed 9 years ago by robertwb

  • Status changed from needs_review to needs_work

Mostly looks good. My only request is that in the repr and interface_init functions that an attribute error not be caught over the call into the sub-function, as that is not the intended behavior (and, speaking from first-hand experience, leads to un-intuitive to debug errors, particularly in methods like repr).

Changed 9 years ago by robertwb

comment:8 Changed 9 years ago by robertwb

  • Status changed from needs_work to needs_review

Positive review to everything but my own (tiny) patch.

comment:9 in reply to: ↑ 7 Changed 9 years ago by SimonKing

Replying to robertwb:

Mostly looks good. My only request is that in the repr and interface_init functions that an attribute error not be caught over the call into the sub-function, ...

You are right, that needs to be changed.

I applied your referee patch and (to be on the safe side) am running doctests. If it works then (if I understood correctly) I am entitled to change it into positive review, inserting your name as reviewer, right?

Best regards,

Simon

comment:10 Changed 9 years ago by SimonKing

  • Reviewers set to Robert Bradshaw
  • Status changed from needs_review to positive_review

All tests pass with the referee patch. Hence, I change it to "positive review" and insert Robert as reviewer.

To the patchbot/release managaer:

Apply 10467attribute_lookup.patch 10467-attribute-referee.patch

Cheers,

Simon

comment:11 Changed 9 years ago by robertwb

Yes, that's all kosher. Thanks. BTW, the buildbot would pick up both patches just fine (though it only runs on needs-review tickets).

comment:12 Changed 9 years ago by jdemeyer

  • Milestone changed from sage-4.6.1 to sage-4.6.2

comment:13 follow-up: Changed 8 years ago by SimonKing

Will the patch be merged, after meanwhile 5 weeks?

comment:14 in reply to: ↑ 13 Changed 8 years ago by jdemeyer

  • Status changed from positive_review to needs_work

You should give the patch a proper commit message (use hg qrefresh -e for that). Make sure the ticket number appears on the first line.

Changed 8 years ago by SimonKing

comment:15 Changed 8 years ago by SimonKing

  • Status changed from needs_work to positive_review

Apply 10467attribute_lookup.patch 10467-attribute-referee.2.patch

Note: My patch had a proper commit message, and I hope that the referee does not mind that I added a commit message to the referee patch.

Changed 8 years ago by jdemeyer

Same patch, fixed commit message properly

comment:16 Changed 8 years ago by jdemeyer

  • Merged in set to sage-4.6.2.alpha1
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:17 Changed 3 years ago by jdemeyer

I'm undoing a large part of this patch of #20686. See that ticket for the rationale.

comment:18 Changed 3 years ago by nthiery

Hi, I am sitting next to Jeroen.

__getattr__ as implemented in this ticket makes it so that an error is raised in the following situation:

    sage: C = EuclideanDomains()
    sage: P.<x> = QQ[]
    sage: C.element_class.__foo = 'bar'
    sage: x.parent() in C
    True
    sage: x.__foo

However, when using plain Python inheritance, no error is raised in this equivalent situation:

    sage: C = Semigroups()
    sage: S = C.example()
    sage: S.category() is C
    True
    sage: C.element_class.__foo = 'bar'
    sage: x = S.an_element()
    sage: x.__foo
    'bar'

Given that the role of __getattr__ is to emulate Python inheritance as closely as possible, I agree with Jeroen that the above behavior of raising an error is not desirable. Especially since it has a cost.

Or would there be a use case for this behavior? I don't expect to see private methods in element_class'es; they probably would not work anyway.

Cheers,

Nicolas

Last edited 3 years ago by jdemeyer (previous) (diff)

comment:19 Changed 3 years ago by jdemeyer

Just checking that I can post a comment.

comment:20 Changed 3 years ago by jdemeyer

  • Keywords just testing added

comment:21 Changed 3 years ago by jdemeyer

  • Component changed from performance to categories
  • Keywords just testing removed

comment:22 Changed 3 years ago by jdemeyer

I can post here, but not on #20686...

Note: See TracTickets for help on using tickets.