Opened 10 years ago

Closed 4 years ago

#12508 closed enhancement (invalid)

Optimize zero test in dict_addition.pyx

Reported by: hivert Owned by: hivert
Priority: major Milestone: sage-duplicate/invalid/wontfix
Component: combinatorics Keywords:
Cc: stumpc5, sage-combinat Merged in:
Authors: Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: #12510 Stopgaps:

Status badges

Description

The code currently says:

        a = D[iter(D).next()]
        if hasattr(a,'parent') and hasattr(a.parent(),'zero'):
            zero = a.parent().zero()
        else:
            zero = 0
    for_removal = [key for key in D if D[key] == zero]
    for key in for_removal:
        del D[key]    

If it's robust, using bool(x) that is __nonzero__ could be faster.

Change History (14)

comment:1 follow-up: Changed 10 years ago by stumpc5

did you do some tests?

Within Sage, I get

sage: D = dict( (i,i) for i in range(100000) ) sage: zero = 0 sage: %timeit for_removal = [key for key in D if D[key] == zero] 25 loops, best of 3: 23.2 ms per loop sage: %timeit for_removal = [key for key in D if D[key].nonzero() ] 25 loops, best of 3: 36.8 ms per loop

But calling nonzero in python is much slower than is cython, isn't it?

comment:2 in reply to: ↑ 1 ; follow-up: Changed 10 years ago by hivert

Replying to stumpc5:

did you do some tests?

Actually not Yet... However:

sage: D = dict( (i,i) for i in srange(100000) )
sage: zero = 0
sage: %timeit for_removal = [key for key in D if D[key] == zero]
25 loops, best of 3: 15 ms per loop
sage: %timeit for_removal = [key for key in D if D[key].__nonzero__() ]
25 loops, best of 3: 27.5 ms per loop
sage: %timeit for_removal = [key for key in D if not D[key] ]
25 loops, best of 3: 8.78 ms per loop

Two remarks:

  • Using sage integer (ie srange) is more realistic than python int.
  • I actually meant bool(D[key]) which calls implicitely __nonzero__ in an optimized way, rather than calling explicitely it.

Of course, we still have to check how it behave in Cython.

Cheers,

Florent

comment:3 in reply to: ↑ 2 Changed 10 years ago by hivert

  • I actually meant bool(D[key]) which calls implicitely __nonzero__ in an optimized way, rather than calling explicitely it.

If you don't know it, here is what I call by optimized way:

sage: class bla(object):
....:      def __nonzero__(self): 
....:           print "bla.__nonzero__ called"
....:           return True
....: 
sage: b = bla()
sage: if b: print "ok"
....: 
bla.__nonzero__ called
ok

But:

sage: b.__nonzero__ = lambda self: False
sage: if b: print "ok"
....: 
bla.__nonzero__ called
ok

So the lookup in b.__dict__ is bypassed (see special method lookup), which allows for much faster non zero test than calling __nonzero__ explicitely. I'm pretty sure Cython optimize that too.

comment:4 follow-up: Changed 10 years ago by hivert

  • Owner changed from sage-combinat to hivert

By the way, I didn't put you in cc because I wanted you to fix that. I thought you might be interested in those kind of optimizations. If you want to fix it, please tell me, otherwise I'll do it myself.

comment:5 in reply to: ↑ 4 ; follow-up: Changed 10 years ago by stumpc5

Replying to hivert:

By the way, I didn't put you in cc because I wanted you to fix that. I thought you might be interested in those kind of optimizations. If you want to fix it, please tell me, otherwise I'll do it myself.

I will look at the special method lookup these days, but go ahead providing a patch if you want (can we always be certain that bool( D[key] ) will indeed return what we want it to return? ).

comment:6 in reply to: ↑ 5 ; follow-up: Changed 10 years ago by hivert

  • Dependencies set to #12510

Replying to stumpc5:

can we always be certain that bool( D[key] ) will indeed return what we want it to return?

It's Python's specification, so it definitely should return what we want. Now the best way to be sure that it does is to add it in the generic tests for object in the category FreeModules(). Since now, thanks to Simon all rings conform with categories, adding a test here should catch all problems.

comment:7 in reply to: ↑ 6 ; follow-up: Changed 10 years ago by stumpc5

Replying to hivert:

Replying to stumpc5: It's Python's specification, so it definitely should return what we want. Now the best way to be sure that it does is to add it in the generic tests for object in the category FreeModules(). Since now, thanks to Simon all rings conform with categories, adding a test here should catch all problems.

Things are getting forward, that's great to hear!

comment:8 in reply to: ↑ 7 Changed 9 years ago by stumpc5

Replying to stumpc5:

Replying to hivert:

Replying to stumpc5: It's Python's specification, so it definitely should return what we want. Now the best way to be sure that it does is to add it in the generic tests for object in the category FreeModules(). Since now, thanks to Simon all rings conform with categories, adding a test here should catch all problems.

Things are getting forward, that's great to hear!

I just saw this patch in my list, so I give it a ping to see if this was solved somewhere else, or if it is still to be done (most likely by you, Florent ?)...

comment:9 Changed 8 years ago by jdemeyer

  • Milestone changed from sage-5.11 to sage-5.12

comment:10 Changed 8 years ago by vbraun_spam

  • Milestone changed from sage-6.1 to sage-6.2

comment:11 Changed 8 years ago by vbraun_spam

  • Milestone changed from sage-6.2 to sage-6.3

comment:12 Changed 7 years ago by vbraun_spam

  • Milestone changed from sage-6.3 to sage-6.4

comment:13 Changed 4 years ago by chapoton

  • Milestone changed from sage-6.4 to sage-duplicate/invalid/wontfix
  • Status changed from new to needs_review
  • dict_addition.pyx was deprecated in #20680 in favor of blas.sum
  • there one finds src/sage/data_structures/blas_dict.pyx: for_removal = [key for key in result if not result[key]]

So let us close this ticket now.

comment:14 Changed 4 years ago by jdemeyer

  • Resolution set to invalid
  • Status changed from needs_review to closed
Note: See TracTickets for help on using tickets.