Opened 10 years ago

Closed 4 years ago

# Optimize zero test in dict_addition.pyx

Reported by: Owned by: hivert hivert major sage-duplicate/invalid/wontfix combinatorics stumpc5, sage-combinat N/A #12510

### Description

The code currently says:

```        a = D[iter(D).next()]
if hasattr(a,'parent') and hasattr(a.parent(),'zero'):
zero = a.parent().zero()
else:
zero = 0
for_removal = [key for key in D if D[key] == zero]
for key in for_removal:
del D[key]
```

If it's robust, using `bool(x)` that is `__nonzero__` could be faster.

### comment:1 follow-up: ↓ 2 Changed 10 years ago by stumpc5

did you do some tests?

Within Sage, I get

sage: D = dict( (i,i) for i in range(100000) ) sage: zero = 0 sage: %timeit for_removal = [key for key in D if D[key] == zero] 25 loops, best of 3: 23.2 ms per loop sage: %timeit for_removal = [key for key in D if D[key].nonzero() ] 25 loops, best of 3: 36.8 ms per loop

But calling nonzero in python is much slower than is cython, isn't it?

### comment:2 in reply to: ↑ 1 ; follow-up: ↓ 3 Changed 10 years ago by hivert

did you do some tests?

Actually not Yet... However:

```sage: D = dict( (i,i) for i in srange(100000) )
sage: zero = 0
sage: %timeit for_removal = [key for key in D if D[key] == zero]
25 loops, best of 3: 15 ms per loop
sage: %timeit for_removal = [key for key in D if D[key].__nonzero__() ]
25 loops, best of 3: 27.5 ms per loop
sage: %timeit for_removal = [key for key in D if not D[key] ]
25 loops, best of 3: 8.78 ms per loop
```

Two remarks:

• Using sage integer (ie `srange`) is more realistic than python `int`.
• I actually meant `bool(D[key])` which calls implicitely `__nonzero__` in an optimized way, rather than calling explicitely it.

Of course, we still have to check how it behave in Cython.

Cheers,

Florent

### comment:3 in reply to: ↑ 2 Changed 10 years ago by hivert

• I actually meant `bool(D[key])` which calls implicitely `__nonzero__` in an optimized way, rather than calling explicitely it.

If you don't know it, here is what I call by optimized way:

```sage: class bla(object):
....:      def __nonzero__(self):
....:           print "bla.__nonzero__ called"
....:           return True
....:
sage: b = bla()
sage: if b: print "ok"
....:
bla.__nonzero__ called
ok
```

But:

```sage: b.__nonzero__ = lambda self: False
sage: if b: print "ok"
....:
bla.__nonzero__ called
ok
```

So the lookup in `b.__dict__` is bypassed (see special method lookup), which allows for much faster non zero test than calling `__nonzero__` explicitely. I'm pretty sure Cython optimize that too.

### comment:4 follow-up: ↓ 5 Changed 10 years ago by hivert

• Owner changed from sage-combinat to hivert

By the way, I didn't put you in cc because I wanted you to fix that. I thought you might be interested in those kind of optimizations. If you want to fix it, please tell me, otherwise I'll do it myself.

### comment:5 in reply to: ↑ 4 ; follow-up: ↓ 6 Changed 10 years ago by stumpc5

By the way, I didn't put you in cc because I wanted you to fix that. I thought you might be interested in those kind of optimizations. If you want to fix it, please tell me, otherwise I'll do it myself.

I will look at the special method lookup these days, but go ahead providing a patch if you want (can we always be certain that bool( D[key] ) will indeed return what we want it to return? ).

### comment:6 in reply to: ↑ 5 ; follow-up: ↓ 7 Changed 10 years ago by hivert

• Dependencies set to #12510

can we always be certain that bool( D[key] ) will indeed return what we want it to return?

It's Python's specification, so it definitely should return what we want. Now the best way to be sure that it does is to add it in the generic tests for object in the category `FreeModules()`. Since now, thanks to Simon all rings conform with categories, adding a test here should catch all problems.

### comment:7 in reply to: ↑ 6 ; follow-up: ↓ 8 Changed 10 years ago by stumpc5

Replying to stumpc5: It's Python's specification, so it definitely should return what we want. Now the best way to be sure that it does is to add it in the generic tests for object in the category `FreeModules()`. Since now, thanks to Simon all rings conform with categories, adding a test here should catch all problems.

Things are getting forward, that's great to hear!

### comment:8 in reply to: ↑ 7 Changed 9 years ago by stumpc5

Replying to stumpc5: It's Python's specification, so it definitely should return what we want. Now the best way to be sure that it does is to add it in the generic tests for object in the category `FreeModules()`. Since now, thanks to Simon all rings conform with categories, adding a test here should catch all problems.

Things are getting forward, that's great to hear!

I just saw this patch in my list, so I give it a ping to see if this was solved somewhere else, or if it is still to be done (most likely by you, Florent ?)...

### comment:9 Changed 8 years ago by jdemeyer

• Milestone changed from sage-5.11 to sage-5.12

### comment:10 Changed 8 years ago by vbraun_spam

• Milestone changed from sage-6.1 to sage-6.2

### comment:11 Changed 8 years ago by vbraun_spam

• Milestone changed from sage-6.2 to sage-6.3

### comment:12 Changed 7 years ago by vbraun_spam

• Milestone changed from sage-6.3 to sage-6.4

### comment:13 Changed 4 years ago by chapoton

• Milestone changed from sage-6.4 to sage-duplicate/invalid/wontfix
• Status changed from new to needs_review
• `dict_addition.pyx` was deprecated in #20680 in favor of blas.sum
• there one finds `src/sage/data_structures/blas_dict.pyx: for_removal = [key for key in result if not result[key]]`

So let us close this ticket now.

### comment:14 Changed 4 years ago by jdemeyer

• Resolution set to invalid
• Status changed from needs_review to closed
Note: See TracTickets for help on using tickets.