Opened 5 years ago

# Performance regression in span() for vectors over GF(2)

Reported by: Owned by: dominic.else major sage-8.1 performance span, Vector_mod2_dense, IntegerMod_int SimonKing, jdemeyer N/A u/dominic.else/performance_regression_in_span___for_vectors_over_gf_2_ 403ece1d71d97f4eaf2c5ae2ff948c3aa31d1d79

### Description

Calculating the span of a large number of vectors in a high-dimension vector space over GF(2) has gotten a lot slower in recent versions of sage. This can be demonstrated by the following code:

```import time

set_random_seed(0)
n = 1000
v = [ random_vector(GF(2), n) for i in xrange(n) ]

c1 = time.clock()
span(v, GF(2))
print time.clock()-c1
```

In Sage 7.4, this took around ~0.7s (on my laptop), whereas in Sage 8.0, it takes ~12s. This performance regression was introduced by #21746. The reason is that the new logic in `Vector_mod2_dense.__init__()` ends up calculating the modulus of an IntegerMod_int object rather than a native C int, which turns out to be a lot slower. Profiling the above code reveals that in Sage 8.0 it indeed spends most of its time in `IntegerMod_int.__mod__`.

### comment:1 Changed 5 years ago by dominic.else

• Branch set to u/dominic.else/performance_regression_in_span___for_vectors_over_gf_2_

### comment:2 Changed 5 years ago by dominic.else

• Commit set to 403ece1d71d97f4eaf2c5ae2ff948c3aa31d1d79

The fix I uploaded sidesteps the problem by converting an IntegerMod_int object to a Python int first before taking the modulus, restoring the original performance. A better solution might be to make `IntegerMod_int.__mod__()` faster.

New commits:

 ​403ece1 `Fix performance regression introduced by #21746`

### comment:5 Changed 5 years ago by SimonKing

```    def __mod__(self, modulus):
if not isinstance(self, IntegerMod_abstract):
# something % Mod(x,y) makes no sense
return NotImplemented
from .integer_mod_ring import IntegerModRing
R = IntegerModRing(modulus)
if (<Element>self)._parent._IntegerModRing_generic__order % R.order():
raise ArithmeticError(f"reduction modulo {modulus!r} not defined")
return R(self)
```

The import should not be done inside of the method. Is `R.order()` not the same as `modulus`? But in any case, I do think that using the modulus of a native C int should be used.

### comment:6 Changed 5 years ago by SimonKing

Note that the semantic is different:

```sage: x = GF(3)(2)
sage: type(x)
<type 'sage.rings.finite_rings.integer_mod.IntegerMod_int'>
sage: x
2
sage: x%2
---------------------------------------------------------------------------
ArithmeticError                           Traceback (most recent call last)
<ipython-input-4-d92b2f50ba44> in <module>()
----> 1 x%Integer(2)

/home/king/Sage/git/sage/src/sage/rings/finite_rings/integer_mod.pyx in sage.rings.finite_rings.integer_mod.IntegerMod_abstract.__mod__ (build/cythonized/sage/rings/finite_rings/integer_mod.c:6386)()
392         R = IntegerModRing(modulus)
393         if (<Element>self)._parent._IntegerModRing_generic__order % R.order():
--> 394             raise ArithmeticError(f"reduction modulo {modulus!r} not defined")
395         return R(self)
396

ArithmeticError: reduction modulo 2 not defined
sage: int(x)%2
0
```

Note that the error does not depend on the specific element x, but only depends on the parent of x. Is it guaranteed in the vector code that all elements belong to the same parent? If they do, then the check on the parent should of course be done only once per vector, not once per element.

### comment:7 Changed 5 years ago by tscrim

They should all belong to the same parent.

Note: See TracTickets for help on using tickets.