Opened 5 years ago

Last modified 5 years ago

#23746 new defect

Performance regression in span() for vectors over GF(2)

Reported by: dominic.else Owned by:
Priority: major Milestone: sage-8.1
Component: performance Keywords: span, Vector_mod2_dense, IntegerMod_int
Cc: SimonKing, jdemeyer Merged in:
Authors: Reviewers:
Report Upstream: N/A Work issues:
Branch: u/dominic.else/performance_regression_in_span___for_vectors_over_gf_2_ (Commits, GitHub, GitLab) Commit: 403ece1d71d97f4eaf2c5ae2ff948c3aa31d1d79
Dependencies: Stopgaps:

Status badges


Calculating the span of a large number of vectors in a high-dimension vector space over GF(2) has gotten a lot slower in recent versions of sage. This can be demonstrated by the following code:

import time

n = 1000
v = [ random_vector(GF(2), n) for i in xrange(n) ]

c1 = time.clock()
span(v, GF(2))
print time.clock()-c1

In Sage 7.4, this took around ~0.7s (on my laptop), whereas in Sage 8.0, it takes ~12s. This performance regression was introduced by #21746. The reason is that the new logic in Vector_mod2_dense.__init__() ends up calculating the modulus of an IntegerMod_int object rather than a native C int, which turns out to be a lot slower. Profiling the above code reveals that in Sage 8.0 it indeed spends most of its time in IntegerMod_int.__mod__.

Change History (7)

comment:1 Changed 5 years ago by dominic.else

  • Branch set to u/dominic.else/performance_regression_in_span___for_vectors_over_gf_2_

comment:2 Changed 5 years ago by dominic.else

  • Commit set to 403ece1d71d97f4eaf2c5ae2ff948c3aa31d1d79

The fix I uploaded sidesteps the problem by converting an IntegerMod_int object to a Python int first before taking the modulus, restoring the original performance. A better solution might be to make IntegerMod_int.__mod__() faster.

New commits:

403ece1Fix performance regression introduced by #21746

comment:3 Changed 5 years ago by dominic.else

  • Keywords IntegerMod_int added

comment:4 Changed 5 years ago by tscrim

  • Cc SimonKing jdemeyer added

comment:5 Changed 5 years ago by SimonKing

    def __mod__(self, modulus):
        if not isinstance(self, IntegerMod_abstract):
            # something % Mod(x,y) makes no sense
            return NotImplemented
        from .integer_mod_ring import IntegerModRing
        R = IntegerModRing(modulus)
        if (<Element>self)._parent._IntegerModRing_generic__order % R.order():
            raise ArithmeticError(f"reduction modulo {modulus!r} not defined")
        return R(self)

The import should not be done inside of the method. Is R.order() not the same as modulus? But in any case, I do think that using the modulus of a native C int should be used.

comment:6 Changed 5 years ago by SimonKing

Note that the semantic is different:

sage: x = GF(3)(2)
sage: type(x)
<type 'sage.rings.finite_rings.integer_mod.IntegerMod_int'>
sage: x
sage: x%2
ArithmeticError                           Traceback (most recent call last)
<ipython-input-4-d92b2f50ba44> in <module>()
----> 1 x%Integer(2)

/home/king/Sage/git/sage/src/sage/rings/finite_rings/integer_mod.pyx in sage.rings.finite_rings.integer_mod.IntegerMod_abstract.__mod__ (build/cythonized/sage/rings/finite_rings/integer_mod.c:6386)()
    392         R = IntegerModRing(modulus)
    393         if (<Element>self)._parent._IntegerModRing_generic__order % R.order():
--> 394             raise ArithmeticError(f"reduction modulo {modulus!r} not defined")
    395         return R(self)

ArithmeticError: reduction modulo 2 not defined
sage: int(x)%2

Note that the error does not depend on the specific element x, but only depends on the parent of x. Is it guaranteed in the vector code that all elements belong to the same parent? If they do, then the check on the parent should of course be done only once per vector, not once per element.

comment:7 Changed 5 years ago by tscrim

They should all belong to the same parent.

Note: See TracTickets for help on using tickets.