Opened 7 months ago

Closed 5 months ago

#29638 closed enhancement (fixed)

replace tuples by mpz vectors in free abelian monoids

Reported by: gh-mwageringel Owned by:
Priority: major Milestone: sage-9.2
Component: commutative algebra Keywords:
Cc: tscrim Merged in:
Authors: Markus Wageringel, Travis Scrimshaw Reviewers: Frédéric Chapoton, Markus Wageringel
Report Upstream: N/A Work issues:
Branch: ed94d9e (Commits) Commit: ed94d9edfabb97eb1a3fe46a06a447f26a8d8164
Dependencies: Stopgaps:

Description (last modified by gh-mwageringel)

This ticket changes the implementation of FreeAbelianMonoidElement to store its exponents as mpz_t* array (copied in part from vector_integer_dense). The file is cythonized for this. This makes multiplication much faster (from comment:10):

M = FreeAbelianMonoid('x', 5)
L = [M(list(v)) for d in (0..7) for v in IntegerVectors(d, 5)]
%timeit _ = [a * b for a in L for b in L]
1 loop, best of 5: 1.7 s per loop  -- (before)
1 loop, best of 5: 282 ms per loop  -- (after)

Another motivation for this change is to facilitate viewing free abelian monoids as the indexing set of polynomial rings.

Change History (28)

comment:1 Changed 7 months ago by gh-mwageringel

  • Authors set to Markus Wageringel
  • Branch set to u/gh-mwageringel/29638
  • Commit set to 10bd4b950367c55122be34dc2e24cf41de572990
  • Status changed from new to needs_review

New commits:

10bd4b929638: use ETuples in free abelian monoids

comment:2 Changed 6 months ago by git

  • Commit changed from 10bd4b950367c55122be34dc2e24cf41de572990 to efbbddcfcd8f91f6ba0715df46b72986ec00dd75

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

efbbddc29638: use ETuples in free abelian monoids

comment:3 Changed 6 months ago by gh-mwageringel

Rebased.

comment:4 Changed 6 months ago by chapoton

  • Cc tscrim added
  • Reviewers set to Frédéric Chapoton

Looks good, thanks for doing that. Maybe Travis will have a comment ?

comment:5 Changed 6 months ago by tscrim

+1 for moving away from Python code. Did you try using Zn elements for the backend structure?

comment:6 Changed 6 months ago by tscrim

Well, mpz vectors (maybe C arrays?) would perhaps be ideal, but that would need this file to be Cythonized.

comment:7 Changed 6 months ago by chapoton

I am tempted to give a positive review as it is, because it is already a large improvement. Travis, do you agree ? What do you mean by "using Zn" ?

comment:8 Changed 6 months ago by tscrim

I am also thinking of giving it a positive review, but I just wanted to see if Markus had tested it before a positive review.

As you know, any free abelian group has a natural isomorphism with Zn with the ith generator mapping to ei with * -> +, and the monoid naturally lies inside of that. So I was wondering if it has been tested using the corresponding free module code and how that compared. If it hasn't been done, I can write the code to test it.

comment:9 Changed 6 months ago by gh-mwageringel

No, I have not tried using integer vectors. Though, I imagine that cythonizing the whole file could already give another speedup. I only chose ETuples because it seemed natural, as I sometimes use FreeAbelianMonoid().algebra() as a replacement for polynomial rings, as it is in the ModulesWithBasis category (maybe this axiom could be added to polynomial rings as well, in the future?).

comment:10 Changed 5 months ago by tscrim

  • Authors changed from Markus Wageringel to Markus Wageringel, Travis Scrimshaw
  • Branch changed from u/gh-mwageringel/29638 to public/monoids/faster_free_abelian_monoid_elt-29638
  • Commit changed from efbbddcfcd8f91f6ba0715df46b72986ec00dd75 to 0183b87f8d8947df6c96a1137130fe88d80b0e08

Sorry for not being able to get to this sooner. Here is a version unwrapping the dense integer vectors, with a fair amount of overlap with that implementation. Ideally we would refactor that out, but the lack of multiple inheritance in Cython makes that impossible.

On my machine:

1 loop, best of 5: 1.7 s per loop  -- Old

1 loop, best of 5: 282 ms per loop  -- With my branch

New commits:

0183b87Implementing free abelian monoid elements as C arrays of mpz_ints.

comment:11 Changed 5 months ago by gh-mwageringel

This looks like a decent speedup. Would it be much slower to make _element_vector a Vector_integer_dense (or a sparse one)? This would avoid much of the duplication. I assume most of the speedup comes from the cythonization and assume the additional function call overhead would be small, but I might be wrong about this.

comment:12 Changed 5 months ago by tscrim

Sorry, I haven't had time yet to try it. It likely will slow it down a little bit because of the extra level of indirection and handling of an extra Element object, but it is worth seeing to reduce the duplication.

comment:13 Changed 5 months ago by tscrim

  • Branch changed from public/monoids/faster_free_abelian_monoid_elt-29638 to public/monoids/faster_free_abelian_monoid_elt_ZZ-29638
  • Commit changed from 0183b87f8d8947df6c96a1137130fe88d80b0e08 to fe46462ddb43b83766a762d510c39c5bb97075c1

There is about a 10% slowdown:

sage: %timeit _ = [a * b for a in L for b in L]
1 loop, best of 5: 304 ms per loop  -- New branch
sage: timeit("_ = [a * b for a in L for b in L]", repeat=5)
5 loops, best of 5: 427 ms per loop  -- New branch

5 loops, best of 5: 384 ms per loop  -- Previous branch

So it depends on how much you want the extra speed. IMO, the amount of duplication is not significant enough compared to the extra speed. It is just unfortunate that we cannot work around the class hierarchy (and don't have multiple inheritance) to have a common ABC.

Of course, the next step after this would be doing the same treatment for free Abelian groups.


New commits:

beca95dAdding pxd file.
fe46462Using ZZ module elements instead of rolling it directly.

comment:14 follow-up: Changed 5 months ago by gh-mwageringel

  • Status changed from needs_review to needs_work

Thank you for trying this. I get similar timings on my machine. We should go with the previous branch then, I guess, but there are some errors when running the tests:

sage -t --long src/sage/monoids/free_abelian_monoid.py  # Killed due to abort
sage -t --long src/sage/monoids/free_abelian_monoid_element.pyx  # Killed due to abort

Out of curiosity, is the slowdown with timeit(..., repeat=5) caused by deallocation?

By the way, I have also tried to use vector_integer_sparse.mpz_vector which is just a struct, but it is even slower (which is somewhat expected as the elements in the test case are not very sparse).

comment:15 in reply to: ↑ 14 Changed 5 months ago by tscrim

  • Branch changed from public/monoids/faster_free_abelian_monoid_elt_ZZ-29638 to public/monoids/faster_free_abelian_monoid_elt-29638
  • Commit changed from fe46462ddb43b83766a762d510c39c5bb97075c1 to beca95d1197c699eede026a1fae6405acc5a47e4

Replying to gh-mwageringel:

Thank you for trying this. I get similar timings on my machine. We should go with the previous branch then, I guess, but there are some errors when running the tests:

sage -t --long src/sage/monoids/free_abelian_monoid.py  # Killed due to abort
sage -t --long src/sage/monoids/free_abelian_monoid_element.pyx  # Killed due to abort

Strange, I wasn't getting those. I will try again with the latest beta version.

Out of curiosity, is the slowdown with timeit(..., repeat=5) caused by deallocation?

I think it is coming from the additional redirections and extra temporary objects (probably also the allocation as well).

By the way, I have also tried to use vector_integer_sparse.mpz_vector which is just a struct, but it is even slower (which is somewhat expected as the elements in the test case are not very sparse).

There probably is a benefit from having both a sparse version if someone wants a free abelian monoid (or group) with say 10000 generators and utilize that sparseness. Although I would be a bit surprised if there was a good use case.

comment:16 Changed 5 months ago by tscrim

Nope, I am still not getting those errors. Can you post the traceback of the errors and/or reproduce them locally on your computer?

comment:17 Changed 5 months ago by gh-mwageringel

The problem can be reproduced like this, for example:

sage: F = FreeAbelianMonoid(6,'b')
sage: F._test_elements_eq_symmetric()
free(): invalid pointer
------------------------------------------------------------------------
...
------------------------------------------------------------------------
Unhandled SIGABRT: An abort() occurred.
This probably occurred because a *compiled* module has a bug
in it and is not properly wrapped with sig_on(), sig_off().
Python will now terminate.
------------------------------------------------------------------------
/home/math/mwagerin/git/sage_compute/src/bin/sage-python: line 2: 756763 Aborted                 (core dumped) sage -python "$@"

It can also be obtained like this:

sage: F = FreeAbelianMonoid(6,'b')
sage: any(F.an_element() == 0 for _ in range(10))
free(): invalid pointer

It seems to come from the call to _Integer_from_mpz in _repr_, but I cannot see anything wrong with it.

comment:18 Changed 5 months ago by git

  • Commit changed from beca95d1197c699eede026a1fae6405acc5a47e4 to 70cb4257a4a2008ac1308fd55869b941244262bc

Branch pushed to git repo; I updated commit sha1. New commits:

044d28aMerge branch 'develop' into public/monoids/faster_free_abelian_monoid_elt-29638
70cb425Trying to fix _repr_ crash and added latex output.

comment:19 Changed 5 months ago by tscrim

I still cannot reproduce them, but based on your message, that gave me an idea of something to try. Please test this.

I also released there was no _latex_, so I quickly added that. There probably should be a common helper function for printing/latexing monomials with various options; I have definitely written a number of these methods in various places...

comment:20 Changed 5 months ago by tscrim

I am also a bit confused why the _repr_ would be called in those tests too; it is not like it is used in the hash...

comment:21 Changed 5 months ago by gh-mwageringel

I have rebuilt Sage from scratch, but the problem persists. This is on Ubuntu 20.04.

It can also be reproduced with the _repr_ directly (I have added sig_on/sig_off):

sage: F = FreeAbelianMonoid(6,'b')
sage: [F.an_element()._repr_() for _ in range(30)]
free(): invalid pointer
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-2-bc2f259b5b12> in <module>()
----> 1 [F.an_element()._repr_() for _ in range(Integer(30))]

<ipython-input-2-bc2f259b5b12> in <listcomp>(.0)
----> 1 [F.an_element()._repr_() for _ in range(Integer(30))]

/amd/compute/mwagerin/git/sage_compute/local/lib/python3.7/site-packages/sage/monoids/free_abelian_monoid_element.pyx in sage.monoids.free_abelian_monoid_element.FreeAbelianMonoidElement._repr_ (build/cythonized/sage/monoids/free_abelian_monoid_element.c:3749)()
    204         cdef Integer val
    205         for i in range(self._n):
--> 206             sig_on()
    207             val = _Integer_from_mpz(v[i])
    208             sig_off()

RuntimeError: Aborted

Here is a stack trace for the equality test, which shows that the _repr_ is used in an error message:

sage: F = FreeAbelianMonoid(6,'b')
sage: any(F.an_element() == 0 for _ in range(10))
free(): invalid pointer
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-2-be4380b2bde6> in <module>()
----> 1 any(F.an_element() == Integer(0) for _ in range(Integer(10)))

<ipython-input-2-be4380b2bde6> in <genexpr>(.0)
----> 1 any(F.an_element() == Integer(0) for _ in range(Integer(10)))

/amd/compute/mwagerin/git/sage_compute/local/lib/python3.7/site-packages/sage/structure/element.pyx in sage.structure.element.Element.__richcmp__ (build/cythonized/sage/structure/element.c:10160)()
   1115             return (<Element>self)._richcmp_(other, op)
   1116         else:
-> 1117             return coercion_model.richcmp(self, other, op)
   1118
   1119     cpdef _richcmp_(left, right, int op):

/amd/compute/mwagerin/git/sage_compute/local/lib/python3.7/site-packages/sage/structure/coerce.pyx in sage.structure.coerce.CoercionModel.richcmp (build/cythonized/sage/structure/coerce.c:19864)()
   1971         # Coerce to a common parent
   1972         try:
-> 1973             x, y = self.canonical_coercion(x, y)
   1974         except (TypeError, NotImplementedError):
   1975             pass

/amd/compute/mwagerin/git/sage_compute/local/lib/python3.7/site-packages/sage/structure/coerce.pyx in sage.structure.coerce.CoercionModel.canonical_coercion (build/cythonized/sage/structure/coerce.c:13208)()
   1391                 self._record_exception()
   1392
-> 1393         raise TypeError("no common canonical parent for objects with parents: '%s' and '%s'"%(xp, yp))
   1394
   1395

/amd/compute/mwagerin/git/sage_compute/local/lib/python3.7/site-packages/sage/monoids/free_abelian_monoid.py in __repr__(self)
    203     def __repr__(self):
    204         n = self.__ngens
--> 205         return "Free abelian monoid on %s generators %s"%(n,self.gens())
    206
    207     def __call__(self, x):

/amd/compute/mwagerin/git/sage_compute/local/lib/python3.7/site-packages/sage/structure/sage_object.pyx in sage.structure.sage_object.SageObject.__repr__ (build/cythonized/sage/structure/sage_object.c:2435)()
    192         except AttributeError:
    193             return super().__repr__()
--> 194         result = reprfunc()
    195         if isinstance(result, str):
    196             return result

/amd/compute/mwagerin/git/sage_compute/local/lib/python3.7/site-packages/sage/monoids/free_abelian_monoid_element.pyx in sage.monoids.free_abelian_monoid_element.FreeAbelianMonoidElement._repr_ (build/cythonized/sage/monoids/free_abelian_monoid_element.c:3749)()
    204         cdef Integer val
    205         for i in range(self._n):
--> 206             sig_on()
    207             val = _Integer_from_mpz(v[i])
    208             sig_off()

RuntimeError: Aborted

comment:22 Changed 5 months ago by git

  • Commit changed from 70cb4257a4a2008ac1308fd55869b941244262bc to cdebca73a7d4314994a96afa1bdd583b5bc80b96

Branch pushed to git repo; I updated commit sha1. New commits:

cdebca7Remove the redundant MonoidElement.__init__ call; some other small tweaks.

comment:23 Changed 5 months ago by tscrim

I am really perplexed by this. The only difference between this an dense integer vectors is the _repr_, which simply unrolls the call to list. I feel like this should also fail for you by replacing F with ZZ^5. I cannot see what the difference is. I made some other changes that I doubt will fix the problem, but are good do have done. Since I don't get the failure, it is really hard to figure this out. :/ (I am running Ubuntu 18.04 LTS.)

comment:24 Changed 5 months ago by gh-mwageringel

Yes, the problem indeed exists with ZZ-vectors as well. The compiler gives the warning

[sagelib-9.2.beta3] warning: sage/rings/integer.pxd:41:37: Declarations should not be declared inline.

so I tried to remove the inline from the pxd file (this is the same format as used for smallInteger in that file), but it does not change the underlying problem.

However, the following change seems to fix the issue.

  • src/sage/rings/integer.pxd

    diff --git a/src/sage/rings/integer.pxd b/src/sage/rings/integer.pxd
    index e90abd6c1c..778fd1cf7b 100644
    a b  
    11from sage.libs.gmp.types cimport __mpz_struct, mpz_t, mpz_ptr
     2from sage.libs.gmp.mpz cimport mpz_set
    23from sage.libs.ntl.types cimport ZZ_c
    34
    45from sage.structure.element cimport EuclideanDomainElement, RingElement
    cdef int mpz_set_str_python(mpz_ptr z, char* s, int base) except -1 
    3839
    3940cdef Integer smallInteger(long value)
    4041
    41 cdef inline Integer _Integer_from_mpz(mpz_t e)
     42cdef inline Integer _Integer_from_mpz(mpz_t e):
     43    cdef Integer z = Integer.__new__(Integer)
     44    mpz_set(z.value, e)
     45    return z
    4246
    4347cdef class int_to_Z(Morphism):
    4448    pass
  • src/sage/rings/integer.pyx

    diff --git a/src/sage/rings/integer.pyx b/src/sage/rings/integer.pyx
    index 6ef08a3a5c..d3bd868ec7 100644
    a b cdef integer(x): 
    74807480        return x
    74817481    return Integer(x)
    74827482
    7483 cdef inline Integer _Integer_from_mpz(mpz_t e):
    7484     cdef Integer z = Integer.__new__(Integer)
    7485     mpz_set(z.value, e)
    7486     return z
    7487 
    74887483def free_integer_pool():
    74897484    cdef int i
    74907485    cdef PyObject *o

This passes all the tests for me now, but I am still a bit confused about the issue – particularly because it occured with some randomness.

comment:25 Changed 5 months ago by git

  • Commit changed from cdebca73a7d4314994a96afa1bdd583b5bc80b96 to ed94d9edfabb97eb1a3fe46a06a447f26a8d8164

Branch pushed to git repo; I updated commit sha1. New commits:

ed94d9eMoving the function to the pxd file.

comment:26 follow-up: Changed 5 months ago by tscrim

  • Status changed from needs_work to needs_review

That is a little strange. Is that ZZ vector issue also on 9.2.beta3 or just using this branch?

Anyways, I have pushed the change that works for you. It still works for me as well.

comment:27 in reply to: ↑ 26 Changed 5 months ago by gh-mwageringel

  • Description modified (diff)
  • Reviewers changed from Frédéric Chapoton to Frédéric Chapoton, Markus Wageringel
  • Status changed from needs_review to positive_review
  • Summary changed from replace tuples by ETuples in free abelian monoids to replace tuples by mpz vectors in free abelian monoids

Replying to tscrim:

That is a little strange. Is that ZZ vector issue also on 9.2.beta3 or just using this branch?

No – just with that branch. Maybe it is a Cython bug, but I do no think I will be able to debug this much further.

Anyways, I have pushed the change that works for you. It still works for me as well.

Thank you. I think we can move forward with this now, so I am setting this to positive.

comment:28 Changed 5 months ago by vbraun

  • Branch changed from public/monoids/faster_free_abelian_monoid_elt-29638 to ed94d9edfabb97eb1a3fe46a06a447f26a8d8164
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.