#26243 closed enhancement (fixed)

Provide Hilbert series implementation beyond Singular's limitations

Reported by: SimonKing Owned by:
Priority: major Milestone: sage-8.4
Component: commutative algebra Keywords: Hilbert series
Cc: Winfried Merged in:
Authors: Simon King Reviewers: Travis Scrimshaw
Report Upstream: N/A Work issues:
Branch: 5637e07 (Commits) Commit: 5637e07941cab4d3c8ea35b0b35b94150ed0a58c
Dependencies: Stopgaps:

Description (last modified by SimonKing)

In my work on group cohomology, I was bitten by the fact that Singular limits integers to 32 bit in the coefficients of Hilbert series. By consequence, examples such as the following from #20145 fail:

        sage: n=4;m=11;P = PolynomialRing(QQ,n*m,"x"); x = P.gens(); M = Matrix(n,x)
        sage: I = P.ideal(M.minors(2))
        sage: J = P*[m.lm() for m in I.groebner_basis()]
        sage: J.hilbert_numerator()
        Traceback (most recent call last):
        ...
        RuntimeError: error in Singular function call 'hilb':
        int overflow in hilb 1

It seems that upstream does not want to change Singular to use 64 bit integers by default and arbitrary size integers when needed (that's what I would propose. Therefore I am providing an implementation of Hilbert series computation that does not suffer from such limitation.

        sage: from sage.rings.polynomial.hilbert import first_hilbert_series
        sage: hilbert_poincare_series(J).numerator()
        120*t^3 + 135*t^2 + 30*t + 1
        sage: hilbert_poincare_series(J).denominator().factor()
        (t - 1)^14

The above is the correct result according to #20145.

My implementation is a bit slower than libsingular, which is why I do not propose to use my implementation as a replacement of libsingular's hilb function. However, in big applications, it is a good way out.

Attachments (1)

hilbert_example.sobj (48.3 KB) - added by SimonKing 19 months ago.
Some real world example for the computation of Hilbert series

Download all attachments as: .zip

Change History (83)

comment:1 Changed 19 months ago by SimonKing

  • Branch set to u/SimonKing/Hilbert_functions_unlimited

comment:2 Changed 19 months ago by SimonKing

  • Commit set to 827f579781c66fae075ce2cee83942d02cdd4f1e
  • Status changed from new to needs_review

New commits:

1a6c6dcProvide an implementation of Hilbert series without Singular's limitations
827f579Provide documentation for Hilbert series computation

comment:3 Changed 19 months ago by SimonKing

  • Description modified (diff)

comment:4 Changed 19 months ago by mkoeppe

  • Cc Winfried added

comment:5 follow-up: Changed 19 months ago by Winfried

The residue class ring of the polynomial ring in 11 x 4 variables modulo the 2x2 minors is a normal monoid ring. You can use Normaliz, but there is also an explicit formula. See MR1213858

Changed 19 months ago by SimonKing

Some real world example for the computation of Hilbert series

comment:6 in reply to: ↑ 5 Changed 19 months ago by SimonKing

Replying to Winfried:

The residue class ring of the polynomial ring in 11 x 4 variables modulo the 2x2 minors is a normal monoid ring. You can use Normaliz, but there is also an explicit formula. See MR1213858

On #25091, you said that normaliz can only compute the Hilbert series of the integral closure of a monomial ideal. That may not be what is required here.

Anyway: Clearly I am not interested in that particular example. The example is just that: An example, that I borrowed from #20145. It is suitable as a doc test, as it can be set up in short time; and it is instructive as a doc example, as it exposes limitations of Singular that make Singular unusable in "real world examples" for which there are no explicit formulae in the literature.

Or would you know a formula for the Hilbert series of the mod-2 cohomology ring of group number 299 of order 256 (numbering according to the small groups library)? That cohomology ring is the smallest potential counter example to a conjecture of Dave Benson. A small part of the attempt to verify that conjecture is the computation of the Hilbert series of several quotients of that ring. Unfortunately, that "small part" is too big for Singular or Frobby.

The leading ideal of the relation ideal of the above-mentioned cohomology ring is in attachment:hilbert_example.sobj. The computation of its Hilbert series actually shouldn't take very long:

sage: from sage.rings.polynomial.hilbert import hilbert_poincare_series
sage: I,grading = load("/home/king/Projekte/coho/hilbert_example.sobj")
sage: %time hilbert_poincare_series(I,grading)
CPU times: user 270 ms, sys: 2.85 ms, total: 273 ms
Wall time: 324 ms
(-8*t^26 + 24*t^25 - 16*t^24 - 16*t^23 + 24*t^22 - 8*t^21 - t^7 - t^6 + t^5 - t^4 - t^3 - t^2 + t - 1)/(t^13 - 3*t^12 + 2*t^11 + 2*t^10 - 3*t^9 + t^8 - t^5 + 3*t^4 - 2*t^3 - 2*t^2 + 3*t - 1)

However, Singular immediately tells that the example is too big, and Frobby needs to be killed after several minutes. There is no interface to CoCoA, and I don't know how to utilise the interface to normaliz. I also don't know how to install Macaulay2, so, I can't test that either.

comment:7 follow-up: Changed 19 months ago by tscrim

All of these comments are about sum_from_list:

Something that should give you better speed would be having it take another argument of the starting point to do the sum. That way you would not have to create all of the intermediate lists (e.g., L[:l2] and L[:l2]). It will be slightly trickier to code, but not by much and I would imagine give a decent speedup.

I think this might also give you better C code (and hence, be faster):

    if l == 1:
        return <ETuple> L[0]
    if l == 2:
        return (<ETuple> L[0]).eadd(<ETuple> L[1])

Some other thoughts:

Does L need to be a list (i.e., not an array of ETuple's)? This would mean less type-casting. It also seems like you are using list manipulations other than one sort (which you can most likely just use C qsort on).

It is faster to do if not Id: than if len(Id)==0:, even in Cython.

If you replace Id[-1] with Id[len(Id)-1], you can then remove the bounds check and wrap around from HilbertBaseCase. You probably should remove those checks across the file to get some extra speed.

You are probably going to be better in terms of speed to directly the low-level polynomial manipulation via flint, that way you have fewer intermediate (Sage) objects and indirection.

Should those ETuple functions be a part of the ETuple class?

The for i from 0 <= i < k by m is discouraged (if not semi-deprecated) in favor of the more standard Python for i in range(0, k, m), which gets turned into the same C code AFAIK.

You can avoid a few additions here:

-    for i from 0 <= i < 2*m._nonzero by 2:
-        degree += m._data[i+1]
+    for i in range(1, 2*m._nonzero, 2):
+        degree += m._data[i]

You should have two functions quotient_degree and quotient_degree_weighted, where the former does not take w as an input. There is almost no code shared between the two and it makes it easier to understand what is going on where they are called. Same for degree.

quotient(by_var) knows its return type is a list since interred returns a list.

I don't see why AN should be a dict (and hence the first arguments D to a number of your functions). It seems more like you are mimicking a struct with a dict. Using a struct will be much faster because it doesn't have to do string hashing and equality checking; it would be a direct C lookup. For those things you are manipulating in subfunction calls, you can pass it by pointer like you would normally in C (I forget if Cython passes everything by copy or reference by default for C objects).

comment:8 in reply to: ↑ 7 ; follow-up: Changed 19 months ago by SimonKing

Dear Travis,

first of all: Thank you for your thoughtful comments!

Replying to tscrim:

Something that should give you better speed would be having it take another argument of the starting point to do the sum. That way you would not have to create all of the intermediate lists (e.g., L[:l2] and L[:l2]). It will be slightly trickier to code, but not by much and I would imagine give a decent speedup.

Right, I should try that.

I think this might also give you better C code (and hence, be faster):

    if l == 1:
        return <ETuple> L[0]
    if l == 2:
        return (<ETuple> L[0]).eadd(<ETuple> L[1])

Right, actually the first "if" is nonsense in my code: I assign to m1, but do not return it. Anyway, here is the resulting C code:

    /* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":268
 *     if l==1:
 *         m1 = L[0]
 *         return L[0]             # <<<<<<<<<<<<<<
 *     if l==2:
 *         m1,m2=L
 */
    __Pyx_XDECREF(((PyObject *)__pyx_r));
    if (unlikely(__pyx_v_L == Py_None)) {
      PyErr_SetString(PyExc_TypeError, "'NoneType' object is not subscriptable");
      __PYX_ERR(0, 268, __pyx_L1_error)
    }
    __pyx_t_2 = __Pyx_GetItemInt_List(__pyx_v_L, 0, long, 1, __Pyx_PyInt_From_long, 1, 0, 1); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 268, __pyx_L1_error)
    __Pyx_GOTREF(__pyx_t_2);
    if (!(likely(((__pyx_t_2) == Py_None) || likely(__Pyx_TypeTest(__pyx_t_2, __pyx_ptype_4sage_5rings_10polynomial_8polydict_ETuple))))) __PYX_ERR(0, 268, __pyx_L1_error)
    __pyx_r = ((struct __pyx_obj_4sage_5rings_10polynomial_8polydict_ETuple *)__pyx_t_2);
    __pyx_t_2 = 0;
    goto __pyx_L0;

versus

    /* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":268
 *     if l==1:
 *         m1 = L[0]
 *         return <ETuple>L[0]             # <<<<<<<<<<<<<<
 *     if l==2:
 *         m1,m2=L
 */
    __Pyx_XDECREF(((PyObject *)__pyx_r));
    if (unlikely(__pyx_v_L == Py_None)) {
      PyErr_SetString(PyExc_TypeError, "'NoneType' object is not subscriptable");
      __PYX_ERR(0, 268, __pyx_L1_error)
    }
    __pyx_t_2 = __Pyx_GetItemInt_List(__pyx_v_L, 0, long, 1, __Pyx_PyInt_From_long, 1, 0, 1); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 268, __pyx_L1_error)
    __Pyx_GOTREF(__pyx_t_2);
    __Pyx_INCREF(((PyObject *)((struct __pyx_obj_4sage_5rings_10polynomial_8polydict_ETuple *)__pyx_t_2)));
    __pyx_r = ((struct __pyx_obj_4sage_5rings_10polynomial_8polydict_ETuple *)__pyx_t_2);
    __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;
    goto __pyx_L0;

Is the second one better?

Does L need to be a list (i.e., not an array of ETuple's)? This would mean less type-casting.

Can you give me a pointer on how to use arrays of Cython types?

It also seems like you are using list manipulations other than one sort (which you can most likely just use C qsort on).

I also use append and pop. How to do these with arrays?

It is faster to do if not Id: than if len(Id)==0:, even in Cython.

Cool! I thought that, again, it is equivalent. But the code becomes a lot shorter:

  /* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":292
 *     cdef int e
 *     # First, the easiest cases:
 *     if len(Id)==0:             # <<<<<<<<<<<<<<
 *         return PR(1)
 *     cdef ETuple m = Id[-1]
 */
  if (unlikely(__pyx_v_Id == Py_None)) {
    PyErr_SetString(PyExc_TypeError, "object of type 'NoneType' has no len()");
    __PYX_ERR(0, 292, __pyx_L1_error)
  }
  __pyx_t_2 = PyList_GET_SIZE(__pyx_v_Id); if (unlikely(__pyx_t_2 == ((Py_ssize_t)-1))) __PYX_ERR(0, 292, __pyx_L1_error)
  __pyx_t_3 = ((__pyx_t_2 == 0) != 0);
  if (__pyx_t_3) {

versus

  /* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":292
 *     cdef int e
 *     # First, the easiest cases:
 *     if not Id:             # <<<<<<<<<<<<<<
 *         return PR(1)
 *     cdef ETuple m = Id[-1]
 */
  __pyx_t_2 = (__pyx_v_Id != Py_None)&&(PyList_GET_SIZE(__pyx_v_Id) != 0);
  __pyx_t_3 = ((!__pyx_t_2) != 0);
  if (__pyx_t_3) {

If you replace Id[-1] with Id[len(Id)-1], you can then remove the bounds check and wrap around from HilbertBaseCase. You probably should remove those checks across the file to get some extra speed.

I don't understand that. What bounds check are you talking about? Are you saying that Id[-1] does a bounds check, whereas Id[len(Id)-1] doesn't?

You are probably going to be better in terms of speed to directly the low-level polynomial manipulation via flint, that way you have fewer intermediate (Sage) objects and indirection.

So, I should allocate the underlying flint C types and manipulate these, and only in the very end create a Sage polynomial? I guess that's related with another point below.

Should those ETuple functions be a part of the ETuple class?

Probably. Since ETuple is designed for monomials anyway, it make sense to have monomial divisibility checks there and not here.

The for i from 0 <= i < k by m is discouraged (if not semi-deprecated) in favor of the more standard Python for i in range(0, k, m), which gets turned into the same C code AFAIK.

Almost. I get

  /* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":110
 *     cdef int exp1
 *     cdef ETuple result
 *     for i from 0 <= i < 2*m1._nonzero by 2:             # <<<<<<<<<<<<<<
 *         if m1._data[i] == index:
 *             result = <ETuple>m1._new()
 */
  __pyx_t_1 = (2 * __pyx_v_m1->_nonzero);
  for (__pyx_v_i = 0; __pyx_v_i < __pyx_t_1; __pyx_v_i+=2) {
    ...
    }

versus

  /* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":110
 *     cdef int exp1
 *     cdef ETuple result
 *     for i in range(0,2*m1._nonzero,2):             # <<<<<<<<<<<<<<
 *         if m1._data[i] == index:
 *             result = <ETuple>m1._new()
 */
  __pyx_t_1 = (2 * __pyx_v_m1->_nonzero);
  __pyx_t_2 = __pyx_t_1;
  for (__pyx_t_3 = 0; __pyx_t_3 < __pyx_t_2; __pyx_t_3+=2) {
    __pyx_v_i = __pyx_t_3;
    ...
    }

So, in the non-deprecated version there is an additional variable assignment. Actually this is why I changed my original code (which did use in range(...)) into from ... by.

You can avoid a few additions here:

-    for i from 0 <= i < 2*m._nonzero by 2:
-        degree += m._data[i+1]
+    for i in range(1, 2*m._nonzero, 2):
+        degree += m._data[i]

Right, good point!

You should have two functions quotient_degree and quotient_degree_weighted, where the former does not take w as an input. There is almost no code shared between the two and it makes it easier to understand what is going on where they are called. Same for degree.

Really? I thought it was clearer the other way around: In all cases, we have a computation that involves degree weights, and it is the job of the function to find out if the weights are trivial or not.

One thing, though: At some point, one needs to check whether w is None or not. By having a massive code duplication, I could achieve that this check is only done once, namely in first_hilbert_series. Currently, it is checked frequently, and the frequency wouldn't be reduced with your suggestion of a small code duplication.

quotient(by_var) knows its return type is a list since interred returns a list.

Right.

I don't see why AN should be a dict (and hence the first arguments D to a number of your functions). It seems more like you are mimicking a struct with a dict. Using a struct will be much faster because it doesn't have to do string hashing and equality checking; it would be a direct C lookup. For those things you are manipulating in subfunction calls, you can pass it by pointer like you would normally in C (I forget if Cython passes everything by copy or reference by default for C objects).

This is related with the point above: In a struct, I could also more easily use the underlying C data type of flint polynomials.

But just to be clear: I would need to do the memory management myself, wouldn't I? Basically, it is a binary search tree; is there code in Sage (I am sure there is!) where such data structure is used and from where I can get inspiration?

comment:9 follow-up: Changed 19 months ago by SimonKing

Just to be sure: If I turn the !ETuple functions into methods, they should probably be cpdef, not cdef, right? As long as all instances of !ETuple in my code are cdefined, a cpdef method wouldn't be less efficient than a cdef method, right?

comment:10 follow-up: Changed 19 months ago by Winfried

1) One of the bad aspects of Singular is that it is the opposite of being careful in dealing with classes of objects. What is called the Hilbert series of an ideal I there, is the Hilbert series of R/I where R is the ambient polynomial ring. But I is a graded module and has its own Hilbert series. If one wants the Hilbert series of R/I, it should NOT be called the Hilbert series of I.

2) I don't think one can relate the Hilbert series of a (monomial) ideal and the Hilbert series of its integral closure in a useful way.

comment:11 in reply to: ↑ 10 Changed 19 months ago by SimonKing

Replying to Winfried:

1) One of the bad aspects of Singular is that it is the opposite of being careful in dealing with classes of objects. What is called the Hilbert series of an ideal I there, is the Hilbert series of R/I where R is the ambient polynomial ring. But I is a graded module and has its own Hilbert series. If one wants the Hilbert series of R/I, it should NOT be called the Hilbert series of I.

Right. In most cases, I actually use the wording "Hilbert series of I" and "Poincaré series of R/I", which I guess isn't standard, but at least it makes the difference clear. But I think in my comments above I haven't been sufficiently careful with these wordings.

EDIT: Note that Bigatti uses the notation "<I>", which coincides with the Hilbert-Poincaré series of R/I. But that's just a notation, not a name. It seems to me that in practical applications, one most commonly has an ideal I and wants to compute the Hilbert-Poincaré series of R/I, not of "I as a module". Hence, from a practical aspect, it actually makes sense to use a function name such as "hilb" or "hilbert_numerator" and so on for a function that tells something about R/I but takes I as an input.

2) I don't think one can relate the Hilbert series of a (monomial) ideal and the Hilbert series of its integral closure in a useful way.

That's unfortunate. I really need the Hilbert/Poincaré series of R/I.

Last edited 19 months ago by SimonKing (previous) (diff)

comment:12 in reply to: ↑ 8 ; follow-up: Changed 19 months ago by tscrim

Replying to SimonKing:

first of all: Thank you for your thoughtful comments!

No problem. Sorry I am not always able to find the time to review your code.

Replying to tscrim:

I think this might also give you better C code (and hence, be faster):

    if l == 1:
        return <ETuple> L[0]
    if l == 2:
        return (<ETuple> L[0]).eadd(<ETuple> L[1])

Right, actually the first "if" is nonsense in my code: I assign to m1, but do not return it. Anyway, here is the resulting C code: [snip] Is the second one better?

Yes, you can see that it does not do the type check (which could result in a segfault if the result is not the correct type).

Does L need to be a list (i.e., not an array of ETuple's)? This would mean less type-casting.

Can you give me a pointer on how to use arrays of Cython types?

Right, you can't have an array of cdef classes. Forgot about that. There is a way around that by having an array of pointers to the objects:

https://stackoverflow.com/questions/33851333/cython-how-do-you-create-an-array-of-cdef-class

It is just really annoying to have to do all of the type-casting.

It also seems like you are using list manipulations other than one sort (which you can most likely just use C qsort on).

I also use append and pop. How to do these with arrays?

I missed that when going through the code. So then it is just better to use lists. Add lots of <ETuple>'s to avoid as much extra type checking as possible.

If you replace Id[-1] with Id[len(Id)-1], you can then remove the bounds check and wrap around from HilbertBaseCase. You probably should remove those checks across the file to get some extra speed.

I don't understand that. What bounds check are you talking about? Are you saying that Id[-1] does a bounds check, whereas Id[len(Id)-1] doesn't?

Id[-1] is a wrap-around: normal C would just do pointer offset. You can tell Cython to assume that indices passed in behave like C arrays and do not check if the result is negative (wrap-around check) or outside the list (bounds check).

You are probably going to be better in terms of speed to directly the low-level polynomial manipulation via flint, that way you have fewer intermediate (Sage) objects and indirection.

So, I should allocate the underlying flint C types and manipulate these, and only in the very end create a Sage polynomial? I guess that's related with another point below.

Yes, if you want to squeeze out as much speed as possible. It is more work and makes the code a bit more obscure, but it should give you some more speed.

The for i from 0 <= i < k by m is discouraged (if not semi-deprecated) in favor of the more standard Python for i in range(0, k, m), which gets turned into the same C code AFAIK.

Almost. I get [snip] So, in the non-deprecated version there is an additional variable assignment. Actually this is why I changed my original code (which did use in range(...)) into from ... by.

Hmm...curious. Although the compiler will almost certainly optimize those extra variables out anyways.

You should have two functions quotient_degree and quotient_degree_weighted, where the former does not take w as an input. There is almost no code shared between the two and it makes it easier to understand what is going on where they are called. Same for degree.

Really? I thought it was clearer the other way around: In all cases, we have a computation that involves degree weights, and it is the job of the function to find out if the weights are trivial or not.

It seems like they want to separate functions to me. Plus you could then require w to be a list for the signature.

One thing, though: At some point, one needs to check whether w is None or not. By having a massive code duplication, I could achieve that this check is only done once, namely in first_hilbert_series. Currently, it is checked frequently, and the frequency wouldn't be reduced with your suggestion of a small code duplication.

I see; I didn't think it was going to be that bad overall. However, it's not a big issue for me the way things are now.

I don't see why AN should be a dict (and hence the first arguments D to a number of your functions). It seems more like you are mimicking a struct with a dict. Using a struct will be much faster because it doesn't have to do string hashing and equality checking; it would be a direct C lookup. For those things you are manipulating in subfunction calls, you can pass it by pointer like you would normally in C (I forget if Cython passes everything by copy or reference by default for C objects).

This is related with the point above: In a struct, I could also more easily use the underlying C data type of flint polynomials.

Yes, that is correct, but the struct could hold onto a Sage polynomial just as well. The reason for doing this is both cleaner and faster code by avoiding the dict.

But just to be clear: I would need to do the memory management myself, wouldn't I? Basically, it is a binary search tree; is there code in Sage (I am sure there is!) where such data structure is used and from where I can get inspiration?

Yes, you would have to do some memory management. I'm not sure exactly how much from my relatively quick look, but it should not be too bad.

comment:13 in reply to: ↑ 9 Changed 19 months ago by tscrim

Replying to SimonKing:

Just to be sure: If I turn the !ETuple functions into methods, they should probably be cpdef, not cdef, right? As long as all instances of !ETuple in my code are cdefined, a cpdef method wouldn't be less efficient than a cdef method, right?

No, a cpdef called from a Cython file is just a C function call IIRC. So it should not be any less efficient from within Cython code (although I think it can introduce a bit of extra overhead when doing a Python function call compared to a plain def).

sage: %%cython
....: def foo():
....:     pass
....: cpdef cpfoo():
....:     pass
....: cdef cfoo():
....:     pass
....: def test():
....:     foo()
....: def ctest():
....:     cfoo()
....: def cptest():
....:     cpfoo()
....: cpdef cptestcp():
....:     cpfoo()
....:
sage: %timeit test()
10000000 loops, best of 3: 34.9 ns per loop
sage: %timeit ctest()
10000000 loops, best of 3: 22.9 ns per loop
sage: %timeit cptest()
10000000 loops, best of 3: 22.9 ns per loop
sage: %timeit cptestcp()
10000000 loops, best of 3: 23.3 ns per loop

(These timings are relatively consistent across multiple runs for me.)

comment:14 in reply to: ↑ 12 ; follow-ups: Changed 19 months ago by SimonKing

Just some questions to make sure I understand your suggestions:

Replying to tscrim:

I missed that when going through the code. So then it is just better to use lists. Add lots of <ETuple>'s to avoid as much extra type checking as possible.

So, <ETuple> silences the type check, which is faster but can theoretically result in a segfault, right?

Id[-1] is a wrap-around: normal C would just do pointer offset. You can tell Cython to assume that indices passed in behave like C arrays and do not check if the result is negative (wrap-around check) or outside the list (bounds check).

So, when I do L[len(L)-1], Cython will automatically know that a bound check is not needed?

How to avoid a bound check in sum_from_list(list L, size_t s, size_t l), where I access L[0], L[1], L[s], L[s+l//2] etc.?

Hmm...curious. Although the compiler will almost certainly optimize those extra variables out anyways.

In that case I should change it.

It seems like they want to separate functions to me. Plus you could then require w to be a list for the signature.

I do require tuple. Would list be better?

Anyway, do you think that I should duplicate the loop in first_hilbert_series, once for the case that w is None and once for the case that it isn't?

Yes, you would have to do some memory management. I'm not sure exactly how much from my relatively quick look, but it should not be too bad.

Maybe go an easy middle way, and have some

cdef class Node:
    cdef Node Back, Left, Right
    cdef object LMult   # or some flint boilerplate instead of object
    cdef object LeftFHS # or some flint boilerplate instead of object

The overhead would be that Python objects will be allocated (unless I use a pool, similar to what Sage does with Integers...), but I guess the allocation is not more than the allocation of a dict, and moreover I wouldn't need to care about memory management. And accessing the cdef attributes of the node will be faster than looking up dictionary items, wouldn't it?

Last edited 19 months ago by SimonKing (previous) (diff)

comment:15 follow-up: Changed 19 months ago by SimonKing

PS, concerning "pool":

I recall that when I was young, there used to be a utility that automatically equipped a given cdef class with a pool/kill list, similar to what is hard coded for Integer. But I cannot find it now. Has it gone?

It may actually be an idea to equip !ETuple with pool/kill list. In that way, MPolynomial_polydict would probably suck slightly less, and the code here might also be a little faster.

comment:16 in reply to: ↑ 15 Changed 19 months ago by SimonKing

Replying to SimonKing:

PS, concerning "pool":

I recall that when I was young, there used to be a utility that automatically equipped a given cdef class with a pool/kill list, similar to what is hard coded for Integer. But I cannot find it now. Has it gone?

Found it: sage.misc.allocator

comment:17 in reply to: ↑ 14 Changed 19 months ago by SimonKing

Replying to SimonKing:

How to avoid a bound check in sum_from_list(list L, size_t s, size_t l), where I access L[0], L[1], L[s], L[s+l//2] etc.?

Since I am sure about bounds being correct, I guess I can use <ETuple>PyList_GET_ITEM(...).

comment:18 in reply to: ↑ 14 ; follow-up: Changed 19 months ago by tscrim

Replying to SimonKing:

Just some questions to make sure I understand your suggestions:

Replying to tscrim:

I missed that when going through the code. So then it is just better to use lists. Add lots of <ETuple>'s to avoid as much extra type checking as possible.

So, <ETuple> silences the type check, which is faster but can theoretically result in a segfault, right?

Yes, that is correct. If you want to do an explicit check in there that can raise an error, you can do <Etuple?>. (I believe this is suppose to be different than letting Cython raise an error when not given the correct type, but I'm not sure about that, Jeroen?)

Id[-1] is a wrap-around: normal C would just do pointer offset. You can tell Cython to assume that indices passed in behave like C arrays and do not check if the result is negative (wrap-around check) or outside the list (bounds check).

So, when I do L[len(L)-1], Cython will automatically know that a bound check is not needed?

You have to also tell Cython not to do those checks. You can do this at the level of the file by adding # cython: boundscheck=False, wraparound=False as the first line(s) of the file (see, e.g., matrix/args.pyx) or a decorator @cython.boundscheck(False) and @cython.wraparound(False) on the appropriate functions (see, e.g., combinat/combinat_cython.pyx).

How to avoid a bound check in sum_from_list(list L, size_t s, size_t l), where I access L[0], L[1], L[s], L[s+l//2] etc.?

See above.

Hmm...curious. Although the compiler will almost certainly optimize those extra variables out anyways.

In that case I should change it.

I bet that will make Jeoren happy. :)

It seems like they want to separate functions to me. Plus you could then require w to be a list for the signature.

I do require tuple. Would list be better?

I couldn't remember and was guessing. tuple is perfectly fine.

Anyway, do you think that I should duplicate the loop in first_hilbert_series, once for the case that w is None and once for the case that it isn't?

I don't see why that one has to be split. It is only degree and quotient_degree that should be split IMO. For example, in HilbertBaseCase you would have

-        for m2 in Id:
-            if m is not m2:
-                Factor *= (1-t**quotient_degree(m2,m,w))
+        if w is None:
+            for m2 in Id:
+                if m is not m2:
+                    Factor *= (1-t**quotient_degree(m2,m))
+        else:
+            for m2 in Id:
+                if m is not m2:
+                    Factor *= (1-t**quotient_degree_graded(m2,m,w))

In some cases, this will result in fewer if checks (unless the compiler is optimizing those out). In others, this will be no worse than before.

Yes, you would have to do some memory management. I'm not sure exactly how much from my relatively quick look, but it should not be too bad.

Maybe go an easy middle way, and have some

cdef class Node:
    cdef Node Back, Left, Right
    cdef object LMult   # or some flint boilerplate instead of object
    cdef object LeftFHS # or some flint boilerplate instead of object

The overhead would be that Python objects will be allocated (unless I use a pool, similar to what Sage does with Integers...), but I guess the allocation is not more than the allocation of a dict, and moreover I wouldn't need to care about memory management. And accessing the cdef attributes of the node will be faster than looking up dictionary items, wouldn't it?

Yes, it should be much faster (at that scale) and require less memory than a dict (hash tables are generally at most 80% full IIRC, but you have to store both the key (which would be a string) and the value). I think a pool is overkill for this as you are building out a tree rather than frequently creating and removing nodes, right? The memory management really only comes into play when you start dealing with the flint objects directly.

Yes, you can use PyList_GET_ITEM, which I think is even a bit faster.

comment:19 in reply to: ↑ 18 ; follow-up: Changed 19 months ago by SimonKing

Replying to tscrim:

I think a pool is overkill for this as you are building out a tree rather than frequently creating and removing nodes, right? The memory management really only comes into play when you start dealing with the flint objects directly.

I wasn't talking about a pool for Nodes (which just build a tree), but about a pool for !ETuple, which I guess will be created and killed (during interreduction) more frequently.

comment:20 in reply to: ↑ 19 Changed 19 months ago by tscrim

Replying to SimonKing:

Replying to tscrim:

I think a pool is overkill for this as you are building out a tree rather than frequently creating and removing nodes, right? The memory management really only comes into play when you start dealing with the flint objects directly.

I wasn't talking about a pool for Nodes (which just build a tree), but about a pool for !ETuple, which I guess will be created and killed (during interreduction) more frequently.

Ah, I see. Yea, that might be good to do. Although the use of the pool could end up having a fair bit of overhead when not having a lot of ETuple's that would be GC'ed. I haven't actually looked though.

comment:21 Changed 19 months ago by SimonKing

It seems to me that really for i in range(s,e,b) is a bit slower than for i from s<=i<e by b. My main example becomes about 2% slower. See also this little benchmark:

sage: cython("""
....: def test1(size_t m):
....:     cdef size_t i,j
....:     j = 0
....:     for i in range(0,m,2):
....:         j += i
....:     return j
....: def test2(size_t m):
....:     cdef size_t i,j
....:     j = 0
....:     for i from 0<=i<m by 2:
....:         j += 1
....:     return j
....: """)
sage: %timeit test1(10000)
The slowest run took 4.12 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.18 µs per loop
sage: %timeit test1(10000)
100000 loops, best of 3: 3.1 µs per loop
sage: %timeit test2(10000)
The slowest run took 4.26 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 1.85 µs per loop
sage: %timeit test2(10000)
1000000 loops, best of 3: 1.86 µs per loop

So, if the loop itself is timed, one sees that the for ... from ... by is double the speed of for ... in range(...).

I guess I should use the deprecated syntax and post the above benchmark on cython-users, shouldn't I?

comment:22 Changed 19 months ago by SimonKing

Ouch. I had a type in my benchmark (j+=i versus j+=1). Without the typo, the timing of the test functions is basically equal. So, probably you are right that the additional assignments are optimized out by the compiler.

I don't know if the 2% difference in the real world example is significant.

comment:23 Changed 19 months ago by git

  • Commit changed from 827f579781c66fae075ce2cee83942d02cdd4f1e to 7698bb6b6f99d19466f40723597d71de105856cd

Branch pushed to git repo; I updated commit sha1. New commits:

7698bb6Some code optimizations in Hilbert series computation

comment:24 Changed 19 months ago by SimonKing

The last commit did some optimizations. What I want to try next:

  • Use flint boilerplate
  • Count the number of allocations and deallocations of !ETuple, and perhaps play with a pool.

comment:25 follow-up: Changed 19 months ago by SimonKing

I wonder about

FLINT_DLL void fmpz_poly_mul(fmpz_poly_t res, 
                          const fmpz_poly_t poly1, const fmpz_poly_t poly2);

Is it legal that res coincides with poly1 or with poly2?

comment:26 in reply to: ↑ 25 Changed 19 months ago by SimonKing

Replying to SimonKing:

I wonder about

FLINT_DLL void fmpz_poly_mul(fmpz_poly_t res, 
                          const fmpz_poly_t poly1, const fmpz_poly_t poly2);

Is it legal that res coincides with poly1 or with poly2?

Probably yes. In the sources, I found checks like

if (res == poly1 || res == poly2)...

which indicate that res==poly1 is allowed.

Last edited 19 months ago by SimonKing (previous) (diff)

comment:27 Changed 19 months ago by git

  • Commit changed from 7698bb6b6f99d19466f40723597d71de105856cd to 8ae070bc006c6f4953b631f6605d7dd15fff74b0

Branch pushed to git repo; I updated commit sha1. New commits:

8ae070bUse flint polynomial boilerplate for Hilbert series computation

comment:28 Changed 19 months ago by SimonKing

In the latest commit, I am using boilerplate for flint polynomials. It brings a drastic improvement, apparently a factor of two compared with the previous commit, and there seems to be no memory leak:

sage: from sage.rings.polynomial.hilbert import first_hilbert_series, hilbert_poincare_series
sage: I,grading = load("/home/king/Projekte/coho/hilbert_example.sobj")
sage: get_memory_usage()
7366.8203125
sage: %timeit res = hilbert_poincare_series(I,grading)
10 loops, best of 3: 144 ms per loop
sage: get_memory_usage()
7438.82421875
sage: %timeit res = hilbert_poincare_series(I,grading)
10 loops, best of 3: 145 ms per loop
sage: get_memory_usage()
7438.82421875
sage: %timeit res = hilbert_poincare_series(I,grading)
10 loops, best of 3: 144 ms per loop
sage: get_memory_usage()
7438.82421875
sage: %timeit res = hilbert_poincare_series(I,grading)
10 loops, best of 3: 144 ms per loop
sage: get_memory_usage()
7438.82421875

It might be interesting to compare with Singular in examples that Singular can manage:

sage: R = PolynomialRing(QQ,'x',25)
sage: M = matrix(R,5,R.gens())
sage: I = R*[m.lm() for m in (R*(M^2).list()).groebner_basis()]
sage: I.hilbert_numerator()
-t^30 + 24*t^29 - 275*t^28 + 2000*t^27 - 10350*t^26 + 40480*t^25 - 123971*t^24 + 303600*t^23 - 600774*t^22 + 960600*t^21 - 1222811*t^20 + 1182944*t^19 - 740749*t^18 + 18472*t^17 + 687275*t^16 - 1105688*t^15 + 1161279*t^14 - 963728*t^13 + 671803*t^12 - 393000*t^11 + 179220*t^10 - 51176*t^9 + 503*t^8 + 6024*t^7 - 1400*t^6 - 576*t^5 + 275*t^4 + 24*t^3 - 25*t^2 + 1
sage: first_hilbert_series(I)
-t^30 + 24*t^29 - 275*t^28 + 2000*t^27 - 10350*t^26 + 40480*t^25 - 123971*t^24 + 303600*t^23 - 600774*t^22 + 960600*t^21 - 1222811*t^20 + 1182944*t^19 - 740749*t^18 + 18472*t^17 + 687275*t^16 - 1105688*t^15 + 1161279*t^14 - 963728*t^13 + 671803*t^12 - 393000*t^11 + 179220*t^10 - 51176*t^9 + 503*t^8 + 6024*t^7 - 1400*t^6 - 576*t^5 + 275*t^4 + 24*t^3 - 25*t^2 + 1

so, the result is still correct. However, comparing with Singular-via-pexpect, I get

sage: %timeit first_hilbert_series(I)
10 loops, best of 3: 120 ms per loop

versus

sage: Is = singular(I)
sage: singular.eval('attrib({},"isSB",1)'.format(Is.name()))
''
sage: %timeit singular.eval(hv.name()+'=hilb({},1)'.format(Is.name()))
10 loops, best of 3: 163 ms per loop

Conclusion: The new implementation not only avoids the 32 bit limitation of Singular but seems to be faster than Singular. So, meanwhile the implementation is competitive.

Next, I will see if a pool for !ETuple makes sense, and in any case I will move some cdef functions to cpdef methods of !ETuple.

Last edited 19 months ago by SimonKing (previous) (diff)

comment:29 Changed 19 months ago by SimonKing

I just tested: In the main example (the one in the attachment to this ticket), there are 93382 allocations and the same number of deallocations of !ETuples. It sounds much, however it doesn't contribute much time:

sage: cython("""
....: from sage.rings.polynomial.polydict cimport ETuple
....: def test():
....:     cdef ETuple m
....:     cdef int i
....:     for i in range(93000):
....:         m = ETuple.__new__(ETuple)
....: """
....: )
sage: %timeit test()
100 loops, best of 3: 3.03 ms per loop

So, the prospect of saving less than 3ms in an example that takes 145 ms isn't enough to justify the implementation of a pool. Do you agree?

comment:30 Changed 19 months ago by tscrim

You probably are not doing enough creations and deletions with the same amount in memory. Probably the bigger time spent is in allocating the data list in the ETuple, which cannot reasonably1 be put in a pool since those lengths almost certainly change each (re)allocation. So you are not really gaining much from that. Perhaps it is worthwhile to try on a bigger example, say something that takes 1-2 seconds and see if you still get a 2% savings or if there is just little change.

1 Of course, it is possible, but I am pretty sure you have to write your own memory manager.

comment:31 follow-up: Changed 19 months ago by tscrim

One other general comment: IMO it is still useful to add a little bit of documentation via a docstring explaining what functions do even when they are cdef (you do not need to doctest them however).

comment:32 follow-up: Changed 19 months ago by tscrim

Just a general question: do you this is would be easy and worthwhile to pull the main Hilbert series computation code out into a standalone Cython package library (that would depend only on flint)?

One last comment: Please remove this global definition:

# Global definition
PR = PolynomialRing(ZZ,'t')
t = PR('t')

I see no reason to nail it in memory.

comment:33 Changed 19 months ago by git

  • Commit changed from 8ae070bc006c6f4953b631f6605d7dd15fff74b0 to d997abe64d3526cc3741f49bce5592158044c7c8

Branch pushed to git repo; I updated commit sha1. New commits:

d997abeFix a corner case in Hilbert series computation

comment:34 in reply to: ↑ 31 Changed 19 months ago by SimonKing

Replying to tscrim:

One other general comment: IMO it is still useful to add a little bit of documentation via a docstring explaining what functions do even when they are cdef (you do not need to doctest them however).

Yes, documentation should be added. And another test, for the corner case that is fixed in the previous commit.

comment:35 in reply to: ↑ 32 Changed 19 months ago by SimonKing

Replying to tscrim:

Just a general question: do you this is would be easy and worthwhile to pull the main Hilbert series computation code out into a standalone Cython package library (that would depend only on flint)?

Not totally sure. What I need is !ETuple, which is defined in Sage. So, unless !ETuple is pulled out as well, it makes no sense. Apart from !ETuple though, I think it would make sense: If I see that correctly, the only place where I use more than flint is in hilbert_poincare_series, in which I currently just do a division of two flint polynomials in Sage.

One last comment: Please remove this global definition:

# Global definition
PR = PolynomialRing(ZZ,'t')
t = PR('t')

I see no reason to nail it in memory.

Done in the previous commit.

comment:36 Changed 19 months ago by SimonKing

Ouch. The previous commit was supposed to fix the corner case of the zero ideal, but I did a mistake so that now the corner case of the one ideal became wrong. Will be fixed next, together with more documentation.

comment:37 Changed 19 months ago by git

  • Commit changed from d997abe64d3526cc3741f49bce5592158044c7c8 to 5af3439fea2d996e84ceb1190c863b5a58d82509

Branch pushed to git repo; I updated commit sha1. New commits:

5af3439Fix the erroneous fix of some corner cases

comment:38 Changed 19 months ago by git

  • Commit changed from 5af3439fea2d996e84ceb1190c863b5a58d82509 to 138f85b3c9ad61aab1116b5acac2062f175fd298

Branch pushed to git repo; I updated commit sha1. New commits:

138f85bTurn some functions of polynomial.hilbert to methods of ETuple

comment:39 Changed 19 months ago by SimonKing

The latest commit turns some cdef functions into cpdef methods.

The effect is not good, the computation time dropped from about 145ms to about 190ms:

sage: %time hilbert_poincare_series(I,grading)
CPU times: user 192 ms, sys: 0 ns, total: 192 ms
Wall time: 191 ms
(-8*t^26 + 24*t^25 - 16*t^24 - 16*t^23 + 24*t^22 - 8*t^21 - t^7 - t^6 + t^5 - t^4 - t^3 - t^2 + t - 1)/(t^13 - 3*t^12 + 2*t^11 + 2*t^10 - 3*t^9 + t^8 - t^5 + 3*t^4 - 2*t^3 - 2*t^2 + 3*t - 1)
sage: %crun hilbert_poincare_series(I,grading)
PROFILE: interrupts/evictions/bytes = 20/0/8288
Using local file /home/king/Sage/git/sage/local/bin/python2.
Using local file /home/king/.sage/temp/klap/5581/tmp_fZKweM.perf.
Total: 20 samples
       0   0.0%   0.0%       19  95.0% PyEval_EvalCode
       0   0.0%   0.0%       19  95.0% PyEval_EvalCodeEx
       0   0.0%   0.0%       19  95.0% PyEval_EvalFrameEx
       0   0.0%   0.0%       19  95.0% PyObject_Call
       0   0.0%   0.0%       19  95.0% PyRun_FileExFlags
       0   0.0%   0.0%       19  95.0% PyRun_SimpleFileExFlags
       0   0.0%   0.0%       19  95.0% PyRun_StringFlags
       0   0.0%   0.0%       19  95.0% Py_Main
       0   0.0%   0.0%       19  95.0% __Pyx_PyObject_Call
       0   0.0%   0.0%       19  95.0% __libc_start_main
       0   0.0%   0.0%       19  95.0% __pyx_pf_4sage_5rings_10polynomial_7hilbert_2hilbert_poincare_series (inline)
       0   0.0%   0.0%       19  95.0% __pyx_pf_4sage_5rings_10polynomial_7hilbert_first_hilbert_series (inline)
       0   0.0%   0.0%       19  95.0% __pyx_pw_4sage_5rings_10polynomial_7hilbert_1first_hilbert_series
       0   0.0%   0.0%       19  95.0% __pyx_pw_4sage_5rings_10polynomial_7hilbert_3hilbert_poincare_series
       0   0.0%   0.0%       19  95.0% _start
       0   0.0%   0.0%       19  95.0% call_function (inline)
       0   0.0%   0.0%       19  95.0% exec_statement (inline)
       0   0.0%   0.0%       19  95.0% ext_do_call (inline)
       0   0.0%   0.0%       19  95.0% fast_function (inline)
       0   0.0%   0.0%       19  95.0% function_call
       0   0.0%   0.0%       19  95.0% run_mod (inline)
       0   0.0%   0.0%       11  55.0% __pyx_f_4sage_5rings_10polynomial_7hilbert_make_children (inline)
       0   0.0%   0.0%        9  45.0% __pyx_f_4sage_5rings_10polynomial_7hilbert_interred
       0   0.0%   0.0%        8  40.0% __pyx_f_4sage_5rings_10polynomial_7hilbert_indivisible_in_list (inline)
       8  40.0%  40.0%        8  40.0% __pyx_f_4sage_5rings_10polynomial_8polydict_6ETuple_divides
       0   0.0%  40.0%        5  25.0% __pyx_f_4sage_5rings_10polynomial_7hilbert_quotient_by_var (inline)
       0   0.0%  40.0%        4  20.0% __pyx_f_4sage_5rings_10polynomial_7hilbert_HilbertBaseCase.isra.41
       0   0.0%  40.0%        3  15.0% PyObject_RichCompare
       1   5.0%  45.0%        3  15.0% __pyx_pw_4sage_5rings_10polynomial_8polydict_6ETuple_11__getitem__
       0   0.0%  45.0%        3  15.0% __pyx_sq_item_4sage_5rings_10polynomial_8polydict_ETuple
       0   0.0%  45.0%        3  15.0% fmpz_poly_mul
       0   0.0%  45.0%        2  10.0% __Pyx_PyObject_Call (inline)
       0   0.0%  45.0%        2  10.0% __Pyx_PyObject_CallOneArg
       0   0.0%  45.0%        2  10.0% __pyx_pf_4sage_5rings_10polynomial_8polydict_6ETuple_10__getitem__ (inline)
       1   5.0%  50.0%        2  10.0% _fmpz_poly_mul_tiny1
       2  10.0%  60.0%        2  10.0% convert_3way_to_object (inline)
       0   0.0%  60.0%        2  10.0% do_richcmp (inline)
       0   0.0%  60.0%        2  10.0% try_3way_to_rich_compare (inline)
       0   0.0%  60.0%        1   5.0% 0x00007fe83c01f837
       1   5.0%  65.0%        1   5.0% PyInt_FromLong
       0   0.0%  65.0%        1   5.0% PyNumber_Remainder
       0   0.0%  65.0%        1   5.0% PyObject_RichCompareBool
       1   5.0%  70.0%        1   5.0% _PyObject_New
       0   0.0%  70.0%        1   5.0% __Pyx_PyFunction_FastCallDict.constprop.73
       0   0.0%  70.0%        1   5.0% __Pyx_PyFunction_FastCallNoKw (inline)
       0   0.0%  70.0%        1   5.0% __Pyx_PyInt_As_size_t.part.41
       0   0.0%  70.0%        1   5.0% __Pyx_PyInt_From_int (inline)
       0   0.0%  70.0%        1   5.0% __Pyx_PyNumber_IntOrLong (inline)
       1   5.0%  75.0%        1   5.0% __Pyx_TypeTest (inline)
       0   0.0%  75.0%        1   5.0% __Pyx__PyObject_CallOneArg
       0   0.0%  75.0%        1   5.0% __libc_calloc
       0   0.0%  75.0%        1   5.0% __pyx_f_4sage_5rings_10polynomial_8polydict_6ETuple_weighted_degree
       0   0.0%  75.0%        1   5.0% __pyx_pf_4sage_5rings_10polynomial_8polydict_6ETuple_18__richcmp__ (inline)
       1   5.0%  80.0%        1   5.0% __pyx_pf_4sage_5rings_7integer_7Integer_122__int__ (inline)
       0   0.0%  80.0%        1   5.0% __pyx_pw_4sage_5rings_10polynomial_8polydict_6ETuple_19__richcmp__
       0   0.0%  80.0%        1   5.0% __pyx_pw_4sage_5rings_7integer_7Integer_123__int__
       1   5.0%  85.0%        1   5.0% _fmpz_vec_zero
       1   5.0%  90.0%        1   5.0% _init
       1   5.0%  95.0%        1   5.0% _int_malloc
       0   0.0%  95.0%        1   5.0% binary_op (inline)
       1   5.0% 100.0%        1   5.0% binary_op1 (inline)
       0   0.0% 100.0%        1   5.0% build_sortwrapper (inline)
       0   0.0% 100.0%        1   5.0% flint_calloc
       0   0.0% 100.0%        1   5.0% fmpz_poly_init2
       0   0.0% 100.0%        1   5.0% listindex
       0   0.0% 100.0%        1   5.0% listsort

What is convert_3way_to_object?

Do you see why the computation became A LOT slower? Some of the functions that I moved from hilbert to ETuple-methods use to be "cdef inline", but are now "cpdef". Is it possible that the removal of "inline" was bad?

Last edited 19 months ago by SimonKing (previous) (diff)

comment:40 Changed 19 months ago by SimonKing

Also note that there is a compiler warning. If I understand correctly, it comes from the line

L.sort(key=ETuple.unweighted_degree)

How else should I force sorting by unweighted degree?

comment:41 Changed 19 months ago by SimonKing

PS: Of course I should add documentation. But first I want to know why the code became so much slower.

comment:42 Changed 19 months ago by SimonKing

I was wondering if perhaps the slowness came from NOT defining the parent of the first Hilbert series globally:

sage: cython("""
....: from sage.all import ZZ
....: P = ZZ['t']
....: t = P.gen()
....: def test1(i):
....:     return t**i
....: """)
sage: cython("""
....: def test2(i):
....:     from sage.all import ZZ
....:     P = ZZ['t']
....:     t = P.gen()
....:     return t**i
....: """)
sage: %timeit test1(50)
The slowest run took 29.39 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 1.83 µs per loop
sage: %timeit test2(50)
The slowest run took 4.76 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 42.2 µs per loop

So, that's a lot. But the slow-down can be reduced:

sage: cython("""
....: def test3(i):
....:     from sage.all import ZZ,PolynomialRing
....:     P = PolynomialRing(ZZ,'t')
....:     t = P.gen()
....:     return t**i
....: """)
sage: %timeit test3(50)
The slowest run took 4.88 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 24.6 µs per loop

Perhaps I should change my code accordingly.

comment:43 Changed 19 months ago by SimonKing

Without a change:

sage: from sage.rings.polynomial.hilbert import first_hilbert_series, hilbert_poincare_series
sage: I,grading = load("/home/king/Projekte/coho/hilbert_example.sobj")
sage: %timeit first_hilbert_series(I,grading)
1 loop, best of 3: 177 ms per loop
sage: %timeit hilbert_poincare_series(I,grading)
1 loop, best of 3: 176 ms per loop
sage: %timeit first_hilbert_series(I,grading)
10 loops, best of 3: 176 ms per loop
sage: %timeit hilbert_poincare_series(I,grading)
1 loop, best of 3: 179 ms per loop

With PolynomialRing(ZZ,'t') instead of ZZ['t']:

sage: %timeit first_hilbert_series(I,grading)
10 loops, best of 3: 171 ms per loop
sage: %timeit hilbert_poincare_series(I,grading)
10 loops, best of 3: 173 ms per loop
sage: %timeit first_hilbert_series(I,grading)
10 loops, best of 3: 172 ms per loop
sage: %timeit hilbert_poincare_series(I,grading)
1 loop, best of 3: 172 ms per loop

So, it really doesn't help much. And globally defining the polynomial ring doesn't help either.

Hence, I believe the slowness comes from the cpdef methods or the not-inlining.

comment:44 Changed 19 months ago by SimonKing

I know that if the compiler knows that we are dealing with an ETuple that a cpdef method should have the same speed as a cdef function. But in fact it has not. With cdef methods:

sage: I,grading = load("/home/king/Projekte/coho/hilbert_example.sobj")
sage: %time hilbert_poincare_series(I,grading)
CPU times: user 168 ms, sys: 43 µs, total: 168 ms
Wall time: 168 ms
(8*t^26 - 24*t^25 + 16*t^24 + 16*t^23 - 24*t^22 + 8*t^21 + t^7 + t^6 - t^5 + t^4 + t^3 + t^2 - t + 1)/(-t^13 + 3*t^12 - 2*t^11 - 2*t^10 + 3*t^9 - t^8 + t^5 - 3*t^4 + 2*t^3 + 2*t^2 - 3*t + 1)
sage: %timeit first_hilbert_series(I)
10 loops, best of 3: 111 ms per loop
sage: %timeit first_hilbert_series(I,grading)
10 loops, best of 3: 158 ms per loop

With cpdef methods

sage: from sage.rings.polynomial.hilbert import first_hilbert_series, hilbert_poincare_series
sage: I,grading = load("/home/king/Projekte/coho/hilbert_example.sobj")
sage: %time hilbert_poincare_series(I,grading)
CPU times: user 178 ms, sys: 7 µs, total: 178 ms
Wall time: 177 ms
(8*t^26 - 24*t^25 + 16*t^24 + 16*t^23 - 24*t^22 + 8*t^21 + t^7 + t^6 - t^5 + t^4 + t^3 + t^2 - t + 1)/(-t^13 + 3*t^12 - 2*t^11 - 2*t^10 + 3*t^9 - t^8 + t^5 - 3*t^4 + 2*t^3 + 2*t^2 - 3*t + 1)
sage: %timeit first_hilbert_series(I)
10 loops, best of 3: 128 ms per loop
sage: %timeit first_hilbert_series(I,grading)
1 loop, best of 3: 171 ms per loop

So, we have a slow-down of about 11-15%. Therefore I'd prefer to use cdef methods, where possible (i.e., the only exception is unweighted_degree, which I use for sorting).

Next plan: Try to test w is None less often, by a moderate code duplication.

comment:45 Changed 19 months ago by git

  • Commit changed from 138f85b3c9ad61aab1116b5acac2062f175fd298 to e900222fb9b43aa1470790f1e9d903bb0270dcb3

Branch pushed to git repo; I updated commit sha1. New commits:

e900222Hilbert series: Remove unnecessary bound checks, simplify code branches

comment:46 Changed 19 months ago by SimonKing

In the new commit, I introduce separate methods for weighted and unweighted degrees, and use it so that checks appear less often. Also, I turn off bounds checks and wrap-arounds in the weighted versions of the methods (in fact I do check the correct length of the tuple of weights, which means that inside of the loops the bounds are guaranteed to be correct).

It made the code faster again:

sage: from sage.rings.polynomial.hilbert import first_hilbert_series, hilbert_poincare_series
sage: I,grading = load("/home/king/Projekte/coho/hilbert_example.sobj")
sage: %timeit first_hilbert_series(I)
10 loops, best of 3: 109 ms per loop
sage: %timeit first_hilbert_series(I,grading)
10 loops, best of 3: 132 ms per loop

From my perspective, I am almost done: There remains the documentation. Or do you see further issues?

comment:47 Changed 19 months ago by git

  • Commit changed from e900222fb9b43aa1470790f1e9d903bb0270dcb3 to f0e5c8585489cdf3a80e3ee161561184bb648408

Branch pushed to git repo; I updated commit sha1. New commits:

f0e5c85Add documentation to new ETuple methods

comment:48 Changed 19 months ago by SimonKing

I have added some documentation for the cdef methods.

From my perspective, the code is good to go, and I think also good to be used in #20145.

comment:49 follow-up: Changed 19 months ago by tscrim

  • Branch changed from u/SimonKing/Hilbert_functions_unlimited to u/tscrim/hilbert_functions-26243
  • Commit changed from f0e5c8585489cdf3a80e3ee161561184bb648408 to 2d29b9b164fd1e429466c882ace499a238ecf02c
  • Reviewers set to Travis Scrimshaw

Somewhat surprising to me:

sage: %%cython
....: def test1(list L):
....:     list(L)
....: def test2(list L):
....:     L[:len(L)]
sage: %timeit test1(L)
10000000 loops, best of 3: 101 ns per loop
sage: %timeit test2(L)
10000000 loops, best of 3: 88.2 ns per loop

So I changed all of the list(L) to L[:len(L)].

I implemented a custom version of median since the other implementation was in Python and not as optimized.

All of my other changes are PEP8 and formatting. I used your #~ to denote lines that were expanded out using fmpz_*`.

If my changes are good (I got another ~15% speed gain; see below), then positive review. We can fix/improve things on followup tickets if necessary.

sage: from sage.rings.polynomial.hilbert import first_hilbert_series, hilbert_poincare_series
sage: I,grading = load("/home/uqtscrim/Downloads/hilbert_example.sobj")
sage: %timeit first_hilbert_series(I)
10 loops, best of 3: 60.9 ms per loop

vs your branch

10 loops, best of 3: 71.7 ms per loop

New commits:

2d29b9bUsing slices, custom median, and formatting/P#P8 tweaks to Hilbert code.

comment:50 follow-up: Changed 19 months ago by tscrim

Actually, one more little thing:

(-HP) / (-((1-PR.gen())**I.ring().ngens()))

Don't those minus signs cancel out?

comment:51 in reply to: ↑ 49 Changed 19 months ago by SimonKing

  • Status changed from needs_review to positive_review

Replying to tscrim:

Somewhat surprising to me:

sage: %%cython
....: def test1(list L):
....:     list(L)
....: def test2(list L):
....:     L[:len(L)]
sage: %timeit test1(L)
10000000 loops, best of 3: 101 ns per loop
sage: %timeit test2(L)
10000000 loops, best of 3: 88.2 ns per loop

So I changed all of the list(L) to L[:len(L)].

That's interesting. I thought it is something that Cython would detect (it is known that L is a list at that point).

I implemented a custom version of median since the other implementation was in Python and not as optimized.

Good idea.

All of my other changes are PEP8 and formatting.

You forgot one change: You added get_exp and used to replace the former item access of ETuple. Why is that faster?

If my changes are good (I got another ~15% speed gain; see below),

That is really amazing. I wouldn't have believed that so much additional speed can be obtained so easily (customised median, list copying, and item access).

then positive review. We can fix/improve things on followup tickets if necessary.

The changes look good to me.

comment:52 in reply to: ↑ 50 ; follow-up: Changed 19 months ago by SimonKing

Replying to tscrim:

Actually, one more little thing:

(-HP) / (-((1-PR.gen())**I.ring().ngens()))

Don't those minus signs cancel out?

That's tricky:

sage: P.<t> = ZZ[]
sage: (t-1)/(t^2+1)
(t - 1)/(t^2 + 1)
sage: (-t+1)/(-t^2-1)
(-t + 1)/(-t^2 - 1)

versus

sage: P.<t> = QQ[]
sage: (t-1)/(t^2+1)
(t - 1)/(t^2 + 1)
sage: (-t+1)/(-t^2-1)
(t - 1)/(t^2 + 1)

I think the first Hilbert series, having integer coefficients, should live in ZZ[t], not QQ[t]. And I also think the output should be normalised.

comment:53 follow-up: Changed 19 months ago by SimonKing

For fun, I was playing with a pool for ETuple. And in fact one gets a quite substantial additional speed-up. But currently, it is suffering from a memory leak.

Question: Should we discuss this HERE (as the ticket isn't closed yet)? Or should I create a new ticket for it (as the ticket has a positive review)?

comment:54 Changed 19 months ago by SimonKing

Question: Is Py_TPFLAGS_HAVE_GC set on ETuple? I thought it only is set, if something has attributes that are objects. But if I see that correctly, all attributes of ETuple are either int* or size_t.

comment:55 in reply to: ↑ 52 Changed 19 months ago by tscrim

Replying to SimonKing:

Replying to tscrim:

Actually, one more little thing:

(-HP) / (-((1-PR.gen())**I.ring().ngens()))

Don't those minus signs cancel out?

I think the first Hilbert series, having integer coefficients, should live in ZZ[t], not QQ[t]. And I also think the output should be normalised.

I am assuming you mean Frac(ZZ[t]) and QQ(t). Generally, Frac(ZZ[t]) has better normalization because not all coefficients are units (i.e., it uses the gcd). I agree that the leading coefficients would be better to be normalized to positive values for ZZ, and I don't know why it is not. However, these two expressions are considered by Sage to be equal:

sage: P.<t> = ZZ[]
sage: f = (t-1)/(t^2+1)
sage: g = (-t+1)/(-t^2-1)
sage: f == g
True
sage: hash(f) == hash(g)
True

The issue stems from the fact that there are special reductions done for fraction fields of polynomial rings over a field, but not for Frac(ZZ[t]). It is a bit unfortunate perhaps that a similar process is not attempted for at least negative signs.

comment:56 in reply to: ↑ 53 ; follow-up: Changed 19 months ago by tscrim

Replying to SimonKing:

For fun, I was playing with a pool for ETuple. And in fact one gets a quite substantial additional speed-up. But currently, it is suffering from a memory leak.

Question: Should we discuss this HERE (as the ticket isn't closed yet)? Or should I create a new ticket for it (as the ticket has a positive review)?

Another ticket.

comment:57 in reply to: ↑ 56 ; follow-up: Changed 19 months ago by SimonKing

Replying to tscrim:

Replying to SimonKing:

For fun, I was playing with a pool for ETuple. And in fact one gets a quite substantial additional speed-up. But currently, it is suffering from a memory leak.

Question: Should we discuss this HERE (as the ticket isn't closed yet)? Or should I create a new ticket for it (as the ticket has a positive review)?

Another ticket.

See #26291

comment:58 in reply to: ↑ 57 Changed 19 months ago by SimonKing

Replying to SimonKing:

Replying to tscrim:

Another ticket.

See #26291

Sorry, for some reason I got wrong timings. I think I was comparing the timing WITH GRADING without pool to the timing WITHOUT GRADING with pool. After trying the timings again, I found that the pool made it slower (although that might come from the memory leak).

Anyway, I think one can forget about #26291

comment:59 Changed 19 months ago by tscrim

I forgot to answer why I separated out the get_exp, that is because I wanted to have explicit input/output types and avoid the extra check if it is a slice.

comment:60 follow-ups: Changed 19 months ago by vbraun

  • Status changed from positive_review to needs_work

There is quite a lot of test failures with negative exponents like

File "src/sage/rings/polynomial/multi_polynomial_ring.py", line 632, in sage.rings.polynomial.multi_polynomial_ring.MPolynomialRing_polydict.monomial_quotient
Failed example:
    P.monomial_quotient(x, y) # Note the wrong result
Expected:
    x*y^-1
Got:
    x*y^18446744073709551615

comment:61 in reply to: ↑ 60 Changed 19 months ago by SimonKing

Replying to vbraun:

There is quite a lot of test failures with negative exponents like

File "src/sage/rings/polynomial/multi_polynomial_ring.py", line 632, in sage.rings.polynomial.multi_polynomial_ring.MPolynomialRing_polydict.monomial_quotient
Failed example:
    P.monomial_quotient(x, y) # Note the wrong result
Expected:
    x*y^-1
Got:
    x*y^18446744073709551615

Oops. I think in some places I was assuming that exponents are non-negative.

comment:62 Changed 19 months ago by SimonKing

Without the changes to ETuple, one gets

sage: P.<x,y> = QQ[]
sage: P.monomial_quotient(x,y)
x*y^65535

So, is it perhaps a bug that it used to work? After all, we are talking about polynomial rings here.

comment:63 in reply to: ↑ 60 Changed 19 months ago by SimonKing

Replying to vbraun:

There is quite a lot of test failures with negative exponents like

File "src/sage/rings/polynomial/multi_polynomial_ring.py", line 632, in sage.rings.polynomial.multi_polynomial_ring.MPolynomialRing_polydict.monomial_quotient
Failed example:
    P.monomial_quotient(x, y) # Note the wrong result
Expected:
    x*y^-1
Got:
    x*y^18446744073709551615

Note the comment: "Note the wrong result". In other words, x*y^-1 is wrong, and P.monomial_quotient(x,y) is wrong usage.

Last edited 19 months ago by SimonKing (previous) (diff)

comment:64 Changed 19 months ago by tscrim

I don't see how any of the changes here could change that doctest. I am investigating.

Note that you need to create the correct class:

sage: from sage.rings.polynomial.multi_polynomial_ring import MPolynomialRing_polydict_domain
sage: P.<x,y,z> = MPolynomialRing_polydict_domain(QQ,3, order='degrevlex')
sage: R.<x,y,z> = QQ[]
sage: type(P)
<class 'sage.rings.polynomial.multi_polynomial_ring.MPolynomialRing_polydict_domain_with_category'>
sage: type(R)
<type 'sage.rings.polynomial.multi_polynomial_libsingular.MPolynomialRing_libsingular'>

comment:65 Changed 19 months ago by tscrim

Ah, it is probably the creation of get_exp that caused this. My guess is that the input i not being a size_t but something else that indexing works with. At least, that is the only thing I see in my commit that could have led to the patchbot failures.

comment:66 Changed 19 months ago by git

  • Commit changed from 2d29b9b164fd1e429466c882ace499a238ecf02c to 07ff2e1627ea16a9921dde8ffea053e597630043

Branch pushed to git repo; I updated commit sha1. New commits:

07ff2e1Reverting use of get_exp for __getitem__.

comment:67 Changed 19 months ago by tscrim

Okay, I just reverted using get_exp in the __getitem__. So there is some code duplication, but using get_exp will be faster when you know your input is an int. Thus, I think it is useful to actually have, but I wanted to match the data structure (which is why I changed it from a size_t input). I also saw a few doc/PEP8 things I missed on my previous pass that I added in too. The failing tests from the patchbot now pass for me.

comment:68 Changed 19 months ago by tscrim

  • Status changed from needs_work to needs_review

comment:69 follow-up: Changed 19 months ago by SimonKing

Am I correct in thinking that ETuple is designed for implementing polynomials, not Laurent polynomials? This and the comment on the failing test and the fact that "ETuple free code" as in comment:63 indicates that negative exponents is wrong usage.

So, what could we do about it?

An obvious change (on a new ticket, of course) would be to change ETuple._data from int* to size_t*.

comment:70 in reply to: ↑ 69 ; follow-up: Changed 19 months ago by SimonKing

Replying to SimonKing:

Am I correct in thinking that ETuple is designed for implementing polynomials, not Laurent polynomials?

Too bad. ETuple *is* used for Laurent polynomials. In the Hilbert series code, I do assume that the exponents are non-negative, though. The assumption is explicitly documented in some places.

comment:71 in reply to: ↑ 70 Changed 19 months ago by tscrim

Replying to SimonKing:

Replying to SimonKing:

Am I correct in thinking that ETuple is designed for implementing polynomials, not Laurent polynomials?

Too bad. ETuple *is* used for Laurent polynomials. In the Hilbert series code, I do assume that the exponents are non-negative, though. The assumption is explicitly documented in some places.

I think the assumption is documented enough where it is used, so it is fine. If someone needs the negatives, then they can change it to, e.g., Py_ssize_t.

comment:72 Changed 19 months ago by SimonKing

Are the tests passing? Can this be back to positive review?

comment:73 follow-up: Changed 19 months ago by tscrim

Tests are passing for me. From #20145, don't you need another commit?

comment:74 in reply to: ↑ 73 ; follow-up: Changed 19 months ago by tscrim

Replying to tscrim:

Tests are passing for me. From #20145, don't you need another commit?

Let me rephrase, do you want the last commit on #20145 here?

comment:75 in reply to: ↑ 74 Changed 19 months ago by SimonKing

Replying to tscrim:

Replying to tscrim:

Tests are passing for me. From #20145, don't you need another commit?

Let me rephrase, do you want the last commit on #20145 here?

No, why? I was told occasionally that one should keep tickets small. In this case, there is one ticket for a new implementation of Hilbert series, and one ticket (#20145) for using the new implementation (also adding an implementation of Hilbert polynomial) in existing methods.

Additionally, when you ask if I need the two commits from #20145: For my own applications, I just need the commits from here.

comment:76 follow-up: Changed 19 months ago by tscrim

Okay, I wasn't sure if the last commit from #20145 would be considered a bug from this part or not. Thank you for clarifying. So yes, all tests pass for me, and this is a positive review if my last change is good.

comment:77 in reply to: ↑ 76 Changed 19 months ago by SimonKing

Replying to tscrim:

Okay, I wasn't sure if the last commit from #20145 would be considered a bug from this part or not. Thank you for clarifying. So yes, all tests pass for me, and this is a positive review if my last change is good.

No, it wasn't a bug --- at least not a bug here. I thought it was needed to amend the sign in #20145, which I did in the first commit, but in fact it is not needed, and thus I reverted the sign amendment in the second commit.

comment:78 Changed 19 months ago by tscrim

I see. Thanks for explaining. This is ready to be set back to positive review.

comment:79 Changed 19 months ago by SimonKing

  • Status changed from needs_review to positive_review

Thanks!

comment:80 Changed 19 months ago by git

  • Commit changed from 07ff2e1627ea16a9921dde8ffea053e597630043 to 5637e07941cab4d3c8ea35b0b35b94150ed0a58c
  • Status changed from positive_review to needs_review

Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. New commits:

7b2ed50Merge branch 'u/tscrim/hilbert_functions-26243' of git://trac.sagemath.org/sage into u/tscrim/hilbert_functions-26243
5637e07Fixing INPUT:: to INPUT: in hilbert.pyx.

comment:81 Changed 19 months ago by tscrim

  • Status changed from needs_review to positive_review

A little stupid trivial fix.

comment:82 Changed 19 months ago by vbraun

  • Branch changed from u/tscrim/hilbert_functions-26243 to 5637e07941cab4d3c8ea35b0b35b94150ed0a58c
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.