Opened 19 months ago
Closed 19 months ago
#26243 closed enhancement (fixed)
Provide Hilbert series implementation beyond Singular's limitations
Reported by:  SimonKing  Owned by:  

Priority:  major  Milestone:  sage8.4 
Component:  commutative algebra  Keywords:  Hilbert series 
Cc:  Winfried  Merged in:  
Authors:  Simon King  Reviewers:  Travis Scrimshaw 
Report Upstream:  N/A  Work issues:  
Branch:  5637e07 (Commits)  Commit:  5637e07941cab4d3c8ea35b0b35b94150ed0a58c 
Dependencies:  Stopgaps: 
Description (last modified by )
In my work on group cohomology, I was bitten by the fact that Singular limits integers to 32 bit in the coefficients of Hilbert series. By consequence, examples such as the following from #20145 fail:
sage: n=4;m=11;P = PolynomialRing(QQ,n*m,"x"); x = P.gens(); M = Matrix(n,x) sage: I = P.ideal(M.minors(2)) sage: J = P*[m.lm() for m in I.groebner_basis()] sage: J.hilbert_numerator() Traceback (most recent call last): ... RuntimeError: error in Singular function call 'hilb': int overflow in hilb 1
It seems that upstream does not want to change Singular to use 64 bit integers by default and arbitrary size integers when needed (that's what I would propose. Therefore I am providing an implementation of Hilbert series computation that does not suffer from such limitation.
sage: from sage.rings.polynomial.hilbert import first_hilbert_series sage: hilbert_poincare_series(J).numerator() 120*t^3 + 135*t^2 + 30*t + 1 sage: hilbert_poincare_series(J).denominator().factor() (t  1)^14
The above is the correct result according to #20145.
My implementation is a bit slower than libsingular, which is why I do not propose to use my implementation as a replacement of libsingular's hilb
function. However, in big applications, it is a good way out.
Attachments (1)
Change History (83)
comment:1 Changed 19 months ago by
 Branch set to u/SimonKing/Hilbert_functions_unlimited
comment:2 Changed 19 months ago by
 Commit set to 827f579781c66fae075ce2cee83942d02cdd4f1e
 Status changed from new to needs_review
comment:3 Changed 19 months ago by
 Description modified (diff)
comment:4 Changed 19 months ago by
 Cc Winfried added
comment:5 followup: ↓ 6 Changed 19 months ago by
The residue class ring of the polynomial ring in 11 x 4 variables modulo the 2x2 minors is a normal monoid ring. You can use Normaliz, but there is also an explicit formula. See MR1213858
comment:6 in reply to: ↑ 5 Changed 19 months ago by
Replying to Winfried:
The residue class ring of the polynomial ring in 11 x 4 variables modulo the 2x2 minors is a normal monoid ring. You can use Normaliz, but there is also an explicit formula. See MR1213858
On #25091, you said that normaliz can only compute the Hilbert series of the integral closure of a monomial ideal. That may not be what is required here.
Anyway: Clearly I am not interested in that particular example. The example is just that: An example, that I borrowed from #20145. It is suitable as a doc test, as it can be set up in short time; and it is instructive as a doc example, as it exposes limitations of Singular that make Singular unusable in "real world examples" for which there are no explicit formulae in the literature.
Or would you know a formula for the Hilbert series of the mod2 cohomology ring of group number 299 of order 256 (numbering according to the small groups library)? That cohomology ring is the smallest potential counter example to a conjecture of Dave Benson. A small part of the attempt to verify that conjecture is the computation of the Hilbert series of several quotients of that ring. Unfortunately, that "small part" is too big for Singular or Frobby.
The leading ideal of the relation ideal of the abovementioned cohomology ring is in attachment:hilbert_example.sobj. The computation of its Hilbert series actually shouldn't take very long:
sage: from sage.rings.polynomial.hilbert import hilbert_poincare_series sage: I,grading = load("/home/king/Projekte/coho/hilbert_example.sobj") sage: %time hilbert_poincare_series(I,grading) CPU times: user 270 ms, sys: 2.85 ms, total: 273 ms Wall time: 324 ms (8*t^26 + 24*t^25  16*t^24  16*t^23 + 24*t^22  8*t^21  t^7  t^6 + t^5  t^4  t^3  t^2 + t  1)/(t^13  3*t^12 + 2*t^11 + 2*t^10  3*t^9 + t^8  t^5 + 3*t^4  2*t^3  2*t^2 + 3*t  1)
However, Singular immediately tells that the example is too big, and Frobby needs to be killed after several minutes. There is no interface to CoCoA, and I don't know how to utilise the interface to normaliz. I also don't know how to install Macaulay2, so, I can't test that either.
comment:7 followup: ↓ 8 Changed 19 months ago by
All of these comments are about sum_from_list
:
Something that should give you better speed would be having it take another argument of the starting point to do the sum. That way you would not have to create all of the intermediate lists (e.g., L[:l2]
and L[:l2]
). It will be slightly trickier to code, but not by much and I would imagine give a decent speedup.
I think this might also give you better C code (and hence, be faster):
if l == 1: return <ETuple> L[0] if l == 2: return (<ETuple> L[0]).eadd(<ETuple> L[1])
Some other thoughts:
Does L
need to be a list (i.e., not an array of ETuple
's)? This would mean less typecasting. It also seems like you are using list manipulations other than one sort (which you can most likely just use C qsort
on).
It is faster to do if not Id:
than if len(Id)==0:
, even in Cython.
If you replace Id[1]
with Id[len(Id)1]
, you can then remove the bounds check and wrap around from HilbertBaseCase
. You probably should remove those checks across the file to get some extra speed.
You are probably going to be better in terms of speed to directly the lowlevel polynomial manipulation via flint, that way you have fewer intermediate (Sage) objects and indirection.
Should those ETuple
functions be a part of the ETuple
class?
The for i from 0 <= i < k by m
is discouraged (if not semideprecated) in favor of the more standard Python for i in range(0, k, m)
, which gets turned into the same C code AFAIK.
You can avoid a few additions here:
 for i from 0 <= i < 2*m._nonzero by 2:  degree += m._data[i+1] + for i in range(1, 2*m._nonzero, 2): + degree += m._data[i]
You should have two functions quotient_degree
and quotient_degree_weighted
, where the former does not take w
as an input. There is almost no code shared between the two and it makes it easier to understand what is going on where they are called. Same for degree
.
quotient(by_var)
knows its return type is a list
since interred
returns a list
.
I don't see why AN
should be a dict
(and hence the first arguments D
to a number of your functions). It seems more like you are mimicking a struct
with a dict
. Using a struct
will be much faster because it doesn't have to do string hashing and equality checking; it would be a direct C lookup. For those things you are manipulating in subfunction calls, you can pass it by pointer like you would normally in C (I forget if Cython passes everything by copy or reference by default for C objects).
comment:8 in reply to: ↑ 7 ; followup: ↓ 12 Changed 19 months ago by
Dear Travis,
first of all: Thank you for your thoughtful comments!
Replying to tscrim:
Something that should give you better speed would be having it take another argument of the starting point to do the sum. That way you would not have to create all of the intermediate lists (e.g.,
L[:l2]
andL[:l2]
). It will be slightly trickier to code, but not by much and I would imagine give a decent speedup.
Right, I should try that.
I think this might also give you better C code (and hence, be faster):
if l == 1: return <ETuple> L[0] if l == 2: return (<ETuple> L[0]).eadd(<ETuple> L[1])
Right, actually the first "if" is nonsense in my code: I assign to m1, but do not return it. Anyway, here is the resulting C code:
/* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":268 * if l==1: * m1 = L[0] * return L[0] # <<<<<<<<<<<<<< * if l==2: * m1,m2=L */ __Pyx_XDECREF(((PyObject *)__pyx_r)); if (unlikely(__pyx_v_L == Py_None)) { PyErr_SetString(PyExc_TypeError, "'NoneType' object is not subscriptable"); __PYX_ERR(0, 268, __pyx_L1_error) } __pyx_t_2 = __Pyx_GetItemInt_List(__pyx_v_L, 0, long, 1, __Pyx_PyInt_From_long, 1, 0, 1); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 268, __pyx_L1_error) __Pyx_GOTREF(__pyx_t_2); if (!(likely(((__pyx_t_2) == Py_None)  likely(__Pyx_TypeTest(__pyx_t_2, __pyx_ptype_4sage_5rings_10polynomial_8polydict_ETuple))))) __PYX_ERR(0, 268, __pyx_L1_error) __pyx_r = ((struct __pyx_obj_4sage_5rings_10polynomial_8polydict_ETuple *)__pyx_t_2); __pyx_t_2 = 0; goto __pyx_L0;
versus
/* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":268 * if l==1: * m1 = L[0] * return <ETuple>L[0] # <<<<<<<<<<<<<< * if l==2: * m1,m2=L */ __Pyx_XDECREF(((PyObject *)__pyx_r)); if (unlikely(__pyx_v_L == Py_None)) { PyErr_SetString(PyExc_TypeError, "'NoneType' object is not subscriptable"); __PYX_ERR(0, 268, __pyx_L1_error) } __pyx_t_2 = __Pyx_GetItemInt_List(__pyx_v_L, 0, long, 1, __Pyx_PyInt_From_long, 1, 0, 1); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 268, __pyx_L1_error) __Pyx_GOTREF(__pyx_t_2); __Pyx_INCREF(((PyObject *)((struct __pyx_obj_4sage_5rings_10polynomial_8polydict_ETuple *)__pyx_t_2))); __pyx_r = ((struct __pyx_obj_4sage_5rings_10polynomial_8polydict_ETuple *)__pyx_t_2); __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0; goto __pyx_L0;
Is the second one better?
Does
L
need to be a list (i.e., not an array ofETuple
's)? This would mean less typecasting.
Can you give me a pointer on how to use arrays of Cython types?
It also seems like you are using list manipulations other than one sort (which you can most likely just use C
qsort
on).
I also use append and pop. How to do these with arrays?
It is faster to do
if not Id:
thanif len(Id)==0:
, even in Cython.
Cool! I thought that, again, it is equivalent. But the code becomes a lot shorter:
/* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":292 * cdef int e * # First, the easiest cases: * if len(Id)==0: # <<<<<<<<<<<<<< * return PR(1) * cdef ETuple m = Id[1] */ if (unlikely(__pyx_v_Id == Py_None)) { PyErr_SetString(PyExc_TypeError, "object of type 'NoneType' has no len()"); __PYX_ERR(0, 292, __pyx_L1_error) } __pyx_t_2 = PyList_GET_SIZE(__pyx_v_Id); if (unlikely(__pyx_t_2 == ((Py_ssize_t)1))) __PYX_ERR(0, 292, __pyx_L1_error) __pyx_t_3 = ((__pyx_t_2 == 0) != 0); if (__pyx_t_3) {
versus
/* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":292 * cdef int e * # First, the easiest cases: * if not Id: # <<<<<<<<<<<<<< * return PR(1) * cdef ETuple m = Id[1] */ __pyx_t_2 = (__pyx_v_Id != Py_None)&&(PyList_GET_SIZE(__pyx_v_Id) != 0); __pyx_t_3 = ((!__pyx_t_2) != 0); if (__pyx_t_3) {
If you replace
Id[1]
withId[len(Id)1]
, you can then remove the bounds check and wrap around fromHilbertBaseCase
. You probably should remove those checks across the file to get some extra speed.
I don't understand that. What bounds check are you talking about? Are you saying that Id[1]
does a bounds check, whereas Id[len(Id)1]
doesn't?
You are probably going to be better in terms of speed to directly the lowlevel polynomial manipulation via flint, that way you have fewer intermediate (Sage) objects and indirection.
So, I should allocate the underlying flint C types and manipulate these, and only in the very end create a Sage polynomial? I guess that's related with another point below.
Should those
ETuple
functions be a part of theETuple
class?
Probably. Since ETuple
is designed for monomials anyway, it make sense to have monomial divisibility checks there and not here.
The
for i from 0 <= i < k by m
is discouraged (if not semideprecated) in favor of the more standard Pythonfor i in range(0, k, m)
, which gets turned into the same C code AFAIK.
Almost. I get
/* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":110 * cdef int exp1 * cdef ETuple result * for i from 0 <= i < 2*m1._nonzero by 2: # <<<<<<<<<<<<<< * if m1._data[i] == index: * result = <ETuple>m1._new() */ __pyx_t_1 = (2 * __pyx_v_m1>_nonzero); for (__pyx_v_i = 0; __pyx_v_i < __pyx_t_1; __pyx_v_i+=2) { ... }
versus
/* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":110 * cdef int exp1 * cdef ETuple result * for i in range(0,2*m1._nonzero,2): # <<<<<<<<<<<<<< * if m1._data[i] == index: * result = <ETuple>m1._new() */ __pyx_t_1 = (2 * __pyx_v_m1>_nonzero); __pyx_t_2 = __pyx_t_1; for (__pyx_t_3 = 0; __pyx_t_3 < __pyx_t_2; __pyx_t_3+=2) { __pyx_v_i = __pyx_t_3; ... }
So, in the nondeprecated version there is an additional variable assignment. Actually this is why I changed my original code (which did use in range(...)
) into from ... by
.
You can avoid a few additions here:
 for i from 0 <= i < 2*m._nonzero by 2:  degree += m._data[i+1] + for i in range(1, 2*m._nonzero, 2): + degree += m._data[i]
Right, good point!
You should have two functions
quotient_degree
andquotient_degree_weighted
, where the former does not takew
as an input. There is almost no code shared between the two and it makes it easier to understand what is going on where they are called. Same fordegree
.
Really? I thought it was clearer the other way around: In all cases, we have a computation that involves degree weights, and it is the job of the function to find out if the weights are trivial or not.
One thing, though: At some point, one needs to check whether w
is None or not. By having a massive code duplication, I could achieve that this check is only done once, namely in first_hilbert_series
. Currently, it is checked frequently, and the frequency wouldn't be reduced with your suggestion of a small code duplication.
quotient(by_var)
knows its return type is alist
sinceinterred
returns alist
.
Right.
I don't see why
AN
should be adict
(and hence the first argumentsD
to a number of your functions). It seems more like you are mimicking astruct
with adict
. Using astruct
will be much faster because it doesn't have to do string hashing and equality checking; it would be a direct C lookup. For those things you are manipulating in subfunction calls, you can pass it by pointer like you would normally in C (I forget if Cython passes everything by copy or reference by default for C objects).
This is related with the point above: In a struct
, I could also more easily use the underlying C data type of flint polynomials.
But just to be clear: I would need to do the memory management myself, wouldn't I? Basically, it is a binary search tree; is there code in Sage (I am sure there is!) where such data structure is used and from where I can get inspiration?
comment:9 followup: ↓ 13 Changed 19 months ago by
Just to be sure: If I turn the !ETuple functions into methods, they should probably be cpdef, not cdef, right? As long as all instances of !ETuple in my code are cdefined, a cpdef method wouldn't be less efficient than a cdef method, right?
comment:10 followup: ↓ 11 Changed 19 months ago by
1) One of the bad aspects of Singular is that it is the opposite of being careful in dealing with classes of objects. What is called the Hilbert series of an ideal I there, is the Hilbert series of R/I where R is the ambient polynomial ring. But I is a graded module and has its own Hilbert series. If one wants the Hilbert series of R/I, it should NOT be called the Hilbert series of I.
2) I don't think one can relate the Hilbert series of a (monomial) ideal and the Hilbert series of its integral closure in a useful way.
comment:11 in reply to: ↑ 10 Changed 19 months ago by
Replying to Winfried:
1) One of the bad aspects of Singular is that it is the opposite of being careful in dealing with classes of objects. What is called the Hilbert series of an ideal I there, is the Hilbert series of R/I where R is the ambient polynomial ring. But I is a graded module and has its own Hilbert series. If one wants the Hilbert series of R/I, it should NOT be called the Hilbert series of I.
Right. In most cases, I actually use the wording "Hilbert series of I" and "Poincaré series of R/I", which I guess isn't standard, but at least it makes the difference clear. But I think in my comments above I haven't been sufficiently careful with these wordings.
EDIT: Note that Bigatti uses the notation "<I>", which coincides with the HilbertPoincaré series of R/I. But that's just a notation, not a name. It seems to me that in practical applications, one most commonly has an ideal I and wants to compute the HilbertPoincaré series of R/I, not of "I as a module". Hence, from a practical aspect, it actually makes sense to use a function name such as "hilb" or "hilbert_numerator" and so on for a function that tells something about R/I but takes I as an input.
2) I don't think one can relate the Hilbert series of a (monomial) ideal and the Hilbert series of its integral closure in a useful way.
That's unfortunate. I really need the Hilbert/Poincaré series of R/I.
comment:12 in reply to: ↑ 8 ; followup: ↓ 14 Changed 19 months ago by
Replying to SimonKing:
first of all: Thank you for your thoughtful comments!
No problem. Sorry I am not always able to find the time to review your code.
Replying to tscrim:
I think this might also give you better C code (and hence, be faster):
if l == 1: return <ETuple> L[0] if l == 2: return (<ETuple> L[0]).eadd(<ETuple> L[1])Right, actually the first "if" is nonsense in my code: I assign to m1, but do not return it. Anyway, here is the resulting C code: [snip] Is the second one better?
Yes, you can see that it does not do the type check (which could result in a segfault if the result is not the correct type).
Does
L
need to be a list (i.e., not an array ofETuple
's)? This would mean less typecasting.Can you give me a pointer on how to use arrays of Cython types?
Right, you can't have an array of cdef classes. Forgot about that. There is a way around that by having an array of pointers to the objects:
https://stackoverflow.com/questions/33851333/cythonhowdoyoucreateanarrayofcdefclass
It is just really annoying to have to do all of the typecasting.
It also seems like you are using list manipulations other than one sort (which you can most likely just use C
qsort
on).I also use append and pop. How to do these with arrays?
I missed that when going through the code. So then it is just better to use lists. Add lots of <ETuple>
's to avoid as much extra type checking as possible.
If you replace
Id[1]
withId[len(Id)1]
, you can then remove the bounds check and wrap around fromHilbertBaseCase
. You probably should remove those checks across the file to get some extra speed.I don't understand that. What bounds check are you talking about? Are you saying that
Id[1]
does a bounds check, whereasId[len(Id)1]
doesn't?
Id[1]
is a wraparound: normal C would just do pointer offset. You can tell Cython to assume that indices passed in behave like C arrays and do not check if the result is negative (wraparound check) or outside the list (bounds check).
You are probably going to be better in terms of speed to directly the lowlevel polynomial manipulation via flint, that way you have fewer intermediate (Sage) objects and indirection.
So, I should allocate the underlying flint C types and manipulate these, and only in the very end create a Sage polynomial? I guess that's related with another point below.
Yes, if you want to squeeze out as much speed as possible. It is more work and makes the code a bit more obscure, but it should give you some more speed.
The
for i from 0 <= i < k by m
is discouraged (if not semideprecated) in favor of the more standard Pythonfor i in range(0, k, m)
, which gets turned into the same C code AFAIK.Almost. I get [snip] So, in the nondeprecated version there is an additional variable assignment. Actually this is why I changed my original code (which did use
in range(...)
) intofrom ... by
.
Hmm...curious. Although the compiler will almost certainly optimize those extra variables out anyways.
You should have two functions
quotient_degree
andquotient_degree_weighted
, where the former does not takew
as an input. There is almost no code shared between the two and it makes it easier to understand what is going on where they are called. Same fordegree
.Really? I thought it was clearer the other way around: In all cases, we have a computation that involves degree weights, and it is the job of the function to find out if the weights are trivial or not.
It seems like they want to separate functions to me. Plus you could then require w
to be a list
for the signature.
One thing, though: At some point, one needs to check whether
w
is None or not. By having a massive code duplication, I could achieve that this check is only done once, namely infirst_hilbert_series
. Currently, it is checked frequently, and the frequency wouldn't be reduced with your suggestion of a small code duplication.
I see; I didn't think it was going to be that bad overall. However, it's not a big issue for me the way things are now.
I don't see why
AN
should be adict
(and hence the first argumentsD
to a number of your functions). It seems more like you are mimicking astruct
with adict
. Using astruct
will be much faster because it doesn't have to do string hashing and equality checking; it would be a direct C lookup. For those things you are manipulating in subfunction calls, you can pass it by pointer like you would normally in C (I forget if Cython passes everything by copy or reference by default for C objects).This is related with the point above: In a
struct
, I could also more easily use the underlying C data type of flint polynomials.
Yes, that is correct, but the struct
could hold onto a Sage polynomial just as well. The reason for doing this is both cleaner and faster code by avoiding the dict
.
But just to be clear: I would need to do the memory management myself, wouldn't I? Basically, it is a binary search tree; is there code in Sage (I am sure there is!) where such data structure is used and from where I can get inspiration?
Yes, you would have to do some memory management. I'm not sure exactly how much from my relatively quick look, but it should not be too bad.
comment:13 in reply to: ↑ 9 Changed 19 months ago by
Replying to SimonKing:
Just to be sure: If I turn the !ETuple functions into methods, they should probably be cpdef, not cdef, right? As long as all instances of !ETuple in my code are cdefined, a cpdef method wouldn't be less efficient than a cdef method, right?
No, a cpdef called from a Cython file is just a C function call IIRC. So it should not be any less efficient from within Cython code (although I think it can introduce a bit of extra overhead when doing a Python function call compared to a plain def).
sage: %%cython ....: def foo(): ....: pass ....: cpdef cpfoo(): ....: pass ....: cdef cfoo(): ....: pass ....: def test(): ....: foo() ....: def ctest(): ....: cfoo() ....: def cptest(): ....: cpfoo() ....: cpdef cptestcp(): ....: cpfoo() ....: sage: %timeit test() 10000000 loops, best of 3: 34.9 ns per loop sage: %timeit ctest() 10000000 loops, best of 3: 22.9 ns per loop sage: %timeit cptest() 10000000 loops, best of 3: 22.9 ns per loop sage: %timeit cptestcp() 10000000 loops, best of 3: 23.3 ns per loop
(These timings are relatively consistent across multiple runs for me.)
comment:14 in reply to: ↑ 12 ; followups: ↓ 17 ↓ 18 Changed 19 months ago by
Just some questions to make sure I understand your suggestions:
Replying to tscrim:
I missed that when going through the code. So then it is just better to use lists. Add lots of
<ETuple>
's to avoid as much extra type checking as possible.
So, <ETuple>
silences the type check, which is faster but can theoretically result in a segfault, right?
Id[1]
is a wraparound: normal C would just do pointer offset. You can tell Cython to assume that indices passed in behave like C arrays and do not check if the result is negative (wraparound check) or outside the list (bounds check).
So, when I do L[len(L)1]
, Cython will automatically know that a bound check is not needed?
How to avoid a bound check in sum_from_list(list L, size_t s, size_t l)
, where I access L[0], L[1], L[s]
, L[s+l//2]
etc.?
Hmm...curious. Although the compiler will almost certainly optimize those extra variables out anyways.
In that case I should change it.
It seems like they want to separate functions to me. Plus you could then require
w
to be alist
for the signature.
I do require tuple
. Would list
be better?
Anyway, do you think that I should duplicate the loop in first_hilbert_series
, once for the case that w
is None and once for the case that it isn't?
Yes, you would have to do some memory management. I'm not sure exactly how much from my relatively quick look, but it should not be too bad.
Maybe go an easy middle way, and have some
cdef class Node: cdef Node Back, Left, Right cdef object LMult # or some flint boilerplate instead of object cdef object LeftFHS # or some flint boilerplate instead of object
The overhead would be that Python objects will be allocated (unless I use a pool, similar to what Sage does with Integers...), but I guess the allocation is not more than the allocation of a dict, and moreover I wouldn't need to care about memory management. And accessing the cdef attributes of the node will be faster than looking up dictionary items, wouldn't it?
comment:15 followup: ↓ 16 Changed 19 months ago by
PS, concerning "pool":
I recall that when I was young, there used to be a utility that automatically equipped a given cdef class with a pool/kill list, similar to what is hard coded for Integer. But I cannot find it now. Has it gone?
It may actually be an idea to equip !ETuple with pool/kill list. In that way, MPolynomial_polydict
would probably suck slightly less, and the code here might also be a little faster.
comment:16 in reply to: ↑ 15 Changed 19 months ago by
Replying to SimonKing:
PS, concerning "pool":
I recall that when I was young, there used to be a utility that automatically equipped a given cdef class with a pool/kill list, similar to what is hard coded for Integer. But I cannot find it now. Has it gone?
Found it: sage.misc.allocator
comment:17 in reply to: ↑ 14 Changed 19 months ago by
Replying to SimonKing:
How to avoid a bound check in
sum_from_list(list L, size_t s, size_t l)
, where I access L[0], L[1],L[s]
,L[s+l//2]
etc.?
Since I am sure about bounds being correct, I guess I can use <ETuple>PyList_GET_ITEM(...)
.
comment:18 in reply to: ↑ 14 ; followup: ↓ 19 Changed 19 months ago by
Replying to SimonKing:
Just some questions to make sure I understand your suggestions:
Replying to tscrim:
I missed that when going through the code. So then it is just better to use lists. Add lots of
<ETuple>
's to avoid as much extra type checking as possible.So,
<ETuple>
silences the type check, which is faster but can theoretically result in a segfault, right?
Yes, that is correct. If you want to do an explicit check in there that can raise an error, you can do <Etuple?>
. (I believe this is suppose to be different than letting Cython raise an error when not given the correct type, but I'm not sure about that, Jeroen?)
Id[1]
is a wraparound: normal C would just do pointer offset. You can tell Cython to assume that indices passed in behave like C arrays and do not check if the result is negative (wraparound check) or outside the list (bounds check).So, when I do
L[len(L)1]
, Cython will automatically know that a bound check is not needed?
You have to also tell Cython not to do those checks. You can do this at the level of the file by adding # cython: boundscheck=False, wraparound=False
as the first line(s) of the file (see, e.g., matrix/args.pyx
) or a decorator @cython.boundscheck(False)
and @cython.wraparound(False)
on the appropriate functions (see, e.g., combinat/combinat_cython.pyx
).
How to avoid a bound check in
sum_from_list(list L, size_t s, size_t l)
, where I access L[0], L[1],L[s]
,L[s+l//2]
etc.?
See above.
Hmm...curious. Although the compiler will almost certainly optimize those extra variables out anyways.
In that case I should change it.
I bet that will make Jeoren happy. :)
It seems like they want to separate functions to me. Plus you could then require
w
to be alist
for the signature.I do require
tuple
. Wouldlist
be better?
I couldn't remember and was guessing. tuple
is perfectly fine.
Anyway, do you think that I should duplicate the loop in
first_hilbert_series
, once for the case thatw
is None and once for the case that it isn't?
I don't see why that one has to be split. It is only degree
and quotient_degree
that should be split IMO. For example, in HilbertBaseCase
you would have
 for m2 in Id:  if m is not m2:  Factor *= (1t**quotient_degree(m2,m,w)) + if w is None: + for m2 in Id: + if m is not m2: + Factor *= (1t**quotient_degree(m2,m)) + else: + for m2 in Id: + if m is not m2: + Factor *= (1t**quotient_degree_graded(m2,m,w))
In some cases, this will result in fewer if
checks (unless the compiler is optimizing those out). In others, this will be no worse than before.
Yes, you would have to do some memory management. I'm not sure exactly how much from my relatively quick look, but it should not be too bad.
Maybe go an easy middle way, and have some
cdef class Node: cdef Node Back, Left, Right cdef object LMult # or some flint boilerplate instead of object cdef object LeftFHS # or some flint boilerplate instead of objectThe overhead would be that Python objects will be allocated (unless I use a pool, similar to what Sage does with Integers...), but I guess the allocation is not more than the allocation of a dict, and moreover I wouldn't need to care about memory management. And accessing the cdef attributes of the node will be faster than looking up dictionary items, wouldn't it?
Yes, it should be much faster (at that scale) and require less memory than a dict
(hash tables are generally at most 80% full IIRC, but you have to store both the key (which would be a string) and the value). I think a pool is overkill for this as you are building out a tree rather than frequently creating and removing nodes, right? The memory management really only comes into play when you start dealing with the flint objects directly.
Yes, you can use PyList_GET_ITEM
, which I think is even a bit faster.
comment:19 in reply to: ↑ 18 ; followup: ↓ 20 Changed 19 months ago by
Replying to tscrim:
I think a pool is overkill for this as you are building out a tree rather than frequently creating and removing nodes, right? The memory management really only comes into play when you start dealing with the flint objects directly.
I wasn't talking about a pool for Nodes (which just build a tree), but about a pool for !ETuple, which I guess will be created and killed (during interreduction) more frequently.
comment:20 in reply to: ↑ 19 Changed 19 months ago by
Replying to SimonKing:
Replying to tscrim:
I think a pool is overkill for this as you are building out a tree rather than frequently creating and removing nodes, right? The memory management really only comes into play when you start dealing with the flint objects directly.
I wasn't talking about a pool for Nodes (which just build a tree), but about a pool for !ETuple, which I guess will be created and killed (during interreduction) more frequently.
Ah, I see. Yea, that might be good to do. Although the use of the pool could end up having a fair bit of overhead when not having a lot of ETuple
's that would be GC'ed. I haven't actually looked though.
comment:21 Changed 19 months ago by
It seems to me that really for i in range(s,e,b)
is a bit slower than for i from s<=i<e by b
. My main example becomes about 2% slower. See also this little benchmark:
sage: cython(""" ....: def test1(size_t m): ....: cdef size_t i,j ....: j = 0 ....: for i in range(0,m,2): ....: j += i ....: return j ....: def test2(size_t m): ....: cdef size_t i,j ....: j = 0 ....: for i from 0<=i<m by 2: ....: j += 1 ....: return j ....: """) sage: %timeit test1(10000) The slowest run took 4.12 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 3.18 µs per loop sage: %timeit test1(10000) 100000 loops, best of 3: 3.1 µs per loop sage: %timeit test2(10000) The slowest run took 4.26 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 1.85 µs per loop sage: %timeit test2(10000) 1000000 loops, best of 3: 1.86 µs per loop
So, if the loop itself is timed, one sees that the for ... from ... by
is double the speed of for ... in range(...)
.
I guess I should use the deprecated syntax and post the above benchmark on cythonusers, shouldn't I?
comment:22 Changed 19 months ago by
Ouch. I had a type in my benchmark (j+=i
versus j+=1
). Without the typo, the timing of the test functions is basically equal. So, probably you are right that the additional assignments are optimized out by the compiler.
I don't know if the 2% difference in the real world example is significant.
comment:23 Changed 19 months ago by
 Commit changed from 827f579781c66fae075ce2cee83942d02cdd4f1e to 7698bb6b6f99d19466f40723597d71de105856cd
Branch pushed to git repo; I updated commit sha1. New commits:
7698bb6  Some code optimizations in Hilbert series computation

comment:24 Changed 19 months ago by
The last commit did some optimizations. What I want to try next:
 Use flint boilerplate
 Count the number of allocations and deallocations of !ETuple, and perhaps play with a pool.
comment:25 followup: ↓ 26 Changed 19 months ago by
I wonder about
FLINT_DLL void fmpz_poly_mul(fmpz_poly_t res, const fmpz_poly_t poly1, const fmpz_poly_t poly2);
Is it legal that res coincides with poly1 or with poly2?
comment:26 in reply to: ↑ 25 Changed 19 months ago by
Replying to SimonKing:
I wonder about
FLINT_DLL void fmpz_poly_mul(fmpz_poly_t res, const fmpz_poly_t poly1, const fmpz_poly_t poly2);Is it legal that res coincides with poly1 or with poly2?
Probably yes. In the sources, I found checks like
if (res == poly1  res == poly2)...
which indicate that res==poly1 is allowed.
comment:27 Changed 19 months ago by
 Commit changed from 7698bb6b6f99d19466f40723597d71de105856cd to 8ae070bc006c6f4953b631f6605d7dd15fff74b0
Branch pushed to git repo; I updated commit sha1. New commits:
8ae070b  Use flint polynomial boilerplate for Hilbert series computation

comment:28 Changed 19 months ago by
In the latest commit, I am using boilerplate for flint polynomials. It brings a drastic improvement, apparently a factor of two compared with the previous commit, and there seems to be no memory leak:
sage: from sage.rings.polynomial.hilbert import first_hilbert_series, hilbert_poincare_series sage: I,grading = load("/home/king/Projekte/coho/hilbert_example.sobj") sage: get_memory_usage() 7366.8203125 sage: %timeit res = hilbert_poincare_series(I,grading) 10 loops, best of 3: 144 ms per loop sage: get_memory_usage() 7438.82421875 sage: %timeit res = hilbert_poincare_series(I,grading) 10 loops, best of 3: 145 ms per loop sage: get_memory_usage() 7438.82421875 sage: %timeit res = hilbert_poincare_series(I,grading) 10 loops, best of 3: 144 ms per loop sage: get_memory_usage() 7438.82421875 sage: %timeit res = hilbert_poincare_series(I,grading) 10 loops, best of 3: 144 ms per loop sage: get_memory_usage() 7438.82421875
It might be interesting to compare with Singular in examples that Singular can manage:
sage: R = PolynomialRing(QQ,'x',25) sage: M = matrix(R,5,R.gens()) sage: I = R*[m.lm() for m in (R*(M^2).list()).groebner_basis()] sage: I.hilbert_numerator() t^30 + 24*t^29  275*t^28 + 2000*t^27  10350*t^26 + 40480*t^25  123971*t^24 + 303600*t^23  600774*t^22 + 960600*t^21  1222811*t^20 + 1182944*t^19  740749*t^18 + 18472*t^17 + 687275*t^16  1105688*t^15 + 1161279*t^14  963728*t^13 + 671803*t^12  393000*t^11 + 179220*t^10  51176*t^9 + 503*t^8 + 6024*t^7  1400*t^6  576*t^5 + 275*t^4 + 24*t^3  25*t^2 + 1 sage: first_hilbert_series(I) t^30 + 24*t^29  275*t^28 + 2000*t^27  10350*t^26 + 40480*t^25  123971*t^24 + 303600*t^23  600774*t^22 + 960600*t^21  1222811*t^20 + 1182944*t^19  740749*t^18 + 18472*t^17 + 687275*t^16  1105688*t^15 + 1161279*t^14  963728*t^13 + 671803*t^12  393000*t^11 + 179220*t^10  51176*t^9 + 503*t^8 + 6024*t^7  1400*t^6  576*t^5 + 275*t^4 + 24*t^3  25*t^2 + 1
so, the result is still correct. However, comparing with Singularviapexpect, I get
sage: %timeit first_hilbert_series(I) 10 loops, best of 3: 120 ms per loop
versus
sage: Is = singular(I) sage: singular.eval('attrib({},"isSB",1)'.format(Is.name())) '' sage: %timeit singular.eval(hv.name()+'=hilb({},1)'.format(Is.name())) 10 loops, best of 3: 163 ms per loop
Conclusion: The new implementation not only avoids the 32 bit limitation of Singular but seems to be faster than Singular. So, meanwhile the implementation is competitive.
Next, I will see if a pool for !ETuple makes sense, and in any case I will move some cdef functions to cpdef methods of !ETuple.
comment:29 Changed 19 months ago by
I just tested: In the main example (the one in the attachment to this ticket), there are 93382 allocations and the same number of deallocations of !ETuples. It sounds much, however it doesn't contribute much time:
sage: cython(""" ....: from sage.rings.polynomial.polydict cimport ETuple ....: def test(): ....: cdef ETuple m ....: cdef int i ....: for i in range(93000): ....: m = ETuple.__new__(ETuple) ....: """ ....: ) sage: %timeit test() 100 loops, best of 3: 3.03 ms per loop
So, the prospect of saving less than 3ms in an example that takes 145 ms isn't enough to justify the implementation of a pool. Do you agree?
comment:30 Changed 19 months ago by
You probably are not doing enough creations and deletions with the same amount in memory. Probably the bigger time spent is in allocating the data list in the ETuple
, which cannot reasonably^{1} be put in a pool since those lengths almost certainly change each (re)allocation. So you are not really gaining much from that. Perhaps it is worthwhile to try on a bigger example, say something that takes 12 seconds and see if you still get a 2% savings or if there is just little change.
^{1} Of course, it is possible, but I am pretty sure you have to write your own memory manager.
comment:31 followup: ↓ 34 Changed 19 months ago by
One other general comment: IMO it is still useful to add a little bit of documentation via a docstring explaining what functions do even when they are cdef
(you do not need to doctest them however).
comment:32 followup: ↓ 35 Changed 19 months ago by
Just a general question: do you this is would be easy and worthwhile to pull the main Hilbert series computation code out into a standalone Cython package library (that would depend only on flint)?
One last comment: Please remove this global definition:
# Global definition PR = PolynomialRing(ZZ,'t') t = PR('t')
I see no reason to nail it in memory.
comment:33 Changed 19 months ago by
 Commit changed from 8ae070bc006c6f4953b631f6605d7dd15fff74b0 to d997abe64d3526cc3741f49bce5592158044c7c8
Branch pushed to git repo; I updated commit sha1. New commits:
d997abe  Fix a corner case in Hilbert series computation

comment:34 in reply to: ↑ 31 Changed 19 months ago by
Replying to tscrim:
One other general comment: IMO it is still useful to add a little bit of documentation via a docstring explaining what functions do even when they are
cdef
(you do not need to doctest them however).
Yes, documentation should be added. And another test, for the corner case that is fixed in the previous commit.
comment:35 in reply to: ↑ 32 Changed 19 months ago by
Replying to tscrim:
Just a general question: do you this is would be easy and worthwhile to pull the main Hilbert series computation code out into a standalone Cython package library (that would depend only on flint)?
Not totally sure. What I need is !ETuple, which is defined in Sage. So, unless !ETuple is pulled out as well, it makes no sense. Apart from !ETuple though, I think it would make sense: If I see that correctly, the only place where I use more than flint is in hilbert_poincare_series, in which I currently just do a division of two flint polynomials in Sage.
One last comment: Please remove this global definition:
# Global definition PR = PolynomialRing(ZZ,'t') t = PR('t')I see no reason to nail it in memory.
Done in the previous commit.
comment:36 Changed 19 months ago by
Ouch. The previous commit was supposed to fix the corner case of the zero ideal, but I did a mistake so that now the corner case of the one ideal became wrong. Will be fixed next, together with more documentation.
comment:37 Changed 19 months ago by
 Commit changed from d997abe64d3526cc3741f49bce5592158044c7c8 to 5af3439fea2d996e84ceb1190c863b5a58d82509
Branch pushed to git repo; I updated commit sha1. New commits:
5af3439  Fix the erroneous fix of some corner cases

comment:38 Changed 19 months ago by
 Commit changed from 5af3439fea2d996e84ceb1190c863b5a58d82509 to 138f85b3c9ad61aab1116b5acac2062f175fd298
Branch pushed to git repo; I updated commit sha1. New commits:
138f85b  Turn some functions of polynomial.hilbert to methods of ETuple

comment:39 Changed 19 months ago by
The latest commit turns some cdef functions into cpdef methods.
The effect is not good, the computation time dropped from about 145ms to about 190ms:
sage: %time hilbert_poincare_series(I,grading) CPU times: user 192 ms, sys: 0 ns, total: 192 ms Wall time: 191 ms (8*t^26 + 24*t^25  16*t^24  16*t^23 + 24*t^22  8*t^21  t^7  t^6 + t^5  t^4  t^3  t^2 + t  1)/(t^13  3*t^12 + 2*t^11 + 2*t^10  3*t^9 + t^8  t^5 + 3*t^4  2*t^3  2*t^2 + 3*t  1) sage: %crun hilbert_poincare_series(I,grading) PROFILE: interrupts/evictions/bytes = 20/0/8288 Using local file /home/king/Sage/git/sage/local/bin/python2. Using local file /home/king/.sage/temp/klap/5581/tmp_fZKweM.perf. Total: 20 samples 0 0.0% 0.0% 19 95.0% PyEval_EvalCode 0 0.0% 0.0% 19 95.0% PyEval_EvalCodeEx 0 0.0% 0.0% 19 95.0% PyEval_EvalFrameEx 0 0.0% 0.0% 19 95.0% PyObject_Call 0 0.0% 0.0% 19 95.0% PyRun_FileExFlags 0 0.0% 0.0% 19 95.0% PyRun_SimpleFileExFlags 0 0.0% 0.0% 19 95.0% PyRun_StringFlags 0 0.0% 0.0% 19 95.0% Py_Main 0 0.0% 0.0% 19 95.0% __Pyx_PyObject_Call 0 0.0% 0.0% 19 95.0% __libc_start_main 0 0.0% 0.0% 19 95.0% __pyx_pf_4sage_5rings_10polynomial_7hilbert_2hilbert_poincare_series (inline) 0 0.0% 0.0% 19 95.0% __pyx_pf_4sage_5rings_10polynomial_7hilbert_first_hilbert_series (inline) 0 0.0% 0.0% 19 95.0% __pyx_pw_4sage_5rings_10polynomial_7hilbert_1first_hilbert_series 0 0.0% 0.0% 19 95.0% __pyx_pw_4sage_5rings_10polynomial_7hilbert_3hilbert_poincare_series 0 0.0% 0.0% 19 95.0% _start 0 0.0% 0.0% 19 95.0% call_function (inline) 0 0.0% 0.0% 19 95.0% exec_statement (inline) 0 0.0% 0.0% 19 95.0% ext_do_call (inline) 0 0.0% 0.0% 19 95.0% fast_function (inline) 0 0.0% 0.0% 19 95.0% function_call 0 0.0% 0.0% 19 95.0% run_mod (inline) 0 0.0% 0.0% 11 55.0% __pyx_f_4sage_5rings_10polynomial_7hilbert_make_children (inline) 0 0.0% 0.0% 9 45.0% __pyx_f_4sage_5rings_10polynomial_7hilbert_interred 0 0.0% 0.0% 8 40.0% __pyx_f_4sage_5rings_10polynomial_7hilbert_indivisible_in_list (inline) 8 40.0% 40.0% 8 40.0% __pyx_f_4sage_5rings_10polynomial_8polydict_6ETuple_divides 0 0.0% 40.0% 5 25.0% __pyx_f_4sage_5rings_10polynomial_7hilbert_quotient_by_var (inline) 0 0.0% 40.0% 4 20.0% __pyx_f_4sage_5rings_10polynomial_7hilbert_HilbertBaseCase.isra.41 0 0.0% 40.0% 3 15.0% PyObject_RichCompare 1 5.0% 45.0% 3 15.0% __pyx_pw_4sage_5rings_10polynomial_8polydict_6ETuple_11__getitem__ 0 0.0% 45.0% 3 15.0% __pyx_sq_item_4sage_5rings_10polynomial_8polydict_ETuple 0 0.0% 45.0% 3 15.0% fmpz_poly_mul 0 0.0% 45.0% 2 10.0% __Pyx_PyObject_Call (inline) 0 0.0% 45.0% 2 10.0% __Pyx_PyObject_CallOneArg 0 0.0% 45.0% 2 10.0% __pyx_pf_4sage_5rings_10polynomial_8polydict_6ETuple_10__getitem__ (inline) 1 5.0% 50.0% 2 10.0% _fmpz_poly_mul_tiny1 2 10.0% 60.0% 2 10.0% convert_3way_to_object (inline) 0 0.0% 60.0% 2 10.0% do_richcmp (inline) 0 0.0% 60.0% 2 10.0% try_3way_to_rich_compare (inline) 0 0.0% 60.0% 1 5.0% 0x00007fe83c01f837 1 5.0% 65.0% 1 5.0% PyInt_FromLong 0 0.0% 65.0% 1 5.0% PyNumber_Remainder 0 0.0% 65.0% 1 5.0% PyObject_RichCompareBool 1 5.0% 70.0% 1 5.0% _PyObject_New 0 0.0% 70.0% 1 5.0% __Pyx_PyFunction_FastCallDict.constprop.73 0 0.0% 70.0% 1 5.0% __Pyx_PyFunction_FastCallNoKw (inline) 0 0.0% 70.0% 1 5.0% __Pyx_PyInt_As_size_t.part.41 0 0.0% 70.0% 1 5.0% __Pyx_PyInt_From_int (inline) 0 0.0% 70.0% 1 5.0% __Pyx_PyNumber_IntOrLong (inline) 1 5.0% 75.0% 1 5.0% __Pyx_TypeTest (inline) 0 0.0% 75.0% 1 5.0% __Pyx__PyObject_CallOneArg 0 0.0% 75.0% 1 5.0% __libc_calloc 0 0.0% 75.0% 1 5.0% __pyx_f_4sage_5rings_10polynomial_8polydict_6ETuple_weighted_degree 0 0.0% 75.0% 1 5.0% __pyx_pf_4sage_5rings_10polynomial_8polydict_6ETuple_18__richcmp__ (inline) 1 5.0% 80.0% 1 5.0% __pyx_pf_4sage_5rings_7integer_7Integer_122__int__ (inline) 0 0.0% 80.0% 1 5.0% __pyx_pw_4sage_5rings_10polynomial_8polydict_6ETuple_19__richcmp__ 0 0.0% 80.0% 1 5.0% __pyx_pw_4sage_5rings_7integer_7Integer_123__int__ 1 5.0% 85.0% 1 5.0% _fmpz_vec_zero 1 5.0% 90.0% 1 5.0% _init 1 5.0% 95.0% 1 5.0% _int_malloc 0 0.0% 95.0% 1 5.0% binary_op (inline) 1 5.0% 100.0% 1 5.0% binary_op1 (inline) 0 0.0% 100.0% 1 5.0% build_sortwrapper (inline) 0 0.0% 100.0% 1 5.0% flint_calloc 0 0.0% 100.0% 1 5.0% fmpz_poly_init2 0 0.0% 100.0% 1 5.0% listindex 0 0.0% 100.0% 1 5.0% listsort
What is convert_3way_to_object
?
Do you see why the computation became A LOT slower? Some of the functions that I moved from hilbert to ETuplemethods use to be "cdef inline", but are now "cpdef". Is it possible that the removal of "inline" was bad?
comment:40 Changed 19 months ago by
Also note that there is a compiler warning. If I understand correctly, it comes from the line
L.sort(key=ETuple.unweighted_degree)
How else should I force sorting by unweighted degree?
comment:41 Changed 19 months ago by
PS: Of course I should add documentation. But first I want to know why the code became so much slower.
comment:42 Changed 19 months ago by
I was wondering if perhaps the slowness came from NOT defining the parent of the first Hilbert series globally:
sage: cython(""" ....: from sage.all import ZZ ....: P = ZZ['t'] ....: t = P.gen() ....: def test1(i): ....: return t**i ....: """) sage: cython(""" ....: def test2(i): ....: from sage.all import ZZ ....: P = ZZ['t'] ....: t = P.gen() ....: return t**i ....: """) sage: %timeit test1(50) The slowest run took 29.39 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 1.83 µs per loop sage: %timeit test2(50) The slowest run took 4.76 times longer than the fastest. This could mean that an intermediate result is being cached. 10000 loops, best of 3: 42.2 µs per loop
So, that's a lot. But the slowdown can be reduced:
sage: cython(""" ....: def test3(i): ....: from sage.all import ZZ,PolynomialRing ....: P = PolynomialRing(ZZ,'t') ....: t = P.gen() ....: return t**i ....: """) sage: %timeit test3(50) The slowest run took 4.88 times longer than the fastest. This could mean that an intermediate result is being cached. 10000 loops, best of 3: 24.6 µs per loop
Perhaps I should change my code accordingly.
comment:43 Changed 19 months ago by
Without a change:
sage: from sage.rings.polynomial.hilbert import first_hilbert_series, hilbert_poincare_series sage: I,grading = load("/home/king/Projekte/coho/hilbert_example.sobj") sage: %timeit first_hilbert_series(I,grading) 1 loop, best of 3: 177 ms per loop sage: %timeit hilbert_poincare_series(I,grading) 1 loop, best of 3: 176 ms per loop sage: %timeit first_hilbert_series(I,grading) 10 loops, best of 3: 176 ms per loop sage: %timeit hilbert_poincare_series(I,grading) 1 loop, best of 3: 179 ms per loop
With PolynomialRing(ZZ,'t')
instead of ZZ['t']
:
sage: %timeit first_hilbert_series(I,grading) 10 loops, best of 3: 171 ms per loop sage: %timeit hilbert_poincare_series(I,grading) 10 loops, best of 3: 173 ms per loop sage: %timeit first_hilbert_series(I,grading) 10 loops, best of 3: 172 ms per loop sage: %timeit hilbert_poincare_series(I,grading) 1 loop, best of 3: 172 ms per loop
So, it really doesn't help much. And globally defining the polynomial ring doesn't help either.
Hence, I believe the slowness comes from the cpdef methods or the notinlining.
comment:44 Changed 19 months ago by
I know that if the compiler knows that we are dealing with an ETuple that a cpdef method should have the same speed as a cdef function. But in fact it has not. With cdef methods:
sage: I,grading = load("/home/king/Projekte/coho/hilbert_example.sobj") sage: %time hilbert_poincare_series(I,grading) CPU times: user 168 ms, sys: 43 µs, total: 168 ms Wall time: 168 ms (8*t^26  24*t^25 + 16*t^24 + 16*t^23  24*t^22 + 8*t^21 + t^7 + t^6  t^5 + t^4 + t^3 + t^2  t + 1)/(t^13 + 3*t^12  2*t^11  2*t^10 + 3*t^9  t^8 + t^5  3*t^4 + 2*t^3 + 2*t^2  3*t + 1) sage: %timeit first_hilbert_series(I) 10 loops, best of 3: 111 ms per loop sage: %timeit first_hilbert_series(I,grading) 10 loops, best of 3: 158 ms per loop
With cpdef methods
sage: from sage.rings.polynomial.hilbert import first_hilbert_series, hilbert_poincare_series sage: I,grading = load("/home/king/Projekte/coho/hilbert_example.sobj") sage: %time hilbert_poincare_series(I,grading) CPU times: user 178 ms, sys: 7 µs, total: 178 ms Wall time: 177 ms (8*t^26  24*t^25 + 16*t^24 + 16*t^23  24*t^22 + 8*t^21 + t^7 + t^6  t^5 + t^4 + t^3 + t^2  t + 1)/(t^13 + 3*t^12  2*t^11  2*t^10 + 3*t^9  t^8 + t^5  3*t^4 + 2*t^3 + 2*t^2  3*t + 1) sage: %timeit first_hilbert_series(I) 10 loops, best of 3: 128 ms per loop sage: %timeit first_hilbert_series(I,grading) 1 loop, best of 3: 171 ms per loop
So, we have a slowdown of about 1115%. Therefore I'd prefer to use cdef methods, where possible (i.e., the only exception is unweighted_degree
, which I use for sorting).
Next plan: Try to test w is None
less often, by a moderate code duplication.
comment:45 Changed 19 months ago by
 Commit changed from 138f85b3c9ad61aab1116b5acac2062f175fd298 to e900222fb9b43aa1470790f1e9d903bb0270dcb3
Branch pushed to git repo; I updated commit sha1. New commits:
e900222  Hilbert series: Remove unnecessary bound checks, simplify code branches

comment:46 Changed 19 months ago by
In the new commit, I introduce separate methods for weighted and unweighted degrees, and use it so that checks appear less often. Also, I turn off bounds checks and wraparounds in the weighted versions of the methods (in fact I do check the correct length of the tuple of weights, which means that inside of the loops the bounds are guaranteed to be correct).
It made the code faster again:
sage: from sage.rings.polynomial.hilbert import first_hilbert_series, hilbert_poincare_series sage: I,grading = load("/home/king/Projekte/coho/hilbert_example.sobj") sage: %timeit first_hilbert_series(I) 10 loops, best of 3: 109 ms per loop sage: %timeit first_hilbert_series(I,grading) 10 loops, best of 3: 132 ms per loop
From my perspective, I am almost done: There remains the documentation. Or do you see further issues?
comment:47 Changed 19 months ago by
 Commit changed from e900222fb9b43aa1470790f1e9d903bb0270dcb3 to f0e5c8585489cdf3a80e3ee161561184bb648408
Branch pushed to git repo; I updated commit sha1. New commits:
f0e5c85  Add documentation to new ETuple methods

comment:48 Changed 19 months ago by
I have added some documentation for the cdef methods.
From my perspective, the code is good to go, and I think also good to be used in #20145.
comment:49 followup: ↓ 51 Changed 19 months ago by
 Branch changed from u/SimonKing/Hilbert_functions_unlimited to u/tscrim/hilbert_functions26243
 Commit changed from f0e5c8585489cdf3a80e3ee161561184bb648408 to 2d29b9b164fd1e429466c882ace499a238ecf02c
 Reviewers set to Travis Scrimshaw
Somewhat surprising to me:
sage: %%cython ....: def test1(list L): ....: list(L) ....: def test2(list L): ....: L[:len(L)] sage: %timeit test1(L) 10000000 loops, best of 3: 101 ns per loop sage: %timeit test2(L) 10000000 loops, best of 3: 88.2 ns per loop
So I changed all of the list(L)
to L[:len(L)]
.
I implemented a custom version of median
since the other implementation was in Python and not as optimized.
All of my other changes are PEP8 and formatting. I used your #~ to denote lines that were expanded out using
fmpz_*`.
If my changes are good (I got another ~15% speed gain; see below), then positive review. We can fix/improve things on followup tickets if necessary.
sage: from sage.rings.polynomial.hilbert import first_hilbert_series, hilbert_poincare_series sage: I,grading = load("/home/uqtscrim/Downloads/hilbert_example.sobj") sage: %timeit first_hilbert_series(I) 10 loops, best of 3: 60.9 ms per loop
vs your branch
10 loops, best of 3: 71.7 ms per loop
New commits:
2d29b9b  Using slices, custom median, and formatting/P#P8 tweaks to Hilbert code.

comment:50 followup: ↓ 52 Changed 19 months ago by
Actually, one more little thing:
(HP) / (((1PR.gen())**I.ring().ngens()))
Don't those minus signs cancel out?
comment:51 in reply to: ↑ 49 Changed 19 months ago by
 Status changed from needs_review to positive_review
Replying to tscrim:
Somewhat surprising to me:
sage: %%cython ....: def test1(list L): ....: list(L) ....: def test2(list L): ....: L[:len(L)] sage: %timeit test1(L) 10000000 loops, best of 3: 101 ns per loop sage: %timeit test2(L) 10000000 loops, best of 3: 88.2 ns per loopSo I changed all of the
list(L)
toL[:len(L)]
.
That's interesting. I thought it is something that Cython would detect (it is known that L is a list at that point).
I implemented a custom version of
median
since the other implementation was in Python and not as optimized.
Good idea.
All of my other changes are PEP8 and formatting.
You forgot one change: You added get_exp and used to replace the former item access of ETuple. Why is that faster?
If my changes are good (I got another ~15% speed gain; see below),
That is really amazing. I wouldn't have believed that so much additional speed can be obtained so easily (customised median, list copying, and item access).
then positive review. We can fix/improve things on followup tickets if necessary.
The changes look good to me.
comment:52 in reply to: ↑ 50 ; followup: ↓ 55 Changed 19 months ago by
Replying to tscrim:
Actually, one more little thing:
(HP) / (((1PR.gen())**I.ring().ngens()))Don't those minus signs cancel out?
That's tricky:
sage: P.<t> = ZZ[] sage: (t1)/(t^2+1) (t  1)/(t^2 + 1) sage: (t+1)/(t^21) (t + 1)/(t^2  1)
versus
sage: P.<t> = QQ[] sage: (t1)/(t^2+1) (t  1)/(t^2 + 1) sage: (t+1)/(t^21) (t  1)/(t^2 + 1)
I think the first Hilbert series, having integer coefficients, should live in ZZ[t], not QQ[t]. And I also think the output should be normalised.
comment:53 followup: ↓ 56 Changed 19 months ago by
For fun, I was playing with a pool for ETuple. And in fact one gets a quite substantial additional speedup. But currently, it is suffering from a memory leak.
Question: Should we discuss this HERE (as the ticket isn't closed yet)? Or should I create a new ticket for it (as the ticket has a positive review)?
comment:54 Changed 19 months ago by
Question: Is Py_TPFLAGS_HAVE_GC
set on ETuple? I thought it only is set, if something has attributes that are objects. But if I see that correctly, all attributes of ETuple are either int* or size_t.
comment:55 in reply to: ↑ 52 Changed 19 months ago by
Replying to SimonKing:
Replying to tscrim:
Actually, one more little thing:
(HP) / (((1PR.gen())**I.ring().ngens()))Don't those minus signs cancel out?
I think the first Hilbert series, having integer coefficients, should live in ZZ[t], not QQ[t]. And I also think the output should be normalised.
I am assuming you mean Frac(ZZ[t])
and QQ(t)
. Generally, Frac(ZZ[t])
has better normalization because not all coefficients are units (i.e., it uses the gcd
). I agree that the leading coefficients would be better to be normalized to positive values for ZZ
, and I don't know why it is not. However, these two expressions are considered by Sage to be equal:
sage: P.<t> = ZZ[] sage: f = (t1)/(t^2+1) sage: g = (t+1)/(t^21) sage: f == g True sage: hash(f) == hash(g) True
The issue stems from the fact that there are special reductions done for fraction fields of polynomial rings over a field, but not for Frac(ZZ[t])
. It is a bit unfortunate perhaps that a similar process is not attempted for at least negative signs.
comment:56 in reply to: ↑ 53 ; followup: ↓ 57 Changed 19 months ago by
Replying to SimonKing:
For fun, I was playing with a pool for ETuple. And in fact one gets a quite substantial additional speedup. But currently, it is suffering from a memory leak.
Question: Should we discuss this HERE (as the ticket isn't closed yet)? Or should I create a new ticket for it (as the ticket has a positive review)?
Another ticket.
comment:57 in reply to: ↑ 56 ; followup: ↓ 58 Changed 19 months ago by
Replying to tscrim:
Replying to SimonKing:
For fun, I was playing with a pool for ETuple. And in fact one gets a quite substantial additional speedup. But currently, it is suffering from a memory leak.
Question: Should we discuss this HERE (as the ticket isn't closed yet)? Or should I create a new ticket for it (as the ticket has a positive review)?
Another ticket.
See #26291
comment:58 in reply to: ↑ 57 Changed 19 months ago by
Replying to SimonKing:
Replying to tscrim:
Another ticket.
See #26291
Sorry, for some reason I got wrong timings. I think I was comparing the timing WITH GRADING without pool to the timing WITHOUT GRADING with pool. After trying the timings again, I found that the pool made it slower (although that might come from the memory leak).
Anyway, I think one can forget about #26291
comment:59 Changed 19 months ago by
I forgot to answer why I separated out the get_exp
, that is because I wanted to have explicit input/output types and avoid the extra check if it is a slice.
comment:60 followups: ↓ 61 ↓ 63 Changed 19 months ago by
 Status changed from positive_review to needs_work
There is quite a lot of test failures with negative exponents like
File "src/sage/rings/polynomial/multi_polynomial_ring.py", line 632, in sage.rings.polynomial.multi_polynomial_ring.MPolynomialRing_polydict.monomial_quotient Failed example: P.monomial_quotient(x, y) # Note the wrong result Expected: x*y^1 Got: x*y^18446744073709551615
comment:61 in reply to: ↑ 60 Changed 19 months ago by
Replying to vbraun:
There is quite a lot of test failures with negative exponents like
File "src/sage/rings/polynomial/multi_polynomial_ring.py", line 632, in sage.rings.polynomial.multi_polynomial_ring.MPolynomialRing_polydict.monomial_quotient Failed example: P.monomial_quotient(x, y) # Note the wrong result Expected: x*y^1 Got: x*y^18446744073709551615
Oops. I think in some places I was assuming that exponents are nonnegative.
comment:62 Changed 19 months ago by
Without the changes to ETuple, one gets
sage: P.<x,y> = QQ[] sage: P.monomial_quotient(x,y) x*y^65535
So, is it perhaps a bug that it used to work? After all, we are talking about polynomial rings here.
comment:63 in reply to: ↑ 60 Changed 19 months ago by
Replying to vbraun:
There is quite a lot of test failures with negative exponents like
File "src/sage/rings/polynomial/multi_polynomial_ring.py", line 632, in sage.rings.polynomial.multi_polynomial_ring.MPolynomialRing_polydict.monomial_quotient Failed example: P.monomial_quotient(x, y) # Note the wrong result Expected: x*y^1 Got: x*y^18446744073709551615
Note the comment: "Note the wrong result". In other words, x*y^1
is wrong, and P.monomial_quotient(x,y) is wrong usage.
comment:64 Changed 19 months ago by
I don't see how any of the changes here could change that doctest. I am investigating.
Note that you need to create the correct class:
sage: from sage.rings.polynomial.multi_polynomial_ring import MPolynomialRing_polydict_domain sage: P.<x,y,z> = MPolynomialRing_polydict_domain(QQ,3, order='degrevlex') sage: R.<x,y,z> = QQ[] sage: type(P) <class 'sage.rings.polynomial.multi_polynomial_ring.MPolynomialRing_polydict_domain_with_category'> sage: type(R) <type 'sage.rings.polynomial.multi_polynomial_libsingular.MPolynomialRing_libsingular'>
comment:65 Changed 19 months ago by
Ah, it is probably the creation of get_exp
that caused this. My guess is that the input i
not being a size_t
but something else that indexing works with. At least, that is the only thing I see in my commit that could have led to the patchbot failures.
comment:66 Changed 19 months ago by
 Commit changed from 2d29b9b164fd1e429466c882ace499a238ecf02c to 07ff2e1627ea16a9921dde8ffea053e597630043
Branch pushed to git repo; I updated commit sha1. New commits:
07ff2e1  Reverting use of get_exp for __getitem__.

comment:67 Changed 19 months ago by
Okay, I just reverted using get_exp
in the __getitem__
. So there is some code duplication, but using get_exp
will be faster when you know your input is an int
. Thus, I think it is useful to actually have, but I wanted to match the data structure (which is why I changed it from a size_t
input). I also saw a few doc/PEP8 things I missed on my previous pass that I added in too. The failing tests from the patchbot now pass for me.
comment:68 Changed 19 months ago by
 Status changed from needs_work to needs_review
comment:69 followup: ↓ 70 Changed 19 months ago by
Am I correct in thinking that ETuple is designed for implementing polynomials, not Laurent polynomials? This and the comment on the failing test and the fact that "ETuple free code" as in comment:63 indicates that negative exponents is wrong usage.
So, what could we do about it?
An obvious change (on a new ticket, of course) would be to change ETuple._data
from int*
to size_t*
.
comment:70 in reply to: ↑ 69 ; followup: ↓ 71 Changed 19 months ago by
Replying to SimonKing:
Am I correct in thinking that ETuple is designed for implementing polynomials, not Laurent polynomials?
Too bad. ETuple *is* used for Laurent polynomials. In the Hilbert series code, I do assume that the exponents are nonnegative, though. The assumption is explicitly documented in some places.
comment:71 in reply to: ↑ 70 Changed 19 months ago by
Replying to SimonKing:
Replying to SimonKing:
Am I correct in thinking that ETuple is designed for implementing polynomials, not Laurent polynomials?
Too bad. ETuple *is* used for Laurent polynomials. In the Hilbert series code, I do assume that the exponents are nonnegative, though. The assumption is explicitly documented in some places.
I think the assumption is documented enough where it is used, so it is fine. If someone needs the negatives, then they can change it to, e.g., Py_ssize_t
.
comment:72 Changed 19 months ago by
Are the tests passing? Can this be back to positive review?
comment:73 followup: ↓ 74 Changed 19 months ago by
Tests are passing for me. From #20145, don't you need another commit?
comment:74 in reply to: ↑ 73 ; followup: ↓ 75 Changed 19 months ago by
comment:75 in reply to: ↑ 74 Changed 19 months ago by
Replying to tscrim:
Replying to tscrim:
Tests are passing for me. From #20145, don't you need another commit?
Let me rephrase, do you want the last commit on #20145 here?
No, why? I was told occasionally that one should keep tickets small. In this case, there is one ticket for a new implementation of Hilbert series, and one ticket (#20145) for using the new implementation (also adding an implementation of Hilbert polynomial) in existing methods.
Additionally, when you ask if I need the two commits from #20145: For my own applications, I just need the commits from here.
comment:76 followup: ↓ 77 Changed 19 months ago by
Okay, I wasn't sure if the last commit from #20145 would be considered a bug from this part or not. Thank you for clarifying. So yes, all tests pass for me, and this is a positive review if my last change is good.
comment:77 in reply to: ↑ 76 Changed 19 months ago by
Replying to tscrim:
Okay, I wasn't sure if the last commit from #20145 would be considered a bug from this part or not. Thank you for clarifying. So yes, all tests pass for me, and this is a positive review if my last change is good.
No, it wasn't a bug  at least not a bug here. I thought it was needed to amend the sign in #20145, which I did in the first commit, but in fact it is not needed, and thus I reverted the sign amendment in the second commit.
comment:78 Changed 19 months ago by
I see. Thanks for explaining. This is ready to be set back to positive review.
comment:80 Changed 19 months ago by
 Commit changed from 07ff2e1627ea16a9921dde8ffea053e597630043 to 5637e07941cab4d3c8ea35b0b35b94150ed0a58c
 Status changed from positive_review to needs_review
comment:81 Changed 19 months ago by
 Status changed from needs_review to positive_review
A little stupid trivial fix.
comment:82 Changed 19 months ago by
 Branch changed from u/tscrim/hilbert_functions26243 to 5637e07941cab4d3c8ea35b0b35b94150ed0a58c
 Resolution set to fixed
 Status changed from positive_review to closed
New commits:
Provide an implementation of Hilbert series without Singular's limitations
Provide documentation for Hilbert series computation