Opened 7 years ago

Closed 7 years ago

#18585 closed defect (fixed)

Comparison of sparse polynomials

Reported by: bruno Owned by:
Priority: major Milestone: sage-6.9
Component: commutative algebra Keywords: polynomials
Cc: Merged in:
Authors: Bruno Grenet Reviewers: Vincent Delecroix
Report Upstream: N/A Work issues:
Branch: c88f1b0 (Commits, GitHub, GitLab) Commit: c88f1b06b429f179520c9694a62924c9889577b0
Dependencies: Stopgaps:

Status badges

Description

Currently, the default _cmp_ is written with dense polynomials in mind since it creates an xrange of size the degree of the polynomial. For sparse polynomials which can have very high degree, this is a problem:

sage: R.<x> = PolynomialRing(ZZ, sparse=True)
sage: x^2^100 == x^2^100
Traceback (most recent call last)
...
OverflowError: Python int too large to convert to C long

The patch introduces a _cmp_ method in the class Polynomial_generic_sparse to get rid of this problem:

sage: R.<x> = PolynomialRing(ZZ, sparse=True)
sage: x^2^100 == x^2^100
True

Note, as a side comment, that the comparison is faster for polynomials that are indeed sparse with the new code, while the timings are equivalent for quite dense polynomials.

Without the patch:

sage: %timeit x^100 == x^100
1000 loops, best of 3: 224 µs per loop

With the patch:

sage: %timeit x^100 == x^100
The slowest run took 7.61 times longer than the fastest. This could mean that an intermediate result is being cached 
10000 loops, best of 3: 24.6 µs per loop

Change History (26)

comment:1 Changed 7 years ago by git

  • Commit set to 421bd041f4084e8da89ee27f87be0e54e79706d1

Branch pushed to git repo; I updated commit sha1. New commits:

421bd04Add _cmp_ in class Polynomial_generic_sparse

comment:2 Changed 7 years ago by bruno

  • Status changed from new to needs_review

comment:3 Changed 7 years ago by bruno

  • Type changed from PLEASE CHANGE to defect

comment:4 Changed 7 years ago by bruno

  • Status changed from needs_review to needs_work

comment:5 Changed 7 years ago by git

  • Commit changed from 421bd041f4084e8da89ee27f87be0e54e79706d1 to 7c72ce25d8b5a0709abbed60c7526ba63ebde8b5

Branch pushed to git repo; I updated commit sha1. New commits:

7c72ce2Rewritten using lists to correct a bug

comment:6 Changed 7 years ago by bruno

  • Status changed from needs_work to needs_review

There was a bug in my first commit, for some case I didn't think to test. I hope there is no more bug, and I've tested all possibilities.

comment:7 follow-up: Changed 7 years ago by vdelecroix

Hello,

  1. You can replace
    keys1 = list(reversed(sorted(d1.keys())))
    
    with
    keys1 = d1.keys()
    keys1.reverse()
    
  1. Instead of looking at the length, you can do
    if not keys1 and not keys2: return 0
    if not keys1: return -1
    if not keys2: return 1
    
  1. Why the first key is treated differently?

Vincent

comment:8 Changed 7 years ago by git

  • Commit changed from 7c72ce25d8b5a0709abbed60c7526ba63ebde8b5 to 1726058568171c77880ac9d00684ffc00c0b1492

Branch pushed to git repo; I updated commit sha1. New commits:

1726058list(reversed(sorted(...))) replaced by .sort() and .reverse()

comment:9 in reply to: ↑ 7 ; follow-up: Changed 7 years ago by bruno

Replying to vdelecroix:

  1. You can replace
    keys1 = list(reversed(sorted(d1.keys())))
    
    with
    keys1 = d1.keys()
    keys1.reverse()
    

Done, with the added keys1.sort() before the reverse().

  1. Instead of looking at the length, you can do
    if not keys1 and not keys2: return 0
    if not keys1: return -1
    if not keys2: return 1
    

Also done.

  1. Why the first key is treated differently?

This is to comply with the default _cmp_ method for polynomials: for instance -2x^2 > x but x^3 - 2x^2 < x^3 + x. A larger degree polynomial is always greater than a lower degree polynomial, but then when the coefficients are compared, 0 is greater than a negative coefficient.

I add a question: I noticed after your remark than the method exponents() could be improved. This is now implemented as

        return [c[0] for c in sorted(self.__coeffs.iteritems())] 

while it would be faster (from my testing) to replace it by

        keys = self.__coeffs.keys()
        keys.sort()
        return keys

Is that right to make the change in this ticket (since it is a rather minor change) or is it more appropriate to open a new one?

Last edited 7 years ago by bruno (previous) (diff)

comment:10 in reply to: ↑ 9 Changed 7 years ago by vdelecroix

  • Milestone changed from sage-6.8 to sage-6.9
  • Reviewers set to Vincent Delecroix
  • Status changed from needs_review to needs_work

Replying to bruno:

Replying to vdelecroix: I add a question: I noticed after your remark than the method exponents() could be improved. This is now implemented as

        return [c[0] for c in sorted(self.__coeffs.iteritems())] 

while it would be faster (from my testing) to replace it by

        keys = self.__coeffs.keys()
        keys.sort()
        return keys

Is that right to make the change in this ticket (since it is a rather minor change) or is it more appropriate to open a new one?

That's fine with me.

And could you add a doctest like

sage: Rd = PolynomialRing(ZZ, 'x', sparse=False)
sage: Rs = PolynomialRing(ZZ, 'x', sparse=True)
sage: for _ in range(100):
....:     pd = Rd.random_element()
....:     qd = Rd.random_element()
....:     assert cmp(pd,qd) == cmp(Rs(pd), Rs(qd))

comment:11 Changed 7 years ago by git

  • Commit changed from 1726058568171c77880ac9d00684ffc00c0b1492 to c9b2091f17937e9300b4f09c9f0f20aee78d65a3

Branch pushed to git repo; I updated commit sha1. New commits:

efa67ddFasten Polynomial_generic_sparse.exponents()
c9b2091Add doctest

comment:12 Changed 7 years ago by bruno

  • Status changed from needs_work to needs_review

comment:13 Changed 7 years ago by vdelecroix

keys1.sort(); keys1.reverse() -> keys1.sort(reverse=True)

comment:14 follow-up: Changed 7 years ago by vdelecroix

In

            if c > 0:
                c1 = cmp(d1[k1], 0)
                if c1: return c1

Isn't c1 automatically nonzero? Could there be some zero coefficients in the dictionary?

comment:15 Changed 7 years ago by vdelecroix

Instead of comparing with 0 (the Python int zero) it would be better to compare with self.base_ring().zero().

comment:16 Changed 7 years ago by git

  • Commit changed from c9b2091f17937e9300b4f09c9f0f20aee78d65a3 to 3121374eed7b683bd135e9ec5086aac03c4c8eb5

Branch pushed to git repo; I updated commit sha1. New commits:

1bb00ffsort + reverse in a single instruction
3121374Compare with self.base_ring().zero() instead of Python 0

comment:17 in reply to: ↑ 14 ; follow-up: Changed 7 years ago by bruno

Replying to vdelecroix:

In

            if c > 0:
                c1 = cmp(d1[k1], 0)
                if c1: return c1

Isn't c1 automatically nonzero? Could there be some zero coefficients in the dictionary?

sage: R.<x> = PolynomialRing(ZZ, sparse=True)
sage: p = R({0:1, 1:0, 2:1}, check=False)
sage: p
x^2 + 1
sage: p.dict()
{0: 1, 1: 0, 2: 1}

I doubt this happens often in practice, but since this is possible, I am quite reluctant to remove the test.

comment:18 in reply to: ↑ 17 Changed 7 years ago by vdelecroix

Replying to bruno:

Replying to vdelecroix:

In

            if c > 0:
                c1 = cmp(d1[k1], 0)
                if c1: return c1

Isn't c1 automatically nonzero? Could there be some zero coefficients in the dictionary?

sage: R.<x> = PolynomialRing(ZZ, sparse=True)
sage: p = R({0:1, 1:0, 2:1}, check=False)
sage: p
x^2 + 1
sage: p.dict()
{0: 1, 1: 0, 2: 1}

I doubt this happens often in practice, but since this is possible, I am quite reluctant to remove the test.

Dough! I was asking because for sparse matrices it is explicitely mentioned in the specification that no value can be zero. And this is very useful!

comment:19 follow-up: Changed 7 years ago by vdelecroix

Though, multiplication is nicer

sage: (p*p).dict()
{0: 1, 2: 2, 4: 1}

What do you think about opening a ticket about enforcing non-zero values in the dictionary?

Vincent

comment:20 in reply to: ↑ 19 ; follow-up: Changed 7 years ago by bruno

Replying to vdelecroix:

Dough! I was asking because for sparse matrices it is explicitely mentioned in the specification that no value can be zero. And this is very useful!

There is no such specification as for matrices. There is a method __normalize() to remove the zero coefficients, used by default in __init__, and used for instance in the multiplication (since there may be some cancellations).

Replying to vdelecroix:

Though, multiplication is nicer

sage: (p*p).dict()
{0: 1, 2: 2, 4: 1}

What do you think about opening a ticket about enforcing non-zero values in the dictionary?

I am in favor of opening a ticket to add this specification, though I am not in favor of removing the check=False option that can be useful at some places to get (slightly) faster code.

comment:21 in reply to: ↑ 20 ; follow-up: Changed 7 years ago by vdelecroix

Replying to bruno:

Replying to vdelecroix:

Dough! I was asking because for sparse matrices it is explicitely mentioned in the specification that no value can be zero. And this is very useful!

There is no such specification as for matrices. There is a method __normalize() to remove the zero coefficients, used by default in __init__, and used for instance in the multiplication (since there may be some cancellations).

matrix_generic_sparse.pyx, main docstring line 69-71

    A generic sparse matrix is
    represented using a dictionary whose keys are pairs of integers `(i,j)` and
    values in the base ring. The values of the dictionary must never be zero.

Replying to vdelecroix:

Though, multiplication is nicer

sage: (p*p).dict()
{0: 1, 2: 2, 4: 1}

What do you think about opening a ticket about enforcing non-zero values in the dictionary?

I am in favor of opening a ticket to add this specification, though I am not in favor of removing the check=False option that can be useful at some places to get (slightly) faster code.

I now understand that it only due to check=False. In the case check=True (which is the default) zeros are removed. So I would assume that there never are zero value in the dictionary. One only needs to specify it in the documentation.

You could have a look at the implementation of degree. It does not check if the values are zero or not. It just assumes that values are nonzero.

Vincent

comment:22 in reply to: ↑ 21 Changed 7 years ago by bruno

Replying to vdelecroix:

Replying to bruno:

Replying to vdelecroix:

Dough! I was asking because for sparse matrices it is explicitely mentioned in the specification that no value can be zero. And this is very useful!

There is no such specification as for matrices. There is a method __normalize() to remove the zero coefficients, used by default in __init__, and used for instance in the multiplication (since there may be some cancellations).

matrix_generic_sparse.pyx, main docstring line 69-71

    A generic sparse matrix is
    represented using a dictionary whose keys are pairs of integers `(i,j)` and
    values in the base ring. The values of the dictionary must never be zero.

Replying to vdelecroix:

Though, multiplication is nicer

sage: (p*p).dict()
{0: 1, 2: 2, 4: 1}

What do you think about opening a ticket about enforcing non-zero values in the dictionary?

I am in favor of opening a ticket to add this specification, though I am not in favor of removing the check=False option that can be useful at some places to get (slightly) faster code.

I now understand that it only due to check=False. In the case check=True (which is the default) zeros are removed. So I would assume that there never are zero value in the dictionary. One only needs to specify it in the documentation.

Right. I guess we actually agree (though my previous comment was maybe unclear).

You could have a look at the implementation of degree. It does not check if the values are zero or not. It just assumes that values are nonzero.

True. And this is also the case of _repr:

sage: R.<x> = PolynomialRing(ZZ, sparse=True)
sage: R({0:1,1:0}, check=False)
 + 1

So we actually already assume that coefficients in the dictionary are nonzero. I'll add it in the specifications.

Last edited 7 years ago by bruno (previous) (diff)

comment:23 Changed 7 years ago by git

  • Commit changed from 3121374eed7b683bd135e9ec5086aac03c4c8eb5 to c88f1b06b429f179520c9694a62924c9889577b0

Branch pushed to git repo; I updated commit sha1. New commits:

536a7e5Remove useless test
c88f1b0Add (tentative) specification

comment:24 Changed 7 years ago by vdelecroix

  • Status changed from needs_review to positive_review

Looks good to me.

comment:25 Changed 7 years ago by bruno

Thanks for your comments!

comment:26 Changed 7 years ago by vbraun

  • Branch changed from u/bruno/compare_sparse_polynomials to c88f1b06b429f179520c9694a62924c9889577b0
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.