Opened 3 years ago

Closed 3 years ago

#27146 closed enhancement (fixed)

Speed up initialization code for partitions

Reported by: mantepse Owned by:
Priority: major Milestone: sage-8.7
Component: combinatorics Keywords:
Cc: Merged in:
Authors: Martin Rubey Reviewers: Travis Scrimshaw
Report Upstream: N/A Work issues:
Branch: 1940f8d (Commits, GitHub, GitLab) Commit: 1940f8d7d7d47a8f02acada9034eeef40e25c5e6
Dependencies: Stopgaps:

Status badges

Description

I'm afraid I am in need of a faster way to create an integer partition. It seems to me that there are two things to improve:

  • the (relatively expensive) check whether the argument to _Partitions is actually a partition is done twice: once in Partition.__init__ and, before that, in Partitions._element_constructor_.
  • it appears that x in NN is expensive.

I do not yet know how to proceed. I think it would be best to have a check keyword argument to allow bypassing checks, but also to speed up the checks themselves.

Change History (55)

comment:1 Changed 3 years ago by mantepse

One bug we might want to fix on the way:

sage: Partition([2,1.0])
[2, 1.00000000000000]

comment:2 Changed 3 years ago by mantepse

Another one:

sage: Partition([2,True])
[2, True]

comment:3 Changed 3 years ago by tscrim

Two things: If you really need that level of speed, you should bypass the _element_constructor_ (as a general rule). Second, fixing those "bugs" (I am not sure I would call them that because of ducktyping) will slow things down more and/or be more restrictive on the input (since True == 1.0 == 1, which is in NN).

However, I agree that we should make the check in one place (at least for DRY). However, for subsets of partitions, there are 2 different checks that do go on. So that needs to be considered carefully. My thought right now would be to remove the check in the Partition.__init__ and assume it is valid by that point. Then we find out if there are any errors and go from there.

comment:4 Changed 3 years ago by mantepse

I think we should have

sage: Partition([2,1.0])
[2, 1]

and

sage: Partition([2,True])
[2, 1]

comment:5 follow-up: Changed 3 years ago by tscrim

It probably is better, but I do not think they count as bugs outright. I was mainly pointing out that enforcing that will slow things down by going through ZZ.

comment:6 Changed 3 years ago by mantepse

  • Branch set to u/mantepse/speed_up_initialization_code_for_partitions

comment:7 Changed 3 years ago by mantepse

  • Authors set to Martin Rubey
  • Commit set to 9d363de5dc95430b43bc7e0344e112f8340f4e95
  • Status changed from new to needs_review

New commits:

9d363despeed up and simplify initialisation code

comment:8 Changed 3 years ago by mantepse

I did not change the behaviour concerning conversion to ZZ, but I followed your suggestion to skip the check in __init__. Here is a comparison:

sage: l = [list(Partitions(10).random_element()) for _ in range(10000)]
sage: %timeit map(Partition, l)
10 loops, best of 3: 95.9 ms per loop
sage: l = [list(Partitions(10).random_element())+[0]*50 for _ in range(1000)]
sage: %timeit map(Partition, l)
10 loops, best of 3: 56.6 ms per loop
sage: l = [list(Partitions(50).random_element()) for _ in range(1000)]
sage: %timeit map(Partition, l)
100 loops, best of 3: 17.4 ms per loop
sage: l = [list(Partitions(50).random_element())+[0]*50 for _ in range(1000)]
sage: %timeit map(Partition, l)
10 loops, best of 3: 67.1 ms per loop
sage: l = [list(Partitions(10).random_element()) for _ in range(10000)]
sage: %timeit map(Partition, l)
10 loops, best of 3: 75.1 ms per loop
sage: l = [list(Partitions(10).random_element())+[0]*50 for _ in range(1000)]
sage: %timeit map(Partition, l)
100 loops, best of 3: 18.2 ms per loop
sage: l = [list(Partitions(50).random_element()) for _ in range(1000)]
sage: %timeit map(Partition, l)
100 loops, best of 3: 9.76 ms per loop
sage: l = [list(Partitions(50).random_element())+[0]*50 for _ in range(1000)]
sage: %timeit map(Partition, l)
10 loops, best of 3: 20.7 ms per loop

Question: to bypass the _element_constructor_ , is

def _make_partition(l):
    return _Partitions.element_class(_Partitions, l)

correct?

comment:9 Changed 3 years ago by mantepse

There is a failing test, which is failing because KBoundedQuotientBases.ParentMethods.__getitem__ contains

c = self._kbounded_partitions.element_class(self._kbounded_partitions, list(c))

without checking that c is actually a partition.

I think that this is a bug, do you agree?

comment:10 Changed 3 years ago by mantepse

I am currently testing the whole library for more calls to Partition.__init__ which rely on checks.

comment:11 Changed 3 years ago by git

  • Commit changed from 9d363de5dc95430b43bc7e0344e112f8340f4e95 to 6699343107b0be33bffa7e6f64c11976d1399edc

Branch pushed to git repo; I updated commit sha1. New commits:

6699343fix non-checking code in __getitem__

comment:12 Changed 3 years ago by git

  • Commit changed from 6699343107b0be33bffa7e6f64c11976d1399edc to 42ee585d4a7c66a297a080c1333559fff84b4ecf

Branch pushed to git repo; I updated commit sha1. New commits:

42ee585further simplification

comment:13 Changed 3 years ago by mantepse

What I did was to raise an Exception in Partition.__init__ whenever mu would not be a weakly decreasing list of nonnegative integers. It was only triggered by the code in k_dual.py. Since this code should not be performance critical (for performance, one would use monomial, not __getitem__), I think it is correct to create the element with checking.

comment:14 Changed 3 years ago by mantepse

Question: I think that __getitem__ only takes one argument. Thus, I think it should actually be as simple as this:

        def __getitem__(self, c):
            if isinstance(c, (int, Integer)):
                c = self._kbounded_partitions([c])
            else:
                c = self._kbounded_partitions(list(c))
            return self.monomial(c)

Do you agree?

comment:15 Changed 3 years ago by tscrim

Your timings in comment:8 do not (necessarily) represent a fair comparison because you have generated random elements (length matters a lot here). You have to use the same elements.

The answer is yes to the question in comment:8.

comment:9 is not a bug because the current code does check it. There is (currently) no hard semantic rule for where the checks should be done.

I agree with comment:14 except I would not wrap c in a list for the second case (e.g., why redo the computation if c is an element of self._kbounded_partitions).

For this change:

-            return len(x) == 0 or (x[-1] in NN and
-                                   all(x[i] in NN and x[i] >= x[i+1] for i in range(len(x)-1)))
+            z = Integer(0)
+            try:
+                return (all(e == Integer(e) >= z for e in x) and
+                        all(e >= x[i+1] for i, e in enumerate(x[:-1])))
+            except TypeError:
+                return False

I disagree with using try-except for the length 0 test. IMO, it is confusing as it does not make it clear what it is testing for and it can hide bugs by catching more than it should. Also e == Integer(e), a priori, is redundant with the coercion framework (although it is now more general because it allows conversions). You are also iterating over the list twice, which I believe is generally slower because of possible short-circuiting. Performance critical code should really use self.element_class to avoid as much indirection as possible.

I think in general we should assume that there is no 0 in the partition, so mu.index(0) will be a much slower check than mu[-1] == 0. In particular, the former has the run over the entire list and deal with the exception.

comment:16 follow-up: Changed 3 years ago by mantepse

Many thanks for your comments! Indeed, using the same list changes the timings a little bit, but not very much.

However, the following is a fair bit faster - it's the fastest of several combinations I tried - it's a bit unfortunate that it is cheaper to test if x than if not x. I'm not sure whether it's better to test the last element for non-negativity first, or weakly decreasing first. Is this correct and OK for you?

    def __contains__(self, x):
        if isinstance(x, Partition):
            return True
        if isinstance(x, (list, tuple)):
            if x:
                try:
                    return (all(Integer(a) >= b for a, b in zip(x, x[1:]))
                            and Integer(x[-1]) >= 0)
                except TypeError:
                    return False
            return True

Tiny question: shouldn't __contains__ return False in case, or is None indeed sufficient? I find returning None a bit unclear, but that's just me.

comment:17 in reply to: ↑ 16 Changed 3 years ago by tscrim

Replying to mantepse:

Many thanks for your comments! Indeed, using the same list changes the timings a little bit, but not very much.

It still should be done (also for memory usage).

However, the following is a fair bit faster - it's the fastest of several combinations I tried - it's a bit unfortunate that it is cheaper to test if x than if not x. I'm not sure whether it's better to test the last element for non-negativity first, or weakly decreasing first. Is this correct and OK for you?

It is faster to test a in ZZ than Integer(a) when the input is an Integer:

sage: %timeit 5 in ZZ
10000000 loops, best of 3: 87.4 ns per loop
sage: %timeit Integer(5)
10000000 loops, best of 3: 121 ns per loop

but that is not the case for non-Integer objects:

sage: %timeit 5.0 in ZZ
100000 loops, best of 3: 2.41 µs per loop
sage: %timeit Integer(5.0)
1000000 loops, best of 3: 1.36 µs per loop
sage: %timeit 5/1 in ZZ
1000000 loops, best of 3: 705 ns per loop
sage: %timeit Integer(5/1)
1000000 loops, best of 3: 479 ns per loop

Moreover, false results are much faster than handling the TypeError:

sage: def test1(x):
....:     return x in ZZ
....: def test2(x):
....:     try:
....:         return Integer(x)
....:     except TypeError:
....:         return False
....:     
sage: %timeit test1(1/2)
1000000 loops, best of 3: 1.22 µs per loop
sage: %timeit test2(1/2)
1000000 loops, best of 3: 1.56 µs per loop

In addition, the same statements about the try-except block still hold here. So I would not want to use that code. Also, be wary of a tautology: you create a partition, check to see if it is in the parent, which then is true because it is a partition.

Tiny question: shouldn't __contains__ return False in case, or is None indeed sufficient? I find returning None a bit unclear, but that's just me.

It should return False.

comment:18 Changed 3 years ago by mantepse

Hi Travis!

I just tried

            return not x or (all(a in ZZ and a >= b for a, b in zip(x, x[1:]))
                             and x[-1] in ZZ and x[-1] >= 0)

but that is significantly slower (by a factor of 1.6, that is) when x is a weakly decreasing list of ints.

On the other hand, if x is a weakly decreasing list of Integers, it is not (significantly) faster (only by a factor of 0.96).

I will check my timings later today, but maybe it's obvious to you why the new version is not better. I do think that weakly decreasing lists of integer-like objects (mostly int and Integer) is the most important input, do you agree?

Aside: in case it is better and I made a mistake, NonNegativeIntegers.__contains__ is not ideal:

    def __contains__(self, elt):
        try:
            i = Integer(elt)
            return  i >= Integer(0) and i == elt
        except TypeError:
            return False

comment:19 Changed 3 years ago by git

  • Commit changed from 42ee585d4a7c66a297a080c1333559fff84b4ecf to 5f7d09de404ff48052b9cd07af4c781b765510dd

Branch pushed to git repo; I updated commit sha1. New commits:

5f7d09dfaster __contains__

comment:20 Changed 3 years ago by mantepse

As promised, a check of my timings. Code:

import six
from six.moves import zip
from sage.rings.integer import Integer
from sage.rings.all import ZZ

def test1(x):
    if x:
        try:
            return (all(Integer(a) >= b for a, b in zip(x, x[1:]))
                    and Integer(x[-1]) >= 0)
        except TypeError:
            return False
    return True

def test2(x):
    return not x or (all((a in ZZ) and (a >= b) for a, b in zip(x, x[1:]))
                     and (x[-1] in ZZ) and (x[-1] >= 0))

Timings:

sage: l = [list(Partitions(10).random_element()) for _ in range(100000)]
sage: %timeit map(test1, l)
10 loops, best of 3: 117 ms per loop
sage: %timeit map(test2, l)
1 loop, best of 3: 227 ms per loop
sage: l = [list(Partitions(10).random_element())+[int(0)]*50 for _ in range(10000)]
sage: %timeit map(test1, l)
10 loops, best of 3: 78.4 ms per loop
sage: %timeit map(test2, l)
1 loop, best of 3: 211 ms per loop
sage: l = [map(Integer, Partitions(10).random_element()+[0]*50) for _ in range(10000)]
sage: %timeit map(test1, l)
10 loops, best of 3: 77 ms per loop
sage: %timeit map(test2, l)
10 loops, best of 3: 63.3 ms per loop
sage: l = [list(Partitions(50).random_element()) for _ in range(10000)]
sage: %timeit map(test1, l)
10 loops, best of 3: 23.9 ms per loop
sage: %timeit map(test2, l)
10 loops, best of 3: 57.1 ms per loop
sage: l = [list(Partitions(50).random_element())+[int(0)]*50 for _ in range(10000)]
sage: %timeit map(test1, l)
10 loops, best of 3: 91.4 ms per loop
sage: %timeit map(test2, l)
1 loop, best of 3: 246 ms per loop
sage: l = [map(Integer, Partitions(50).random_element()+[0]*50) for _ in range(10000)]
sage: %timeit map(test1, l)
10 loops, best of 3: 89.8 ms per loop
sage: %timeit map(test2, l)
10 loops, best of 3: 71.5 ms per loop

Conclusion:

For weakly decreasing lists of integers, test2 is slightly faster if the input is a list of Integers. However, test1 is much faster if the input is a list of ints.

comment:21 in reply to: ↑ 5 Changed 3 years ago by mantepse

Replying to tscrim:

It probably is better, but I do not think they count as bugs outright. I was mainly pointing out that enforcing that will slow things down by going through ZZ.

See also #20147, where the bug is hard to detect visually. With floats it's much easier to see:

sage: p = Partition([1.0])
sage: p.to_exp()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-25-02f7e7463ddd> in <module>()
----> 1 p.to_exp()

/home/martin/sage-develop/local/lib/python2.7/site-packages/sage/combinat/partition.pyc in to_exp(self, k)
   3608         if len(p) > 0:
   3609             k = max(k, p[0])
-> 3610         a = [ZZ.zero()] * k
   3611         for i in p:
   3612             a[i-1] += 1

TypeError: can't multiply sequence by non-int of type 'sage.rings.real_mpfr.RealLiteral'

comment:22 Changed 3 years ago by tscrim

This is still not really a bug IMO: garbage-in, garbage-out. However, +1 to cast the input to ZZ. It just is likely to incur a (likely very minor) speed penalty.

Anyways, for the testing I think we should assume ZZ inputs, and since that is faster (and more natural to read in the code, including less likely to hide an unrelated error), we should go with that. If you are passing int's in, then you should either be casting them to ZZ yourself, skipping the containment check, or doing your own optimized checks.

comment:23 Changed 3 years ago by mantepse

I think I incorporated all your suggestions.

I noticed one alternative:

Maybe trailing zeros should be removed in Partitions._element_constructor_ rather than in Partition.__init__?

Advantages of the former:

  • all data manipulation happens in one place
  • we avoid copying the partition another time

Advantages of the latter:

  • safer code

The way I have now implemented the removal of trailing zeros seems to be quite efficient if there are hardly any, which I would expect to be the most common case.

comment:24 Changed 3 years ago by git

  • Commit changed from 5f7d09de404ff48052b9cd07af4c781b765510dd to 1b0dd95196c3e756f59d2e40dde607d4d8be4b22

Branch pushed to git repo; I updated commit sha1. New commits:

fea5fbbMerge branch 'u/mantepse/speed_up_initialization_code_for_partitions' of git://trac.sagemath.org/sage into HEAD
1b0dd95efficient Partition.__init__, make parts ZZs, simplify Partitions.__contains__, fix __getitem__ in sage.combinat.sf

comment:25 Changed 3 years ago by git

  • Commit changed from 1b0dd95196c3e756f59d2e40dde607d4d8be4b22 to 2a32a989f19662e83d21c0ac4365119512279e38

Branch pushed to git repo; I updated commit sha1. New commits:

2a32a98fix an error and adapt two doctests

comment:26 Changed 3 years ago by tscrim

Can you do a rebase? I will finish the review afterwards.

comment:27 Changed 3 years ago by git

  • Commit changed from 2a32a989f19662e83d21c0ac4365119512279e38 to 427c5a8ae93743ed8e4ac39cc22463c23fc80514

Branch pushed to git repo; I updated commit sha1. New commits:

427c5a8Merge branch 'u/mantepse/speed_up_initialization_code_for_partitions' of git://trac.sagemath.org/sage into HEAD

comment:28 follow-ups: Changed 3 years ago by tscrim

Thanks. This should be the last little bits.

Did you check this on Python3? The map no longer returns a list for that, so I think there will be a problem with mu[-1]. You might need to wrap that with a list as well.

I think a better error message for the map(ZZ, lst) would be something like

all parts must be nonnegative integers

to be a bit more specific about what is failing (since it is separated out).

Did you try also timing it doing the check

pos = len(mu) - 1
while pos >= 0 and not mu[pos]:
    pos += 1
CombinatorialElement.__init__(self, parent, mu[:pos])

I am just wondering about all of those little temporary lists being created.

For the __getitem__, I think we should allow "fake" integers (e.g., 2/1 in QQ) by checking if c in ZZ. Although I don't think we were entirely sure this was doing the correct things before:

sage: s = SymmetricFunctions(QQ).s()
sage: s[2/1]
s[2]
sage: _.leading_support()[0]
2
sage: _.parent()
Rational Field
sage: list(2/1)
[2]

I believe your current approach will fix this, but we should add a test.

comment:29 Changed 3 years ago by git

  • Commit changed from 427c5a8ae93743ed8e4ac39cc22463c23fc80514 to 9d41f423c8e6063fdcd2ffdc56e360e29542ba14

Branch pushed to git repo; I updated commit sha1. New commits:

9d41f42fix python3 blunder

comment:30 in reply to: ↑ 28 ; follow-up: Changed 3 years ago by mantepse

Replying to tscrim:

Did you check this on Python3? The map no longer returns a list for that, so I think there will be a problem with mu[-1]. You might need to wrap that with a list as well.

fixed! Thank you! (slightly embarassed :-)

I think a better error message for the map(ZZ, lst) would be something like

all parts must be nonnegative integers

to be a bit more specific about what is failing (since it is separated out).

Hm, what we are actually testing at

        try:
            lst = list(map(ZZ, lst))
        except TypeError:
            raise ValueError('%s is not an element of %s'%(repr(lst), self))

is of course that we have an iterable of integers. Nonnegativity is only checked in the individual __contains__.

Is there any good reason that list(map(ZZ, l)) is three times faster than list(map(NN, l))? If not so, is there any good reason to have the entries of partitions in ZZ rather than NN?

comment:31 in reply to: ↑ 28 ; follow-up: Changed 3 years ago by mantepse

Second question:

Replying to tscrim:

For the __getitem__, I think we should allow "fake" integers (e.g., 2/1 in QQ) by checking if c in ZZ. Although I don't think we were entirely sure this was doing the correct things before:

sage: s = SymmetricFunctions(QQ).s()
sage: s[2/1]
s[2]
sage: _.leading_support()[0]
2
sage: _.parent()
Rational Field
sage: list(2/1)
[2]

I believe your current approach will fix this, but we should add a test.

Yes it does. Do you want a test in each of the __getitem_ methods or only in sfa.py for SymmetricFunctionAlgebra_generic?

comment:32 in reply to: ↑ 30 Changed 3 years ago by tscrim

Replying to mantepse:

Replying to tscrim:

I think a better error message for the map(ZZ, lst) would be something like

all parts must be nonnegative integers

to be a bit more specific about what is failing (since it is separated out).

Hm, what we are actually testing at

        try:
            lst = list(map(ZZ, lst))
        except TypeError:
            raise ValueError('%s is not an element of %s'%(repr(lst), self))

is of course that we have an iterable of integers. Nonnegativity is only checked in the individual __contains__.

If it makes you feel better, you can say "must be integers", but that is misleading to the user who sees the error. IMO, it is fine for a weaker test to assert a stronger statement.

Is there any good reason that list(map(ZZ, l)) is three times faster than list(map(NN, l))? If not so, is there any good reason to have the entries of partitions in ZZ rather than NN?

Cython and a lot of optimization. So until NN gets a similar sort of push (which I highly doubt), I would stick with ZZ.

comment:33 in reply to: ↑ 31 ; follow-up: Changed 3 years ago by tscrim

Replying to mantepse:

Yes it does. Do you want a test in each of the __getitem_ methods or only in sfa.py for SymmetricFunctionAlgebra_generic?

Just in sfa.py is fine. It is the generic case and mostly a slight warning for people in the future.

comment:34 follow-up: Changed 3 years ago by vdelecroix

I don't understand part of your change

-            return len(x) == 0 or (x[-1] in NN and
-                                   all(x[i] in NN and x[i] >= x[i+1] for i in range(len(x)-1)))
+            return not x or (all((a in ZZ) and (a >= b) for a, b in zip(x, x[1:]))
+                             and (x[-1] in ZZ) and (x[-1] >= 0))

Calling x[1:] creates a copy of the list and would better be avoided. Use x[i] and x[i+1] as in the former version.

comment:35 follow-up: Changed 3 years ago by vdelecroix

Similarly

+            while mu and not mu[-1]:
+                mu = mu[:-1]
+            CombinatorialElement.__init__(self, parent, mu)

You are creating as many lists as there are trailing zeros. Either you first compute where to cut, or you can replace mu = mu[:-1] with del mu[:-1].

comment:36 in reply to: ↑ 34 ; follow-up: Changed 3 years ago by mantepse

Replying to vdelecroix:

I don't understand part of your change

-            return len(x) == 0 or (x[-1] in NN and
-                                   all(x[i] in NN and x[i] >= x[i+1] for i in range(len(x)-1)))
+            return not x or (all((a in ZZ) and (a >= b) for a, b in zip(x, x[1:]))
+                             and (x[-1] in ZZ) and (x[-1] >= 0))

Calling x[1:] creates a copy of the list and would better be avoided. Use x[i] and x[i+1] as in the former version.

Yes, I know, but unless I made a mistake (quite possible) testing showed that the new version is faster. I will check again.

comment:37 in reply to: ↑ 35 ; follow-up: Changed 3 years ago by vdelecroix

Replying to vdelecroix:

Similarly

+            while mu and not mu[-1]:
+                mu = mu[:-1]
+            CombinatorialElement.__init__(self, parent, mu)

You are creating as many lists as there are trailing zeros. Either you first compute where to cut, or you can replace mu = mu[:-1] with del mu[:-1].

Or even better:

    while mu and not mu[-1]:
        mu.pop()

comment:38 in reply to: ↑ 37 ; follow-up: Changed 3 years ago by mantepse

Replying to vdelecroix:

Or even better:

    while mu and not mu[-1]:
        mu.pop()

That's what I once had, but I think I am not allowed to modify mu at this point.

comment:39 in reply to: ↑ 38 ; follow-up: Changed 3 years ago by vdelecroix

Replying to mantepse:

Replying to vdelecroix:

Or even better:

    while mu and not mu[-1]:
        mu.pop()

That's what I once had, but I think I am not allowed to modify mu at this point.

If that is true (why?) then take a copy before change anything

if mu and not mu[-1]:
    mu = mu[:-1]
    while mu and not mu[-1]:
        mu.pop()

comment:40 in reply to: ↑ 39 Changed 3 years ago by mantepse

If that is true (why?) then take a copy before change anything

if mu and not mu[-1]:
    mu = mu[:-1]
    while mu and not mu[-1]:
        mu.pop()

That's a nice idea! I won't be able to finish this before the weekend, I certainly want to redo the timings.

comment:41 in reply to: ↑ 36 ; follow-up: Changed 3 years ago by mantepse

Replying to mantepse:

Replying to vdelecroix:

I don't understand part of your change ...

I now checked

def test_contains1(x):
    return not x or (all((a in ZZ) and (a >= b) for a, b in zip(x, x[1:]))
                     and (x[-1] in ZZ) and (x[-1] >= 0))


def test_contains2(x):
    return not x or (all((x[i] in ZZ) and (x[i] >= x[i+1]) for i in range(len(x)-1))
                     and (x[-1] in ZZ) and (x[-1] >= 0))

and the second version is slightly slower (roughly by a factor 1.17), even for very long lists x (having more than 1000 elements). I suspect that copying is faster than adding.

comment:42 in reply to: ↑ 33 Changed 3 years ago by mantepse

Yet another question, sorry about that:

I just noticed that map(ZZ, 4/2) returns [2], whereas map(ZZ, 2) and map(ZZ, 4.0) raise errors. This implies that, with this ticket applied, Partition(4/2) returns [2] whereas Partition(2) and Partition(4.0) raise an error.

Do we want that? Otherwise, I assume that we want to make Partition(4/2) raise an error too, but I don't know how to do that. It seems bad to have a special case for QQ.

comment:43 follow-up: Changed 3 years ago by tscrim

It's not our job to protect against every possible bad input. Garbage in, garbage out.

comment:44 follow-up: Changed 3 years ago by tscrim

I also suspect Python does some stuff lazily with list copying, so x[1:] doesn't actually do any copying until you modify the x[1:] list.

comment:45 in reply to: ↑ 43 Changed 3 years ago by mantepse

Replying to tscrim:

It's not our job to protect against every possible bad input. Garbage in, garbage out.

OK, I agree. But then I think I should do

        C = self.basis().keys()
        if not isinstance(c, C.element_class):
            try:
                c = Integer(c)
            except TypeError:
                c = C(c)
            else:
                c = C([c])
        return self.monomial(c)

in all the __getitem__ methods. (Because otherwise s[2/1] succeeds by sheer luck (because elements of QQ are iterable), but s[QQbar(2)] or s[2.0] don't. Do you agree?

comment:46 Changed 3 years ago by tscrim

That is why I said you should check x in ZZ in comment:28.

comment:47 in reply to: ↑ 41 Changed 3 years ago by vdelecroix

Replying to mantepse:

Replying to mantepse:

Replying to vdelecroix:

I don't understand part of your change ...

I now checked

def test_contains1(x):
    return not x or (all((a in ZZ) and (a >= b) for a, b in zip(x, x[1:]))
                     and (x[-1] in ZZ) and (x[-1] >= 0))


def test_contains2(x):
    return not x or (all((x[i] in ZZ) and (x[i] >= x[i+1]) for i in range(len(x)-1))
                     and (x[-1] in ZZ) and (x[-1] >= 0))

and the second version is slightly slower (roughly by a factor 1.17), even for very long lists x (having more than 1000 elements). I suspect that copying is faster than adding.

Also: this is one copy against 1000 additions.

comment:48 in reply to: ↑ 44 ; follow-up: Changed 3 years ago by vdelecroix

Replying to tscrim:

I also suspect Python does some stuff lazily with list copying, so x[1:] doesn't actually do any copying until you modify the x[1:] list.

This is completely wrong. The getitem calls static PyObject * list_subscript(PyListObject* self, PyObject* item) which does the copy. Though it is fast since everything happen at C level (there is no copy of the actual objects in the list, just their pointers). In other words, copying a list of 1000 integers should be the exact same speed as copying a list of 1000 complicated graphs.

comment:49 Changed 3 years ago by git

  • Commit changed from 9d41f423c8e6063fdcd2ffdc56e360e29542ba14 to 1d24606b28b36412fec50c69938c6ab29ee39770

Branch pushed to git repo; I updated commit sha1. New commits:

1d24606fix getitem, improve error message, improve removal of trailing zeros

comment:50 Changed 3 years ago by git

  • Commit changed from 1d24606b28b36412fec50c69938c6ab29ee39770 to 1940f8d7d7d47a8f02acada9034eeef40e25c5e6

Branch pushed to git repo; I updated commit sha1. New commits:

1940f8dfix doctest

comment:51 in reply to: ↑ 48 Changed 3 years ago by mantepse

Replying to vdelecroix:

Replying to tscrim:

I also suspect Python does some stuff lazily with list copying, so x[1:] doesn't actually do any copying until you modify the x[1:] list.

This is completely wrong. The getitem calls static PyObject * list_subscript(PyListObject* self, PyObject* item) which does the copy. Though it is fast since everything happen at C level (there is no copy of the actual objects in the list, just their pointers). In other words, copying a list of 1000 integers should be the exact same speed as copying a list of 1000 complicated graphs.

Thanks for clarifying - then the timings are even more surprising!

comment:52 follow-up: Changed 3 years ago by mantepse

Off topic: doesn't it take longer to copy a long list than a short list?

Actually, copying seems to be roughly linear:

sage: for i in range(1,6):
....:     l = [randint(1,100) for _ in range(10^i)]
....:     %timeit l[1:]
....:     
The slowest run took 12.66 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 791 ns per loop
The slowest run took 9.37 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.37 µs per loop
100000 loops, best of 3: 7.21 µs per loop
10000 loops, best of 3: 70.1 µs per loop
1000 loops, best of 3: 838 µs per loop
Last edited 3 years ago by mantepse (previous) (diff)

comment:53 in reply to: ↑ 52 Changed 3 years ago by vdelecroix

Replying to mantepse:

Off topic: doesn't it take longer to copy a long list than a short list?

Actually, copying seems to be roughly linear:

sage: for i in range(1,6):
....:     l = [randint(1,100) for _ in range(10^i)]
....:     %timeit l[1:]
....:     
The slowest run took 12.66 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 791 ns per loop
The slowest run took 9.37 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.37 µs per loop
100000 loops, best of 3: 7.21 µs per loop
10000 loops, best of 3: 70.1 µs per loop
1000 loops, best of 3: 838 µs per loop

linear is what is expected.

comment:54 Changed 3 years ago by tscrim

  • Reviewers set to Travis Scrimshaw
  • Status changed from needs_review to positive_review

Okay, this is good enough for me. Thanks Martin for your work on this. Thanks Vincent for the explanations.

comment:55 Changed 3 years ago by vbraun

  • Branch changed from u/mantepse/speed_up_initialization_code_for_partitions to 1940f8d7d7d47a8f02acada9034eeef40e25c5e6
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.