Opened 5 years ago

Closed 5 years ago

#23819 closed enhancement (fixed)

Speed up AlternatingSignMatrix.from_corner_sum

Reported by: mantepse Owned by:
Priority: major Milestone: sage-8.1
Component: combinatorics Keywords:
Cc: jessicapalencia, kdilks Merged in:
Authors: Martin Rubey Reviewers: Travis Scrimshaw, Vincent Delecroix
Report Upstream: N/A Work issues:
Branch: 5d4806b (Commits, GitHub, GitLab) Commit: 5d4806beace6cd865cfd3897d51ccdf0447a0076
Dependencies: Stopgaps:

Status badges

Description (last modified by mantepse)

old:

sage: A = AlternatingSignMatrices(6)
sage: lC = [a.corner_sum_matrix() for a in A]
sage: timeit("[A.from_corner_sum(corner) for corner in lC]", number=1, repeat=1)
1 loops, best of 1: 31.6 s per loop

new:

sage: timeit("[A.from_corner_sum(corner) for corner in lC]", number=1, repeat=1) 
1 loops, best of 1: 2.33 s per loop

Change History (37)

comment:1 Changed 5 years ago by mantepse

  • Branch set to u/mantepse/speed_up_alternatingsignmatrix_from_corner_sum

comment:2 Changed 5 years ago by mantepse

  • Authors set to Martin Rubey
  • Commit set to f34394d84ed07c4ed492479136fed334abd14b19
  • Component changed from PLEASE CHANGE to combinatorics
  • Description modified (diff)
  • Status changed from new to needs_review
  • Type changed from PLEASE CHANGE to enhancement

comment:3 Changed 5 years ago by mantepse

  • Cc jessicapalencia added

comment:4 Changed 5 years ago by git

  • Commit changed from f34394d84ed07c4ed492479136fed334abd14b19 to eab4f573aa46c0a46828d1424cdc88e2199d3f15

Branch pushed to git repo; I updated commit sha1. New commits:

eab4f57speed up from_corner_sum

comment:5 Changed 5 years ago by mantepse

Florian Aigner just pointed out to me that a_i,j=c_{i,j}+c_{i-1,j-1}-c_{i-1,j}-c_{i,j-1}. I would expect that

        n = len(list(corner))-1
        asm = [[corner[i+1][j+1]+corner[i][j]-corner[i][j+1]-corner[i+1][j]
                for j in range(n)]
               for i in range(n)]
        return AlternatingSignMatrix(asm)

is the fastest version possible. Curiously, this is not the case, it is even slower than the original version. Why?

comment:6 Changed 5 years ago by git

  • Commit changed from eab4f573aa46c0a46828d1424cdc88e2199d3f15 to cf5643fb93a85d8f6bca44cadd6ca6c841d5016e

Branch pushed to git repo; I updated commit sha1. New commits:

cf5643fsimplify from_corner_sum, use matrices

comment:7 Changed 5 years ago by mantepse

The new version is *slightly* (:-) faster:

sage: A = AlternatingSignMatrices(6)
sage: lC = [a.corner_sum_matrix() for a in A]
sage: timeit("[A.from_corner_sum(corner) for corner in lC]", number=1, repeat=1)
1 loops, best of 1: 3.14159265359...s per loop

8.1.beta4:

sage: timeit("[A.from_corner_sum(corner) for corner in lC]", number=1, repeat=1)
1 loops, best of 1: 31.4159265359...s per loop

comment:8 Changed 5 years ago by jhpalmieri

Those can't be real timings.

comment:9 Changed 5 years ago by mantepse

I admit that they are imaginary - but correct to one or two decimal positions.

comment:10 Changed 5 years ago by git

  • Commit changed from cf5643fb93a85d8f6bca44cadd6ca6c841d5016e to 450e6c4d4323cd08234e44cbfa8fe20c4e8488dc

Branch pushed to git repo; I updated commit sha1. New commits:

bca9448speed up initialization code, allow to bypass checking of arguments
450e6c4trivial simplification in height_function

comment:11 Changed 5 years ago by git

  • Commit changed from 450e6c4d4323cd08234e44cbfa8fe20c4e8488dc to c73fd4354c350e3c75a6d80773298a3223b1b07e

Branch pushed to git repo; I updated commit sha1. New commits:

c73fd43trivial simplification in to_monotone_triangle

comment:12 Changed 5 years ago by mantepse

There is one further, very important optimization, but I feel it belongs into a separate ticket. It consists of the following patch, introducing the possibility to bypass checking in SemistandardTableau:

diff --git a/src/sage/combinat/alternating_sign_matrix.py b/src/sage/combinat/alternating_sign_matrix.py
index 30b5d49..fe07455 100644
--- a/src/sage/combinat/alternating_sign_matrix.py
+++ b/src/sage/combinat/alternating_sign_matrix.py
@@ -938,7 +938,7 @@ class AlternatingSignMatrix(Element):
         for i in range(len(mt)):
             for j in range(len(mt[i])):
                 ssyt[i][j] = mt[j][-(i+1)]
-        return SemistandardTableau(ssyt)
+        return SemistandardTableau(ssyt, check=False)
 
     def left_key(self):
         r"""
diff --git a/src/sage/combinat/tableau.py b/src/sage/combinat/tableau.py
index 377ad9e..418627b 100644
--- a/src/sage/combinat/tableau.py
+++ b/src/sage/combinat/tableau.py
@@ -4137,7 +4137,7 @@ class SemistandardTableau(Tableau):
         ValueError: entries must be positive integers
     """
     @staticmethod
-    def __classcall_private__(self, t):
+    def __classcall_private__(self, t, check=True):
         r"""
         This ensures that a SemistandardTableau is only ever constructed as an
         element_class call of an appropriate parent.
@@ -4156,8 +4156,8 @@ class SemistandardTableau(Tableau):
         """
         if isinstance(t, SemistandardTableau):
             return t
-        elif t in SemistandardTableaux():
-            return SemistandardTableaux_all().element_class(SemistandardTableaux_all(), t)
+        elif not check or t in SemistandardTableaux():
+            return SemistandardTableaux_all().element_class(SemistandardTableaux_all(), t, check=check)
 
         # t is not a semistandard tableau so we give an appropriate error message
         if t not in Tableaux():
@@ -4173,7 +4173,7 @@ class SemistandardTableau(Tableau):
         raise ValueError('%s is not a column strict tableau' % t)
 
 
-    def __init__(self, parent, t):
+    def __init__(self, parent, t, check=True):
         r"""
         Initialize a semistandard tableau.
 
@@ -4197,6 +4197,8 @@ class SemistandardTableau(Tableau):
         """
         super(SemistandardTableau, self).__init__(parent, t)
 
+        if not check:
+            return
         # Tableau() has checked that t is tableau, so it remains to check that
         # the entries of t are positive integers which are weakly increasing
         # along rows

comment:13 follow-up: Changed 5 years ago by tscrim

  • Cc kdilks added
  • Status changed from needs_review to needs_work

I do not like general exception catching: it can hide real errors. Also, you should catch Exception as I think StandardError would also catch more serious errors and interrupts (e.g., ctrl-C) that should rise up.

You also do not need the else in that try-except block.

comment:14 in reply to: ↑ 13 ; follow-ups: Changed 5 years ago by mantepse

Replying to tscrim:

I do not like general exception catching: it can hide real errors.

Me neither, but in the case at hand it is necessary. However, I agree that StandardError should be replaced with (TypeError, ValueError) in line 1196. I don't know what would be better to catch in line 1199 though.

Also, you should catch Exception as I think StandardError would also catch more serious errors and interrupts (e.g., ctrl-C) that should rise up.

Actually, it is the other way round.

You also do not need the else in that try-except block.

I know, but I think it makes the code much clearer - it is easy to overlook the return just before.

comment:15 in reply to: ↑ 14 Changed 5 years ago by mantepse

I don't know what would be better to catch in line 1199 though.

Just checked, it's also (TypeError, ValueError).

Thanks!

comment:16 Changed 5 years ago by git

  • Commit changed from c73fd4354c350e3c75a6d80773298a3223b1b07e to 2d5a7447c1a5323f9a5ddad1efcf4b17d1623152

Branch pushed to git repo; I updated commit sha1. New commits:

2d5a744replace StandardError with two more specific errors

comment:17 Changed 5 years ago by mantepse

  • Status changed from needs_work to needs_review

comment:18 in reply to: ↑ 14 ; follow-up: Changed 5 years ago by tscrim

Replying to mantepse:

Replying to tscrim:

Also, you should catch Exception as I think StandardError would also catch more serious errors and interrupts (e.g., ctrl-C) that should rise up.

Actually, it is the other way round.

I remember being told that if you want a general exception catching, you should do Exception. Maybe that does everything except interrupts? Maybe I am thinking of a different error class that is bad?

You also do not need the else in that try-except block.

I know, but I think it makes the code much clearer - it is easy to overlook the return just before.

I disagree. It suggests extra logic as opposed to the fact that it should just fall through and the indented except block is the "error handling" portion. Although I would not hold up this ticket because of this.

In a way, I am not usually satisfied using errors to control program flow, but I think this is a necessary evil to differentiate the input in a fast way.

Jessica, Kevin, do you have any comments?

comment:19 Changed 5 years ago by jessicapalencia

Hi all, I don't have any comments on the specifics, except to say that when I wrote this method originally, I was aiming to write something that worked rather than something that was fast. So I appreciate your efforts to speed it up!

comment:20 Changed 5 years ago by kdilks

There's some kind of connection issue keeping me and the Patchbots from pulling the code and testing it right now, but I'm fine with everything that I see in the diff file.

comment:21 Changed 5 years ago by git

  • Commit changed from 2d5a7447c1a5323f9a5ddad1efcf4b17d1623152 to 11484958ccb747af349db836d672e19bdc042221

Branch pushed to git repo; I updated commit sha1. New commits:

1148495remove an else by reviewer's request

comment:22 in reply to: ↑ 18 ; follow-up: Changed 5 years ago by mantepse

Also, you should catch Exception as I think StandardError would also catch more serious errors and interrupts (e.g., ctrl-C) that should rise up.

Actually, it is the other way round.

I remember being told that if you want a general exception catching, you should do Exception. Maybe that does everything except interrupts? Maybe I am thinking of a different error class that is bad?

The hierarchy of classes is here: https://docs.python.org/2/library/exceptions.html#exception-hierarchy

Exception catches a strict superset of StandardError. For example, you can try:

def testit1():
    try:
        l = iter([1,2,3])
        while True:
            print l.next()
    except Exception as e:
        print e
def testit2():
    try:
        l = iter([1,2,3])
        while True:
            print l.next()
    except StandardError as e:
        print e

You also do not need the else in that try-except block.

I know, but I think it makes the code much clearer - it is easy to overlook the return just before.

I disagree. It suggests extra logic as opposed to the fact that it should just fall through and the indented except block is the "error handling" portion. Although I would not hold up this ticket because of this.

well, the m = self._matrix_space(asm) is also indented. But anyway, I removed the else.

In a way, I am not usually satisfied using errors to control program flow, but I think this is a necessary evil to differentiate the input in a fast way.

Yes, unless there is an easy way to check, python prefers to try.

comment:23 in reply to: ↑ 22 Changed 5 years ago by tscrim

Replying to mantepse:

Also, you should catch Exception as I think StandardError would also catch more serious errors and interrupts (e.g., ctrl-C) that should rise up.

Actually, it is the other way round.

I remember being told that if you want a general exception catching, you should do Exception. Maybe that does everything except interrupts? Maybe I am thinking of a different error class that is bad?

The hierarchy of classes is here: https://docs.python.org/2/library/exceptions.html#exception-hierarchy

Thank you for the reference.

You also do not need the else in that try-except block.

I know, but I think it makes the code much clearer - it is easy to overlook the return just before.

I disagree. It suggests extra logic as opposed to the fact that it should just fall through and the indented except block is the "error handling" portion. Although I would not hold up this ticket because of this.

well, the m = self._matrix_space(asm) is also indented. But anyway, I removed the else.

Thank you.

In a way, I am not usually satisfied using errors to control program flow, but I think this is a necessary evil to differentiate the input in a fast way.

Yes, unless there is an easy way to check, python prefers to try.

That depends on how you define "easy". If you mean in terms of readability and robustness, then the containment check is better IMO. However, the speed-up is clear.

Also, I don't think we also have to worry about 1x1 matrices versus size 1 arrays. However, it is an internal behavior change because it now considers it as a matrix instead of an array.

comment:24 Changed 5 years ago by git

  • Commit changed from 11484958ccb747af349db836d672e19bdc042221 to 23433b173f615c7315799458e4a9d5a26efd1b7b

Branch pushed to git repo; I updated commit sha1. New commits:

23433b1add a test for 1x1 matrices

comment:25 Changed 5 years ago by mantepse

You are very attentive! I actually checked the 1x1 matrix, but was too lazy to write a test. Fixed.


New commits:

23433b1add a test for 1x1 matrices

New commits:

23433b1add a test for 1x1 matrices

comment:26 Changed 5 years ago by git

  • Commit changed from 23433b173f615c7315799458e4a9d5a26efd1b7b to 572930cae7aa079cbd28313d0fc83b468f74e957

Branch pushed to git repo; I updated commit sha1. New commits:

572930cbetter construction of matrices, fix a typo

comment:27 Changed 5 years ago by vdelecroix

Instead of

corner = MatrixSpace(ZZ, n+1)(corner)
asm = [[corner[i+1,j+1]+corner[i,j]-corner[i,j+1]-corner[i+1,j]
           for j in range(n)]
            for i in range(n)]
self._matrix_space(asm)

You should use matrix operations

corner = MatrixSpace(ZZ, n+1)(corner)
asm = corner[1:,1:] + corner[:n,:n] - corner[:n,1:] - corner[1:,:n]

As I mentioned in #23819, a Python loop is a bad idea if you want speed. Though matrix slices are not seriously optimized in Sage (we are not using memory views), but still there is a win

sage: n = 5
sage: corner = random_matrix(ZZ, n+1)
sage: %timeit asm = matrix([[corner[i+1,j+1]+corner[i,j]-corner[i,j+1]-corner[i+1,j] for j in range(n)] for i in range(n)])
10000 loops, best of 3: 108 µs per loop
sage: %timeit asm = corner[1:,1:] + corner[:n,:n] - corner[:n,1:] - corner[1:,:n]
10000 loops, best of 3: 34.5 µs per loop

(In numpy it would have been tremendous!)

comment:28 Changed 5 years ago by mantepse

That's brilliant!

comment:29 Changed 5 years ago by git

  • Commit changed from 572930cae7aa079cbd28313d0fc83b468f74e957 to 1e5fd41c9520b0298b2c70790c0b220b5b31d004

Branch pushed to git repo; I updated commit sha1. New commits:

1e5fd41use matrix operations in from_corner_sum

comment:30 Changed 5 years ago by vdelecroix

Could you update the timings in the ticket description? (we were at 25.5s with your version)

comment:31 Changed 5 years ago by vdelecroix

This conversion is not needed anymore

self._matrix_space(asm)

If you really feel there should be something, just add the line

assert asm.parent() is self._matrix_space

comment:32 Changed 5 years ago by git

  • Commit changed from 1e5fd41c9520b0298b2c70790c0b220b5b31d004 to 5d4806beace6cd865cfd3897d51ccdf0447a0076

Branch pushed to git repo; I updated commit sha1. New commits:

5d4806bremove unnecessary conversion to matrix

comment:33 Changed 5 years ago by mantepse

  • Description modified (diff)

comment:34 follow-up: Changed 5 years ago by vdelecroix

Haaa. Looks better now ;-)

comment:35 Changed 5 years ago by tscrim

  • Reviewers set to Travis Scrimshaw, Vincent Delecroix
  • Status changed from needs_review to positive_review

Given the speed we've gained, I am setting this to positive review (unless anyone has any objections).

comment:36 in reply to: ↑ 34 Changed 5 years ago by mantepse

Replying to vdelecroix:

Haaa. Looks better now ;-)

I agree! The new version is much faster and much easier to understand! Thank you (and Florian)!

comment:37 Changed 5 years ago by vbraun

  • Branch changed from u/mantepse/speed_up_alternatingsignmatrix_from_corner_sum to 5d4806beace6cd865cfd3897d51ccdf0447a0076
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.