Opened 5 years ago

Last modified 5 years ago

#23819 closed enhancement

Speed up AlternatingSignMatrix.from_corner_sum — at Version 33

Reported by: Martin Rubey Owned by:
Priority: major Milestone: sage-8.1
Component: combinatorics Keywords:
Cc: Jessica Striker, Kevin Dilks Merged in:
Authors: Martin Rubey Reviewers:
Report Upstream: N/A Work issues:
Branch: u/mantepse/speed_up_alternatingsignmatrix_from_corner_sum (Commits, GitHub, GitLab) Commit: 5d4806beace6cd865cfd3897d51ccdf0447a0076
Dependencies: Stopgaps:

Status badges

Description (last modified by Martin Rubey)

old:

sage: A = AlternatingSignMatrices(6)
sage: lC = [a.corner_sum_matrix() for a in A]
sage: timeit("[A.from_corner_sum(corner) for corner in lC]", number=1, repeat=1)
1 loops, best of 1: 31.6 s per loop

new:

sage: timeit("[A.from_corner_sum(corner) for corner in lC]", number=1, repeat=1) 
1 loops, best of 1: 2.33 s per loop

Change History (33)

comment:1 Changed 5 years ago by Martin Rubey

Branch: u/mantepse/speed_up_alternatingsignmatrix_from_corner_sum

comment:2 Changed 5 years ago by Martin Rubey

Authors: Martin Rubey
Commit: f34394d84ed07c4ed492479136fed334abd14b19
Component: PLEASE CHANGEcombinatorics
Description: modified (diff)
Status: newneeds_review
Type: PLEASE CHANGEenhancement

comment:3 Changed 5 years ago by Martin Rubey

Cc: Jessica Striker added

comment:4 Changed 5 years ago by git

Commit: f34394d84ed07c4ed492479136fed334abd14b19eab4f573aa46c0a46828d1424cdc88e2199d3f15

Branch pushed to git repo; I updated commit sha1. New commits:

eab4f57speed up from_corner_sum

comment:5 Changed 5 years ago by Martin Rubey

Florian Aigner just pointed out to me that a_i,j=c_{i,j}+c_{i-1,j-1}-c_{i-1,j}-c_{i,j-1}. I would expect that

        n = len(list(corner))-1
        asm = [[corner[i+1][j+1]+corner[i][j]-corner[i][j+1]-corner[i+1][j]
                for j in range(n)]
               for i in range(n)]
        return AlternatingSignMatrix(asm)

is the fastest version possible. Curiously, this is not the case, it is even slower than the original version. Why?

comment:6 Changed 5 years ago by git

Commit: eab4f573aa46c0a46828d1424cdc88e2199d3f15cf5643fb93a85d8f6bca44cadd6ca6c841d5016e

Branch pushed to git repo; I updated commit sha1. New commits:

cf5643fsimplify from_corner_sum, use matrices

comment:7 Changed 5 years ago by Martin Rubey

The new version is *slightly* (:-) faster:

sage: A = AlternatingSignMatrices(6)
sage: lC = [a.corner_sum_matrix() for a in A]
sage: timeit("[A.from_corner_sum(corner) for corner in lC]", number=1, repeat=1)
1 loops, best of 1: 3.14159265359...s per loop

8.1.beta4:

sage: timeit("[A.from_corner_sum(corner) for corner in lC]", number=1, repeat=1)
1 loops, best of 1: 31.4159265359...s per loop

comment:8 Changed 5 years ago by John Palmieri

Those can't be real timings.

comment:9 Changed 5 years ago by Martin Rubey

I admit that they are imaginary - but correct to one or two decimal positions.

comment:10 Changed 5 years ago by git

Commit: cf5643fb93a85d8f6bca44cadd6ca6c841d5016e450e6c4d4323cd08234e44cbfa8fe20c4e8488dc

Branch pushed to git repo; I updated commit sha1. New commits:

bca9448speed up initialization code, allow to bypass checking of arguments
450e6c4trivial simplification in height_function

comment:11 Changed 5 years ago by git

Commit: 450e6c4d4323cd08234e44cbfa8fe20c4e8488dcc73fd4354c350e3c75a6d80773298a3223b1b07e

Branch pushed to git repo; I updated commit sha1. New commits:

c73fd43trivial simplification in to_monotone_triangle

comment:12 Changed 5 years ago by Martin Rubey

There is one further, very important optimization, but I feel it belongs into a separate ticket. It consists of the following patch, introducing the possibility to bypass checking in SemistandardTableau:

diff --git a/src/sage/combinat/alternating_sign_matrix.py b/src/sage/combinat/alternating_sign_matrix.py
index 30b5d49..fe07455 100644
--- a/src/sage/combinat/alternating_sign_matrix.py
+++ b/src/sage/combinat/alternating_sign_matrix.py
@@ -938,7 +938,7 @@ class AlternatingSignMatrix(Element):
         for i in range(len(mt)):
             for j in range(len(mt[i])):
                 ssyt[i][j] = mt[j][-(i+1)]
-        return SemistandardTableau(ssyt)
+        return SemistandardTableau(ssyt, check=False)
 
     def left_key(self):
         r"""
diff --git a/src/sage/combinat/tableau.py b/src/sage/combinat/tableau.py
index 377ad9e..418627b 100644
--- a/src/sage/combinat/tableau.py
+++ b/src/sage/combinat/tableau.py
@@ -4137,7 +4137,7 @@ class SemistandardTableau(Tableau):
         ValueError: entries must be positive integers
     """
     @staticmethod
-    def __classcall_private__(self, t):
+    def __classcall_private__(self, t, check=True):
         r"""
         This ensures that a SemistandardTableau is only ever constructed as an
         element_class call of an appropriate parent.
@@ -4156,8 +4156,8 @@ class SemistandardTableau(Tableau):
         """
         if isinstance(t, SemistandardTableau):
             return t
-        elif t in SemistandardTableaux():
-            return SemistandardTableaux_all().element_class(SemistandardTableaux_all(), t)
+        elif not check or t in SemistandardTableaux():
+            return SemistandardTableaux_all().element_class(SemistandardTableaux_all(), t, check=check)
 
         # t is not a semistandard tableau so we give an appropriate error message
         if t not in Tableaux():
@@ -4173,7 +4173,7 @@ class SemistandardTableau(Tableau):
         raise ValueError('%s is not a column strict tableau' % t)
 
 
-    def __init__(self, parent, t):
+    def __init__(self, parent, t, check=True):
         r"""
         Initialize a semistandard tableau.
 
@@ -4197,6 +4197,8 @@ class SemistandardTableau(Tableau):
         """
         super(SemistandardTableau, self).__init__(parent, t)
 
+        if not check:
+            return
         # Tableau() has checked that t is tableau, so it remains to check that
         # the entries of t are positive integers which are weakly increasing
         # along rows

comment:13 Changed 5 years ago by Travis Scrimshaw

Cc: Kevin Dilks added
Status: needs_reviewneeds_work

I do not like general exception catching: it can hide real errors. Also, you should catch Exception as I think StandardError would also catch more serious errors and interrupts (e.g., ctrl-C) that should rise up.

You also do not need the else in that try-except block.

comment:14 in reply to:  13 ; Changed 5 years ago by Martin Rubey

Replying to tscrim:

I do not like general exception catching: it can hide real errors.

Me neither, but in the case at hand it is necessary. However, I agree that StandardError should be replaced with (TypeError, ValueError) in line 1196. I don't know what would be better to catch in line 1199 though.

Also, you should catch Exception as I think StandardError would also catch more serious errors and interrupts (e.g., ctrl-C) that should rise up.

Actually, it is the other way round.

You also do not need the else in that try-except block.

I know, but I think it makes the code much clearer - it is easy to overlook the return just before.

comment:15 in reply to:  14 Changed 5 years ago by Martin Rubey

I don't know what would be better to catch in line 1199 though.

Just checked, it's also (TypeError, ValueError).

Thanks!

comment:16 Changed 5 years ago by git

Commit: c73fd4354c350e3c75a6d80773298a3223b1b07e2d5a7447c1a5323f9a5ddad1efcf4b17d1623152

Branch pushed to git repo; I updated commit sha1. New commits:

2d5a744replace StandardError with two more specific errors

comment:17 Changed 5 years ago by Martin Rubey

Status: needs_workneeds_review

comment:18 in reply to:  14 ; Changed 5 years ago by Travis Scrimshaw

Replying to mantepse:

Replying to tscrim:

Also, you should catch Exception as I think StandardError would also catch more serious errors and interrupts (e.g., ctrl-C) that should rise up.

Actually, it is the other way round.

I remember being told that if you want a general exception catching, you should do Exception. Maybe that does everything except interrupts? Maybe I am thinking of a different error class that is bad?

You also do not need the else in that try-except block.

I know, but I think it makes the code much clearer - it is easy to overlook the return just before.

I disagree. It suggests extra logic as opposed to the fact that it should just fall through and the indented except block is the "error handling" portion. Although I would not hold up this ticket because of this.

In a way, I am not usually satisfied using errors to control program flow, but I think this is a necessary evil to differentiate the input in a fast way.

Jessica, Kevin, do you have any comments?

comment:19 Changed 5 years ago by Jessica Striker

Hi all, I don't have any comments on the specifics, except to say that when I wrote this method originally, I was aiming to write something that worked rather than something that was fast. So I appreciate your efforts to speed it up!

comment:20 Changed 5 years ago by Kevin Dilks

There's some kind of connection issue keeping me and the Patchbots from pulling the code and testing it right now, but I'm fine with everything that I see in the diff file.

comment:21 Changed 5 years ago by git

Commit: 2d5a7447c1a5323f9a5ddad1efcf4b17d162315211484958ccb747af349db836d672e19bdc042221

Branch pushed to git repo; I updated commit sha1. New commits:

1148495remove an else by reviewer's request

comment:22 in reply to:  18 ; Changed 5 years ago by Martin Rubey

Also, you should catch Exception as I think StandardError would also catch more serious errors and interrupts (e.g., ctrl-C) that should rise up.

Actually, it is the other way round.

I remember being told that if you want a general exception catching, you should do Exception. Maybe that does everything except interrupts? Maybe I am thinking of a different error class that is bad?

The hierarchy of classes is here: https://docs.python.org/2/library/exceptions.html#exception-hierarchy

Exception catches a strict superset of StandardError. For example, you can try:

def testit1():
    try:
        l = iter([1,2,3])
        while True:
            print l.next()
    except Exception as e:
        print e
def testit2():
    try:
        l = iter([1,2,3])
        while True:
            print l.next()
    except StandardError as e:
        print e

You also do not need the else in that try-except block.

I know, but I think it makes the code much clearer - it is easy to overlook the return just before.

I disagree. It suggests extra logic as opposed to the fact that it should just fall through and the indented except block is the "error handling" portion. Although I would not hold up this ticket because of this.

well, the m = self._matrix_space(asm) is also indented. But anyway, I removed the else.

In a way, I am not usually satisfied using errors to control program flow, but I think this is a necessary evil to differentiate the input in a fast way.

Yes, unless there is an easy way to check, python prefers to try.

comment:23 in reply to:  22 Changed 5 years ago by Travis Scrimshaw

Replying to mantepse:

Also, you should catch Exception as I think StandardError would also catch more serious errors and interrupts (e.g., ctrl-C) that should rise up.

Actually, it is the other way round.

I remember being told that if you want a general exception catching, you should do Exception. Maybe that does everything except interrupts? Maybe I am thinking of a different error class that is bad?

The hierarchy of classes is here: https://docs.python.org/2/library/exceptions.html#exception-hierarchy

Thank you for the reference.

You also do not need the else in that try-except block.

I know, but I think it makes the code much clearer - it is easy to overlook the return just before.

I disagree. It suggests extra logic as opposed to the fact that it should just fall through and the indented except block is the "error handling" portion. Although I would not hold up this ticket because of this.

well, the m = self._matrix_space(asm) is also indented. But anyway, I removed the else.

Thank you.

In a way, I am not usually satisfied using errors to control program flow, but I think this is a necessary evil to differentiate the input in a fast way.

Yes, unless there is an easy way to check, python prefers to try.

That depends on how you define "easy". If you mean in terms of readability and robustness, then the containment check is better IMO. However, the speed-up is clear.

Also, I don't think we also have to worry about 1x1 matrices versus size 1 arrays. However, it is an internal behavior change because it now considers it as a matrix instead of an array.

comment:24 Changed 5 years ago by git

Commit: 11484958ccb747af349db836d672e19bdc04222123433b173f615c7315799458e4a9d5a26efd1b7b

Branch pushed to git repo; I updated commit sha1. New commits:

23433b1add a test for 1x1 matrices

comment:25 Changed 5 years ago by Martin Rubey

You are very attentive! I actually checked the 1x1 matrix, but was too lazy to write a test. Fixed.


New commits:

23433b1add a test for 1x1 matrices

New commits:

23433b1add a test for 1x1 matrices

comment:26 Changed 5 years ago by git

Commit: 23433b173f615c7315799458e4a9d5a26efd1b7b572930cae7aa079cbd28313d0fc83b468f74e957

Branch pushed to git repo; I updated commit sha1. New commits:

572930cbetter construction of matrices, fix a typo

comment:27 Changed 5 years ago by Vincent Delecroix

Instead of

corner = MatrixSpace(ZZ, n+1)(corner)
asm = [[corner[i+1,j+1]+corner[i,j]-corner[i,j+1]-corner[i+1,j]
           for j in range(n)]
            for i in range(n)]
self._matrix_space(asm)

You should use matrix operations

corner = MatrixSpace(ZZ, n+1)(corner)
asm = corner[1:,1:] + corner[:n,:n] - corner[:n,1:] - corner[1:,:n]

As I mentioned in #23819, a Python loop is a bad idea if you want speed. Though matrix slices are not seriously optimized in Sage (we are not using memory views), but still there is a win

sage: n = 5
sage: corner = random_matrix(ZZ, n+1)
sage: %timeit asm = matrix([[corner[i+1,j+1]+corner[i,j]-corner[i,j+1]-corner[i+1,j] for j in range(n)] for i in range(n)])
10000 loops, best of 3: 108 µs per loop
sage: %timeit asm = corner[1:,1:] + corner[:n,:n] - corner[:n,1:] - corner[1:,:n]
10000 loops, best of 3: 34.5 µs per loop

(In numpy it would have been tremendous!)

comment:28 Changed 5 years ago by Martin Rubey

That's brilliant!

comment:29 Changed 5 years ago by git

Commit: 572930cae7aa079cbd28313d0fc83b468f74e9571e5fd41c9520b0298b2c70790c0b220b5b31d004

Branch pushed to git repo; I updated commit sha1. New commits:

1e5fd41use matrix operations in from_corner_sum

comment:30 Changed 5 years ago by Vincent Delecroix

Could you update the timings in the ticket description? (we were at 25.5s with your version)

comment:31 Changed 5 years ago by Vincent Delecroix

This conversion is not needed anymore

self._matrix_space(asm)

If you really feel there should be something, just add the line

assert asm.parent() is self._matrix_space

comment:32 Changed 5 years ago by git

Commit: 1e5fd41c9520b0298b2c70790c0b220b5b31d0045d4806beace6cd865cfd3897d51ccdf0447a0076

Branch pushed to git repo; I updated commit sha1. New commits:

5d4806bremove unnecessary conversion to matrix

comment:33 Changed 5 years ago by Martin Rubey

Description: modified (diff)
Note: See TracTickets for help on using tickets.