Optimize no broken circuit sets

Reported by: Owned by: tkarn tkarn major sage-9.5 matroid theory gsoc2021 optimization matroid no-broken-circuit-sets tscrim, tkarn Trevor K. Karn Travis Scrimshaw N/A fe6d6d6 fe6d6d6f92688ed0811827e88367aec03f91a256

Description

Algorithms in the literature (https://arxiv.org/pdf/1110.0120.pdf) exist to speed up the computation of no broken circuit sets of a matroid. We implement them in this ticket.

Changed 13 months ago by tkarn

Timing comparison of n.b.c. set computation for the braid matroid

comment:1 Changed 13 months ago by tkarn

• Branch set to u/tkarn/optimize-nbc-set-computation-32339
• Commit set to 5b751919903c8c1c3fd73cfd43b1c113de1f561f
• Owner changed from (none) to tkarn

New commits:

 ​5b75191 `Implement faster n.b.c. set computation`

comment:2 Changed 13 months ago by tkarn

• Status changed from new to needs_review

comment:3 Changed 13 months ago by tscrim

I think it would be easier to follow (and also faster) to modify the algorithm to work with Sage's convention. Also, here are some tweaks:

```-        order_dict = {Tmax-i-1:e for i,e in enumerate(ordering)}
-        reverse_dict = {value:key for key,value in order_dict.items()}
+        order_dict = {Tmax-i-1: e for i, e in enumerate(ordering)}
+        reverse_dict = {value: key for key, value in order_dict.items()}

B = list()
Q = deque(([],))
i = 0
while Q:
H = Q.popleft()
-            tp = reverse_dict[H[-1]] if len(H)>0 else -1
-            Ht = [H+[order_dict[i]] for i in range(tp+1,Tmax)]
-            if all(map(self.is_independent, Ht)):
+            tp = reverse_dict[H[-1]] if len(H) > 0 else -1
+            Ht = [H + [order_dict[i]] for i in range(tp+1, Tmax)]
+            if all(self.is_independent(x) for x in Ht):
B.append(frozenset(H))
Q.extend(Ht)
return B
```

The only non-trivial/PEP8 tweak is removing the `map` as I find generator expressions easier to read (although perhaps it is slower and so we shouldn't do it).

Can you also just post the complete tests and timings as a comment in the ticket. Just the result of `%time`/`%timeit` should be sufficient. It makes it easier to see at which stage the timings were conducted and less to navigate.

comment:4 Changed 13 months ago by tkarn

Tests with the original:

```sage: M = matroids.CompleteGraphic(5)
sage: timeit.eval('M.no_broken_circuits_sets()')
25 loops, best of 3: 15.1 ms per loop
sage: M = matroids.Uniform(4,4)
sage: timeit.eval('M.no_broken_circuits_sets()')
625 loops, best of 3: 14.4 μs per loop
sage: M = matroids.Wheel(5)
sage: timeit.eval('M.no_broken_circuits_sets()')
625 loops, best of 3: 1.1 ms per loop
sage: M = matroids.Uniform(5,5)
sage: timeit.eval('M.no_broken_circuits_sets()')
625 loops, best of 3: 20.2 μs per loop
sage: M = matroids.Uniform(5,7)
sage: timeit.eval('M.no_broken_circuits_sets()')
625 loops, best of 3: 303 μs per loop
sage: M = matroids.CompleteGraphic(6)
sage: timeit.eval('M.no_broken_circuits_sets()')
5 loops, best of 3: 306 ms per loop
sage: M = matroids.PG(2,3)
sage: timeit.eval('M.no_broken_circuits_sets()')
125 loops, best of 3: 6.39 ms per loop
sage: M = matroids.PG(2,5)
sage: timeit.eval('M.no_broken_circuits_sets()')
5 loops, best of 3: 1.98 s per loop
```

Tests using commit 5b75191:

```sage: M = matroids.CompleteGraphic(5)
sage: timeit.eval('M.no_broken_circuits_sets()')
125 loops, best of 3: 2.72 ms per loop
sage: M = matroids.Uniform(4,4)
sage: timeit.eval('M.no_broken_circuits_sets()')
625 loops, best of 3: 56.4 μs per loop
sage: M = matroids.Wheel(5)
sage: timeit.eval('M.no_broken_circuits_sets()')
625 loops, best of 3: 1.16 ms per loop
sage: timeit.eval('M.no_broken_circuits_sets()')
625 loops, best of 3: 102 μs per loop
sage: timeit.eval('M.no_broken_circuits_sets()')
625 loops, best of 3: 286 μs per loop
sage: M = matroids.CompleteGraphic(6)
sage: timeit.eval('M.no_broken_circuits_sets()')
5 loops, best of 3: 37.9 ms per loop
sage: M = matroids.PG(2,3)
sage: timeit.eval('M.no_broken_circuits_sets()')
625 loops, best of 3: 618 μs per loop
sage: M = matroids.PG(2,5)
sage: timeit.eval('M.no_broken_circuits_sets()')
5 loops, best of 3: 42.6 ms per loop
```

So it does seem to vary a bit. In particular, it is not faster for small matroids with a relatively simple structure and I think the speed up for other classes of matroids makes this change worth it.

comment:5 Changed 13 months ago by tkarn

Replacing the `map` with a comprehension seems to break the code unless it is in a list. I don't see why this should be the case (maybe a Cython problem?), but it is. Testing including the list yields:

```sage: M = matroids.CompleteGraphic(5)
sage: timeit.eval('M.no_broken_circuits_sets()')
125 loops, best of 3: 3.72 ms per loop
sage: M = matroids.Uniform(4,4)
sage: timeit.eval('M.no_broken_circuits_sets()')
625 loops, best of 3: 24.5 μs per loop
sage: M = matroids.Wheel(5)
sage: timeit.eval('M.no_broken_circuits_sets()')
625 loops, best of 3: 1.36 ms per loop
sage: M = matroids.Uniform(5,5)
sage: timeit.eval('M.no_broken_circuits_sets()')
625 loops, best of 3: 50.2 μs per loop
sage: M = matroids.Uniform(5,7)
sage: timeit.eval('M.no_broken_circuits_sets()')
625 loops, best of 3: 251 μs per loop
sage: M = matroids.CompleteGraphic(6)
sage: timeit.eval('M.no_broken_circuits_sets()')
25 loops, best of 3: 32 ms per loop
sage: M = matroids.PG(2,3)
sage: timeit.eval('M.no_broken_circuits_sets()')
625 loops, best of 3: 798 μs per loop
sage: M = matroids.PG(2,5)
sage: timeit.eval('M.no_broken_circuits_sets()')
5 loops, best of 3: 67.9 ms per loop
```

This appears noticably slower than with `map` (although still faster than the original), so if its ok, I'd like to keep it as a `map`.

comment:6 Changed 13 months ago by tscrim

• Branch changed from u/tkarn/optimize-nbc-set-computation-32339 to public/matroids/optimize_nbc_set-32339
• Commit changed from 5b751919903c8c1c3fd73cfd43b1c113de1f561f to 16061d2beb5dff3b4b3ef6ddb81c435189fdab60
• Reviewers set to Travis Scrimshaw

Ah, that's right, its a cpdef method so we can't have generator objects.

I made some Cython specific stuff for speed. I also tweaked the algorithm slightly to avoid some of the checks you were doing. Granted, my improvement pales in comparison to your original one, but I still am squeezing a bit more out. Here are the timings in the same order as what you have above:

```1.59 ms ± 14.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
12.5 µs ± 138 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
608 µs ± 5.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
27.7 µs ± 322 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
150 µs ± 5.19 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
12.6 ms ± 35.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
320 µs ± 3.61 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
15.7 ms ± 156 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
```

```sage: %timeit M.no_broken_circuits_sets()
1.78 ms ± 14.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
23.1 µs ± 256 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
789 µs ± 12.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
46.6 µs ± 1.01 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
209 µs ± 2.21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
13.8 ms ± 125 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
406 µs ± 2.42 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
16.2 ms ± 42 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
```

If my changes are good, then positive review.

New commits:

 ​16061d2 `Further Cython and algorithm speedups.`

comment:7 Changed 13 months ago by tscrim

Here is a version that skips some checks and short-circuits the independence check. This gets me another 25%-50% speed reduction:

```1.41 ms ± 13.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
10.1 µs ± 119 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
429 µs ± 3.21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
19.2 µs ± 112 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
100 µs ± 1.07 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
11.2 ms ± 60.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
209 µs ± 1.77 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
13.9 ms ± 204 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
```

comment:8 Changed 13 months ago by git

• Commit changed from 16061d2beb5dff3b4b3ef6ddb81c435189fdab60 to de750fe9634238c2a052150603ce83581daf13e2

Branch pushed to git repo; I updated commit sha1. New commits:

 ​a0a2014 `Avoiding some useless checks in NBC sets.` ​de750fe `Use short-circuiting for the independence check.`

comment:9 Changed 13 months ago by tscrim

Now I feel I have gotten everything out that is possible without significant modifications. Ready for review.

comment:10 follow-up: ↓ 12 Changed 12 months ago by tkarn

I'm slightly worried about disabling bounds checking because of possible crashing and security issues.

On my machine commit de750fe has the following timings:

```625 loops, best of 3: 1.36 ms per loop
625 loops, best of 3: 10.4 μs per loop
625 loops, best of 3: 401 μs per loop
625 loops, best of 3: 17.8 μs per loop
625 loops, best of 3: 87.4 μs per loop
25 loops, best of 3: 9.82 ms per loop
625 loops, best of 3: 183 μs per loop
25 loops, best of 3: 12.8 ms per loop
```

and with bounds checking enabled:

```625 loops, best of 3: 1.38 ms per loop
625 loops, best of 3: 9.85 μs per loop
625 loops, best of 3: 404 μs per loop
625 loops, best of 3: 17.1 μs per loop
625 loops, best of 3: 83.1 μs per loop
25 loops, best of 3: 9.87 ms per loop
625 loops, best of 3: 179 μs per loop
25 loops, best of 3: 13.1 ms per loop
```

I don't see an appreciable difference between bounds-checking enabled and disabled. So if my changes are good, positive review.

Last edited 12 months ago by tkarn (previous) (diff)

comment:11 Changed 12 months ago by git

• Commit changed from de750fe9634238c2a052150603ce83581daf13e2 to 446512e2e3efe74e0218867c2b60092a052ac7a2

Branch pushed to git repo; I updated commit sha1. New commits:

 ​446512e `Re-enable bounds checking`

comment:12 in reply to: ↑ 10 ; follow-up: ↓ 13 Changed 12 months ago by tscrim

• Status changed from needs_review to positive_review

I'm slightly worried about disabling bounds checking because of possible crashing and security issues.

You should not be worried about crashing. We have doctests for the places in the code where it would fail. I don't understand why there would be any security issue (unless you mean something different than system security here).

I don't see an appreciable difference between bounds-checking enabled and disabled. So if my changes are good, positive review.

There are a lot of variations in your timings, and enabling the bounds-checking should not result in a speedup. I am bit suspicious of the reliability, but this is a minor point and we don't need this level of micro-optimization.

Thank you for improving this.

comment:13 in reply to: ↑ 12 ; follow-up: ↓ 14 Changed 12 months ago by tkarn

I'm slightly worried about disabling bounds checking because of possible crashing and security issues.

You should not be worried about crashing. We have doctests for the places in the code where it would fail. I don't understand why there would be any security issue (unless you mean something different than system security here).

I was just worried about providing a vector for a buffer overflow attack.

I don't see an appreciable difference between bounds-checking enabled and disabled. So if my changes are good, positive review.

There are a lot of variations in your timings, and enabling the bounds-checking should not result in a speedup.

Sorry, I meant that enabling bounds-checking did not appear to slow it down. If there was a big slow down I would have looked more into the security side of things to see if it was ok to disable the bounds checking, but since there was not any apparent slow down, it seemed ok to me to just disable it.

Thanks for the review!

comment:14 in reply to: ↑ 13 ; follow-up: ↓ 16 Changed 12 months ago by tscrim

I'm slightly worried about disabling bounds checking because of possible crashing and security issues.

You should not be worried about crashing. We have doctests for the places in the code where it would fail. I don't understand why there would be any security issue (unless you mean something different than system security here).

I was just worried about providing a vector for a buffer overflow attack.

One thing I missed if you could do really quick. Please remove the `import cython` as it is no longer needed. You can immediately reset back to a positive review once pushed.

comment:15 Changed 12 months ago by git

• Commit changed from 446512e2e3efe74e0218867c2b60092a052ac7a2 to 1f28d9c8da03a5efe66ab29aa50229c1c8d351e2
• Status changed from positive_review to needs_review

Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. New commits:

 ​1f28d9c `Remove cython import`

comment:16 in reply to: ↑ 14 Changed 12 months ago by tkarn

I'm slightly worried about disabling bounds checking because of possible crashing and security issues.

You should not be worried about crashing. We have doctests for the places in the code where it would fail. I don't understand why there would be any security issue (unless you mean something different than system security here).

I was just worried about providing a vector for a buffer overflow attack.

Fair enough!

One thing I missed if you could do really quick. Please remove the `import cython` as it is no longer needed. You can immediately reset back to a positive review once pushed.

Done.

comment:17 Changed 12 months ago by tscrim

• Status changed from needs_review to positive_review

Thank you.

comment:18 Changed 12 months ago by vbraun

• Status changed from positive_review to needs_work
```**********************************************************************
File "src/sage/algebras/orlik_solomon.py", line 535, in sage.algebras.orlik_solomon.OrlikSolomonInvariantAlgebra
Failed example:
[OSH.lift(b) for b in OSH.basis()]
Expected:
[OS{}, OS{1} + OS{2} + OS{3}, OS{4} + OS{5} + OS{6}]
Got:
[OS{}, OS{4} + OS{5} + OS{6}, OS{1} + OS{2} + OS{3}]
**********************************************************************
```

comment:19 Changed 12 months ago by git

• Commit changed from 1f28d9c8da03a5efe66ab29aa50229c1c8d351e2 to 7b0be1603b92104ecb68fefc740c6551f1e301af

Branch pushed to git repo; I updated commit sha1. New commits:

 ​203d706 `Merge branch 'public/matroids/optimize_nbc_set-32339' of git://trac.sagemath.org/sage into public/matroids/optimize_nbc_set-32339` ​7b0be16 `Updating doctests due to NBC order output changed.`

comment:20 Changed 12 months ago by git

• Commit changed from 7b0be1603b92104ecb68fefc740c6551f1e301af to fe6d6d6f92688ed0811827e88367aec03f91a256

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

 ​fe6d6d6 `Updating doctests due to NBC order output changed.`

comment:21 Changed 12 months ago by tscrim

• Status changed from needs_work to positive_review

Trivial updating the output of the doctests.

comment:22 Changed 11 months ago by vbraun

• Branch changed from public/matroids/optimize_nbc_set-32339 to fe6d6d6f92688ed0811827e88367aec03f91a256
• Resolution set to fixed
• Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.