#31197 closed enhancement (fixed)

Use bitsets/binary matrix for edges of dense graphs

Reported by: gh-kliem Owned by:
Priority: major Milestone: sage-9.3
Component: graph theory Keywords: dense graphs, bitsets
Cc: dcoudert Merged in:
Authors: Jonathan Kliem Reviewers: David Coudert
Report Upstream: N/A Work issues:
Branch: 3361fc6 (Commits, GitHub, GitLab) Commit: 3361fc6bf57c87af9bda1ad6da950e6ca36df124
Dependencies: #31200, #31207 Stopgaps:

Status badges

Description (last modified by dcoudert)

Currently, the dense graphs backend redefines bitsets. We change the type to binary_matrix_t.

This mainly simplifies the code and lets dense graphs use future improvements of bitsets, but it also speeds up direct functions with the backend.

Realloc is a bit slower though.

We benefit from the improvements of the binary matrix data structure done in #31200 and #31207, including the addition of reallocation method.

Before:

sage: set_random_seed(0)                                                                                                              
sage: edges = [(randint(0,1000), randint(0,1000)) for _ in range(100000)]                                                             
sage: %timeit G = Graph(edges, sparse=False, loops=True)                                                                              
19.4 ms ± 752 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
sage: G = Graph(edges, sparse=False, loops=True)                                                                                      
sage: %timeit _ = G.copy(sparse=True)                                                                                                 
49.2 ms ± 312 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
sage: %timeit _ = G.copy(sparse=False)                                                                                                
6.18 ms ± 5.74 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

After:

sage: set_random_seed(0)                                                                                                                                                            
sage: edges = [(randint(0,1000), randint(0,1000)) for _ in range(100000)]                                                                                                           
sage: %timeit G = Graph(edges, sparse=False, loops=True)                                                                                                                            
19.2 ms ± 55.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
sage: G = Graph(edges, sparse=False, loops=True)                                                                                                                                    
sage: %timeit _ = G.copy(sparse=True)                                                                                                                                               
47.3 ms ± 396 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
sage: %timeit _ = G.copy(sparse=False)                                                                                                                                              
3.64 ms ± 15.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Change History (22)

comment:1 Changed 11 months ago by dcoudert

I fully agree with these changes. However, may be a first step is to do what has been done for bitsets: use .pxd and .pyx files and no longer .pxi. So split this ticket in 2, one for modifications of the binary matrix data structure (and addition of realloc method), and the other for the modification of the dense graph backend.

comment:2 Changed 11 months ago by gh-kliem

Sure, I was almost expecting this request (maybe not for splitting into pxd and pyx, but this one should be done as well).

comment:3 Changed 11 months ago by dcoudert

We already have binary_matrix.pxd but not binary_matrix.pyx.

comment:4 Changed 11 months ago by gh-kliem

  • Dependencies set to #31200

comment:5 Changed 11 months ago by git

  • Commit changed from 4cb41984954561dcdce29fbe2d38a3481cc8ac59 to 5f125c9889b9731e65c51f1a91bf5ea7995557f3

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

09a88f7realloc for binary matrix
60efd17used already imported realloc
b556008do not enforce another typecast
49b21bbmove binary_matrix.pxi to binary_matrix.pxd
e749434be more verbose about return value
8c95a04use mp_bitcnt
1291f04use binary matrix as data structure for edges of dense graph
5f125c9pxi -> pxd

comment:6 Changed 11 months ago by gh-kliem

  • Status changed from new to needs_review

comment:7 Changed 11 months ago by dcoudert

  • Authors set to Jonathan Kliem
  • Reviewers set to David Coudert

If I'm correct, it should be a bit faster to get out-neighbors (unsafe iterator) now. For in-neighbors, I'm not sure if it is faster or slower.

Concerning the realloc, the slowdown might not be a critical issue.

I'm waiting for the patchbot. My tests are OK, but I may have missed something.

comment:8 Changed 11 months ago by dcoudert

  • Status changed from needs_review to needs_work

some failing doctests in src/sage/groups/perm_gps/partn_ref/refinement_graphs.pyx due to the removal of num_longs.

comment:9 Changed 11 months ago by gh-kliem

copy_dense_graph in src/sage/groups/perm_gps/part_ref/refinement_graphs.pyx?

That is the complete wrong place for such a function.

comment:10 Changed 11 months ago by gh-kliem

  • Dependencies changed from #31200 to #31200, #31207

comment:11 Changed 11 months ago by git

  • Commit changed from 5f125c9889b9731e65c51f1a91bf5ea7995557f3 to 704dae744cd3825c103395510c07b7ecf0c1b6a3

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

e008d42remove redundant import
a7666b1implement copy for binary matrix
c495087use binary matrix as data structure for edges of dense graph
bc8b072pxi -> pxd
704dae7move `copy_dense_graph` to a more appropriate place and fix it

comment:12 Changed 11 months ago by gh-kliem

  • Status changed from needs_work to needs_review

comment:13 follow-up: Changed 11 months ago by dcoudert

I know method copy_dense_graph is unsafe, but it might be better to check whether both matrix have the same size, no ?

comment:14 Changed 11 months ago by git

  • Commit changed from 704dae744cd3825c103395510c07b7ecf0c1b6a3 to e5de60f9fd4ecb09fdab0ed758a5da564c313ed6

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

5d37387declare type of i
88e73aeuse binary matrix as data structure for edges of dense graph
e4bf61fpxi -> pxd
1d0946fmove `copy_dense_graph` to a more appropriate place and fix it
e5de60fprevent segementation fault when using `copy_dense_graph` incorrectly

comment:15 in reply to: ↑ 13 Changed 11 months ago by gh-kliem

Replying to dcoudert:

I know method copy_dense_graph is unsafe, but it might be better to check whether both matrix have the same size, no ?

Sure.

comment:16 Changed 11 months ago by dcoudert

You check active_vertices but not edges. Since the copy method of binary_matrix assumes that both matrix have the same size, we must have a test somewhere to prevent segfaults.

comment:17 Changed 11 months ago by gh-kliem

__init__ and dealloc guarantee this. But if you want, I can add this test, just to make sure no one messes around with the matrix or the active vertices without using the proper methods (or the methods are changed without preserving this property).

comment:18 Changed 11 months ago by dcoudert

Yes, I think it's better to have these tests and the extra cost is small compared to the actual cost of the copy of the binary matrix.

comment:19 Changed 11 months ago by git

  • Commit changed from e5de60f9fd4ecb09fdab0ed758a5da564c313ed6 to 3361fc6bf57c87af9bda1ad6da950e6ca36df124

Branch pushed to git repo; I updated commit sha1. New commits:

3361fc6add edges dimensions check

comment:20 Changed 11 months ago by dcoudert

  • Description modified (diff)
  • Status changed from needs_review to positive_review

Perfect. Thank you.

comment:21 Changed 11 months ago by gh-kliem

Thank you for reviewing.

comment:22 Changed 10 months ago by vbraun

  • Branch changed from u/gh-kliem/use_bitsets_for_dense_graphs to 3361fc6bf57c87af9bda1ad6da950e6ca36df124
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.