Opened 11 months ago
Closed 10 months ago
#31197 closed enhancement (fixed)
Use bitsets/binary matrix for edges of dense graphs
Reported by:  ghkliem  Owned by:  

Priority:  major  Milestone:  sage9.3 
Component:  graph theory  Keywords:  dense graphs, bitsets 
Cc:  dcoudert  Merged in:  
Authors:  Jonathan Kliem  Reviewers:  David Coudert 
Report Upstream:  N/A  Work issues:  
Branch:  3361fc6 (Commits, GitHub, GitLab)  Commit:  3361fc6bf57c87af9bda1ad6da950e6ca36df124 
Dependencies:  #31200, #31207  Stopgaps: 
Description (last modified by )
Currently, the dense graphs backend redefines bitsets. We change the type to binary_matrix_t
.
This mainly simplifies the code and lets dense graphs use future improvements of bitsets, but it also speeds up direct functions with the backend.
Realloc is a bit slower though.
We benefit from the improvements of the binary matrix data structure done in #31200 and #31207, including the addition of reallocation method.
Before:
sage: set_random_seed(0) sage: edges = [(randint(0,1000), randint(0,1000)) for _ in range(100000)] sage: %timeit G = Graph(edges, sparse=False, loops=True) 19.4 ms ± 752 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) sage: G = Graph(edges, sparse=False, loops=True) sage: %timeit _ = G.copy(sparse=True) 49.2 ms ± 312 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) sage: %timeit _ = G.copy(sparse=False) 6.18 ms ± 5.74 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
After:
sage: set_random_seed(0) sage: edges = [(randint(0,1000), randint(0,1000)) for _ in range(100000)] sage: %timeit G = Graph(edges, sparse=False, loops=True) 19.2 ms ± 55.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) sage: G = Graph(edges, sparse=False, loops=True) sage: %timeit _ = G.copy(sparse=True) 47.3 ms ± 396 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) sage: %timeit _ = G.copy(sparse=False) 3.64 ms ± 15.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Change History (22)
comment:1 Changed 11 months ago by
comment:2 Changed 11 months ago by
Sure, I was almost expecting this request (maybe not for splitting into pxd and pyx, but this one should be done as well).
comment:3 Changed 11 months ago by
We already have binary_matrix.pxd
but not binary_matrix.pyx
.
comment:4 Changed 11 months ago by
 Dependencies set to #31200
comment:5 Changed 11 months ago by
 Commit changed from 4cb41984954561dcdce29fbe2d38a3481cc8ac59 to 5f125c9889b9731e65c51f1a91bf5ea7995557f3
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
09a88f7  realloc for binary matrix

60efd17  used already imported realloc

b556008  do not enforce another typecast

49b21bb  move binary_matrix.pxi to binary_matrix.pxd

e749434  be more verbose about return value

8c95a04  use mp_bitcnt

1291f04  use binary matrix as data structure for edges of dense graph

5f125c9  pxi > pxd

comment:6 Changed 11 months ago by
 Status changed from new to needs_review
comment:7 Changed 11 months ago by
 Reviewers set to David Coudert
If I'm correct, it should be a bit faster to get outneighbors (unsafe iterator) now. For inneighbors, I'm not sure if it is faster or slower.
Concerning the realloc, the slowdown might not be a critical issue.
I'm waiting for the patchbot. My tests are OK, but I may have missed something.
comment:8 Changed 11 months ago by
 Status changed from needs_review to needs_work
some failing doctests in src/sage/groups/perm_gps/partn_ref/refinement_graphs.pyx
due to the removal of num_longs
.
comment:9 Changed 11 months ago by
copy_dense_graph
in src/sage/groups/perm_gps/part_ref/refinement_graphs.pyx
?
That is the complete wrong place for such a function.
comment:10 Changed 11 months ago by
 Dependencies changed from #31200 to #31200, #31207
comment:11 Changed 11 months ago by
 Commit changed from 5f125c9889b9731e65c51f1a91bf5ea7995557f3 to 704dae744cd3825c103395510c07b7ecf0c1b6a3
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
e008d42  remove redundant import

a7666b1  implement copy for binary matrix

c495087  use binary matrix as data structure for edges of dense graph

bc8b072  pxi > pxd

704dae7  move `copy_dense_graph` to a more appropriate place and fix it

comment:12 Changed 11 months ago by
 Status changed from needs_work to needs_review
comment:13 followup: ↓ 15 Changed 11 months ago by
I know method copy_dense_graph
is unsafe
, but it might be better to check whether both matrix have the same size, no ?
comment:14 Changed 11 months ago by
 Commit changed from 704dae744cd3825c103395510c07b7ecf0c1b6a3 to e5de60f9fd4ecb09fdab0ed758a5da564c313ed6
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
5d37387  declare type of i

88e73ae  use binary matrix as data structure for edges of dense graph

e4bf61f  pxi > pxd

1d0946f  move `copy_dense_graph` to a more appropriate place and fix it

e5de60f  prevent segementation fault when using `copy_dense_graph` incorrectly

comment:15 in reply to: ↑ 13 Changed 11 months ago by
Replying to dcoudert:
I know method
copy_dense_graph
isunsafe
, but it might be better to check whether both matrix have the same size, no ?
Sure.
comment:16 Changed 11 months ago by
You check active_vertices
but not edges
. Since the copy
method of binary_matrix
assumes that both matrix have the same size, we must have a test somewhere to prevent segfaults.
comment:17 Changed 11 months ago by
__init__
and dealloc
guarantee this. But if you want, I can add this test, just to make sure no one messes around with the matrix or the active vertices without using the proper methods (or the methods are changed without preserving this property).
comment:18 Changed 11 months ago by
Yes, I think it's better to have these tests and the extra cost is small compared to the actual cost of the copy of the binary matrix.
comment:19 Changed 11 months ago by
 Commit changed from e5de60f9fd4ecb09fdab0ed758a5da564c313ed6 to 3361fc6bf57c87af9bda1ad6da950e6ca36df124
Branch pushed to git repo; I updated commit sha1. New commits:
3361fc6  add edges dimensions check

comment:20 Changed 11 months ago by
 Description modified (diff)
 Status changed from needs_review to positive_review
Perfect. Thank you.
comment:21 Changed 11 months ago by
Thank you for reviewing.
comment:22 Changed 10 months ago by
 Branch changed from u/ghkliem/use_bitsets_for_dense_graphs to 3361fc6bf57c87af9bda1ad6da950e6ca36df124
 Resolution set to fixed
 Status changed from positive_review to closed
I fully agree with these changes. However, may be a first step is to do what has been done for bitsets: use .pxd and .pyx files and no longer .pxi. So split this ticket in 2, one for modifications of the binary matrix data structure (and addition of realloc method), and the other for the modification of the dense graph backend.