Opened 23 months ago
Closed 22 months ago
#31197 closed enhancement (fixed)
Use bitsets/binary matrix for edges of dense graphs
Reported by:  ghkliem  Owned by:  

Priority:  major  Milestone:  sage9.3 
Component:  graph theory  Keywords:  dense graphs, bitsets 
Cc:  David Coudert  Merged in:  
Authors:  Jonathan Kliem  Reviewers:  David Coudert 
Report Upstream:  N/A  Work issues:  
Branch:  3361fc6 (Commits, GitHub, GitLab)  Commit:  3361fc6bf57c87af9bda1ad6da950e6ca36df124 
Dependencies:  #31200, #31207  Stopgaps: 
Description (last modified by )
Currently, the dense graphs backend redefines bitsets. We change the type to binary_matrix_t
.
This mainly simplifies the code and lets dense graphs use future improvements of bitsets, but it also speeds up direct functions with the backend.
Realloc is a bit slower though.
We benefit from the improvements of the binary matrix data structure done in #31200 and #31207, including the addition of reallocation method.
Before:
sage: set_random_seed(0) sage: edges = [(randint(0,1000), randint(0,1000)) for _ in range(100000)] sage: %timeit G = Graph(edges, sparse=False, loops=True) 19.4 ms ± 752 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) sage: G = Graph(edges, sparse=False, loops=True) sage: %timeit _ = G.copy(sparse=True) 49.2 ms ± 312 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) sage: %timeit _ = G.copy(sparse=False) 6.18 ms ± 5.74 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
After:
sage: set_random_seed(0) sage: edges = [(randint(0,1000), randint(0,1000)) for _ in range(100000)] sage: %timeit G = Graph(edges, sparse=False, loops=True) 19.2 ms ± 55.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) sage: G = Graph(edges, sparse=False, loops=True) sage: %timeit _ = G.copy(sparse=True) 47.3 ms ± 396 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) sage: %timeit _ = G.copy(sparse=False) 3.64 ms ± 15.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Change History (22)
comment:1 Changed 23 months ago by
comment:2 Changed 23 months ago by
Sure, I was almost expecting this request (maybe not for splitting into pxd and pyx, but this one should be done as well).
comment:4 Changed 23 months ago by
Dependencies:  → #31200 

comment:5 Changed 23 months ago by
Commit:  4cb41984954561dcdce29fbe2d38a3481cc8ac59 → 5f125c9889b9731e65c51f1a91bf5ea7995557f3 

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
09a88f7  realloc for binary matrix

60efd17  used already imported realloc

b556008  do not enforce another typecast

49b21bb  move binary_matrix.pxi to binary_matrix.pxd

e749434  be more verbose about return value

8c95a04  use mp_bitcnt

1291f04  use binary matrix as data structure for edges of dense graph

5f125c9  pxi > pxd

comment:6 Changed 23 months ago by
Status:  new → needs_review 

comment:7 Changed 23 months ago by
Authors:  → Jonathan Kliem 

Reviewers:  → David Coudert 
If I'm correct, it should be a bit faster to get outneighbors (unsafe iterator) now. For inneighbors, I'm not sure if it is faster or slower.
Concerning the realloc, the slowdown might not be a critical issue.
I'm waiting for the patchbot. My tests are OK, but I may have missed something.
comment:8 Changed 23 months ago by
Status:  needs_review → needs_work 

some failing doctests in src/sage/groups/perm_gps/partn_ref/refinement_graphs.pyx
due to the removal of num_longs
.
comment:9 Changed 23 months ago by
copy_dense_graph
in src/sage/groups/perm_gps/part_ref/refinement_graphs.pyx
?
That is the complete wrong place for such a function.
comment:10 Changed 23 months ago by
Dependencies:  #31200 → #31200, #31207 

comment:11 Changed 23 months ago by
Commit:  5f125c9889b9731e65c51f1a91bf5ea7995557f3 → 704dae744cd3825c103395510c07b7ecf0c1b6a3 

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
e008d42  remove redundant import

a7666b1  implement copy for binary matrix

c495087  use binary matrix as data structure for edges of dense graph

bc8b072  pxi > pxd

704dae7  move `copy_dense_graph` to a more appropriate place and fix it

comment:12 Changed 23 months ago by
Status:  needs_work → needs_review 

comment:13 followup: 15 Changed 23 months ago by
I know method copy_dense_graph
is unsafe
, but it might be better to check whether both matrix have the same size, no ?
comment:14 Changed 23 months ago by
Commit:  704dae744cd3825c103395510c07b7ecf0c1b6a3 → e5de60f9fd4ecb09fdab0ed758a5da564c313ed6 

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
5d37387  declare type of i

88e73ae  use binary matrix as data structure for edges of dense graph

e4bf61f  pxi > pxd

1d0946f  move `copy_dense_graph` to a more appropriate place and fix it

e5de60f  prevent segementation fault when using `copy_dense_graph` incorrectly

comment:15 Changed 23 months ago by
Replying to dcoudert:
I know method
copy_dense_graph
isunsafe
, but it might be better to check whether both matrix have the same size, no ?
Sure.
comment:16 Changed 23 months ago by
You check active_vertices
but not edges
. Since the copy
method of binary_matrix
assumes that both matrix have the same size, we must have a test somewhere to prevent segfaults.
comment:17 Changed 23 months ago by
__init__
and dealloc
guarantee this. But if you want, I can add this test, just to make sure no one messes around with the matrix or the active vertices without using the proper methods (or the methods are changed without preserving this property).
comment:18 Changed 23 months ago by
Yes, I think it's better to have these tests and the extra cost is small compared to the actual cost of the copy of the binary matrix.
comment:19 Changed 23 months ago by
Commit:  e5de60f9fd4ecb09fdab0ed758a5da564c313ed6 → 3361fc6bf57c87af9bda1ad6da950e6ca36df124 

Branch pushed to git repo; I updated commit sha1. New commits:
3361fc6  add edges dimensions check

comment:20 Changed 23 months ago by
Description:  modified (diff) 

Status:  needs_review → positive_review 
Perfect. Thank you.
comment:22 Changed 22 months ago by
Branch:  u/ghkliem/use_bitsets_for_dense_graphs → 3361fc6bf57c87af9bda1ad6da950e6ca36df124 

Resolution:  → fixed 
Status:  positive_review → closed 
I fully agree with these changes. However, may be a first step is to do what has been done for bitsets: use .pxd and .pyx files and no longer .pxi. So split this ticket in 2, one for modifications of the binary matrix data structure (and addition of realloc method), and the other for the modification of the dense graph backend.