#31117 closed enhancement (fixed)

Improve Breadth First Search in c_graph.pyx

Reported by: dcoudert Owned by:
Priority: major Milestone: sage-9.3
Component: graph theory Keywords:
Cc: gh-kliem, vdelecroix, chaperon, dimpase Merged in:
Authors: David Coudert, Jonathan Kliem Reviewers: Jonathan Kliem, David Coudert
Report Upstream: N/A Work issues:
Branch: 6159f31 (Commits, GitHub, GitLab) Commit: 6159f317c636fa7bd9d14bbc60ea393549afc0e9
Dependencies: Stopgaps:

Status badges

Description (last modified by dcoudert)

Currently, method breadth_first_search in c_graph.pyx implements a queue using a list and removes element at position 0. This is ok on small graphs, but for large graphs, This is really slow.

I'm using the following function to benchmark. When report_distance is False, it calls the code from c_graph.pyx, and when it is True, is uses a Python implementation from generic_graph.py.

def comp(): 
    for n in [5, 10, 50, 100, 500, 1000]: 
        G = graphs.Grid2dGraph(n, n) 
        print(G) 
        %timeit _ = list(G.breadth_first_search(start=(0, 0), report_distance=False))
        %timeit _ = list(G.breadth_first_search(start=(0, 0), report_distance=True))

Before:

sage: comp()                                                                                                                                        
2D Grid Graph for [5, 5]
17.9 µs ± 470 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
72.8 µs ± 6 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
2D Grid Graph for [10, 10]
58.5 µs ± 1.2 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
269 µs ± 13.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
2D Grid Graph for [50, 50]
1.92 ms ± 165 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
8.12 ms ± 263 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2D Grid Graph for [100, 100]
9.63 ms ± 232 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
36.2 ms ± 1.43 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
2D Grid Graph for [500, 500]
397 ms ± 18 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.12 s ± 27.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2D Grid Graph for [1000, 1000]
32.3 s ± 1.1 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
5.64 s ± 414 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

After:

sage: comp()                                                                                                                                        
2D Grid Graph for [5, 5]
7.43 µs ± 269 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
56.4 µs ± 1.91 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
2D Grid Graph for [10, 10]
21.8 µs ± 1.62 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
300 µs ± 19.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
2D Grid Graph for [50, 50]
716 µs ± 40.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
8.65 ms ± 457 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2D Grid Graph for [100, 100]
2.99 ms ± 141 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
34.8 ms ± 1.69 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
2D Grid Graph for [500, 500]
105 ms ± 4.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
1.18 s ± 14.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2D Grid Graph for [1000, 1000]
596 ms ± 68.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
5.83 s ± 1.11 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

Another improvement is to avoid using in_neighbors and out_neighbors since these methods return lists of vertices and behind we have some mallocs of arrays. So a lot of operations that can be avoided.

Change History (34)

comment:1 Changed 11 months ago by dcoudert

  • Branch set to public/graphs/31117_BFS
  • Commit set to a19e27645d4c621f223a93245347a2472ee21a56
  • Status changed from new to needs_review

New commits:

a19e276faster breadth first search in c_graph.pyx

comment:2 Changed 11 months ago by dcoudert

  • Description modified (diff)

comment:3 Changed 11 months ago by git

  • Commit changed from a19e27645d4c621f223a93245347a2472ee21a56 to 156d2229332c6d7e4f0e74fd49a144225e3ccc26

Branch pushed to git repo; I updated commit sha1. New commits:

156d222further improvement

comment:4 Changed 11 months ago by dcoudert

This last commit is not very nice, but it's now way faster.

comment:5 Changed 11 months ago by git

  • Commit changed from 156d2229332c6d7e4f0e74fd49a144225e3ccc26 to e56743ba1d0274bfca69e91a01cbaad6af8973ba

Branch pushed to git repo; I updated commit sha1. New commits:

e56743brevert last commit because some methods are not in all backends

comment:6 Changed 11 months ago by dcoudert

  • Description modified (diff)

I had to remove last commit because method next_out_neighbor_unsafe is not implemented in static_sparse_backend.pyx, and I don't know yet how to add such method to this backend.

comment:7 Changed 11 months ago by gh-kliem

First of all:

I can confirm an enourmous improvement (something like a factor 2 or 3), but I can't reproduce the 32 seconds for your before 2D Grid Graph for [1000, 1000]. Looks to me like you went into swap or similar.

comment:8 Changed 11 months ago by dcoudert

I tested on a desktop with more memory than my laptop (which has already 16G Ram), and I don't get such huge slowdown anymore. So it's certainly a swap problem.

Actually, the way BFS is implementented in c_graph, we start pushing vertices to the queue, and then we say that a vertex is seen if it has already been extracted from the queue. An alternative, which is what is done in generic_graph is to say that a vertex is seen if already added to the queue. In the first case, the queue can have up to O(m) vertices, while in the second case, it's at most O(n). I will try the second option in c_graph.

comment:9 Changed 11 months ago by gh-kliem

A swap problem is certainly also an implementation problem. If we can use less memory with better implementation that is certainly an improvement.

comment:10 Changed 11 months ago by gh-kliem

CGraph.out_neighbors etc are still lists. If you really want it to be fast, you should use CGraph.next_out_neighbor etc as:

  • src/sage/graphs/base/c_graph.pyx

    diff --git a/src/sage/graphs/base/c_graph.pyx b/src/sage/graphs/base/c_graph.pyx
    index 47fe5b2b90..92b16ab283 100644
    a b cdef class Search_iterator: 
    48374837        """
    48384838        cdef int v_int
    48394839        cdef int w_int
     4840        cdef int l
     4841        cdef CGraph cg = self.graph.cg()
    48404842
    48414843        while not self.fifo.empty():
    48424844            v_int = self.fifo.front()
    cdef class Search_iterator: 
    48474849                bitset_add(self.seen, v_int)
    48484850
    48494851                if self.test_out:
    4850                     for w_int in self.graph.cg().out_neighbors(v_int):
     4852                    w_int = cg.next_out_neighbor_unsafe(v_int, -1, &l)
     4853                    while w_int != -1:
    48514854                        self.fifo.push(w_int)
     4855                        w_int = cg.next_out_neighbor_unsafe(v_int, w_int, &l)
    48524856                if self.test_in:
    4853                     for w_int in self.in_neighbors(v_int):
     4857                    w_int = cg.next_in_neighbor_unsafe(v_int, -1, &l)
     4858                    while w_int != -1:
    48544859                        self.fifo.push(w_int)
     4860                        w_int = cg.next_in_neighbor_unsafe(v_int, w_int, &l)

comment:11 follow-up: Changed 11 months ago by dcoudert

I tried that but it's not compatible with static_sparse_backend.pyx (don't have next_out_neighbor_unsafe). An option is to implement a specific BFS for static_sparse_backend.pyx.

comment:12 in reply to: ↑ 11 Changed 11 months ago by gh-kliem

Replying to dcoudert:

I tried that but it's not compatible with static_sparse_backend.pyx (don't have next_out_neighbor_unsafe). An option is to implement a specific BFS for static_sparse_backend.pyx.

Ok. I'll think about how this can be fixed.

comment:13 Changed 11 months ago by gh-kliem

I don't see an improvement in factoring out next_depth....

The compiler optimzes things fine for me, if I distinct everywhere with self.direction == 0 and self.direction != 0. Then one can right away change the data type for lifo to https://en.cppreference.com/w/cpp/container/stack as well.

This is how my overall diff looks to your branch:

  • src/sage/graphs/base/c_graph.pyx

    diff --git a/src/sage/graphs/base/c_graph.pyx b/src/sage/graphs/base/c_graph.pyx
    index 47fe5b2b90..daf9bc8af5 100644
    a b from sage.rings.integer cimport Integer 
    4848from sage.arith.long cimport pyobject_to_long
    4949from libcpp.queue cimport priority_queue
    5050from libcpp.queue cimport queue
     51from libcpp.stack cimport stack
    5152from libcpp.pair cimport pair
    5253from sage.rings.integer_ring import ZZ
    5354from cysignals.memory cimport check_allocarray, sig_free
    cdef class Search_iterator: 
    47164717
    47174718    cdef CGraphBackend graph
    47184719    cdef int direction
    4719     cdef list stack
     4720    cdef stack[int] lifo
    47204721    cdef queue[int] fifo
    47214722    cdef bitset_t seen
    47224723    cdef bint test_out
    cdef class Search_iterator: 
    47944795        if direction == 0:
    47954796            self.fifo.push(v_id)
    47964797        else:
    4797             self.stack = [v_id]
     4798            self.lifo.push(v_id)
    47984799
    47994800        if not self.graph._directed:
    48004801            ignore_direction = False
    cdef class Search_iterator: 
    48234824        """
    48244825        return self
    48254826
    4826     def next_breadth_first_search(self):
     4827    def __next__(self):
    48274828        r"""
    48284829        Return the next vertex in a breadth first search traversal of a graph.
    48294830
    cdef class Search_iterator: 
    48374838        """
    48384839        cdef int v_int
    48394840        cdef int w_int
     4841        cdef int l
     4842        cdef CGraph cg = self.graph.cg()
    48404843
    4841         while not self.fifo.empty():
    4842             v_int = self.fifo.front()
    4843             self.fifo.pop()
    4844 
    4845             if bitset_not_in(self.seen, v_int):
    4846                 value = self.graph.vertex_label(v_int)
    4847                 bitset_add(self.seen, v_int)
    4848 
    4849                 if self.test_out:
    4850                     for w_int in self.graph.cg().out_neighbors(v_int):
    4851                         self.fifo.push(w_int)
    4852                 if self.test_in:
    4853                     for w_int in self.in_neighbors(v_int):
    4854                         self.fifo.push(w_int)
    4855 
    4856                 break
    4857         else:
    4858             raise StopIteration
    4859 
    4860         return value
    4861 
    4862     def next_depth_first_search(self):
    4863         r"""
    4864         Return the next vertex in a depth first search traversal of a graph.
    4865 
    4866         EXAMPLES::
    4867 
    4868             sage: g = graphs.PetersenGraph()
    4869             sage: g.depth_first_search(0)
    4870             <generator object ...depth_first_search at ...
    4871             sage: next(g.depth_first_search(0))
    4872             0
    4873         """
    4874         cdef int v_int
    4875 
    4876         while self.stack:
    4877             v_int = self.stack.pop()
     4844        while ((self.direction == 0 and not self.fifo.empty())
     4845                or self.direction != 0 and not self.lifo.empty()):
     4846            if self.direction == 0:
     4847                v_int = self.fifo.front()
     4848                self.fifo.pop()
     4849            else:
     4850                v_int = self.lifo.top()
     4851                self.lifo.pop()
    48784852
    48794853            if bitset_not_in(self.seen, v_int):
    48804854                value = self.graph.vertex_label(v_int)
    48814855                bitset_add(self.seen, v_int)
    48824856
    48834857                if self.test_out:
    4884                     self.stack.extend(self.graph.cg().out_neighbors(v_int))
     4858                    w_int = cg.next_out_neighbor_unsafe(v_int, -1, &l)
     4859                    while w_int != -1:
     4860                        if self.direction == 0:
     4861                            self.fifo.push(w_int)
     4862                        else:
     4863                            self.lifo.push(w_int)
     4864                        w_int = cg.next_out_neighbor_unsafe(v_int, w_int, &l)
    48854865                if self.test_in:
    4886                     self.stack.extend(self.in_neighbors(v_int))
     4866                    w_int = cg.next_in_neighbor_unsafe(v_int, -1, &l)
     4867                    while w_int != -1:
     4868                        if self.direction == 0:
     4869                            self.fifo.push(w_int)
     4870                        else:
     4871                            self.lifo.push(w_int)
     4872                        w_int = cg.next_in_neighbor_unsafe(v_int, w_int, &l)
    48874873
    48884874                break
    48894875        else:
    cdef class Search_iterator: 
    48914877
    48924878        return value
    48934879
    4894     def __next__(self):
    4895         r"""
    4896         Return the next vertex in a breadth first search traversal of a graph.
    4897 
    4898         EXAMPLES::
    4899 
    4900             sage: g = graphs.PetersenGraph()
    4901             sage: g.breadth_first_search(0)
    4902             <generator object ...breadth_first_search at ...
    4903             sage: next(g.breadth_first_search(0))
    4904             0
    4905         """
    4906         if self.direction == 0:
    4907             return self.next_breadth_first_search()
    4908         return self.next_depth_first_search()
    4909 
    49104880##############################
    49114881# Functions to simplify edge iterator.
    49124882##############################

However, as mentioned above, I need to figure out something yet for sparse graphs.

And there seems to exist no depth-first doctest for static sparse, at least not in sage/src/sage/graphs.

comment:14 Changed 11 months ago by gh-kliem

Never mind my last comment. I think it is a good first step to seperate next_breadth... and next_depth... for the following reason:

I don't think queue is a good implementation in the long run. The thing is, we know exactly which values things in queue can take and we don't want anything twice, so we might as well implement the queue as int* of length number of active vertices (one needs to realloc to segmentation fault, in case someone adds vertices while we perform the depth first search, but I assume non-optimal runtime is okay in this case).

  • And each vertex points to the next in the queue (initial value -1).
  • We need also keep track of first and last of the queue.
  • We only add a vertex to the queue, if it doesn't have a next value yet (and is not the last one).

Later on, we can also report the distance easily.

comment:15 Changed 11 months ago by git

  • Commit changed from e56743ba1d0274bfca69e91a01cbaad6af8973ba to c342fe67d46988020093a39be21ef9675ba1813c

Branch pushed to git repo; I updated commit sha1. New commits:

e4415f9trac #31117: merged with 9.3.beta5
c342fe6trac #31117: better bfs

comment:16 Changed 11 months ago by dcoudert

I rebased the branch on 9.3.beta5 and use int* to implement the queue. I have let the in_neighbors and out_neighbors to have a functional branch.

Ideally, we should implement (un)safe iterators over the in/out neighbors of all backends. Currently, some backends have neighbor iterators that first build the list of neighbors and then iterate over that list...

comment:17 Changed 11 months ago by gh-kliem

Ok, I implemented next_out_neighbor_unsafe for static sparse graph. It doesn't accept labels, because the labels are only known to the backend.

I pushed it to u/gh-kliem/next_out_neighbor_for_static_sparse, not yet a ticket, but with this thing you can apply my suggestion from comment:10.

Does this appear reasonable to you?

Before applying comment comment:10

sage: sage: def comp():  
....:     ....:     for n in [5, 10, 50, 100, 500, 1000]:  
....:         ....:         G = graphs.Grid2dGraph(n, n)   
....:         ....:         print(G)   
....:         ....:         %timeit _ = list(G.breadth_first_search(start=(0, 0), report_distance=False))  
....:         ....:         G = G.copy(sparse=True, immutable=True) 
....:         ....:         %timeit _ = list(G.breadth_first_search(start=(0, 0), report_distance=False))   
....:          
....:                                                                                                                                 
sage: comp()                                                                                                                          
2D Grid Graph for [5, 5]
8.35 µs ± 36.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
6.46 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
2D Grid Graph for [10, 10]
25.8 µs ± 39 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
18.3 µs ± 105 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
2D Grid Graph for [50, 50]
796 µs ± 3.01 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
605 µs ± 344 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
2D Grid Graph for [100, 100]
3.27 ms ± 17.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.46 ms ± 1.42 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2D Grid Graph for [500, 500]
104 ms ± 45 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
78.4 ms ± 31.6 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
2D Grid Graph for [1000, 1000]
445 ms ± 328 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
319 ms ± 725 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

After:

sage: comp()                                                                                                                                                                                                                                                                                                                                                               
2D Grid Graph for [5, 5]
6.26 µs ± 14.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
5.58 µs ± 8.64 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
2D Grid Graph for [10, 10]
17.2 µs ± 12.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
14.1 µs ± 36.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
2D Grid Graph for [50, 50]
484 µs ± 3.59 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
388 µs ± 189 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
2D Grid Graph for [100, 100]
2.05 ms ± 4.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.59 ms ± 1.35 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
2D Grid Graph for [500, 500]
72.9 ms ± 30.8 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
55.4 ms ± 22.7 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
2D Grid Graph for [1000, 1000]
289 ms ± 71.7 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
228 ms ± 120 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

As mentioned, we can also easily add a method that reports the distance of the current vertex as to replace the python method. Do you want to do this here or in an extra ticket.

comment:10 can of course also be an extra ticket and then I can quickly just review this here.

comment:18 Changed 11 months ago by gh-kliem

Btw, my idea for the queue imitation does not seem to make a difference, but for limiting the maximum memory usage. What might be simpler (and actually use less memory in some cases) is to use the standard queue implementation with the new setup for seen: Mark a vertex as seen as soon as we add it to the queue.

Its appears to be all the same speed but this would avoid reinventing the wheel while still reducing the memory usage.

When we also keep track of the current distance, we will need to change the base type for queue, but that wouldn't be a problem.

And on some follow up ticket, we can also allow a list of vertices for the cython implementation.

comment:19 follow-up: Changed 11 months ago by dcoudert

Feel free to modify this ticket and merge your proposals (including using queue to not reinvent the wheel).

For reporting distances, my only concern is whether we should do all in a single method or if we should design specific methods to save some tests.

Observe that we can use the data structure for distances to report edges.

I agree that in another ticket we can have list of vertices as input.

I still think that it would be nice to have low level neighbor iterators in backends, but this is another story.

comment:20 in reply to: ↑ 19 Changed 11 months ago by gh-kliem

Replying to dcoudert:

[...] I still think that it would be nice to have low level neighbor iterators in backends, but this is another story.

I implemented next_out_neighbor_unsafe for static sparse. What do you mean by low level neighbor iterators? Python iterators?

comment:21 follow-up: Changed 11 months ago by dcoudert

currently, we use out_neighbors. it returns a list, and to build it, it calls out_neighbors_unsafe with a local array, which itself calls next_out_neighbor_unsafe. When we just want to iterate over the out neighbors once, may be we can combine several operations done in next_out_neighbor_unsafe in a method called out_neighbor_iterator_unsafe that yields out neighbors.

We are going to call directly next_out_neighbor_unsafe, but again, inside this method, several operations could be done only once for yielding all the out neighbors of a vertex.

comment:22 Changed 11 months ago by git

  • Commit changed from c342fe67d46988020093a39be21ef9675ba1813c to cc0cfd1c53e04b66c71713dc86790432552857b5

Branch pushed to git repo; I updated commit sha1. New commits:

65e677eimplement next_out_neighbor for static sparse
b7e8a20Merge branch 'u/gh-kliem/next_out_neighbor_for_static_sparse' of git://trac.sagemath.org/sage into public/graphs/31117_BFS
eaf982buse next_out_neighbor_unsafe
4fd001ereuse queue
cc0cfd1account for that the loop is an if-clause now

comment:23 Changed 11 months ago by gh-kliem

  • Authors changed from David Coudert to David Coudert, Jonathan Kliem

I only noticed now that what I suggested in comment:10 is what you have tried before on this ticket and then have fallen back.

comment:24 in reply to: ↑ 21 Changed 11 months ago by gh-kliem

Replying to dcoudert:

currently, we use out_neighbors. it returns a list, and to build it, it calls out_neighbors_unsafe with a local array, which itself calls next_out_neighbor_unsafe. When we just want to iterate over the out neighbors once, may be we can combine several operations done in next_out_neighbor_unsafe in a method called out_neighbor_iterator_unsafe that yields out neighbors.

We are going to call directly next_out_neighbor_unsafe, but again, inside this method, several operations could be done only once for yielding all the out neighbors of a vertex.

I think next_out_neighbor_unsafe is a pretty good solution to make it work for all backends with one function name.

I tried implementing an iterator in sparse graph and this takes twice as much (I don't know how to make it significantly faster):

  • src/sage/graphs/base/c_graph.pyx

    diff --git a/src/sage/graphs/base/c_graph.pyx b/src/sage/graphs/base/c_graph.pyx
    index 63a024e37a..15b070ed7c 100644
    a b cdef class CGraph: 
    11221122    cdef int next_in_neighbor_unsafe(self, int v, int u, int* l) except -2:
    11231123        raise NotImplementedError()
    11241124
     1125    def out_neighbor_iterator_unsafe(self, int u):
     1126        raise NotImplementedError
     1127
    11251128    cdef adjacency_sequence_out(self, int n, int *vertices, int v, int* sequence):
    11261129        r"""
    11271130        Return the adjacency sequence corresponding to a list of vertices and a
    cdef class Search_iterator: 
    48484851            value = self.graph.vertex_label(v_int)
    48494852
    48504853            if self.test_out:
    4851                 w_int = cg.next_out_neighbor_unsafe(v_int, -1, &l)
    4852                 while w_int != -1:
     4854                for w_int in cg.out_neighbor_iterator_unsafe(v_int):
     4855                #w_int = cg.next_out_neighbor_unsafe(v_int, -1, &l)
     4856                #while w_int != -1:
    48534857                    if bitset_not_in(self.seen, w_int):
    48544858                        bitset_add(self.seen, w_int)
    48554859                        self.fifo.push(w_int)
    4856                     w_int = cg.next_out_neighbor_unsafe(v_int, w_int, &l)
     4860                    #w_int = cg.next_out_neighbor_unsafe(v_int, w_int, &l)
    48574861            if self.test_in:
    48584862                w_int = cg.next_in_neighbor_unsafe(v_int, -1, &l)
    48594863                while w_int != -1:
  • src/sage/graphs/base/sparse_graph.pyx

    diff --git a/src/sage/graphs/base/sparse_graph.pyx b/src/sage/graphs/base/sparse_graph.pyx
    index 19006ceb80..220fcab123 100644
    a b for both of these uses. 
    191191
    192192
    193193from libc.string cimport memset
    194 from cysignals.memory cimport check_malloc, check_allocarray, sig_free
     194from cysignals.memory cimport check_malloc, check_realloc, check_allocarray, sig_free
    195195
    196196from sage.data_structures.bitset_base cimport *
    197197from sage.data_structures.bitset cimport *
    198198
     199cdef extern from "Python.h":
     200    int unlikely(int) nogil  # Defined by Cython
     201
    199202cdef enum:
    200203    BT_REORDERING_CONSTANT = 145533211
    201204    # Since the binary tree will often see vertices coming in already sorted,
    cdef class SparseGraph(CGraph): 
    706709            return temp
    707710        return NULL
    708711
     712    def out_neighbor_iterator_unsafe(self, int u):
     713        cdef int i
     714        cdef SparseGraphBTNode* v
     715        cdef SparseGraphBTNode** lifo = <SparseGraphBTNode**> check_malloc(2*self.hash_length*sizeof(SparseGraphBTNode*))
     716        cdef int lifo_index = 0
     717        cdef size_t lifo_size = self.hash_length*2
     718
     719        for i in range(u * self.hash_length, (u+1) * self.hash_length):
     720            if not self.vertices[i]:
     721                continue
     722            lifo[lifo_index] = self.vertices[i]
     723            lifo_index += 1
     724
     725        while lifo_index:
     726            lifo_index -= 1
     727            v = lifo[lifo_index]
     728            if unlikely(lifo_size < lifo_index + 2):
     729                lifo = <SparseGraphBTNode**> check_realloc(lifo, 2*lifo_size*sizeof(SparseGraphBTNode*))
     730                lifo_size *= 2
     731
     732            if v.left:
     733                lifo[lifo_index] = v.left
     734                lifo_index += 1
     735            if v.right:
     736                lifo[lifo_index] = v.right
     737                lifo_index += 1
     738            yield v.vertex
     739
     740
     741        sig_free(lifo)

It is a python function after all.

The advantage of creating the list of all neighbors is that we need less often to traverse the tree of neighbors. However, for a sparse graph, this should usually be a very fast operation. This seems to be the only thing that can be improved in next_out_neighbor for sparse graphs (and I don't see anything for dense graphs at all).

comment:25 Changed 11 months ago by dcoudert

Then we should let this for another ticket. Further cleaning / improvements / implications in the backends may lead to other / faster solutions.

comment:26 Changed 11 months ago by gh-kliem

  • Reviewers set to Jonathan Kliem

I'm happy with the ticket.

comment:27 Changed 11 months ago by git

  • Commit changed from cc0cfd1c53e04b66c71713dc86790432552857b5 to 07eaaa59da76507299e6a249c8b3710d6c0173e4

Branch pushed to git repo; I updated commit sha1. New commits:

07eaaa5inlining makes a difference

comment:28 Changed 11 months ago by gh-kliem

Inlining makes a difference. More than 10 percent for any of those test instances.

Please update the timings in the ticket description.

comment:29 Changed 11 months ago by dcoudert

  • Description modified (diff)

comment:30 Changed 11 months ago by git

  • Commit changed from 07eaaa59da76507299e6a249c8b3710d6c0173e4 to 6159f317c636fa7bd9d14bbc60ea393549afc0e9

Branch pushed to git repo; I updated commit sha1. New commits:

6159f31trac #31117: fix next_in_neighbor_unsafe in static_sparse_backend

comment:31 Changed 11 months ago by dcoudert

There was an error in method next_in_neighbor_unsafe of static_sparse_backend.pyx:

-        cdef int degree = self.g_rev.neighbors[u+1] - self.g.neighbors[u]
+        cdef int degree = self.g_rev.neighbors[u+1] - self.g_rev.neighbors[u]

The segfaults reported by the patchbot for #31129 are all due to this error.

I changed to the safer / cleaner version

-        cdef int degree = self.g_rev.neighbors[u+1] - self.g.neighbors[u]
+        cdef int degree = out_degree(self.g_rev, u)

comment:32 Changed 11 months ago by dcoudert

We now have green bot. If you agree, we can set this ticket to positive review. We have significant improvements !

comment:33 Changed 11 months ago by gh-kliem

  • Reviewers changed from Jonathan Kliem to Jonathan Kliem, David Coudert
  • Status changed from needs_review to positive_review

Yes, I agree.

comment:34 Changed 11 months ago by vbraun

  • Branch changed from public/graphs/31117_BFS to 6159f317c636fa7bd9d14bbc60ea393549afc0e9
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.