#23290 closed defect (fixed)
Graph.merge_vertices() destroys loops
Reported by:  zgershkoff  Owned by:  

Priority:  minor  Milestone:  sage8.0 
Component:  graph theory  Keywords:  
Cc:  dcoudert, tscrim, Stefan  Merged in:  
Authors:  Zach Gershkoff  Reviewers:  Stefan van Zwam, David Coudert 
Report Upstream:  N/A  Work issues:  
Branch:  ff571d6 (Commits)  Commit:  
Dependencies:  Stopgaps: 
Description (last modified by )
merge_vertices() accepts a list of vertices as input, but loops are lost unless they're on the first vertex.
Change History (17)
comment:1 Changed 2 years ago by
 Cc dcoudert tscrim Stefan added
 Component changed from PLEASE CHANGE to graph theory
 Description modified (diff)
 Priority changed from major to minor
 Type changed from PLEASE CHANGE to defect
comment:2 Changed 2 years ago by
comment:3 Changed 2 years ago by
In that case, I'll fix (1) and leave (2) for one of the other tickets.
comment:4 Changed 2 years ago by
 Branch set to u/zgershkoff/graph_merge_vertices___destroys_loops
comment:5 Changed 2 years ago by
 Commit set to 412dc9082deb4ed0c7f14cdbee7a37950e0e51a7
 Status changed from new to needs_review
It now checks if any of the (di)graph's loops are on the vertices to be deleted before deleting them, and it only adds a few steps if loops are disabled.
New commits:
412dc90  moved loops from vertices before the vertices get deleted

comment:6 Changed 2 years ago by
Before this gets closed: There's code at the beginning of merge_vertices()
to check if the first vertex label is None
, but I don't think this will ever happen because of #9362. Should I take those lines out?
Nevermind. A user may wish to input None
as the first item in the list because this will give the merged vertex a new name.
comment:7 Changed 2 years ago by
 Reviewers set to Stefan van Zwam
 Status changed from needs_review to positive_review
Works as advertised!
comment:8 followup: ↓ 10 Changed 2 years ago by
 Reviewers changed from Stefan van Zwam to Stefan van Zwam, David Coudert
 Status changed from positive_review to needs_work
This could be more efficient
u = vertices[0] for v in vertices[1:]: if self.has_edge(v, v): if self.allows_multiple_edges(): for l in self.edge_label(v, v): self.add_edge(u, u, l) else: self.add_edge(u, u, self.edge_label(v, v))
Could you add a test to do nothing when len(vertices) < 2
.
sage: G = Graph() sage: G.merge_vertices([None]) sage: G Graph on 1 vertex sage: G.merge_vertices([None]) sage: G Graph on 2 vertices
Other issue: TESTS::
> TESTS:
comment:9 Changed 2 years ago by
Also, when multiedges are not allowed, the behavior must be documented.
sage: edgelist = [(0,0,'a'),(0,1,'b'),(1,1,'c')] sage: G = Graph(edgelist, loops=True, multiedges=False) sage: G.merge_vertices([0,1]); G.edges() [(0, 0, 'c')] sage: edgelist = [(0,0,'a'),(0,1,'b'),(1,1,'c'), (1, 2, 'd'), (2, 2, 'e')] sage: G = Graph(edgelist, loops=True, multiedges=False) sage: G.merge_vertices([0,1,2]); G.edges() [(0, 0, 'e')] sage: G = Graph(edgelist, loops=True, multiedges=False) sage: G.merge_vertices([0,2,1]); G.edges() [(0, 0, 'e')]
comment:10 in reply to: ↑ 8 ; followup: ↓ 11 Changed 2 years ago by
Replying to dcoudert:
This could be more efficient
u = vertices[0] for v in vertices[1:]: if self.has_edge(v, v): if self.allows_multiple_edges(): for l in self.edge_label(v, v): self.add_edge(u, u, l) else: self.add_edge(u, u, self.edge_label(v, v))
Is this faster because it specifies the vertices, rather than iterating over a list of edges to get the loops?
Could you add a test to do nothing when
len(vertices) < 2
.sage: G = Graph() sage: G.merge_vertices([None]) sage: G Graph on 1 vertex sage: G.merge_vertices([None]) sage: G Graph on 2 vertices
It's an undocumented feature that None
as the first entry will give the vertex a new name, and a bug that it will create a new vertex if there's nothing to merge. I'll addressed both.
Other issue:
TESTS::
>TESTS:
Is there a guide where I can see how to correctly mark up documentation? The developer's guide doesn't seem to cover it all, and many of the other methods I've looked at use TESTS::
and not TESTS:
. This doesn't seem to be a standard thing for Python docstrings so I'm not sure where to look.
comment:11 in reply to: ↑ 10 ; followup: ↓ 13 Changed 2 years ago by
Replying to zgershkoff:
Replying to dcoudert:
This could be more efficient
u = vertices[0] for v in vertices[1:]: if self.has_edge(v, v): if self.allows_multiple_edges(): for l in self.edge_label(v, v): self.add_edge(u, u, l) else: self.add_edge(u, u, self.edge_label(v, v))Is this faster because it specifies the vertices, rather than iterating over a list of edges to get the loops?
The loop_edges
method calls the loop_vertices
method which returns [v for v in self if self.has_edge(v,v)]
if loops are allowed.
Since vertices
is a subset of the vertices of the graph, and we can expect this set to be small most of the time (e.g., 2), what I propose should be faster than your first proposal.
Could you add a test to do nothing when
len(vertices) < 2
.
sage: G = Graph() sage: G.merge_vertices([None]) sage: G Graph on 1 vertex sage: G.merge_vertices([None]) sage: G Graph on 2 verticesIt's an undocumented feature that
None
as the first entry will give the vertex a new name, and a bug that it will create a new vertex if there's nothing to merge. I'll addressed both.
It's documented in the method, 3rd paragraph: The new vertex is named after the first vertex in the list given in argument. If this first name is None, a new vertex is created.
Other issue:
TESTS::
>TESTS:
Is there a guide where I can see how to correctly mark up documentation? The developer's guide doesn't seem to cover it all, and many of the other methods I've looked at use
TESTS::
and notTESTS:
. This doesn't seem to be a standard thing for Python docstrings so I'm not sure where to look.
To be honest, I don't know. I checked on recent ticket, and #23255 suggests that TESTS::
is the good form. So keep it as is.
comment:12 Changed 2 years ago by
 Commit changed from 412dc9082deb4ed0c7f14cdbee7a37950e0e51a7 to ff571d6699ab233eedaf3b9d4a417ef6a1f6c56f
Branch pushed to git repo; I updated commit sha1. New commits:
ff571d6  efficiency improvements; bug fix when input is [None]

comment:13 in reply to: ↑ 11 Changed 2 years ago by
Replying to dcoudert:
To be honest, I don't know. I checked on recent ticket, and #23255 suggests that
TESTS::
is the good form. So keep it as is.
Judging from that ticket, the convention is to use ::
when sage:
follows and :
when text follows. So in this instance, I should indeed use TESTS:
. The usage isn't consistent throughout the file but I'm not going to try and fix it within this ticket.
I made the changes you suggested, but I had to change one of the tests:
sage: edgelist = [(0,0,'a'),(0,1,'b'),(1,1,'c'), (1, 2, 'd'), (2, 2, 'e')] sage: G = Graph(edgelist, loops=True, multiedges=False) sage: G.merge_vertices([0,1,2]); G.edges() [(0, 0, 'e')] sage: G = Graph(edgelist, loops=True, multiedges=False) sage: G.merge_vertices([0,2,1]); G.edges() [(0, 0, 'e')]
I think the last output should be [(0, 0, 'c')]
, and that's what the code currently returns, since it evaluates vertex 1
last and finds the label 'c'
last.
comment:14 Changed 2 years ago by
 Status changed from needs_work to needs_review
comment:15 Changed 2 years ago by
 Status changed from needs_review to positive_review
That's much better now. Thanks.
comment:16 Changed 2 years ago by
 Branch changed from u/zgershkoff/graph_merge_vertices___destroys_loops to ff571d6699ab233eedaf3b9d4a417ef6a1f6c56f
 Resolution set to fixed
 Status changed from positive_review to closed
comment:17 Changed 2 years ago by
 Commit ff571d6699ab233eedaf3b9d4a417ef6a1f6c56f deleted
It looks like merge_vertices() works by computing the edge boundary between the specified set of vertices and the rest of the graph, then deleting all the vertices except the first one, then putting the edges back in.
So there are actually two issues here. (1) Loops not on the first vertex will be lost (because they're not in the edge boundary and their vertices get deleted), and (2) we'll lose other edges that are not in the edge boundary so they should become loops.
(1) is certainly a defect. I'm not so certain about (2) because I don't know if it's appropriate for this method to create loops. In fact, it's probably not.