Opened 5 years ago
Closed 5 years ago
#15706 closed enhancement (fixed)
Graph built from their edges are simple by default
Reported by:  ncohen  Owned by:  

Priority:  major  Milestone:  sage6.4 
Component:  graph theory  Keywords:  
Cc:  azi, vdelecroix, rbeezer, jason, dimpase, jmantysalo  Merged in:  
Authors:  Nathann Cohen  Reviewers:  Dima Pasechnik 
Report Upstream:  N/A  Work issues:  
Branch:  7c69f25 (Commits)  Commit:  7c69f250922b0373de62e16b16baaab7f8b2fd32 
Dependencies:  #15704  Stopgaps: 
Description (last modified by )
The Graph/DiGraph
constructor are scary to look at, and waste a lot of computations in order to guess whether the user wants loops or multiple edges. In order to reduce this number of tests and make those functions more efficient, we can change the default and force the user to specify explicitly what they want to obtain.
Note that this is not a bad idea on its own, for graphs which allow multiple edges/loops behave differently with respect to the add_edge
function, and by specifying this explicitly the user is forced to be aware of what is going to happen next.
After many small patches like that, it may become possible to simplify a lot of things in those huge constructors. But first, deprecation for one year T_T
Nathann
(example of why this could be useful https://groups.google.com/d/topic/sagesupport/4ffCepWrjDc/discussion)
Change History (47)
comment:1 Changed 5 years ago by
 Branch set to u/ncohen/15706
 Description modified (diff)
 Status changed from new to needs_review
comment:2 Changed 5 years ago by
 Commit set to fdbf92b364b0fb0a69307abaf792be21d8403ef1
comment:3 Changed 5 years ago by
 Dependencies set to 15704
comment:4 Changed 5 years ago by
 Milestone changed from sage6.1 to sage6.2
comment:5 Changed 5 years ago by
 Dependencies changed from 15704 to #15704
comment:6 Changed 5 years ago by
 Cc vdelecrois added
 Description modified (diff)
comment:7 Changed 5 years ago by
 Milestone changed from sage6.2 to sage6.3
comment:8 Changed 5 years ago by
 Work issues set to merge conflict
comment:9 Changed 5 years ago by
 Commit changed from fdbf92b364b0fb0a69307abaf792be21d8403ef1 to a049551c0d1c8d95cc4301d82f70f4f50bd8fc51
Branch pushed to git repo; I updated commit sha1. New commits:
a049551  trac #15706: Merged with 6.3.beta1

comment:10 Changed 5 years ago by
Fixed.
Nathann
comment:11 Changed 5 years ago by
 Work issues merge conflict deleted
comment:12 Changed 5 years ago by
 Status changed from needs_review to needs_work
sage t long src/sage/combinat/finite_state_machine.py # 2 doctests failed sage t long src/sage/quivers/path_semigroup.py # 1 doctest failed
comment:13 Changed 5 years ago by
 Commit changed from a049551c0d1c8d95cc4301d82f70f4f50bd8fc51 to f6087ffcf2370bdfe45a1162c7d7106c9829ad5c
Branch pushed to git repo; I updated commit sha1. New commits:
f6087ff  trac #15706: Broken doctests

comment:14 Changed 5 years ago by
 Status changed from needs_work to needs_review
comment:15 Changed 5 years ago by
 Milestone changed from sage6.3 to sage6.4
comment:16 Changed 5 years ago by
Interesting idea  that graphs aren't simple by default bit me recently, had to use to_simple()
everywhere... but I have no particular opinion on what the best default is.
comment:17 Changed 5 years ago by
What I would like to do is reach a state where people have to say explicitly what they expect. Right now it is automatically detected and this had a real cost in runtime, which is actually pure waste when you only create simple graphs.
What this branch does is prepare deprecations so that, one year from now, I am able to rewrite the code correctly without this loss of time. So if you have some time to review it ?... (and discuss whatever is needed of course)
Nathann
comment:18 followup: ↓ 19 Changed 5 years ago by
Well, like I said, I'm not sure I want to change that behavior or not. It seems like the kind of thing that should be discussed at some length. I do agree there should be a flag to avoid the runtime cost, that seems clear  but again, I am agnostic on the default.
As a constructive comment, maybe one ticket could be to have a flag (how that is named, I don't know) and then this one for making defaults, assuming there is something approaching consensus.
comment:19 in reply to: ↑ 18 Changed 5 years ago by
Yo !
Well, like I said, I'm not sure I want to change that behavior or not. It seems like the kind of thing that should be discussed at some length.
Well... I would be glad to discuss it at length with anybody but this ticket has been here for 10 months already....
I do agree there should be a flag to avoid the runtime cost, that seems clear  but again, I am agnostic on the default.
Well... Again, all graph books usually begin by saying that all graphs considered are simple, i.e. no loops nor multiple edges. There is such a thing as "multigraph", too. And I really don't like this behaviour:
sage: G=Graph([(1,2),(1,2),(1,1)]); G.edges() [(1, 1, None), (1, 2, None), (1, 2, None)] sage: G=Graph(); G.add_edge(1,2); G.add_edge(1,2); G.add_edge(1,1); G.edges() [(1, 2, None)]
Somehow the current behaviour of add_edge
depends on nonexplicit information, and considers simple graphs as the default. I would feel better if everybody got used to state clearly what they want to work with and dealt with the consequences. A guy who wants to build multigraphs and initializes them from a list of edges will experience seriously inconsistent behaviour depending on whether the initial information contains multiple edges and/or loops !
As a constructive comment, maybe one ticket could be to have a flag (how that is named, I don't know) and then this one for making defaults, assuming there is something approaching consensus.
Hmmmmm.... Well, what do you expect consensus to look like when it seems after 10 months that only one person cares ? :P
Nathann
comment:20 followup: ↓ 21 Changed 5 years ago by
 Cc vdelecroix rbeezer jason added; vdelecrois removed
Hmmmmm.... Well, what do you expect consensus to look like when it seems after 10 months that only one person cares ? :P
Well, I don't consider no one cares when it's only on a ticket. Only the seriously obsessed (like me?) just look at random tickets without invitation ;) And stuff that is actually important often languishes for years.
I think the problem is that while some people like me are lazy and will put in double edges but don't mean it, the most likely thing when people put in extra edges is that they want them. I don't feel competent to resolve this question  and definitely not all graph things assume all are simple.
comment:21 in reply to: ↑ 20 Changed 5 years ago by
Hello !
Well, I don't consider no one cares when it's only on a ticket. Only the seriously obsessed (like me?) just look at random tickets without invitation ;) And stuff that is actually important often languishes for years.
Hmmm... :P
I think the problem is that while some people like me are lazy and will put in double edges but don't mean it, the most likely thing when people put in extra edges is that they want them.
True, but I really worry of those who create many graphs, most of which have multiple edges, and all of a sudden create one from data which (by chance) corresponds to a simple graph. There is a danger of wrong results later on.
I don't feel competent to resolve this question  and definitely not all graph things assume all are simple.
?
Well no, we have stuff that works for both, and stuff that works for simple graphs.. "recently" I added a function called "scream if not simple" that is used in several places in the graph code, as really some functions that should not be called on nonsimple graphs were not checked at all. It is a bit better, but this constructor thing remains a problem, especially when the behaviour of the constructor and the behaviour of add_edge
is not the same.
Nathann
comment:22 followup: ↓ 23 Changed 5 years ago by
There is a danger of wrong results later on.
Really? So some of our algorithms only work for multigraphs? That would be interesting/scary. I really thought the bigger danger was in the other direction, but that it was user error (such as mine) in that event.
?
Sorry, by 'things' I meant 'books'.
comment:23 in reply to: ↑ 22 Changed 5 years ago by
Hello !
There is a danger of wrong results later on.
Really? So some of our algorithms only work for multigraphs?
Nonono, not that I know. But I only deal with simple graphs, remember ? :P
I was thinking of algebraists, or people doing automata theory. I expect them to create some graphs and to modify them later, and add_edge
will only create a multiple edge if the graph "allows" multiple edge (this is decided at creation time).
If the first graph does not have multiple edge (by luck) this could make the later parts of their algorithms meaningless. I do not think that such a problem would exist in Sage, though. Some functions have no meaning for multigraphs (like .density()) but not the other way around.
Nathann
comment:24 followup: ↓ 25 Changed 5 years ago by
I see, I didn't realize that it was impossible to convert a simple graph into a multigraph. Is there at least an error thrown?
comment:25 in reply to: ↑ 24 Changed 5 years ago by
I see, I didn't realize that it was impossible to convert a simple graph into a multigraph.
Nonononono, it is possible ! You can do G.allow_multiple_edges(True)
, but then you have to *know* that you need to do that.
Is there at least an error thrown?
I don't understand what you are talking about, so I will answer the two possibilities:
1) For .density() yes an error is thrown:
sage: Graph([(1,2),(1,2)]).density() ... TypeError: Density is not welldefined for multigraphs.
2) For functions that use scream_if_not_simple
too:
sage: Graph([(1,2),(1,2)]).is_split() ... ValueError: This method is not known to work on graphs with multiedges. Perhaps this method can be updated to handle them, but in the meantime if you want to use it please disallow multiedges using allow_multiple_edges().
3) No error is thrown when you want to add an edge to a (simple) graph that already has it:
sage: g = Graph([(0,1)]) sage: g.edges() [(0, 1, None)] sage: g.add_edge(0,1) sage: g.edges() [(0, 1, None)]
I believe that this is what should happen, but of course it can be a problem if you have created a simple graph by mistake, which can happen with this automatic detection.
Nathann
comment:26 Changed 5 years ago by
 Branch changed from u/ncohen/15706 to public/ticket/15706
 Commit changed from f6087ffcf2370bdfe45a1162c7d7106c9829ad5c to 321f51f489b3dd53f8cfe780e710d2d2f0cb4df3
comment:27 Changed 5 years ago by
Thanks... It would be cool if somebody was willing to review it, though.
Nathann
comment:28 followup: ↓ 29 Changed 5 years ago by
 Cc dimpase added
Is there at least an error thrown?
I don't understand what you are talking about,
I said that when I thought it was impossible to convert to multigraphs.
That said, this current behavior you mention is troubling.
sage: g = Graph([(0,1)]) sage: g.edges() [(0, 1, None)] sage: g.add_edge(0,1) sage: g.edges() [(0, 1, None)]
Ideally, this should raise a warning (once per session, like a deprecation) that you should change to a multigraph, and after the deprecation period it would still raise the warning but now force it to be a multigraph.
As I mentioned before, I'm not a heavy graph theory user (though I do use it) so I think would be better to at least see whether any opinions existed on sagedevel  cc:ing one person who may have an opinion (?).
comment:29 in reply to: ↑ 28 Changed 5 years ago by
Ideally, this should raise a warning (once per session, like a deprecation) that you should change to a multigraph
Not necessarily. I work on simple graphs only, and I use this from time to time. When I add an edge which is already there <do nothing> is what I expect.
Currently, users may not even know if their graph is simple or may contain multiedges, that's what I think is the problem.
As I mentioned before, I'm not a heavy graph theory user (though I do use it) so I think would be better to at least see whether any opinions existed on sagedevel  cc:ing one person who may have an opinion (?).
Okay, I'll post there....
Nathann
comment:31 Changed 5 years ago by
 Cc jmantysalo added
comment:32 followup: ↓ 35 Changed 5 years ago by
 Status changed from needs_review to needs_work
Graph([(1, 2, 0), (1,2,1)])
gives NameError: global name 'deprecation' is not defined
.
comment:33 Changed 5 years ago by
To be honest, I think that I do not know enought of graphs code to really review this. Sorry.
In one place there is missing space after ,
before multiedges=True
.
comment:34 Changed 5 years ago by
 Commit changed from 321f51f489b3dd53f8cfe780e710d2d2f0cb4df3 to 7c69f250922b0373de62e16b16baaab7f8b2fd32
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
7c69f25  trac #15706: Graphs built from their edges will be simple by default

comment:35 in reply to: ↑ 32 Changed 5 years ago by
Yo !
Graph([(1, 2, 0), (1,2,1)])
givesNameError: global name 'deprecation' is not defined
.
The modulelevel import of 'deprecation' was probably removed since the branch was written. I just rebased it (and removed the merge commits on the way).
To be honest, I think that I do not know enought of graphs code to really review this. Sorry.
I see. Well, I do not think that there is much to know about graph code in this case. It deals with the constructor allright, but it is just an user interface decision.
Anybodyyyyyyyyyyyy ?...
Nathann
comment:36 Changed 5 years ago by
Is it ready for review?
comment:37 Changed 5 years ago by
It is.
Nathann
comment:38 Changed 5 years ago by
 Status changed from needs_work to needs_review
comment:39 Changed 5 years ago by
Oops, sorry.
comment:40 Changed 5 years ago by
 Reviewers set to Dima Pasechnik
 Status changed from needs_review to positive_review
LGTM
comment:41 Changed 5 years ago by
Wowwwwwwww !! Super cool, thanks :)
Nathann
comment:42 followup: ↓ 43 Changed 5 years ago by
Okay, I'll let Dima take the fall if users complain ;) Seriously, I hope this leads to less confusion in the longrun for people who like both multigraphs and simple ones; I just didn't want to make that UI decision.
I do have one question that probably has an answer of "yes"; are there still enough examples of graphs not in the extremely cryptic/hermetic dictionary of dictionary notation? I noticed you changed quite a few of the dict of lists ones, and I had never seen that other constructor before.
comment:43 in reply to: ↑ 42 Changed 5 years ago by
Okay, I'll let Dima take the fall if users complain ;) Seriously, I hope this leads to less confusion in the longrun for people who like both multigraphs and simple ones; I just didn't want to make that UI decision.
Well. Regardless of changing the default to "always simple", it is already good that add_edge
will become consistent with the constructor.
I do have one question that probably has an answer of "yes"; are there still enough examples of graphs not in the extremely cryptic/hermetic dictionary of dictionary notation? I noticed you changed quite a few of the dict of lists ones, and I had never seen that other constructor before.
I changed dicts to lists ? Can you tell me where ?
The dictionary notation is only cryptic when you want "edgelabelled multigraphs". When you want simple graphs, it is rather simple: associate to each vertex the list of its neighbors.
To answer your question a bit more completely: I hope that Sage (the code, not the doctests) contains no occurrence of Graph(list_of_edges)
, for this is slower than g=Graph();g.add_edges(list_of_edges)
. This is also the point of this ticket: making both equally fast.
Nathann
comment:44 Changed 5 years ago by
Besides, just for your own information:
sage: g = Graph() sage: edges = list(combinations(range(1000),2)) sage: %time g.add_edges(edges) CPU times: user 992 ms, sys: 44 ms, total: 1.04 s Wall time: 989 ms sage: g = Graph() sage: %time for e in edges: g.add_edge(e) CPU times: user 1.8 s, sys: 44 ms, total: 1.84 s Wall time: 1.79 s
This, because add_edge(u,v)
, add_edge(u,v,l)
, add_edge((u,v))
, add_edge((u,v,l))
are all valid, and that deciding which should be used takes time, at each add_edge
call.
Nathann
comment:45 followup: ↓ 46 Changed 5 years ago by
I changed dicts to lists ? Can you tell me where ?
No, you changed lists to dicts, e.g.
 sage: G=DiGraph([[0,0],[0,0],[0,0],[1,1],[1,1],[1,1]])  sage: H=DiGraph([[0,0],[0,0],[0,0],[0,0],[1,1],[1,1]]) + sage: G=DiGraph({0:[0,0,0],1:[1,1,1]}) + sage: H=DiGraph({0:[0,0,0,0],1:[1,1]})
and some similar, which is okay, but also
 sage: DiGraph([(1, 2, 0), (1,2,1)]) + sage: DiGraph({1:{2:[0,1]}})
which I didn't even know was possible, and certainly don't know how to interpret (though I barely know how to interpret the other one, I guess the last element of each triple is for the direction?). Edit: after thinking about it for a little, I now get it, but still find it pretty terse.
comment:46 in reply to: ↑ 45 Changed 5 years ago by
No, you changed lists to dicts, e.g.
Oh sorrysorry, I thought that you were speaking of Sage's code.
 sage: DiGraph([(1, 2, 0), (1,2,1)]) + sage: DiGraph({1:{2:[0,1]}})which I didn't even know was possible, and certainly don't know how to interpret (though I barely know how to interpret the other one, I guess the last element of each triple is for the direction?). Edit: after thinking about it for a little, I now get it, but still find it pretty terse.
Ahaha. Well, indeed, "multigraph with labelled edges". It is the most awful thing we handle I believe :P
This being said, it is "the right data structure" in terms of elementary operations. Though hard to read. By the way, it can give you an idea of why I hate multiple edges and loops. It makes everything more complicated, and none of the additional complexities is very interesting from the theoretical point of view. It is just more work and more cornercases :P
Nathann
comment:47 Changed 5 years ago by
 Branch changed from public/ticket/15706 to 7c69f250922b0373de62e16b16baaab7f8b2fd32
 Resolution set to fixed
 Status changed from positive_review to closed
Branch pushed to git repo; I updated commit sha1. New commits:
trac 15704: Stupid waste of time in graphs 1
trac #15704: Adding a doctest
trac #15704: Changing add_edge to _backend.add_edge in the (Di)Graph constructor
trac #15706: Graphs built from their edges will be simple by default