Opened 8 years ago
Last modified 7 years ago
#14535 closed enhancement
Mutability of Graphs — at Version 26
Reported by: | SimonKing | Owned by: | jason, ncohen, rlm |
---|---|---|---|
Priority: | major | Milestone: | sage-duplicate/invalid/wontfix |
Component: | graph theory | Keywords: | mutability graph |
Cc: | jernej | Merged in: | |
Authors: | Simon King, Volker Braun | Reviewers: | |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | #14524 | Stopgaps: |
Description (last modified by )
This patch allows to create immutable graphs, so that they can be used as keys in dictionaries.
Previously, calling hash on a graph did work after assigning the attribute _immutable
to the graph. That's a hack, and it would in fact not prevent the graph from being mutated.
With this patch, the attempt to change an immutable graph be means of methods such as add_vertex or add_edge or delete_vertex, will result in an error. If one really wants to play nasty, one could make an immutable graph mutable and change it, or use the backend of the graph for changing the underlying data.
Change History (28)
Changed 8 years ago by
comment:1 Changed 8 years ago by
- Status changed from new to needs_review
comment:2 follow-up: ↓ 4 Changed 8 years ago by
- Cc jernej added
comment:3 Changed 8 years ago by
- Status changed from needs_review to needs_info
comment:4 in reply to: ↑ 2 ; follow-up: ↓ 8 Changed 8 years ago by
Hi!
Replying to ncohen:
Hello !
- Could you show that there is no slowdown in the graph functions because of that please ?
There certainly is a slowdown in all methods that are protected in this way. And I do not use the decorator in cdef classes, in an attempt to prevent a regression.
Does it make sense to time methods like add_vertex separately? In my applications, one would build a graph, and then it will never again be changed. But I am sure you have better use cases than I. Do you have examples in which add/delete_vertex/edge will typically be called zillions of times?
- Besides, why do you need decorators in the Python classes if you forbid modifications directly in the backend ?
I do not forbid to change the graphs in the backend. Or at least, I did not intend to modify the backends. If I did, then it was by mistake. If G is an immutable graph, then G._backend.add_vertex(...)
should not complain.
The reason is, again, that I don't want a slow-down in the "private" methods. If one has an immutable graph used as key in a dictionary, and then decides to work around the "protected" methods add_vertex
, add_edge
etc by changing the graph G via G._backend, then this is a conscious decision for which the user takes full responsibility.
So, protection of the user will be restricted to the "official" methods G.add_vertex etc.
- You create 4 functions in a class that already has a LOT of them. We have several functions that work like that already :
sage: g.mutable() Tells you if it is mutable sage: g.mutable(True) Sets it to be mutable
I did not find any support for mutability in graphs, except for the hack that makes hash
work.
- There is a lot of things that graphs store and which are not vertices, nor edges. For example the vertices' layout, or its name... You can see the list of these awful things in
GenericGraph.__eq__
. If you just want to take edges and vertices in consideration could you say it explicitely in the documentation of the*_mutable
methods ?
Not good. Yes, I should have looked up GenericGraph.__eq__
and take into account all data used there.
And.. Well.. Could you add me in Cc when you write things like that ?
Actually, when I created the ticket, I first looked at a preview, and saw that you are among the (default) owners of this ticket. Hence, I concluded that there is no need in adding you in Cc, because you already were on the list of recipients. Sorry if I misinterpreted the meaning of the list of "owners".
comment:5 Changed 8 years ago by
Here is a little timing.
Without patch:
sage: K12 = graphs.CompleteGraph(12) sage: K4 = graphs.CompleteGraph(4) sage: def test(g,h): ....: g.delete_edges(h.edge_iterator()) ....: g.add_edges(h.edge_iterator()) ....: sage: %timeit test(K12,K4) 10000 loops, best of 3: 118 us per loop
With patch:
sage: K12 = graphs.CompleteGraph(12) sage: K4 = graphs.CompleteGraph(4) sage: def test(g,h): ....: g.delete_edges(h.edge_iterator()) ....: g.add_edges(h.edge_iterator()) ....: sage: %timeit test(K12,K4) 10000 loops, best of 3: 131 us per loop sage: (131-118)/118.*100 11.0169491525424
So, there is a slow-down of 11% in this example. I don't know if in "real" examples, the backend would be used (without slow-down) internally.
comment:6 Changed 8 years ago by
Here are two more timings, this time using DenseGraph
(hence, not using the decorator) and hash:
Without patch:
sage: from sage.graphs.base.dense_graph import DenseGraph sage: D = DenseGraph(nverts=10, extra_vertices=10) sage: def test(g): ....: for i in xrange(100): ....: g.add_vertex(i) ....: for i in xrange(100): ....: for j in xrange(100): ....: g.add_arc(i,j) ....: for i in xrange(100): ....: g.del_vertex(i) ....: sage: %timeit test(D) 100 loops, best of 3: 2.59 ms per loop sage: K12 = graphs.CompleteGraph(12) sage: K12._immutable = True sage: hash(K12) -1604610596753853799 sage: %timeit hash(K12) 1000 loops, best of 3: 212 us per loop
With the patch:
... sage: %timeit test(D) 100 loops, best of 3: 2.72 ms per loop sage: (2.72-2.59)/2.59*100 # 5% slowdown 5.01930501930503 sage: K12 = graphs.CompleteGraph(12) sage: K12.set_immutable() sage: %timeit hash(K12) 1000 loops, best of 3: 216 us per loop
Anyway, I presume that in real use cases, faster unsafe methods would/could/should be used internally.
comment:7 Changed 8 years ago by
- Status changed from needs_info to needs_review
I think I gave the info, so, back to "needs review"...
comment:8 in reply to: ↑ 4 ; follow-ups: ↓ 9 ↓ 16 Changed 8 years ago by
Hellooooooooooooooo !!!
There certainly is a slowdown in all methods that are protected in this way. And I do not use the decorator in cdef classes, in an attempt to prevent a regression.
Hmmmm. And why should we accept a 10% slowdown in order to have immutable graphs ? :-P
Does it make sense to time methods like add_vertex separately? In my applications, one would build a graph, and then it will never again be changed. But I am sure you have better use cases than I. Do you have examples in which add/delete_vertex/edge will typically be called zillions of times?
Not really. Like you, in my examples I spend most of my time reading graphs. And wasting my time translating labels into integers :-P
- Besides, why do you need decorators in the Python classes if you forbid modifications directly in the backend ?
I do not forbid to change the graphs in the backend. Or at least, I did not intend to modify the backends. If I did, then it was by mistake. If G is an immutable graph, then
G._backend.add_vertex(...)
should not complain.
? But you make changes to CGraph
, SparseGraph
and DenseGraph
, don't you ? O_o
And why wouldn't it be better to only deal with immutability directly in the backends ? No need for decorators if it is done there, and if it is only a "if" with a niiice C variable which never changes. Would reduce the slowdown, wouldn't it ?
The reason is, again, that I don't want a slow-down in the "private" methods. If one has an immutable graph used as key in a dictionary, and then decides to work around the "protected" methods
add_vertex
,add_edge
etc by changing the graph G via G._backend, then this is a conscious decision for which the user takes full responsibility.
Hmmmm... But the developpers could have to do that to make some methods more efficient, and then we would have to deal with immutability by hand. But really I am interested in your answer : why don't you prefer to deal with immutability in the backends directly ? In the graphs/base/
files ?
- You create 4 functions in a class that already has a LOT of them. We have several functions that work like that already :
sage: g.mutable() Tells you if it is mutable sage: g.mutable(True) Sets it to be mutableI did not find any support for mutability in graphs, except for the hack that makes
hash
work.
Nonono, we don't have such a method at all ! But we have two methods to allow loops and test if they are allowed :
sage: g = graphs.PetersenGraph() sage: g.allows_loops() False sage: g.allow_loops(True) sage: g.allows_multiple_edges() False sage: g.allow_multiple_edges(True)
So it's already half of what you take, and we could even merge two functions like allows_loops
and allow_loops
into one, as in my previous example. That's one of the joys of coding in Python.
Actually, when I created the ticket, I first looked at a preview, and saw that you are among the (default) owners of this ticket. Hence, I concluded that there is no need in adding you in Cc, because you already were on the list of recipients. Sorry if I misinterpreted the meaning of the list of "owners".
Oh. I see. Then it's the Trac developper's fault. The owners of a section are expected to receive an email indeed, except when there are ... several owners.
In that case, no mails are sent :-P
http://trac.edgewall.org/ticket/7793
Nathann
comment:9 in reply to: ↑ 8 ; follow-up: ↓ 10 Changed 8 years ago by
Replying to ncohen:
There certainly is a slowdown in all methods that are protected in this way. And I do not use the decorator in cdef classes, in an attempt to prevent a regression.
Hmmmm. And why should we accept a 10% slowdown in order to have immutable graphs ?
:-P
It would be nice to have a better solution, but, as for matrices, I do think it would be good to be able to use graphs as keys in dictionaries without a hack.
I do not forbid to change the graphs in the backend. Or at least, I did not intend to modify the backends. If I did, then it was by mistake. If G is an immutable graph, then
G._backend.add_vertex(...)
should not complain.? But you make changes to
CGraph
,SparseGraph
andDenseGraph
, don't you ?O_o
Then we may have different notions of "backend". By "backend", I understood the stuff that is assigned to G._backend
and is defined in sage.graphs.base.graph_backends`.
You think it might be better to move all mutability checks into the backend? I thought so as well, initially. But then I found that a couple of add_* methods first change some data outside of the backend, and call the backend method only later. So, I thought it is safer to check mutability not in the backend, but in the graph.
Hmmmm... But the developpers could have to do that to make some methods more efficient, and then we would have to deal with immutability by hand. But really I am interested in your answer : why don't you prefer to deal with immutability in the backends directly ? In the
graphs/base/
files ?
I would prefer to deal with it in the backends.
comment:10 in reply to: ↑ 9 ; follow-up: ↓ 11 Changed 8 years ago by
It would be nice to have a better solution, but, as for matrices, I do think it would be good to be able to use graphs as keys in dictionaries without a hack.
And you would trade 10%
of speed for that ? O_o
Really, the only way I see to do this properly is to write an immutable backend...
Then we may have different notions of "backend". By "backend", I understood the stuff that is assigned to
G._backend
and is defined in sage.graphs.base.graph_backends`.
Oh. Sorry, then. I call backend whatever is in the base/
folder, but perhaps that isn't right. In any case :
sage: type(Graph()._backend) <class 'sage.graphs.base.sparse_graph.SparseGraphBackend'>
You think it might be better to move all mutability checks into the backend? I thought so as well, initially.
Yep. Looks much cleaner !
But then I found that a couple of add_* methods first change some data outside of the backend, and call the backend method only later.
Oh ? Like which ones ? O_o
So, I thought it is safer to check mutability not in the backend, but in the graph.
I would prefer to deal with it in the backends.
+1.
Nathann
comment:11 in reply to: ↑ 10 ; follow-up: ↓ 12 Changed 8 years ago by
Replying to ncohen:
Oh. Sorry, then. I call backend whatever is in the
base/
folder, but perhaps that isn't right. In any case :sage: type(Graph()._backend) <class 'sage.graphs.base.sparse_graph.SparseGraphBackend'>
OK, then we have the same notions...
But then I found that a couple of add_* methods first change some data outside of the backend, and call the backend method only later.
Oh ? Like which ones ?
O_o
sage.graphs.generic_graph.GenericGraph.delete_vertex
, which is
for attr in attributes_to_update: if hasattr(self, attr) and getattr(self, attr) is not None: getattr(self, attr).pop(vertex, None) self._boundary = [v for v in self._boundary if v != vertex]
and only afterwards calls the backend. If the order of dealing with attributes and dealing with the backend could be reversed, then making the backend immutable would be enough. But in this order, calling the method on an immutable graph backend would destroy some data before raising the error.
Perhaps I should have tried first whether it can be reversed...
Cheers,
Simon
comment:12 in reply to: ↑ 11 ; follow-up: ↓ 13 Changed 8 years ago by
sage.graphs.generic_graph.GenericGraph.delete_vertex
, which isfor attr in attributes_to_update: if hasattr(self, attr) and getattr(self, attr) is not None: getattr(self, attr).pop(vertex, None) self._boundary = [v for v in self._boundary if v != vertex]and only afterwards calls the backend.
Oh, yeah.. The boundary is stored by graphs, too >_<
If the order of dealing with attributes and dealing with the backend could be reversed, then making the backend immutable would be enough. But in this order, calling the method on an immutable graph backend would destroy some data before raising the error.
Perhaps I should have tried first whether it can be reversed...
Hmmmmmm. First I would like to know something : do you really want everything in the graph to be immutable ? Or only edges and vertices ?
Nathann
comment:13 in reply to: ↑ 12 ; follow-up: ↓ 14 Changed 8 years ago by
Replying to ncohen:
Oh, yeah.. The boundary is stored by graphs, too
>_<
Right. Actually I can't recall if my patch protects changing the boundary.
Hmmmmmm. First I would like to know something : do you really want everything in the graph to be immutable ? Or only edges and vertices ?
I want that they are immutable in this sense: "It is impossible to change the hash or to change the ==
-equivalence class of an immutable object by calling methods that do not start with an underscore.
comment:14 in reply to: ↑ 13 Changed 8 years ago by
I want that they are immutable in this sense: "It is impossible to change the hash or to change the
==
-equivalence class of an immutable object by calling methods that do not start with an underscore.
Then there's a nightmare ahead of you :-P
Nathann
comment:15 follow-up: ↓ 21 Changed 8 years ago by
I haven't looked at the patch yet, but I want to keep an eye on it, and I want to advocate uniformity: try to make the mutability behavior and interface the same as that of Sage's matrices.
I'd be happy if we got immutable graphs. I'd be very happy if we could add/delete/contract and get a copy of the graph without explicitly saying "inplace=False" all the time, but one can dream...
And what about the following?
sage: G = graphs.PetersenGraph() sage: G Petersen graph: Graph on 10 vertices sage: G.add_vertex(11) sage: G Petersen graph: Graph on 11 vertices
Should named graphs be immutable?
comment:16 in reply to: ↑ 8 ; follow-up: ↓ 17 Changed 8 years ago by
sage: g.mutable() Tells you if it is mutable sage: g.mutable(True) Sets it to be mutable
I hope that last line remains fantasy: Changing the mutability of an object is a mutation, so should be forbidden on an immutable object (the only case where g.mutable(True)
should succeed is when g
is already mutable).
comment:17 in reply to: ↑ 16 ; follow-up: ↓ 20 Changed 8 years ago by
Replying to nbruin:
I hope that last line remains fantasy: Changing the mutability of an object is a mutation, so should be forbidden on an immutable object (the only case where
g.mutable(True)
should succeed is wheng
is already mutable).
Sure it is a mutation. However, I was told a couple of times that the "pythonic way" is to pretend that programmers are adult people who know what they do and who are prepared to take responsibility for their actions. Hence, while there should be some protection against accidental changes to an immutable object, it should still be possible to work around the protection.
I conclude that it would be the anti-pythonic way to
- allow to make a mutable object immutable, but not the other way (this is by making
_is_immutable
a cdef attribute, and restrict the interface to three methodsis_immutable
,is_mutable
andset_immutable
) - make it virtually impossible that a sub-class overrides a mutation-protected method by a method that does not check for mutation (this is possible with metaclasses)
- turn an object into an immutable object as soon as its hash is computed, so that its hash bucket will henceforth never change.
We need to find a reasonable middle way between the overly permissive approach ("all graphs are mutable and hashable, the programmer really has to be careful") and the overly protective approach (as described in the bulleted list).
comment:18 follow-up: ↓ 19 Changed 8 years ago by
Again, Matrices have (in my opinion) a decent balance:
sage: A = matrix([[1,2],[3,4]]) sage: A.set_immutable() sage: A[1,1] = 4 Traceback (most recent call last): ... ValueError: matrix is immutable; please change a copy instead (i.e., use copy(M) to change a copy of M). sage: B = copy(A) sage: B[1,1] = 1
Follow that lead, and be consistent with other parts of Sage.
comment:19 in reply to: ↑ 18 Changed 8 years ago by
Replying to Stefan:
Again, Matrices have (in my opinion) a decent balance:
sage: A = matrix([[1,2],[3,4]]) sage: A.set_immutable() sage: A[1,1] = 4 Traceback (most recent call last): ... ValueError: matrix is immutable; please change a copy instead (i.e., use copy(M) to change a copy of M). sage: B = copy(A) sage: B[1,1] = 1
Actually I have not been aware that there is no A.set_mutable
. So, thank you for the pointer!
comment:20 in reply to: ↑ 17 Changed 8 years ago by
Replying to SimonKing:
Sure it is a mutation. However, I was told a couple of times that the "pythonic way" is to pretend that programmers are adult people who know what they do and who are prepared to take responsibility for their actions. Hence, while there should be some protection against accidental changes to an immutable object, it should still be possible to work around the protection.
This goes a step further, though: Once an object is immutable, people can use it as a key in a dictionary (and as you know, caching methods such as UniqueRepresentation
constructors) can do so in quite unobvious ways.
If you then later allow the object to become mutable again, you have now made the graph illegal as dictionary key again! (in fact, if you now mutate it and set it back to mutable, you've made your key unfindable in the dictionaries. That would create VERY hard to debug bugs (as we've recently seen on an object that was just inadvertently changing its hash).
There's already precedent: marking a matrix immutable is a one-way street. Of course, if you really try (and reach in) you can still mutate the matrix. That's where the "consenting adults" comes in. But changing an object from immutable to mutable should never be "official" API.
I don't think you'll lose anything. Making an immutable structure mutable is always a bug. Changing an immutable structure in a way that doesn't "essentially" change it might not be, but in that case it's reasonable to expect you can make that change even without the structure being mutable (e.g., by reaching into the backend to make the change)
comment:21 in reply to: ↑ 15 Changed 8 years ago by
Should named graphs be immutable?
Hmmmmmm :-/
I see the points (we could cache nice information inside of them which is hard to compute), but I don't like it much :-/
We already had some problems with the class BipartiteGraph
which constrained some operations (you were not able to add an edge between two vertices of the same class, for instance). The CompleteBipartiteGraph
and RandomBipartiteGraph
were automatically BipartiteGraph
instances, and it really was a mess. Because you could not expect a function thought for regular graphs to work well on such instances, and the only way out was to make copies of these input again and again into normal graphs, then work on them... :-/
Then again, perhaps you are right and I just complain because I am used to the old way... :-/
Nathann
comment:22 Changed 8 years ago by
Patch needs to be rebased. Also:
set_mutable
should be removed. If you know what you are doing, you can change the_is_immutable
attribute. But we shouldn't write and document a method where you are likely to shoot yourself in the foot.- Override
__copy__
and make the copy an immutable object mutable. Its a common pattern tocopy(const_obj)
to get a mutable copy. - Please no
g.mutable(x=None)
. Method names should be crystal-clear about whether they mutate or not. But this is intentionally straddling the boundary between const and mutable, making code intentional hard to read. And there are lots of fun things that can go wrong now, for startersx=None
will make it look like you are changing it while actually getting the value.
This is also in accord with what matrices are doing.
comment:23 Changed 8 years ago by
I'm still totally against a performance loss in exchange for a feature.
To me the only way out is an immutable backend.
Nathann
comment:24 follow-up: ↓ 25 Changed 8 years ago by
You mean performance loss because we are checking one cdef boolean? This is not really a measurable amount of time, I think. It might even be free if the compiler correctly identifies the unlikely branch and speculative execution works its way. On the plus side, you can then use graphs as hash keys which allows you to implement more efficient algorithms. From that point of view, we are talking about a net performance gain.
comment:25 in reply to: ↑ 24 Changed 8 years ago by
Replying to vbraun:
You mean performance loss because we are checking one cdef boolean?
If I recall correctly, the current decorator is indeed not very quick.
By the way, what changes did you do?
comment:26 Changed 8 years ago by
- Description modified (diff)
I've updated the patch.
I also removed the mutability checks from some of the backends. The central user-facing point to implement them is in GenericGraph
/ GenericGraph_pyx
. This is similar to the matrix code where we, for example, don't patch linbox to support (im)mutability.
It seems there is quite a bit of overhead in graph construction since the GenericGraph._backend
is a Python class. This means various dictionary lookups to locate add_vertex
and friends. You could shave off a lot of overhead by
- Move the construction methods (
add_vertex
etc) fromGenericGraph
toGenericGraph_pyx
- Switch
GenericGraphBackend
and its subclasses to Cython and make its methods cdef - Equip
GenericGraph_pyx
with a Cython attributecdef GenericGraphBackend _backend
See also http://docs.cython.org/src/userguide/early_binding_for_speed.html But thats material for another ticket.
Once all the constructions methods are in GenericGraph_pyx
it'll also be easy to switch the mutability check to a very fast inline function.
Hello !
GenericGraph.__eq__
. If you just want to take edges and vertices in consideration could you say it explicitely in the documentation of the*_mutable
methods ?And.. Well.. Could you add me in Cc when you write things like that ?
Nathann