Opened 3 years ago
Last modified 8 months ago
#25506 needs_work enhancement
Nauty interface for isomorphism checking and automorphism group computing
Reported by:  Vaush  Owned by:  

Priority:  major  Milestone:  sage8.4 
Component:  graph theory  Keywords:  graphs nauty pynauty automorphism isomorphism, gsoc2018 
Cc:  dimpase, stumpc5, dcoudert  Merged in:  
Authors:  Reviewers:  
Report Upstream:  N/A  Work issues:  
Branch:  u/Vaush/nauty_interface_automorphism_isomorphism (Commits, GitHub, GitLab)  Commit:  a907f609128b22ea3d0666aa6fad0a63baf408b2 
Dependencies:  Stopgaps: 
Description
I used the pynauty library (https://web.cs.dal.ca/~peter/software/pynauty/html/intro.html#), which has GPLv3+ license by the way, to interface with our pynauty package and use it to check for isomorphism and compute automorphisms through the standard sage methods. In particular, I am actually copying the package from upstream, patching it and recompiling it, since pynauty wrappers require to be compiled with nauty's own source code. A lot of modifications where required to make the transition from sage's own graph representation to one that nauty was able to understand. At the moment this interface doesn't support arbitrary edge labels, but it supports multiedge graphs through an helper function that converts them in equivalent singleedge graphs (http://pallini.di.uniroma1.it/Guide.html section 14) that I put in sage.graphs.multiedge for everyone to use. The efficiency is not exactly spectacular, since the graph needs to be relabeled with labels going from 0 to n1 (where n is the number of vertices), and thus a copy needs to be made and the relabel function needs to be called. If, by chance, the graph is already labelled in that way, there is an option to tell the interface not to relabel the graph. If the relabeling is found to be too inefficient, it could be implemented in the C++ part of the interface, but then a lot of the conversions done to return sensible results (i.e. return automorphism group generators listing the actual nodes' labels, and not arbitrary labels in {0..n1}) would have to be moved too. Modifications were made to also output results in the exact same format of the standard sage methods.
Last, I noted that a version of the same algorithm I used to convert multiedged graphs to their single edge version was implemented in the Bliss module (but only for Bliss graphs), in #24924; it might be worth it to change it to use this more general version, if needs allow it.
Change History (37)
comment:1 Changed 3 years ago by
 Branch set to u/Vaush/nauty_interface_automorphism_isomorphism
comment:2 Changed 3 years ago by
 Commit set to 80a9212cbd20df059ecda7226bccf7f300196727
comment:3 Changed 3 years ago by
 Cc dcoudert added
 Keywords gsoc2018 added
comment:4 Changed 3 years ago by
Thanks for the contribution.
I'm not expert in nauty or creating interfaces, but instead of creating a new file multiedge_singledge_conversion.pyx
(which is a weird name), you should create a file nauty.pyx
for the interface with nauty, or directly put this method into sage_graph.patch
.
Furthermore, why are you adding from .multiedge import multiedge_to_singledge
to src/sage/graphs/all.py
? It's a very specific method, so I don't see why it should be added to global namespace.
Concerning the method itself, you don't need to convert the graph to a (slow and space consuming) dictionary. You can access the list of multiple edges of the graph with G.multiple_edges()
. Furthermore, if the graph has multiple edges, you can get the list of distinct labels using G.edge_label(0, 1)
sage: G = Graph(multiedges=True) sage: G.add_edges([(0, 1, 0), (0, 1, 2), (1, 2, 1), (2, 3, 1), (2, 3, 2)]) sage: G.multiple_edges() [(0, 1, 0), (0, 1, 2), (2, 3, 1), (2, 3, 2)] sage: G.edge_label(0, 1) [2, 0] sage: G.edge_label(1, 2) [1]
Another issue is that you assume edge labels to be integers, but we can have floats, rationals, list, dict, etc.
It's clearly easier to use pynauty, but it's a petty we cannot convert the graph directly to the data structure used by nauty.
comment:5 followup: ↓ 6 Changed 3 years ago by
Thanks for your input.
Regarding the file with the multiedge conversion, I admit the name is not the best, but I didn't put it inside the nauty interface because it's a general method, it converts a sage multigraph in a suitable sage graph with only single edges, following the method outlined in the provided link. I was thinking of moving it inside generic_graph.pyx, but didn't want to take the decision all on my own. If you think it's better, I'll gladly move it though.
Also, thanks for the suggestion on the method. I don't recall why I chose to use a dictionary, probably because I didn't think to use the multiple_edges > edge_label combo, and I wanted to be able to quickly find the multiplicity of any edge in the graph. I'll rework that part to avoid transforming the graph into a dictionary.
About the import, it's mostly due to experiments I've been running to understand how the importing works in sagemath. Without that line (or any line about pynauty) in all.py, I had to manually import the module in sagemath, but even with that line I wasn't able to limit what sage imports, getting instead the entire list of all the objects inside the module. If the method is moved in generic_graph.pyx I'll have to remove the entry anyway.
I don't think I ever assume edge labels to be integers, could you show me where I did that? It's probably an error or bug if I did. Actually, that's the reason why I said I can't use edge_labels at the moment, because I'm trying to see if substituting edge labels with integers (to use the technique suggested in section 14 of the nauty guide) would cause any issues.
Lastly, while I started with converting the graph in pynauty format and then letting pynauty do its thing, in this version the sage graph is actually converted directly into nauty format (but this is done for every call to automorphism_group or is_isomorphic, so there's that). Nonetheless, the main skeleton of the module, all the code that manages calling nauty and setting up data structures, etc. is still pynauty's, I just switched out its graph format for sage's and set the output format accordingly, so that's why I still call it pynauty, and it's still based on its source code.
If you have any other suggestions or issues, please tell me, I'll be working on the problems you highlighted in the meantime.
comment:6 in reply to: ↑ 5 ; followup: ↓ 7 Changed 3 years ago by
Replying to Vaush:
Thanks for your input.
Regarding the file with the multiedge conversion, I admit the name is not the best, but I didn't put it inside the nauty interface because it's a general method, it converts a sage multigraph in a suitable sage graph with only single edges, following the method outlined in the provided link. I was thinking of moving it inside generic_graph.pyx, but didn't want to take the decision all on my own. If you think it's better, I'll gladly move it though.
Yes, do it.
About the import, it's mostly due to experiments I've been running to understand how the importing works in sagemath. Without that line (or any line about pynauty) in all.py, I had to manually import the module in sagemath, but even with that line I wasn't able to limit what sage imports, getting instead the entire list of all the objects inside the module. If the method is moved in generic_graph.pyx I'll have to remove the entry anyway.
I don't think I ever assume edge labels to be integers, could you show me where I did that? I
You call get_bits_list(label)
whic itself calls bin(x)
. So if you give anything else than an integer, you get an error.
comment:7 in reply to: ↑ 6 Changed 3 years ago by
Replying to dcoudert:
Replying to Vaush:
Thanks for your input.
Regarding the file with the multiedge conversion, I admit the name is not the best, but I didn't put it inside the nauty interface because it's a general method, it converts a sage multigraph in a suitable sage graph with only single edges, following the method outlined in the provided link. I was thinking of moving it inside generic_graph.pyx, but didn't want to take the decision all on my own. If you think it's better, I'll gladly move it though.
Yes, do it.
About the import, it's mostly due to experiments I've been running to understand how the importing works in sagemath. Without that line (or any line about pynauty) in all.py, I had to manually import the module in sagemath, but even with that line I wasn't able to limit what sage imports, getting instead the entire list of all the objects inside the module. If the method is moved in generic_graph.pyx I'll have to remove the entry anyway.
I don't think I ever assume edge labels to be integers, could you show me where I did that? I
You call
get_bits_list(label)
whic itself callsbin(x)
. So if you give anything else than an integer, you get an error.
I see how that might be misleading, but I think I only call that function on a dictionary that I created, that associates each edge in the graph with its multiplicity, so I am sure that "label" is an int because that's the multiplicity of the edge
comment:8 Changed 3 years ago by
 Commit changed from 80a9212cbd20df059ecda7226bccf7f300196727 to 5f6a20a250595cd565acc4a7a873d2cb6b3a27bf
Branch pushed to git repo; I updated commit sha1. New commits:
5f6a20a  Moved function multiedge_to_singledge to the GenericGraph class, changing its name in to_singledge and making it so it doesn't convert the graph into a dictionary anymore

comment:9 followup: ↓ 10 Changed 3 years ago by
Could you explain what's the goal of this method and what it does. You will have to add a description of the method anyway. Also, shouldn't original edge labels be added somewhere ?
comment:10 in reply to: ↑ 9 Changed 3 years ago by
Replying to dcoudert:
Could you explain what's the goal of this method and what it does. You will have to add a description of the method anyway. Also, shouldn't original edge labels be added somewhere ?
So the entire goal of this method is to account for the impossibility of managing multiple edges (and in the future, edge labels) in a graph with nauty. What it does is construct an auxiliary graph with n log k vertices, where k is the maximum multiplicity of the edges in the original graph, that has no multiple edges and with two peculiar characteristics: There is a subset of its vertices (returned by the method) such that the auxiliary graph's automorphism group's action on them is the same as the one on the original graph (that is, if one keeps every vertex not in the subset in a separate partition, and then removes every mention of these additional vertices from the automorphism group of the auxiliary graph, the resulting automorphism group is equal to the original graph's automorphism group Given two (multi)graphs, they are isomorphic if and only if their auxiliary graphs are isomorphic
So right now it works with multiplicities, I am trying to find a way to include arbitrary edge labels, but since it's an auxiliary graph I didn't see the point in adding the edge labels. I can do it, but they would just become a nice ornament, not really used for anything (at least not in nauty).
comment:11 Changed 3 years ago by
As far as I understand, it's more or less the code written by stumpc5  although for bliss rather than for nauty, and the idea is to make it more general, so that it can be used for nauty as well as for bliss as "backends" to compute automorphisms etc.
comment:12 Changed 3 years ago by
Right, but with bliss we are at Cython/C level so it might be faster. We'll see if the performances are OK.
comment:13 Changed 3 years ago by
I don't really know how pynauty, the interface used, works (Dario, the ticket's author, knows better). My limited understanding that it's also an interface on the level of C, thus it should be fast.
comment:14 Changed 3 years ago by
Actually, while the conversion to a nauty graph and the computation is in c, the relabeling and eventual conversion of the multigraph are done in python. I could easily move the relabeling to C++, the multigraph part would be a little more difficult, but still doable. I don't know if this would be that much of an improvement in performance though, that's why I didn't do it in the first place.
comment:15 Changed 3 years ago by
 Commit changed from 5f6a20a250595cd565acc4a7a873d2cb6b3a27bf to a05864fe0e0d6524bdc221ac4504179d0c3b1556
comment:16 Changed 3 years ago by
So, I commented and added doctests. I run into a quite nasty bug though, which I couldn't really pinpoint or fix, I could just find a workaround.
I submitted two commits, the first adds the docs, the second "fixes" the bug, so you can check for yourself.
What happens is that, when running tests only, after any test checking for exception raising (such as the test in automorphism_group()
that checks a KeyError
is returned if a wrong partition is passed as a parameter), any call to to_singledge()
ends in a segfault, caused by a call to PyObject_CallMethodObjArgs(...)
. Again, the segfault is raised only in this specific circumstances.
I couldn't find a way to explain the error, but I could fix it by substituting that call with a call to PyObject_CallMethod(...)
.
It's not necessary to make the code work, but if anyone has any insight on the cause (or on how to run gdb interactively on the c code of a third party package during a sage testing session), I'd be happy to hear it.
Let me know if there's anything wrong with the docs and doctests, since this is the first time I write them.
comment:17 Changed 3 years ago by
 Commit changed from a05864fe0e0d6524bdc221ac4504179d0c3b1556 to 602172366246c788e1aa88efedaddfb96e7dd8a5
Branch pushed to git repo; I updated commit sha1. New commits:
f5567ac  Removed the need to modify the upstream tarball

beb64e9  PyNauty tests should be optional

a1c6572  The nauty interface automorphism (and isomorphism, but it's not tested yet) now allows arbitrary edge labels AND multiple edges, even at the same time, though differing edge labels on one multi edge gives flaky results

6648f64  Partial changes

6021723  Fixed several bugs, added doctests and modified documentation

comment:18 Changed 3 years ago by
So, dimpase made me notice that the code I wrote had a few issues for people who don't actually have pynauty installed or even downloaded. I modified the code so that there's no need to modify the upstream tarball anymore, thus it can simply be downloaded from https://web.cs.dal.ca/~peter/software/pynauty/pynauty0.6.0.tar.gz and put in the upstream folder; still, could this tarball be added in the repositories, so that it can be downloaded automatically by sage? After this, I also made the tests about pynauty optional, since I forgot the first time around. Finally, now this interface supports both multiple edges and labelled edges, even at the same time (but having a multiple edge with different labels can produce weird results). Right now there should not be any more issues with this branch, so I'll move on to another contribution; I'll leave it like this for a couple of days, so if anyone spots any errors I can fix them, then I'll ask for review, I hope.
comment:19 Changed 3 years ago by
the tarball will be mirrored once the ticket id merged.
comment:20 followup: ↓ 21 Changed 3 years ago by
 Dependencies set to #25391
comment:21 in reply to: ↑ 20 Changed 3 years ago by
comment:22 followup: ↓ 23 Changed 3 years ago by
I see, I added it because on my system I needed #25391 to compile sage, so the branch is initially merged with the one from that ticket.
Since the branches are merged, and since I can't have them not merged because otherwise sage won't compile at all for me, I thought I had to list it as a dependency. Is there another way to go about it?
comment:23 in reply to: ↑ 22 Changed 3 years ago by
Replying to Vaush:
I see, I added it because on my system I needed #25391 to compile sage, so the branch is initially merged with the one from that ticket.
Since the branches are merged, and since I can't have them not merged because otherwise sage won't compile at all for me, I thought I had to list it as a dependency. Is there another way to go about it?
Sure, just don't merge these branches on trac, only merge them on your machine.
comment:24 Changed 3 years ago by
 Commit changed from 602172366246c788e1aa88efedaddfb96e7dd8a5 to bf5148b849737cb00740f17c1d3d170c7df59eb5
Branch pushed to git repo; I updated commit sha1. New commits:
79488e5  pyNauty interface with sagemath completed. TODO: actually use pynauty.isomorphic in is_isomorphic, and document everything

dcb29f0  Interface integrated with the standard sage methods for automorphism groups and isomorphism checking

f2cd759  Moved function multiedge_to_singledge to the GenericGraph class, changing its name in to_singledge and making it so it doesn't convert the graph into a dictionary anymore

6d18761  Added documentation and doctests

052f668  Fixing exception test bug

f825522  Removed the need to modify the upstream tarball

7c40f4a  PyNauty tests should be optional

56defeb  The nauty interface automorphism (and isomorphism, but it's not tested yet) now allows arbitrary edge labels AND multiple edges, even at the same time, though differing edge labels on one multi edge gives flaky results

76fb654  Partial changes

bf5148b  Fixed several bugs, added doctests and modified documentation

comment:25 Changed 3 years ago by
Ok, I just deleted the old branch, and created a new one that doesn't merge with ticket #25391.
If anyone has checked out the branch previously, could you please delete your copy and check it out again.
comment:26 followup: ↓ 28 Changed 3 years ago by
What exactly the need for that megapatch of build/pkgs/pynauty/patches/sage_graph.patch
? What does "recompile with nauty source code" mean?
comment:27 Changed 3 years ago by
 Dependencies #25391 deleted
comment:28 in reply to: ↑ 26 Changed 3 years ago by
Replying to dimpase:
What exactly the need for that megapatch of
build/pkgs/pynauty/patches/sage_graph.patch
? What does "recompile with nauty source code" mean?
The mega patch is my work, basically all the changes I made to pynauty so that it could interact with sage graphs directly, use multigraphs and labelled graphs, output automorphism groups in a decent format, etc. Without it the package would be exactly like the one downloaded from upstream, which is good, but not really useful for Sage. About the nauty source code part, basically the way pynauty was developed by its original creator is that it compiles its own version of nauty (and probably uses it internally, i'm not really sure), so it needs nauty's source code instead of already compiled executables; to do this, I take the source code out of Sage's nauty and use that to compile pynauty.
comment:29 followup: ↓ 31 Changed 3 years ago by
I have the following comments on method remove_labels
:
 remove import of
floor
. Not used.  remove import of
version_info
and methodgetiterator
that you don't use.
 in the for loop to fill
edge_labels_dict
, I suggest to change it withfor el in edge_labels: if not el in edge_labels_dict: edge_labels_dict[el] = bin(counter)[2:][::1] counter += 1
 then remove method
get_bits_list
 rewrite
for u,v,label in G.edge_iterator(): for idx, bit in enumerate(edge_labels_dict[label]): if bit == '1': newG.add_edge((u, idx), (v, idx))
In fact we use:
sage: list(enumerate("string")) [(0, 's'), (1, 't'), (2, 'r'), (3, 'i'), (4, 'n'), (5, 'g')]
 Do you really need to set
loops=True
? or can't you just doloops=self.allows_loops()
to allow loops only if self allows loops ?
comment:30 Changed 3 years ago by
 Commit changed from bf5148b849737cb00740f17c1d3d170c7df59eb5 to 472d31997795e06678c323f10bfb87e3db018fe1
Branch pushed to git repo; I updated commit sha1. New commits:
472d319  Small code refactoring

comment:31 in reply to: ↑ 29 Changed 3 years ago by
Replying to dcoudert:
I have the following comments on method
remove_labels
:
 remove import of
floor
. Not used. remove import of
version_info
and methodgetiterator
that you don't use.
 in the for loop to fill
edge_labels_dict
, I suggest to change it withfor el in edge_labels: if not el in edge_labels_dict: edge_labels_dict[el] = bin(counter)[2:][::1] counter += 1 then remove method
get_bits_list
 rewrite
for u,v,label in G.edge_iterator(): for idx, bit in enumerate(edge_labels_dict[label]): if bit == '1': newG.add_edge((u, idx), (v, idx))In fact we use:
sage: list(enumerate("string")) [(0, 's'), (1, 't'), (2, 'r'), (3, 'i'), (4, 'n'), (5, 'g')]
 Do you really need to set
loops=True
? or can't you just doloops=self.allows_loops()
to allow loops only if self allows loops ?
Sure, I don't see any particular upside to making this changes, but if you think they're for the better, I don't see any downsides either, so I made them. The only one I kept out for the moment is the loops part, since I don't remember why precisely I set loops always to true, there might be some other reason not immediately obvious that I'm forgetting. I'll come back to that too in a few days.
comment:32 Changed 3 years ago by
 Status changed from new to needs_review
comment:33 Changed 3 years ago by
 Milestone changed from sage8.3 to sage8.4
 Priority changed from minor to major
comment:34 Changed 3 years ago by
 Status changed from needs_review to needs_work
log()
is not in the global namespace. One needs to add
 a/src/sage/graphs/generic_graph.py +++ b/src/sage/graphs/generic_graph.py @@ 1828,6 +1828,7 @@ class GenericGraph(GenericGraph_pyx): Algorithm as described in section 14 of Nauty's guide (version 2.6), http://pallini.di.uniroma1.it/Guide.html """ + from sage.functions.log import log self_edge_labels = self.edge_labels() if not [el for el in self_edge_labels if el != None]: return self, []
(without it, tests fail in generic_graph.py
)
comment:35 Changed 3 years ago by
 Commit changed from 472d31997795e06678c323f10bfb87e3db018fe1 to a907f609128b22ea3d0666aa6fad0a63baf408b2
Branch pushed to git repo; I updated commit sha1. New commits:
a907f60  Added log import

comment:36 Changed 3 years ago by
 Status changed from needs_work to needs_review
comment:37 Changed 8 months ago by
 Status changed from needs_review to needs_work
red branch => needs work
Notice that it still lacks properly formatted comments, so it's still not ready for review
New commits:
Fixed python 3.6.1 crypt issue on Fedora 28
Merge remotetracking branch 'trac/u/cpernet/fflas_and_linbox_broken_with_gcc_8_1_0' into t/25391/sagemath_fails_to_build_on_on_fedora_28_with_gcc_8_1
Merge remotetracking branch 'trac/u/jdemeyer/upgrade_to_python_2_7_15' into t/25391/sagemath_fails_to_build_on_on_fedora_28_with_gcc_8_1
Merge branch 'master' into t/25391/sagemath_fails_to_build_on_on_fedora_28_with_gcc_8_1
pyNauty interface with sagemath completed. TODO: actually use pynauty.isomorphic in is_isomorphic, and document everything
Interface integrated with the standard sage methods for automorphism groups and isomorphism checking