Opened 4 years ago

Closed 4 years ago

#18828 closed enhancement (fixed)

Export graph to file

Reported by: ncohen Owned by:
Priority: major Milestone: sage-6.8
Component: graph theory Keywords:
Cc: dimpase, dcoudert, borassi Merged in:
Authors: Nathann Cohen Reviewers: David Coudert
Report Upstream: N/A Work issues:
Branch: 254b36c (Commits) Commit: 254b36c7132562b01b8802a0f3da49c90f6487d5
Dependencies: Stopgaps:

Description

I feel rather guilty, as all that this branch does is steal networkx functions. With it we can export graphs to several file formats, using the networkx.write_* functions.

Nathann

Change History (22)

comment:1 Changed 4 years ago by ncohen

  • Branch set to u/ncohen/18828
  • Commit set to 1bd29e566d28ac731e920d8aa6c70913ede27cb0
  • Status changed from new to needs_review

New commits:

1bd29e5trac #18828: Export graph to file

comment:2 Changed 4 years ago by ncohen

  • Cc dcoudert borassi added

comment:3 follow-up: Changed 4 years ago by dcoudert

  • Status changed from needs_review to needs_work

Hello,

The patchbot reports compilation errors associated with ticket #18746. I don't know why... Actually I have the same issue when I try to compile my develop branch, and so I'm currently unable to test this patch.

Concerning this ticket. Don't feel guilty to steal networkx. It is very useful and we will certainly also have to do the same for reading graphs from file.

I suggest to change the way you guess the file format with something like:

for ext in formats:
    if filename.endswith('.'+ext):
        break
finally:
    raise ...

With networkx it is possible to give extension like .edgelist.gz in which case the file should be compressed (at least with version 1.9.1, but we only have 1.8.1). See https://networkx.github.io/documentation/latest/reference/generated/networkx.readwrite.edgelist.write_edgelist.html

David.

comment:4 follow-up: Changed 4 years ago by borassi

Hellooooo!

I have a little trouble understanding the goal of this ticket: would you like to have only one standard function to save graphs, instead of calling different NetworkX functions, right? Because, for instance, if I want to save a file in adjlist format I can simply type:

import networkx
networkx.write_adjlist(G, path)

instead of using this new function.

In any case, I have tried to compile this code, both with make and with make distclean && make, and it works! Some small suggestions:

  • could you make a test that checks if the output is correct?
  • "the format is ‘guessed’ from the extension ..." maybe it is better to say "the format is the extension", because you do not guess, you simply use the extension.
  • I would add a link to the NetworkX manual (http://networkx.lanl.gov/reference/readwrite.html), where the different file formats are detailed.

comment:5 in reply to: ↑ 3 Changed 4 years ago by ncohen

Hello,

The patchbot reports compilation errors associated with ticket #18746. I don't know why... Actually I have the same issue when I try to compile my develop branch, and so I'm currently unable to test this patch.

I had the same problem. Can be solved by removing all the cython cached files that you can find:

./src/build/temp.linux-x86_64-2.7/sage/graphs/graph_decompositions
./src/build/temp.linux-x86_64-2.7/home/ncohen/.Sage/src/build/cythonized/sage/graphs/graph_decompositions
./src/build/cythonized/sage/graphs/graph_decompositions
./src/build/cython_debug/cython_debug_info_sage.graphs.graph_decompositions*
./src/build/lib.linux-x86_64-2.7/sage/graphs/graph_decompositions
./local/lib/python2.7/site-packages/sage/graphs/graph_decompositions

And then it works. That's trouble for the patchbots, though.

Concerning this ticket. Don't feel guilty to steal networkx. It is very useful and we will certainly also have to do the same for reading graphs from file.

Yep.

I suggest to change the way you guess the file format with something like:

for ext in formats:
    if filename.endswith('.'+ext):
        break
finally:
    raise ...

Why? It is longer, and does the same. None of the extensions contains a point.

With networkx it is possible to give extension like .edgelist.gz in which case the file should be compressed (at least with version 1.9.1, but we only have 1.8.1). See https://networkx.github.io/documentation/latest/reference/generated/networkx.readwrite.edgelist.write_edgelist.html

So you want to add all combinations of .edgelist.gz, edgelist.tgz,.. for all possible combinations to the dictionary of extensions? O_o

Nathann

comment:6 in reply to: ↑ 4 ; follow-up: Changed 4 years ago by ncohen

Hello,

I have a little trouble understanding the goal of this ticket: would you like to have only one standard function to save graphs, instead of calling different NetworkX functions, right?

Yes

Because, for instance, if I want to save a file in adjlist format I can simply type:

That's what the branch does, too. Only you may not know that those functions can be found in networkx.

In any case, I have tried to compile this code, both with make and with make distclean && make, and it works! Some small suggestions:

  • could you make a test that checks if the output is correct?

I added one check, but it is unpleasant in many ways. First, the (integer) vertices become strings, and then each edge gets a label encoding a weight. Well, that's networkx...

  • "the format is ‘guessed’ from the extension ..." maybe it is better to say "the format is the extension", because you do not guess, you simply use the extension.

How is that not a guess? Anyway, udpated.

  • I would add a link to the NetworkX manual (http://networkx.lanl.gov/reference/readwrite.html), where the different file formats are detailed.

Done.

Nathann

comment:7 Changed 4 years ago by git

  • Commit changed from 1bd29e566d28ac731e920d8aa6c70913ede27cb0 to f7c4a18508cf66a958ffe3110754916d406c0ee3

Branch pushed to git repo; I updated commit sha1. New commits:

f7c4a18trac #18828: Reviewer's remarks

comment:8 Changed 4 years ago by ncohen

  • Status changed from needs_work to needs_review

comment:9 in reply to: ↑ 6 ; follow-up: Changed 4 years ago by dcoudert

This method is clearly useful. It is too boring to import networkx each time you want to read/write a graph from/to a file.

  • could you make a test that checks if the output is correct?

I added one check, but it is unpleasant in many ways. First, the (integer) vertices become strings, and then each edge gets a label encoding a weight. Well, that's networkx...

By default method networx.write_edgelist sets parameter data=True. So your method produce a weighted edgelist instead of an edgelist

0 1 {}

instead of

0 1

I understand that you prefer short and fast code, but since writing to a file is slow anyway, we could spend some computation time to refine the behavior of the method.

David

comment:10 in reply to: ↑ 9 ; follow-up: Changed 4 years ago by ncohen

I understand that you prefer short and fast code, but since writing to a file is slow anyway, we could spend some computation time to refine the behavior of the method.

Would it do the trick for you if we replaced 'networkx.write_edgelist' with 'lambda x:networkx.write_edgelist(x,data=False)'?

Nathann

comment:11 in reply to: ↑ 10 ; follow-up: Changed 4 years ago by dcoudert

Replying to ncohen:

I understand that you prefer short and fast code, but since writing to a file is slow anyway, we could spend some computation time to refine the behavior of the method.

Would it do the trick for you if we replaced 'networkx.write_edgelist' with 'lambda x:networkx.write_edgelist(x,data=False)'?

Nice trick. Furthermore, in case we want to write the labels, we can have another parameter to set data=True.

comment:12 Changed 4 years ago by git

  • Commit changed from f7c4a18508cf66a958ffe3110754916d406c0ee3 to c019634f9a18798786f80c2c3254b2263f58f7d8

Branch pushed to git repo; I updated commit sha1. New commits:

c019634trac #18828: Expose all options from networkx

comment:13 in reply to: ↑ 11 Changed 4 years ago by ncohen

Nice trick. Furthermore, in case we want to write the labels, we can have another parameter to set data=True.

Some functions have this 'data' flag, others do not. To make everything simpler I updated the code to make it possible to forward any other flag to networkx. This way, we can pick whatever we want.

Nathann

comment:14 Changed 4 years ago by dcoudert

I propose to add extend the tests in the following way. Let me know if you agree before I push a commit.

sage: g = graphs.PetersenGraph()
sage: filename = tmp_filename(ext=".pajek")
sage: g.export_to_file(filename)
sage: import networkx
sage: h = Graph( networkx.read_pajek(filename)
sage: g.is_isomorphic(h)
True
sage: filename = tmp_filename(ext=".edgelist")
sage: g.export_to_file(filename, data=False)
sage: h = Graph( networkx.read_edgelist(filename)
sage: g.is_isomorphic(h)
True

Relying on vertex names is unfortunately not possible yet since the read method turns vertex id like 13 to u'13'.

comment:15 Changed 4 years ago by ncohen

No problem no problem. Or perhaps we could relabel the graph with the function 'int' ? This should turn the u'13' into a proper 13.

Nathann

comment:16 Changed 4 years ago by dcoudert

  • Branch changed from u/ncohen/18828 to u/dcoudert/18828

comment:17 Changed 4 years ago by dcoudert

  • Commit changed from c019634f9a18798786f80c2c3254b2263f58f7d8 to 254b36c7132562b01b8802a0f3da49c90f6487d5

I have pushed a small edit on the example. Hope it helps.

For the unicode problem, I have once used the following trick. Certainly not the best way to do it.

import string

if all(isinstance(u, unicode) for u in G):
    myaction = string.atoi
elif all(isinstance(u, str) for u in G):
    myaction = ZZ
else:
    myaction = lambda x:x

try:
    L = {u:myaction(u) for u in G}
except:
    L = {u:str(u) for u in G}

G.relabel(perm=L, inplace=True)

New commits:

fed2fd7trac #18828: Merged with 6.8.beta7
254b36ctrac #18828: fix and improve test/examples

comment:18 Changed 4 years ago by ncohen

Works for me !

comment:19 follow-up: Changed 4 years ago by dcoudert

  • Reviewers set to David Coudert
  • Status changed from needs_review to positive_review

So then good to go.

It would be nice to have a method for importing a graph from file, but I don't know where to put it: as a method of class Generic_Graph? as a method of (di)graphs generators?

comment:20 in reply to: ↑ 19 ; follow-up: Changed 4 years ago by ncohen

It would be nice to have a method for importing a graph from file, but I don't know where to put it: as a method of class Generic_Graph? as a method of (di)graphs generators?

The 'classiest way' would probably be something like Graph(filename). But adding formats to the current list scares me :-P

Nathann

comment:21 in reply to: ↑ 20 Changed 4 years ago by dcoudert

The 'classiest way' would probably be something like Graph(filename). But adding formats to the current list scares me :-P

I understand. Although you did an impressive cleaning, it's still hard. Well, we will think to something.

comment:22 Changed 4 years ago by vbraun

  • Branch changed from u/dcoudert/18828 to 254b36c7132562b01b8802a0f3da49c90f6487d5
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.