Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#27231 closed enhancement (fixed)

Interface nauty's directg command with sage

Reported by: vklein Owned by:
Priority: major Milestone: sage-8.7
Component: graph theory Keywords: thursdaysbdx
Cc: dcoudert, vdelecroix Merged in:
Authors: Vincent Klein Reviewers: David Coudert
Report Upstream: N/A Work issues:
Branch: ccc2cd3 (Commits, GitHub, GitLab) Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by vklein)

Allow to call nauty's directg command from sage.

This is implemented as a DiGraph generator (file: digraph_generators.py).

Note to reviewers: Any idea for useful examples in documentation is welcome.

Change History (26)

comment:1 Changed 3 years ago by vklein

  • Branch set to u/vklein/27231

comment:2 Changed 3 years ago by vklein

  • Commit set to a538ebcfa229bd2f2a689e3b58add3002905d190
  • Description modified (diff)

New commits:

a538ebcTrac #27231: Add a digraph generator using ...

comment:3 Changed 3 years ago by git

  • Commit changed from a538ebcfa229bd2f2a689e3b58add3002905d190 to 90b6be75191f286a326f9e49ecc60f5acf6c9c75

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

90b6be7Trac #27231: Add a digraph generator using ...

comment:4 Changed 3 years ago by vklein

  • Cc dcoudert vdelecroix added
  • Status changed from new to needs_review

comment:5 follow-ups: Changed 3 years ago by dcoudert

This is a nice addition. Thank you. Below are some comments.

  • You are not returning a list but an iterator.
    -        Returns a list which creates digraphs from nauty's directg program.
    +        Return an iterator yielding digraphs using nauty's ``directg`` program.
    
  • Could you document the role of parameter graphs. It is not completely obvious.
  • -        - ``graphs`` (iterable) -- A :class:`Graph` or an iterable containing
    -          :class:`Graph`.
    +        - ``graphs`` -- a :class:`Graph` or an iterable containing :class:`Graph`
    
  • Also, add an empty line between parameters.
  • As for graphs.nauty_geng, you could add parameter debug.
  • I don't know which is best between the loop for l in out.split('\n'): and the while loop of graphs.nauty_geng. Are the programs different ?
  • -            if l != '' and l[0] == '&':
    +            if l and l[0] == '&':
    
  • directg and watercluster2 generate all possible orientations of the input graphs, include both directions. Currently, we have a tools for iterating over all possible orientations of a graph:
    • orientations return an iterator over orientations
    • strong_orientations_iterator return an iterator over all strong orientations of a graph G
    You can add a test to show that 1) directg/watercluster2 generates more graphs than what we get with orientations, and 2) that each digraph obtained with orientations is in the set of digraphs obtained with directg.

comment:6 in reply to: ↑ 5 Changed 3 years ago by vklein

Replying to dcoudert:

This is a nice addition. Thank you. Below are some comments.

  • You are not returning a list but an iterator.
    -        Returns a list which creates digraphs from nauty's directg program.
    +        Return an iterator yielding digraphs using nauty's ``directg`` program.
    


Yes my mistake ! It's a generator to be accurate. My first version used to return a list.
Thanks for noticing that.

  • I don't know which is best between the loop for l in out.split('\n'): and the while loop of graphs.nauty_geng. Are the programs different ?


The impacting difference between geng and directg is that directg need an input file and geng work with command parameter only.

That's why i use Popen.communicate as it allow to give an input "file" to directg

The drawback is, if i am right, that with communicate reading stdout is in one go where we can read line by line for geng. I will try a line by line read with the communicate output, just in case i am wrong.

No i was unaware of that. I implemented this ticket for a sage user needing directg. I will look at watercluster2 and modify my ticket with it if it works well.

Thanks for your comments.

Last edited 3 years ago by vklein (previous) (diff)

comment:7 Changed 3 years ago by vklein

  • Status changed from needs_review to needs_work

comment:8 Changed 3 years ago by vklein

Comparison watercluster2 and directg

1 - For generating/counting digraph without generating output watercluster2 is way faster.

(sage-sh) vklein@tuono:sage$ time geng -c 7 | directg -o -u
>A geng -cd1D6 n=7 e=6-21
>Z 853 graphs generated in 0.00 sec
>A directg -ou
>Z 853 graphs read from stdin; 2120098 digraphs generated; 1.71 sec

real    0m1.734s
user    0m1.716s
sys 0m0.000s
(sage-sh) vklein@tuono:sage$ time geng -c 7 | watercluster2 S
>A geng -cd1D6 n=7 e=6-21
>Z 853 graphs generated in 0.00 sec
Number of directed graphs: 2120098 

real    0m0.233s
user    0m0.234s
sys 0m0.000s

If you take the time to generate output into account the result is less clear.
watercluster2 has two output format T-code (option T) or a custom binary format (option B).
For the tests below the two functions used just read the command output.

sage: time out = digraphs.nauty_watercluster2(graphs.nauty_geng("-c 9 9:12"), op
....: tions="B S")
CPU times: user 2.42 s, sys: 571 ms, total: 2.99 s
Wall time: 7.12 s
sage: time out = digraphs.nauty_watercluster2(graphs.nauty_geng("-c 9 9:12"), op
....: tions="T S")
CPU times: user 5.65 s, sys: 1.9 s, total: 7.55 s
Wall time: 33.9 s
sage: time out = digraphs.nauty_directg(graphs.nauty_geng("-c 9 9:12"), options=
....: "-o")
CPU times: user 2.02 s, sys: 591 ms, total: 2.61 s
Wall time: 10.3 s

watercluster2 is slower than directg with T-code output and slightly faster with binary output.
My current problem with the binary output is that what i get is not consistent with the specifications (Which are in watercluster2 source code) and therefore i don't know how to build DiGraph? with that.

2 - Command options.

directg and watercluster2 have options with no equivalent :

directg's options:

   -e# | -e#:#  specify a value or range of the total number of arcs
   -f#  Use only the subgroup that fixes the first # vertices setwise
    -s#/#  Make only a fraction of the orientations: The first integer is
            the part number (first is 0) and the second is the number of
            parts. Splitting is done per input graph independently.

watercluster2's options:

The option ix restricts the maximum indegree to x.
The option oy restricts the maximum outdegree to y.

I don't know what is the most useful set of options between these two.

I guess we should stick with directg for now.

comment:9 Changed 3 years ago by dcoudert

Thanks for the benchmark.

I agree to stay with directg. You can add a .. TODO:: block explaining that a future task is to interface watercluster2`.

Also, can you do

-                yield DiGraph(l[1:])
+                yield DiGraph(l[1:], format='dig6')

and add tests like if not graphs: ..., if not input: ..., if not g: ... to avoid computations when the input graph/list of graphs is empty.

comment:10 in reply to: ↑ 5 Changed 3 years ago by vklein

Replying to dcoudert:

You can add a test to show that 1) directg/watercluster2 generates more graphs than what we get with orientations, and 2) that each digraph obtained with orientations is in the set of digraphs obtained with directg.

Well my results are :

sage: len(list(digraphs.nauty_directg(graphs.PetersenGraph())))
121545
sage: len(list(graphs.PetersenGraph().orientations()))
32768
sage: K5 = graphs.CompleteGraph(5)
sage: len(list(digraphs.nauty_directg(K5)))
582
sage: len(list(K5.orientations()))
1024

For K5 directg give me less digraphs than orientations. Is it an inconsistency ?

comment:11 Changed 3 years ago by git

  • Commit changed from 90b6be75191f286a326f9e49ecc60f5acf6c9c75 to 8080a064240470aee7fc4cc69a286e8e688e0c64

Branch pushed to git repo; I updated commit sha1. New commits:

8080a06Trac #27231: Implement reviewer's comments:

comment:12 Changed 3 years ago by vklein

Reviewer's comments implemented apart from comparison with orientations's set (see comment:10).

comment:13 Changed 3 years ago by vklein

  • Status changed from needs_work to needs_review

comment:14 Changed 3 years ago by git

  • Commit changed from 8080a064240470aee7fc4cc69a286e8e688e0c64 to ada4dcb6220968759d81d12ea86965e56ebc91e0

Branch pushed to git repo; I updated commit sha1. New commits:

ada4dcbTrac #27231: add nauty_directg method ...

comment:15 Changed 3 years ago by dcoudert

  • Missing empty line
    -        Return an iterator yielding digraphs using nauty's ``directg`` program.
    -        Description from directg --help:
    +        Return an iterator yielding digraphs using nauty's ``directg`` program.
    +
    +        Description from directg --help:
    
  • One small details for the test if not graphs:. If graphs = Graph(), then the test will be true and nothing is returned. If it is what you expect, this is fine. But if you expect that the method return DiGraph(), then an extra test like if isinstance(graphs, Graph):... is needed. I let you decide which is best.
  • Concerning the test with `orientations, I plotted graphs generated with a smaller example
    sage: sum(1 for _ in digraphs.nauty_directg(graphs.CompleteGraph(3)))
    7
    sage: sum(1 for _ in graphs.CompleteGraph(3).orientations())
    8
    sage: for d in digraphs.nauty_directg(graphs.CompleteGraph(3)):
    ....:     d.show()
    sage: for d in graphs.CompleteGraph(3).orientations():
    ....:     d.show()
    
    It appears that the digraphs generated with directg are non-isomorphic orientations of the graph, while orientations generates isomorphic digraphs. For instance [(0, 1), (0, 2), (1, 2)] and [(1, 0), (2, 0), (2, 1)] are different digraphs obtained with orientations but are isomorphic, so only the first one is produced by directg. So the test I proposed is not relevant.

However, you could add a .. SEEALSO: block with pointers to orientations and strong_orientations.

  • I have successfully tested this ticket with Python3 ;)

comment:16 Changed 3 years ago by git

  • Commit changed from ada4dcb6220968759d81d12ea86965e56ebc91e0 to 48465906e832ba5febee91c19268e88ee6fc9429

Branch pushed to git repo; I updated commit sha1. New commits:

4846590Trac #27231: support single input of empty Graph

comment:17 Changed 3 years ago by vklein

All comments implemented.
The case with graphs == Graph() was really a bug because directg manage the input of empty graphs (graph6 string ?) and return an empty digraph.
Note that to stick with this generator description i prefer to let directg do this job rather than doing a yield DiGraph() manually.

Thank you for your comments.

comment:18 Changed 3 years ago by git

  • Commit changed from 48465906e832ba5febee91c19268e88ee6fc9429 to 9289f5f547a45b611b60027936803b548089d283

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

9289f5fTrac #27231: support single input of empty Graph

comment:19 follow-up: Changed 3 years ago by dcoudert

Some more comments (should be the last one)

  • To be consistent, you must change in the top table:
    -    :meth:`~DiGraphGenerators.nauty_directg`       | Returns a digraph generator using Nauty's directg command.
    +    :meth:`~DiGraphGenerators.nauty_directg`       | Return an iterator yielding digraphs using nauty's ``directg`` program.
    
  • In the tests, I don't understand why the error is raised only after next(g) and not directly during the call to digraphs.nauty_directg. Is it normal ?
  • may be you should do the test on not graphs after the tests on input parameters ? Otherwise, a call to digraphs.nauty_directg([], options="-o -G") will not raise an error.
  • 80 column mode for comments
    -            # digraph6 specifications: http://users.cecs.anu.edu.au/~bdm/data/formats.txt
    +            # digraph6 specifications: 
    +            # http://users.cecs.anu.edu.au/~bdm/data/formats.txt
    

comment:20 in reply to: ↑ 19 Changed 3 years ago by vklein

Replying to dcoudert:

Some more comments (should be the last one)

  • In the tests, I don't understand why the error is raised only after next(g) and not directly during the call to digraphs.nauty_directg. Is it normal ?


Yes that's how python's generators works. Calling the generator's function create the generator object but don't begin to execute the code inside the function:

>>> def generator():
	print('hey')
	yield 1

	
>>> g = generator()
>>> g
<generator object generator at 0x00000198B7D22CF0>
>>> next(g)
hey
1

comment:21 Changed 3 years ago by dcoudert

  • Reviewers set to David Coudert

I didn't know that. Thanks.

So you just have to include my last comments.

comment:22 Changed 3 years ago by git

  • Commit changed from 9289f5f547a45b611b60027936803b548089d283 to ccc2cd321eb0f774276d22d0de4434249732a0b6

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

1969a6aTrac #27231: Add a digraph generator using ...
7f94176Trac #27231: Implement reviewer's comments:
8e302c6Trac #27231: add nauty_directg method ...
18d1ea3Trac #27231: support single input of empty Graph
ccc2cd3Trac #27231: Implements last reviewer's comments

comment:23 Changed 3 years ago by vklein

Rebase on sage8.7.beta3 and implement last reviewer's comments.

comment:24 Changed 3 years ago by dcoudert

  • Status changed from needs_review to positive_review

LGTM. Tested on py2 and py3.

comment:25 Changed 3 years ago by vbraun

  • Branch changed from u/vklein/27231 to ccc2cd321eb0f774276d22d0de4434249732a0b6
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:26 Changed 3 years ago by slelievre

  • Commit ccc2cd321eb0f774276d22d0de4434249732a0b6 deleted

Thanks. There is user demand for this, see

Follow-up ticket at #27686 for a documentation issue raised in that Ask Sage question.

Note: See TracTickets for help on using tickets.