Opened 2 years ago

Last modified 6 weeks ago

#22349 new enhancement

Deprecate sorting of Graph vertices and edges by default

Reported by: jdemeyer Owned by:
Priority: major Milestone:
Component: graph theory Keywords: python3, richcmp
Cc: Stefan, jmantysalo, jhpalmieri Merged in:
Authors: Reviewers:
Report Upstream: N/A Work issues:
Branch: u/jdemeyer/graph_vertices___should_not_be_sorted (Commits) Commit: aa0691f9fa7b03f9130b398923d92de3fb5fcf36
Dependencies: Stopgaps:

Description (last modified by jdemeyer)

Why does Graph.vertices() need to sort the list of vertices? If the user wants a sorted list, they can always call sorted() anyway. The same for Graph.edges().

Sorting is broken badly now that #22029 is merged.

Change History (45)

comment:1 Changed 2 years ago by jdemeyer

  • Description modified (diff)

comment:2 Changed 2 years ago by jdemeyer

  • Description modified (diff)

comment:3 Changed 2 years ago by jdemeyer

  • Summary changed from Graph.vertices() should not sort by default to Graph.vertices() should not be sorted

comment:4 Changed 2 years ago by jdemeyer

  • Summary changed from Graph.vertices() should not be sorted to Deprecate sorting of Graph.vertices()

comment:5 Changed 2 years ago by jdemeyer

  • Description modified (diff)
  • Summary changed from Deprecate sorting of Graph.vertices() to Deprecate sorting of Graph vertices and edges

comment:6 Changed 2 years ago by jdemeyer

  • Summary changed from Deprecate sorting of Graph vertices and edges to Deprecate sorting of Graph vertices and edges by default

comment:7 Changed 2 years ago by jdemeyer

  • Branch set to u/jdemeyer/graph_vertices___should_not_be_sorted

comment:8 Changed 2 years ago by git

  • Commit set to 1c3a584d5f578ab4ff71969b79904ed1220ca4a1

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

1c3a584Deprecate default sorting of Graph vertices and edges

comment:9 follow-up: Changed 2 years ago by dcoudert

What you propose to do is not an easy task. I suspect that many algorithms assume that the vertices are sorted. For sure, many algorithms assume that the ordering is always the same when calling vertices() and vertex_iterator(), so I hope that what you propose preserves this strong assumption.

Is your long term plan to remove parameter sort from the methods or just to set its default value to false ? I don't know the potential problems of sorting with Python 3, so I'm not sure of the motivation.

Many many tests are broken

sage -t src/sage/graphs/generic_graph.py  # 45 doctests failed
sage -t src/sage/graphs/generators/smallgraphs.py  # 4 doctests failed
sage -t src/sage/graphs/strongly_regular_db.pyx  # 1 doctest failed
sage -t src/sage/graphs/graph.py  # 11 doctests failed
sage -t src/sage/graphs/base/static_sparse_graph.pyx  # 1 doctest failed
sage -t src/sage/graphs/generators/families.py  # 3 doctests failed
sage -t src/sage/graphs/digraph.py  # 10 doctests failed
sage -t src/sage/graphs/digraph_generators.py  # 3 doctests failed
sage -t src/sage/graphs/base/graph_backends.pyx  # 2 doctests failed
sage -t src/sage/graphs/generic_graph_pyx.pyx  # 3 doctests failed
sage -t src/sage/graphs/comparability.pyx  # 1 doctest failed
sage -t src/sage/graphs/bipartite_graph.py  # 3 doctests failed
sage -t src/sage/graphs/base/c_graph.pyx  # 1 doctest failed
sage -t src/sage/graphs/graph_decompositions/vertex_separation.pyx  # 4 doctests failed
sage -t src/sage/graphs/schnyder.py  # 1 doctest failed
sage -t src/sage/graphs/line_graph.py  # 3 doctests failed
sage -t src/sage/graphs/isgci.py  # 1 doctest failed
sage -t src/sage/graphs/graph_decompositions/graph_products.pyx  # 1 doctest failed
sage -t src/sage/quivers/path_semigroup.py  # 3 doctests failed
sage -t src/sage/groups/perm_gps/partn_ref/refinement_graphs.pyx  # 1 doctest failed

for instance (random copy/paste)

File "src/sage/graphs/line_graph.py", line 466, in sage.graphs.line_graph.root_graph
Failed example:
    root_graph(graphs.DiamondGraph())
Expected:
    (Graph on 4 vertices, {0: (0, 3), 1: (0, 1), 2: (0, 2), 3: (1, 2)})
Got:
    (Graph on 4 vertices, {0: (1, 2), 1: (0, 1), 2: (0, 2), 3: (0, 3)})
**********************************************************************
File "src/sage/graphs/base/graph_backends.pyx", line 802, in sage.graphs.base.graph_backends.NetworkXGraphDeprecated.mutate
Failed example:
    G.edges(sort=False)
Exception raised:
    Traceback (most recent call last):
      File "/Users/dcoudert/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 498, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/Users/dcoudert/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 861, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.graphs.base.graph_backends.NetworkXGraphDeprecated.mutate[5]>", line 1, in <module>
        G.edges(sort=False)
    TypeError: edges() got an unexpected keyword argument 'sort'
**********************************************************************
File "src/sage/graphs/digraph.py", line 3424, in sage.graphs.digraph.DiGraph.flow_polytope
Failed example:
    fl.vertices(sort=False)
Exception raised:
    Traceback (most recent call last):
      File "/Users/dcoudert/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 498, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/Users/dcoudert/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 861, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.graphs.digraph.DiGraph.flow_polytope[19]>", line 1, in <module>
        fl.vertices(sort=False)
    TypeError: __call__() got an unexpected keyword argument 'sort'
**********************************************************************
File "src/sage/graphs/generic_graph.py", line 20558, in sage.graphs.generic_graph.GenericGraph.?
Failed example:
    d.automorphism_group(algorithm='sage')
Expected:
    Permutation Group with generators [('01','02')('10','20')('11','22')('12','21')]
Got:
    Permutation Group with generators [()]
**********************************************************************
File "src/sage/graphs/generic_graph.py", line 20197, in sage.graphs.generic_graph.GenericGraph.degree_to_cell
Failed example:
    G.degree_to_cell('111', cell)
Expected:
    0
Got:
    1

comment:10 in reply to: ↑ 9 Changed 2 years ago by jdemeyer

Replying to dcoudert:

What you propose to do is not an easy task.

I never claimed that it was easy. But it has to be done in order to support Python 3.

For sure, many algorithms assume that the ordering is always the same when calling vertices() and vertex_iterator(), so I hope that what you propose preserves this strong assumption.

It should remain the same, yes.

Is your long term plan to remove parameter sort from the methods or just to set its default value to false ?

I don't care much. I guess once the default is False, there will be little reason left to keep the sort keyword.

I don't know the potential problems of sorting with Python 3, so I'm not sure of the motivation.

The problem is that you cannot sort arbitrary things in Python 3. For example:

>>> sorted([1, "hello"])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unorderable types: str() < int()

comment:11 follow-ups: Changed 2 years ago by dcoudert

Now I understand the need for this big change.

Is there a way to sort arbitrary things in Python 3 ? some trick ?

comment:12 in reply to: ↑ 11 Changed 2 years ago by jdemeyer

Replying to dcoudert:

Is there a way to sort arbitrary things in Python 3 ?

Of course there is. The question is: what would you expect from such "arbitrary" sorting?

And if it's arbitrary anyway: why would you want to "sort" in the first place?

Last edited 2 years ago by jdemeyer (previous) (diff)

comment:13 follow-up: Changed 2 years ago by mmezzarobba

As far as I understand, there are at least two different issues here:

  • some algorithms want a consistent and easy to test order on the vertices, typically (but probably not only) to loop over the unordered pairs of vertices normalized by putting the smalled vertex first;
  • in interactive use, when there is a natural order on the vertices (in practice, mostly when they are integers, strings, or possibly nested tuples of such things), people want to see the vertices, edges, etc. listed in an order that makes the output easy to read, and it would be a significant regression to take that away from them.

Additionally, it may well be the case that some algorithms also rely on vertices of a certain type (say tuples of integers) automatically coming out sorted in a particular order, I'm not sure.

comment:14 Changed 2 years ago by jdemeyer

  • Dependencies set to #22383

comment:15 Changed 2 years ago by git

  • Commit changed from 1c3a584d5f578ab4ff71969b79904ed1220ca4a1 to aa0691f9fa7b03f9130b398923d92de3fb5fcf36

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

aa0691fDeprecate default sorting of Graph vertices and edges

comment:16 Changed 2 years ago by Stefan

  • Cc Stefan added

comment:17 in reply to: ↑ 11 ; follow-ups: Changed 21 months ago by nbruin

Replying to dcoudert:

Now I understand the need for this big change.

Is there a way to sort arbitrary things in Python 3 ? some trick ?

one way is

sorted( ..., key=str )

You could of course use another key function that uses some sorting heuristic (something like the key function used by https://pypi.python.org/pypi/natsort)

Display functions in IPython have hooks for nice sorting (used for dicts and sets), so if graphs need to display their keys and edges in a nice way, that's probably the hook to go for.

Algorithms that need their edges sorted should either insist on sortable labels or sort by index tuples or something like that (something that is guaranteed to be sortable regardless of what the user decides to use as labels).

comment:18 in reply to: ↑ 13 Changed 21 months ago by jdemeyer

Replying to mmezzarobba:

  • in interactive use, when there is a natural order on the vertices (in practice, mostly when they are integers, strings, or possibly nested tuples of such things), people want to see the vertices, edges, etc. listed in an order that makes the output easy to read, and it would be a significant regression to take that away from them.

In that case, the best solution is probably to use a custom class for this. It would behave exactly like a Python list (or tuple) in all possible ways, except that it would be sorted on display.

Last edited 21 months ago by jdemeyer (previous) (diff)

comment:19 in reply to: ↑ 17 Changed 21 months ago by jdemeyer

Replying to nbruin:

Display functions in IPython have hooks for nice sorting (used for dicts and sets), so if graphs need to display their keys and edges in a nice way, that's probably the hook to go for.

Nils, I only just saw your comment now. But yes, this was exactly my point too.

comment:20 in reply to: ↑ 17 ; follow-up: Changed 21 months ago by mmezzarobba

Replying to nbruin:

Algorithms that need their edges sorted should either insist on sortable labels or sort by index tuples or something like that (something that is guaranteed to be sortable regardless of what the user decides to use as labels).

They could simply sort by id(), provided that the same label is not represented by several physically distinct objects that compare equal. Unfortunately, this probably does happen (see #22374).

comment:21 in reply to: ↑ 20 ; follow-up: Changed 21 months ago by jdemeyer

Replying to mmezzarobba:

They could simply sort by id()

If you want to sort by id(), what is the purpose of sorting at all? I mean, it's not sorting, it's putting in a deterministic (but otherwise arbitrary) order. Assuming that methods like vertices() return the vertices in a deterministic order, I don't see the point.

comment:22 in reply to: ↑ 21 Changed 21 months ago by mmezzarobba

Replying to jdemeyer:

Replying to mmezzarobba:

They could simply sort by id()

If you want to sort by id(), what is the purpose of sorting at all? I mean, it's not sorting, it's putting in a deterministic (but otherwise arbitrary) order. Assuming that methods like vertices() return the vertices in a deterministic order, I don't see the point.

All I'm saying is that some algorithms do want to access the vertices in some arbitrary deterministic order, typically in order to iterate over the unordered pairs {v,v'}, and, as far as I remember, currently rely on the output of vertices() etc. being sorted to do that. If all the methods these algorithms use to access the vertices return them in the same deterministic order, then indeed, there is no need to do anything, otherwise, sorting by id() may be a better option than requiring the labels to be ordered.

comment:23 Changed 16 months ago by jdemeyer

  • Keywords richcmp added
  • Milestone changed from sage-7.6 to sage-8.2

comment:24 Changed 13 months ago by chapoton

  • Keywords python3 added

comment:25 Changed 13 months ago by saraedum

  • Work issues set to merge conflicts

comment:26 Changed 13 months ago by jdemeyer

  • Work issues merge conflicts deleted

comment:27 Changed 11 months ago by chapoton

  • Dependencies #22383 deleted

comment:28 Changed 11 months ago by dcoudert

  • Milestone changed from sage-8.2 to sage-8.4

comment:29 Changed 11 months ago by jmantysalo

  • Cc jmantysalo added

comment:30 Changed 11 months ago by jhpalmieri

  • Cc jhpalmieri added

comment:31 Changed 11 months ago by dcoudert

Apart from the need for ensuring that the internal order of vertices/edges during iterations is always the same (many algorithms assume that we have such a fixed order, whatever it is) and doing our best to preserve a nice display for interactive use, I think we have another issue:

  • For an edge (u, v, label) in an undirected graph, we usually have u <= v. So somehow we have a sorting here and many algorithms assume that vertices are ordered (not all). If I understand well, it is a problem with Python3 if vertex labels are of different types. We must find a solution for that.
  • In many algorithms, we have tests like if u < v. Again a big problem.

So what can we do ?

The only good news here is that vertices must be immutable.

comment:32 Changed 11 months ago by dcoudert

#26169 Trial for not sorting multiple edges

comment:33 Changed 7 months ago by jdemeyer

Do we already have a plan for this ticket? I know that there have many graph-related Python 3 tickets, but I'm a bit lost on the overall plan.

comment:34 Changed 7 months ago by jdemeyer

I am asking because there are now only a handful of doctest failures remaining in #22029 and they are almost all due to sorting in graphs.

comment:35 Changed 7 months ago by dcoudert

There are 3 different points here: 1) sorting the list of vertices, 2) sorting the list of edges, and 3) sorting the vertices of an edge.

For 1), we made huge progresses in reducing the dependency of methods to .vertices(). I'm not sure how many methods still rely on it. Most of them now use the ordering given by list(G) without making specific assumption on that order. We create a mapping when needed, and pass it to some methods. We can certainly completely avoid that dependency, but can we do that without deprecation ?

For 2), similarly, we have reduced the number of methods relying on .edges(), plus we can use sort=False parameter. We can try and see...

3) is more tricky and I don't know how to do that yet.

comment:36 Changed 7 months ago by jdemeyer

For 1) there is still an issue with relabel(): #27027

comment:37 Changed 7 months ago by jdemeyer

And one more with graph_isom_equivalent_non_edge_labeled_graph(): #27029

comment:38 Changed 5 months ago by jdemeyer

  • Description modified (diff)
  • Milestone changed from sage-8.4 to sage-8.7

comment:39 Changed 5 months ago by dcoudert

We have now drastically reduced the dependency to .vertices() and .edges() in the graph module (except in doctests, but we can specify sort=True or just change the expected output). We must now identify the remaining tasks to do and schedule the work. I propose to do that in #26640 to get a full picture.

comment:40 Changed 5 months ago by jdemeyer

Here is a concrete proposal for this ticket:

  1. If no sort option is given (the default), give a deprecation and try to sort but fail gracefully if the input is not sortable.
  1. If a sort option is given, do as before.

comment:41 Changed 5 months ago by jdemeyer

I also like the idea that somebody posted on sage-devel (I don't remember who) that .vertices() and .edges() should return a "view" which could have various operations on it.

comment:42 Changed 5 months ago by dcoudert

It's Thierry (this message). I also like this idea. Should we do that here, before in another ticket, or later ?

Networkx now also use views networkx.Graph.edges.

comment:43 Changed 5 months ago by dcoudert

A proposal for an EdgeView in #27408.

comment:44 Changed 4 months ago by embray

  • Milestone changed from sage-8.7 to sage-8.8

Ticket retargeted after milestone closed (if you don't believe this ticket is appropriate for the Sage 8.8 release please retarget manually)

comment:45 Changed 6 weeks ago by embray

  • Milestone sage-8.8 deleted

As the Sage-8.8 release milestone is pending, we should delete the sage-8.8 milestone for tickets that are not actively being worked on or that still require significant work to move forward. If you feel that this ticket should be included in the next Sage release at the soonest please set its milestone to the next release milestone (sage-8.9).

Note: See TracTickets for help on using tickets.