Opened 4 years ago

Closed 4 years ago

#18346 closed enhancement (fixed)

Easier handling of vertex labels in graph backends

Reported by: ncohen Owned by:
Priority: major Milestone: sage-6.7
Component: graph theory Keywords:
Cc: borassi, dcoudert, darij, vdelecroix Merged in:
Authors: Nathann Cohen Reviewers: Vincent Delecroix
Report Upstream: N/A Work issues:
Branch: 6ffbb05 (Commits) Commit: 6ffbb05999f70fa2535f28aca08b203036664491
Dependencies: #18317 Stopgaps:

Description (last modified by ncohen)

This ticket addresses the following problem:

In the CGraph backends, three cdef functions are used all the time to deal with vertex labels, i.e. get_vertex, vertex_label, check_vertex. They are always called with the same arguments, which are all attributes of the backend that calls it. They should be method, so that the arguments are not needlessly repeated.

What this branch does:

  • The CGraph backends (i.e. SparseBackend, DenseBackend and StaticSparseGraph) and GenericGraphBackend are turned into cdef classes.
  • The three functions get_vertex, vertex_label, check_vertex become methods of CGraphBackend
  • It adds a method CGraphBackend.c_graph() to get the two cdef attributes _cg and _cg_rev. Turns out that some code in Sage accesses directly G._backend._cg.

This should have been sufficient, but there were (unexpected) consequences:

  • There is no automatic pickling for cdef classes (as instrospection does not work), and the doctests do not pass if that does not work. Consequently, I implemented a unique __reduce__ function in GenericGraphBackend which handles the pickling for all backends (sparse/dense/static/networkx). It also removes some duplicated code as a result.

This makes the code a bit clearer (and it is needed). Also, moving this pickling function above is good for the future, for there will be many modifications in the future to the data structures in Sage: I plan to merge CGraph and GenericBackend into only one class (but that's for later).

Nathann

P.S.: The branch is built to be reviewed commit by commit. In particular, the renamed file appears as trivial when looking at the first commit, but as a lot of + and - when looking at the branch's diff

Change History (30)

comment:1 Changed 4 years ago by ncohen

  • Branch set to public/18346
  • Commit set to 25765bb3c4cb4c5f49e5078d7e6e3968c62310e1
  • Component changed from PLEASE CHANGE to graph theory
  • Status changed from new to needs_review

New commits:

9e60492trac #18346: Make graph_backends.py into a .pyx
da8479etrac #18346: Move and indent three functions (nothing else is changed)
5b765e1trac #18346: Rename calls to [get_vertex, vertex_label, check_vertex] into self.method calls
2704ddatrac #18346: Turn the backends into cdef classes
e11d4d8trac #18346: New pickling function for GraphBackend
bd8d988trac #18346: Remove old __reduce__ functions and move useful doctests
30eb626trac #18346: Some python code accesses G._backend._cg directly
25765bbtrac #18346: A 'cdef class' is a 'type', not a 'class'

comment:2 Changed 4 years ago by ncohen

  • Description modified (diff)

comment:3 follow-up: Changed 4 years ago by vdelecroix

Hello,

  1. Would it be bad to access directly the ._cg or ._cg_rev attributes? (ok it seems to be for python code)
  1. Is there a description somewhere of the attributes vertex_ints and vertex_labels?
  1. In a cython file, there is no need to declare cdef object x or object x, just do x.

Vincent

comment:4 Changed 4 years ago by vdelecroix

  • Status changed from needs_review to needs_info

comment:5 in reply to: ↑ 3 ; follow-up: Changed 4 years ago by ncohen

  • Status changed from needs_info to needs_review

Hello,

  1. Would it be bad to access directly the ._cg or ._cg_rev attributes? (ok it seems to be for python code)

I do not think that it is if you do not modify them. Some functions (like isomorphism test) use it directly. The guy who wrote the code that does that knew what he was doing, unfortunately he is the only one :-P

  1. Is there a description somewhere of the attributes vertex_ints and vertex_labels?

Not that I know. I am not sure that I remember exactly what they do either. What I remember is that not all vertices have an associated label, so that you don't spend your time playing with a dictionary when your graph is defined over integers.

This being said, I plan to rewrite all this (and of course the new code will have a proper documentation). I spent several hours thinking of how this should all be done, and my hardest constraint was that the changes should be "reviewable". Which is not very easily achieved when you plan to merge two classes together :-P

The 'first step' was #18185, the second is this one (or #18317, but that's only doc). Then I will turn NetworkX backend into a CGraph, and deprecate a lot of things in there. And then... Merge the classes O_O;;

  1. In a cython file, there is no need to declare cdef object x or object x, just do x.

Okay. Well, I don't think I did anything like that but I may have met such a line somewhere. Really, this does not matter for the moment: this code will be heavily rewritten.

Nathann

comment:6 in reply to: ↑ 5 ; follow-up: Changed 4 years ago by vdelecroix

Hello,

Replying to ncohen:

  1. Is there a description somewhere of the attributes vertex_ints and vertex_labels?

Not that I know. I am not sure that I remember exactly what they do either. What I remember is that not all vertices have an associated label, so that you don't spend your time playing with a dictionary when your graph is defined over integers.

Actually you do since get_vertex or vertex_label are intensively used everywhere. And those methods do check if their argument belongs to some dictionary as a first step! But if I understand well, the aim of the ticket is precisely not to upgrade these two methods but just to move them.

  1. In a cython file, there is no need to declare cdef object x or object x, just do x.

Okay. Well, I don't think I did anything like that but I may have met such a line somewhere. Really, this does not matter for the moment: this code will be heavily rewritten.

Hum

+ cdef int get_vertex(self, object u) except ? -2
+ cdef object vertex_label(self, int u_int)

this would better be

cdef int get_vertex(self, u) except ? -2
cdef vertex_label(self, int u_int)

Vincent

comment:7 Changed 4 years ago by git

  • Commit changed from 25765bb3c4cb4c5f49e5078d7e6e3968c62310e1 to c626eec1f23ab4f1529efe1efa3f738b30589b74

Branch pushed to git repo; I updated commit sha1. New commits:

c626eectrac #18346: Without useless 'objects'

comment:8 in reply to: ↑ 6 ; follow-up: Changed 4 years ago by ncohen

Yo !

Actually you do since get_vertex or vertex_label are intensively used everywhere. And those methods do check if their argument belongs to some dictionary as a first step!

Oh really? But perhaps the dictionary does *not* contain the integers then? Anyway, does not matter much.

But if I understand well, the aim of the ticket is precisely not to upgrade these two methods but just to move them.

The aim of this ticket is to merge CGraph and Backend together. All that "backend" does is forward everything to CGraph. But I certainly cannot do all of this at once, it would be impossible to review. Plus because of the problems I met when writing this branch, I know that I should be wary of the isomorphism test code which works at a lower level than I expected.

Hum

+ cdef int get_vertex(self, object u) except ? -2
+ cdef object vertex_label(self, int u_int)

Fixed. Actually, those are the lines that I moved from functions to methods. You will find (almost) the same lines with a '-' in front of them, a couple of lines above.

Nathann

comment:9 in reply to: ↑ 8 ; follow-up: Changed 4 years ago by vdelecroix

Replying to ncohen:

The aim of this ticket is to merge CGraph and Backend together. All that "backend" does is forward everything to CGraph. But I certainly cannot do all of this at once, it would be impossible to review. Plus because of the problems I met when writing this branch, I know that I should be wary of the isomorphism test code which works at a lower level than I expected.

Will you keep these get_vertex and vertex_label forever? If you do so I will had a commit to optimize them. They are basically used everywhere.

Vincent

comment:10 in reply to: ↑ 9 Changed 4 years ago by ncohen

Will you keep these get_vertex and vertex_label forever? If you do so I will had a commit to optimize them. They are basically used everywhere.

I have no idea for the moment. If there is something you want to change, it really cannot hurt.

Nathann

comment:11 Changed 4 years ago by git

  • Commit changed from c626eec1f23ab4f1529efe1efa3f738b30589b74 to d921cd9b951e14dcf7085788a254b213d3c40dfd

Branch pushed to git repo; I updated commit sha1. New commits:

d921cd9Trac 18346: tiny cleanup and optimization

comment:12 Changed 4 years ago by vdelecroix

I win some very precious nano seconds and add a tiny bit of documentation for vertex_labels and vertex_ints. I will start reviewing your other commits.

Vincent

comment:13 Changed 4 years ago by ncohen

Heuuuuuuu... Sorry Vincent but I am not really sure that what this code needs is 10 more lines to convert integer types from one to another....

And if you remove some 'object', please remove them from the .pxd file too.

Last edited 4 years ago by ncohen (previous) (diff)

comment:14 follow-ups: Changed 4 years ago by ncohen

If this kind of cast from !Python/!Sage integer is really costly, perhaps we should have a cdef function that does it somewhere? Do I make a mistake if I say that you wrote this code in order to avoid having to raise an exception in many cases ?

Nathann

Last edited 4 years ago by ncohen (previous) (diff)

comment:15 in reply to: ↑ 14 ; follow-up: Changed 4 years ago by vdelecroix

Replying to ncohen:

If this kind of cast from !Python/!Sage integer is really costly, perhaps we should have a cdef function that does it somewhere?

That's a possibility, but it is not clear what should be the signature of the function. What if it fails?

Do I make a mistake if I say that you wrote this code in order to avoid having to raise an exception in many cases?

It is not only that: calling directyl mpz_get_si or PyInt_AS_LONG is faster than the cast which does call PyInt_AsLong. But that's true that I do not like this try/except.

comment:16 in reply to: ↑ 15 Changed 4 years ago by ncohen

Yo !

That's a possibility, but it is not clear what should be the signature of the function. What if it fails?

LONG_MAX ?

It is not only that: calling directyl mpz_get_si or PyInt_AS_LONG is faster than the cast which does call PyInt_AsLong. But that's true that I do not like this try/except.

Well, if you can turn it into an independent function in some other module (something integer-related) then no problem. This code is much more in need of clarity than nanoseconds at the moment :-P

Nathann

comment:17 Changed 4 years ago by git

  • Commit changed from d921cd9b951e14dcf7085788a254b213d3c40dfd to 6318944f3f03fc813e11b43d5f4b440389b87e07

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

efd07b7Trac 18346: optimization of get_vertex
6318944Trac 18346: some cleanup

comment:18 follow-up: Changed 4 years ago by vdelecroix

Hello,

Do whatever you want with efd07b7. The rest of the cleanup is isolated in 6318944.

Vincent

comment:19 in reply to: ↑ 18 ; follow-up: Changed 4 years ago by ncohen

Do whatever you want with efd07b7. The rest of the cleanup is isolated in 6318944.

If you do not plan to turn the integer tests into a an independent function, then can you discard it?

comment:20 in reply to: ↑ 19 Changed 4 years ago by vdelecroix

Replying to ncohen:

Do whatever you want with efd07b7. The rest of the cleanup is isolated in 6318944.

If you do not plan to turn the integer tests into a an independent function, then can you discard it?

yes

comment:21 Changed 4 years ago by git

  • Commit changed from 6318944f3f03fc813e11b43d5f4b440389b87e07 to e8df618e3a7fa04547c1cc1f4195eb837f679bdf

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

e8df618Trac 18346: some cleanup

comment:22 follow-up: Changed 4 years ago by vdelecroix

  • Reviewers set to Vincent Delecroix

I am done with reviewing your commits and they are fine.

So if e8df618 is fine for you, then it can go.

comment:23 in reply to: ↑ 22 Changed 4 years ago by ncohen

Yoooooooooo !

I am done with reviewing your commits and they are fine.

So if e8df618 is fine for you, then it can go.

Oh. Great, thanks !! I was reviewing that commit. I should be done in 10 minutes.

Nathann

comment:24 Changed 4 years ago by ncohen

If you feel like messing with a new subfolder of src/, I cleaned a couple of things from finance/ in #18355 ;-)

comment:25 Changed 4 years ago by ncohen

  • Status changed from needs_review to positive_review

Okayokay, it's all good. Thank you very much for your help with these painful patches ^^;

Nathann

comment:26 Changed 4 years ago by vbraun

  • Status changed from positive_review to needs_work
sage -t --long src/sage/graphs/base/overview.py
**********************************************************************
File "src/sage/graphs/base/overview.py", line 50, in sage.graphs.base.overview
Failed example:
    Graph()._backend
Expected:
    <class 'sage.graphs.base.sparse_graph.SparseGraphBackend'>
Got:
    <type 'sage.graphs.base.sparse_graph.SparseGraphBackend'>
**********************************************************************
1 item had failures:
   1 of   2 in sage.graphs.base.overview
    [1 test, 1 failure, 0.01 s]

comment:27 in reply to: ↑ 14 Changed 4 years ago by vdelecroix

Replying to ncohen:

If this kind of cast from !Python/!Sage integer is really costly, perhaps we should have a cdef function that does it somewhere? Do I make a mistake if I say that you wrote this code in order to avoid having to raise an exception in many cases ?

The proper solution to "convert" a Python object u into a long is

cdef long u_long = PyNumber_Index(u)

see

https://groups.google.com/forum/#!topic/sage-support/0ATjCgQlq4c

Vincent

comment:28 Changed 4 years ago by git

  • Commit changed from e8df618e3a7fa04547c1cc1f4195eb837f679bdf to 6ffbb05999f70fa2535f28aca08b203036664491

Branch pushed to git repo; I updated commit sha1. New commits:

8d61e56trac #18317: General documentation about graph data structures
b7d9036trac #18317: Reviewer's comments
2c02e23trac #18317: Merged with 6.7.beta3
c6678cetrac #18317: Review
4378790trac #18317: Delete(s)
2caff2ftrac #18346: Merge with #18317
6ffbb05trac #18346; Broken doctest

comment:29 Changed 4 years ago by ncohen

  • Dependencies set to #18317
  • Status changed from needs_work to positive_review

comment:30 Changed 4 years ago by vbraun

  • Branch changed from public/18346 to 6ffbb05999f70fa2535f28aca08b203036664491
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.