Opened 6 years ago

Closed 5 years ago

#14589 closed enhancement (fixed)

binary matrices, dense graphs, and faster is_strongly_regular

Reported by: ncohen Owned by: jason, ncohen, rlm
Priority: major Milestone: sage-6.1
Component: graph theory Keywords:
Cc: Slani, Stefan, azi, dimpase Merged in:
Authors: Nathann Cohen Reviewers: Jernej Azarija
Report Upstream: N/A Work issues:
Branch: u/ncohen/14589 (Commits) Commit: b56444ba216f3c2e6d7f20e34daf80f5d5aec781
Dependencies: #14805 Stopgaps:

Description (last modified by ncohen)

As the title says !!

With the patch applied

sage: g = graphs.PaleyGraph(primes_first_n(70)[-1])
sage: %time g.is_strongly_regular()
CPU times: user 0.16 s, sys: 0.00 s, total: 0.16 s
Wall time: 0.16 s
True

Without it

sage: g = graphs.PaleyGraph(primes_first_n(70)[-1])
sage: %time g.is_strongly_regular()
CPU times: user 20.43 s, sys: 0.00 s, total: 20.43 s
Wall time: 20.44 s
True

Ahhhh.. It is so nice to implement an algorithm with the right data structure ! :-P

Nathann

Attachments (1)

trac_14589.patch (15.0 KB) - added by ncohen 6 years ago.

Download all attachments as: .zip

Change History (33)

comment:1 Changed 6 years ago by ncohen

  • Status changed from new to needs_review

comment:2 Changed 6 years ago by ncohen

  • Cc dimpase added; jernej dimpae removed
  • Description modified (diff)

comment:3 follow-up: Changed 6 years ago by dimpase

how about (complements of) disjoint unions of complete graphs with the same number of vertices? Are they s.r.g. for you?

comment:4 in reply to: ↑ 3 Changed 6 years ago by ncohen

Well, to me even a complete graph is strongly regular. We just had a discussion on Mathematical Dogma there : #14297. What do you think ?

In this case, as there are non-adjacent vertices, I'd say that the graph is strongly regular.... Dogma :-P

Nathann

comment:5 follow-up: Changed 6 years ago by azi

Nathann,

I think it would make much sense to test the "is_triangle_free" thing with this data structure as well. It should be much more efficient using the same approach.

comment:6 in reply to: ↑ 5 Changed 6 years ago by ncohen

I think it would make much sense to test the "is_triangle_free" thing with this data structure as well. It should be much more efficient using the same approach.

I don't know which kind of speedup we can expect, but that's definitely something that should be implemented, if only to run that test :-)

Could we make this another ticket ? Then you can implement it or I will, it's up to you !

Nathann

comment:7 Changed 6 years ago by azi

Sure I'll make make a ticket soon

comment:8 Changed 6 years ago by Stefan

Hmmm. I wonder how much of the speedup remains if you use a GF(2)-matrix or integer matrix instead of the bitset matrix. You're only doing get and set, and I'd say the real power of these bitset-matrices lies in row operations. You could use Sage's get_unsafe and set_unsafe to bypass range checking. It'll be slower than inline methods, but maybe not by much.

comment:9 follow-up: Changed 6 years ago by vdelecroix

Hello,

I am a bit naive but would like to understand the following

  • pxi against C why do you use a pxi file ? It would cause multiple compilation wherever it is included. I think a pure C file would be much more suited to that purpose (I know that this is the way it is for bitset)
  • mutability why don't you mimic what is done for DenseGraph to allow mutability (ie having a bitset of active vertices that allow to use any subset of 0, ..., n-1 as vertices). Virtually there will be the good number of vertices but in backend you treat your graph with many isolated vertices. Augmenting the number of vertices may then be done with realloc.
  • sparse graphs What do we do with current implementation of SparseGraph ?

comment:10 in reply to: ↑ 9 Changed 6 years ago by ncohen

Hellooooooo !

I am a bit naive but would like to understand the following

  • pxi against C why do you use a pxi file ?

If you compile the file independently then the functions cannot be inlined.

  • mutability why don't you mimic what is done for DenseGraph to allow mutability (ie having a bitset of active vertices that allow to use any subset of 0, ..., n-1 as vertices). Virtually there will be the good number of vertices but in backend you treat your graph with many isolated vertices. Augmenting the number of vertices may then be done with realloc.

Because I want to write simple things. It is not meant to replace a real graph backend, and it will probably be used later on by the future dense backend, but I would really like to have a simple structure somewhere, that does nothing but store bits.

  • sparse graphs What do we do with current implementation of SparseGraph ?

Well, first I will have to read your patch #14690. Then, each time we read this file it would be nice to add comments to it, to explain the parts of it that we have understood. Then we can do whatever we want with this backend : rewrite it or improve it if we can. We can't really throw it out right now, can we ? :-P

Nathann

Changed 6 years ago by ncohen

comment:11 Changed 6 years ago by ncohen

  • Dependencies set to #14805

comment:12 Changed 5 years ago by jdemeyer

  • Milestone changed from sage-5.11 to sage-5.12

comment:13 Changed 5 years ago by jason

Thanks for doing this. Often the first thing I do is to convert the graph to an array of bitsets and then work with that. I do this a lot in the minimum rank library, for example: https://github.com/jasongrout/minimum_rank/blob/master/zero_forcing_wavefront.pyx#L124

Next time I write some code like this, I'll have to look at this implementation. When I get some time to do research coding work, if this hasn't been reviewed, I'll have to review this!

comment:14 follow-up: Changed 5 years ago by azi

Hello!

As promised I took the time to review your patch. The entire thing is easy to read and looks good. The fact that it was not reviewed for so long makes me wonder if I am overlooking some non obvious caveat?

Anyways, before giving another check let me ask

  1. Line 67 - can we always assume that g.vertices() will return a sorted list of vertices? Also, it looks like this can be written in a better way?
  1. Line 48 function inline binary_matrix_free. Shouldn't you free m.rows[1],...,m.rows[n_rows] as well?
  1. Line 69 in inline binary_matrix_set0 the comment should really say set matrix entry to 0

Also off topic to this patch - am I the only one being tempted to change the function edges() and set edge_labels to false as default?

comment:15 in reply to: ↑ 14 ; follow-up: Changed 5 years ago by ncohen

Yooooooo !

As promised I took the time to review your patch.

Thaaaaaaaaaaanks !! ;-)

The entire thing is easy to read and looks good. The fact that it was not reviewed for so long makes me wonder if I am overlooking some non obvious caveat?

The "someone else with do it" syndrome. And the fact that with git the developpers as a whole probably suffered a loss of enthusiasm :-P

  1. Line 67 - can we always assume that g.vertices() will return a sorted list of vertices? Also, it looks like this can be written in a better way?

Unfortunately I think we can. Though I try not to rely on that. Did I do it in this patch ?

The problem is that while the vertices are indeed "sorted" before being returned, but that this sorting can be meaningless : the "natural" ordering on the vertices may be a partial ordering from time to time (i.e. when the vertices are set : graphs.KneserGraph) and well, that causes bugs from time to time.

  1. Line 48 function inline binary_matrix_free. Shouldn't you free m.rows[1],...,m.rows[n_rows] as well?

No, because only m.rows[0] is allocated on line 39. All of m.rows[i] point toward some part of this very long vector, but regardless of that if you deallocate m.rows[0] the whole segment is deallocated.

  1. Line 69 in inline binary_matrix_set0 the comment should really say set matrix entry to 0

Oh. Right :-PPPPPPPPP

Also off topic to this patch - am I the only one being tempted to change the function edges() and set edge_labels to false as default?

T_T

I hate labels. And I also type .edges(labels=False) much more often than .edges(). Well. I didn't try to change this kind of stuff because I consider that it is just "my own use of graphs"... But well, if many of us begin to hate labels/loops/multiple edges now that's a different story :-P The combinat/word guys may probably not agree, though... But really, I have no idea. I just never did that because I don't want to change stuff just because it makes it easier for me, without knowing how useful it can be for others.

Nathann

comment:16 Changed 5 years ago by ncohen

  • Branch set to u/ncohen/14589

Rebased. As a git branch.

Nathann

comment:17 Changed 5 years ago by git

  • Commit set to 9f0f8382c21eee8d76282661bbdc9e1dcfc597a8

Branch pushed to git repo; I updated commit sha1. New commits:

9f0f838trac #14589: binary matrices, dense graphs, and faster is_strongly_regular

comment:18 Changed 5 years ago by ncohen

Hmmmm.. bitset.pyx has been renamed since :-P

And actually.. We don't need it.

Nathann

comment:19 Changed 5 years ago by git

  • Commit changed from 9f0f8382c21eee8d76282661bbdc9e1dcfc597a8 to b56444ba216f3c2e6d7f20e34daf80f5d5aec781

Branch pushed to git repo; I updated commit sha1. New commits:

b56444btrac #14589: binary matrices, dense graphs, and faster is_strongly_regular

comment:20 in reply to: ↑ 15 ; follow-up: Changed 5 years ago by azi

Replying to ncohen:

Yooooooo !

As promised I took the time to review your patch.

Thaaaaaaaaaaanks !! ;-)

The entire thing is easy to read and looks good. The fact that it was not reviewed for so long makes me wonder if I am overlooking some non obvious caveat?

The "someone else with do it" syndrome. And the fact that with git the developpers as a whole probably suffered a loss of enthusiasm :-P

  1. Line 67 - can we always assume that g.vertices() will return a sorted list of vertices? Also, it looks like this can be written in a better way?

Unfortunately I think we can. Though I try not to rely on that. Did I do it in this patch ?

Yes didn't you?

The problem is that while the vertices are indeed "sorted" before being returned, but that this sorting can be meaningless : the "natural" ordering on the vertices may be a partial ordering from time to time (i.e. when the vertices are set : graphs.KneserGraph) and well, that causes bugs from time to time.

  1. Line 48 function inline binary_matrix_free. Shouldn't you free m.rows[1],...,m.rows[n_rows] as well?

No, because only m.rows[0] is allocated on line 39. All of m.rows[i] point toward some part of this very long vector, but regardless of that if you deallocate m.rows[0] the whole segment is deallocated.

Right! I completely misread something somewhere.

  1. Line 69 in inline binary_matrix_set0 the comment should really say set matrix entry to 0

Oh. Right :-PPPPPPPPP

Also off topic to this patch - am I the only one being tempted to change the function edges() and set edge_labels to false as default?

T_T

I hate labels. And I also type .edges(labels=False) much more often than .edges(). Well. I didn't try to change this kind of stuff because I consider that it is just "my own use of graphs"... But well, if many of us begin to hate labels/loops/multiple edges now that's a different story :-P The combinat/word guys may probably not agree, though... But really, I have no idea. I just never did that because I don't want to change stuff just because it makes it easier for me, without knowing how useful it can be for others.

For me they are veeeery annoying so I assumed they were for everyone .

Anyways, this is all I got to say about the patch. It looks good to me provided u changed the above things and ran the doctests (which I did not)

Best,

Jernej

Nathann

comment:21 in reply to: ↑ 20 ; follow-up: Changed 5 years ago by ncohen

Yes didn't you?

Well if I did I do not remember where :-P

For me they are veeeery annoying so I assumed they were for everyone.

Hmmmm.. It is true that a graph is a collection of pairs, and that you always have to explain "what the hell all these None are" when you show the result of g.edges() to beginners... :-/

Anyways, this is all I got to say about the patch. It looks good to me provided u changed the above things and ran the doctests (which I did not)

Well the patchbot complains but the two errors seem unrelated.

Nathann

comment:22 in reply to: ↑ 21 ; follow-up: Changed 5 years ago by azi

Replying to ncohen:

Yes didn't you?

Well if I did I do not remember where :-P

Line 67. g.vertices() == range(n)

For me they are veeeery annoying so I assumed they were for everyone.

Hmmmm.. It is true that a graph is a collection of pairs, and that you always have to explain "what the hell all these None are" when you show the result of g.edges() to beginners... :-/

Anyways, this is all I got to say about the patch. It looks good to me provided u changed the above things and ran the doctests (which I did not)

Well the patchbot complains but the two errors seem unrelated.

Nathann

comment:23 in reply to: ↑ 22 Changed 5 years ago by ncohen

Hellooo !

Line 67. g.vertices() == range(n)

Oh, right. But in this case the vertices are integers. It indeed relies on the fact that vertices are sorted (and they are), but there is no problem if the natural order on the vertices is not total. The equality will be false in that case.

Nathann

comment:24 Changed 5 years ago by azi

I see. Well if you feel safe about that (the documentation afaik does not guarantee sorted order of vertices) then this thing is fine with me.

comment:25 Changed 5 years ago by ncohen

Hmmmmmm... Well, we can re-sort it but I think it's overkill. If that ever changes we will have to look at all occurrences of g.vertices() anyway... And I really think that this isn't going to happen anytime soon O_o

Nathann

comment:26 follow-up: Changed 5 years ago by azi

Indeed. Of course there is the stupid, thing of checking that the sum is n(n+1)/2 but I think that looks even more weird in this context?

comment:27 in reply to: ↑ 26 Changed 5 years ago by ncohen

Indeed. Of course there is the stupid, thing of checking that the sum is n(n+1)/2 but I think that looks even more weird in this context?

O_o

Instead of that test, you mean ?

Well, we have no way to be sure that the vertices CAN be summed. I mean, this test is meant to check whether there is any need to store a label_to_int dictionary. If the vertices are 0...n-1 there is no need and so we do without. But if the vertices are... I don't know, anything ! Strings, Sets, partitions, tuples.. immutables graphs ? Well in that case we remember labels. And we cannot be sure that those can all be summed together.

Nathann

comment:28 Changed 5 years ago by azi

Yes sure! That's what I said its stupid cause you also have to make sure every element is actually an int.

As said this was more of a stupid comment than anything else, as far as I am concerned the patch is good to go!

comment:29 Changed 5 years ago by ncohen

Sooooooooooooooo positive review ?... :-P

Nathann

comment:30 Changed 5 years ago by azi

  • Status changed from needs_review to positive_review

comment:31 Changed 5 years ago by ncohen

  • Reviewers set to Jernej Azarija

Thaaaaaaaaaaanks !

At long last, I can test if a graph is strongly regular without fearing it might hang on forever :-P

Nathann

comment:32 Changed 5 years ago by vbraun

  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.