Opened 6 years ago

Closed 6 years ago

#18682 closed enhancement (fixed)

Add SetSystem.is_connected()

Reported by: Rudi Owned by: Rudi
Priority: major Milestone: sage-6.8
Component: matroid theory Keywords:
Cc: chaoxu, vdelecroix Merged in:
Authors: Rudi Pendavingh Reviewers: Nathann Cohen
Report Upstream: N/A Work issues:
Branch: 7b136ab (Commits) Commit: 7b136abb135ee7872ef4999b06d1893d79905f4a
Dependencies: Stopgaps:

Description

Add a routine for testing connectivity of a SetSystem?.

Change History (27)

comment:1 Changed 6 years ago by Rudi

  • Branch set to u/Rudi/add_setsystem_is_connected__

comment:2 Changed 6 years ago by Rudi

  • Cc chaoxu added
  • Commit set to b91c71c431875d347257e5d410534ed20b21b435

First try.

sage: from sage.matroids.set_system import SetSystem
sage: M = Matroid(MatrixSpace(GF(2), 500,1000).random_element())
sage: B = M.basis()
sage: %time FC = [M.fundamental_cocircuit(B,e) for e in B]
CPU times: user 27.5 ms, sys: 3.13 ms, total: 30.7 ms
Wall time: 30.4 ms
sage: %time S = SetSystem(list(M.groundset()), FC)
CPU times: user 5.06 ms, sys: 106 µs, total: 5.17 ms
Wall time: 5.18 ms
sage: timeit("S.is_connected()")
625 loops, best of 3: 12.8 µs per loop

New commits:

b91c71cAdded SetSystem.is_connected()

comment:3 follow-up: Changed 6 years ago by ncohen

  • Cc vdelecroix added

Hypergraph.is_connected() :-/

Guys, we are doing this work twice :-/

Nathann

comment:4 in reply to: ↑ 3 ; follow-up: Changed 6 years ago by Rudi

Replying to ncohen:

Hypergraph.is_connected() :-/

Guys, we are doing this work twice :-/

I'm helping to speed up the code in this ticket #18539, for that we just need a fast connectivity checker. I tried Sage's Hypergraph first, but look at this (same example as above):

sage: %time S = Hypergraph(points = list(M.groundset()), blocks =FC)
CPU times: user 1.17 s, sys: 20.9 ms, total: 1.19 s
Wall time: 1.18 s
sage: %time S.is_connected()
CPU times: user 50.7 s, sys: 593 ms, total: 51.3 s
Wall time: 50.7 s
True

Compared to SetSystem?, the Hypergraph I want takes 200 times longer to construct, and finding out if it is connected takes about 4 million times longer.

So what would you do in this case?

comment:5 in reply to: ↑ 4 ; follow-up: Changed 6 years ago by ncohen

Compared to SetSystem?, the Hypergraph I want takes 200 times longer to construct, and finding out if it is connected takes about 4 million times longer.

So what would you do in this case?

You got me wrong: I did not claim that Hypergraph's version was faster (just look at the code, you will see why it is slow), but I hope that I do not need to convince you that having two class with the same purpose (Hypergraph and SetSystem?) and developping them in parallel is a very very bad idea.

Nathann

comment:6 in reply to: ↑ 5 ; follow-up: Changed 6 years ago by Rudi

Replying to ncohen:

You got me wrong: I did not claim that Hypergraph's version was faster (just look at the code, you will see why it is slow), but I hope that I do not need to convince you that having two class with the same purpose (Hypergraph and SetSystem?) and developping them in parallel is a very very bad idea.

I agree, but what would you suggest? I do not want to open up SetSystem? for public use and on the other hand, to get what I want from Hypergraph I would need to make changes to it's internal datastructure, using bitsets from the ground up. This would affect nearly every line of code in 'incidence_structures.py'. Apart from it being a bit of work, I hesitate to make such radical changes to other people's work. What there is probably works well for the application it was written for.

So who are working that part of sage right now? They might want to consider re-implementing this stuff using bitsets themselves. Bitsets simply work for this kind of thing.

comment:7 in reply to: ↑ 6 ; follow-up: Changed 6 years ago by ncohen

I agree, but what would you suggest? I do not want to open up SetSystem? for public use and on the other hand, to get what I want from Hypergraph I would need to make changes to it's internal datastructure, using bitsets from the ground up. This would affect nearly every line of code in 'incidence_structures.py'. Apart from it being a bit of work, I hesitate to make such radical changes to other people's work. What there is probably works well for the application it was written for.

HMmmm... Makes sense, but still.. Maybe we could have at least some generic class for the high-level functions :-/

So who are working that part of sage right now? They might want to consider re-implementing this stuff using bitsets themselves. Bitsets simply work for this kind of thing.

Not if you store 3-uniform hypergraphs on a large ground set. You would waste a lot of memory. It's like for graphs, you would need different data structures for different purposes :-/

Nathann

comment:8 in reply to: ↑ 7 ; follow-up: Changed 6 years ago by Rudi

Replying to ncohen:

So who are working that part of sage right now? They might want to consider re-implementing this stuff using bitsets themselves. Bitsets simply work for this kind of thing.

Not if you store 3-uniform hypergraphs on a large ground set. You would waste a lot of memory. It's like for graphs, you would need different data structures for different purposes :-/

Hmm, that's true. You need a sparse data structure too. So how about just replacing that incidence-matrix in there with one over GF(2)? There's dense and sparse versions of such matrices. I think I also saw a data_structures/bitset_matrix.pxi , perhaps use that for the dense version.

comment:9 in reply to: ↑ 8 ; follow-up: Changed 6 years ago by ncohen

Hmm, that's true. You need a sparse data structure too. So how about just replacing that incidence-matrix in there with one over GF(2)? There's dense and sparse versions of such matrices. I think I also saw a data_structures/bitset_matrix.pxi , perhaps use that for the dense version.

If you think that by doing that we can unify the two classes then why not?

comment:10 in reply to: ↑ 9 ; follow-up: Changed 6 years ago by Rudi

Replying to ncohen:

If you think that by doing that we can unify the two classes then why not?

Replacing the incidence matrix would be a good start for speeding up the existing Hypergraph, but it will take a very long time before I can replace Setsystem. Not because the code would remain inefficient (it won't) but because I need something in the matroid code that on a very low level agrees with that code. (My SetSystems? are really just a thinly veiled arrays of bitsets, which I use to avoid unnecessary translation between bitsets and python sets in the matroid code.)

The alternative is that I at some point write a new class Hypergraph from the ground up, and then we work afterwards to make that class mimick the interface of the current class Hypergraph. Would that work for you?

comment:11 in reply to: ↑ 10 ; follow-up: Changed 6 years ago by ncohen

The alternative is that I at some point write a new class Hypergraph from the ground up, and then we work afterwards to make that class mimick the interface of the current class Hypergraph. Would that work for you?

Hmmm.. Do you think we could settle that with having some high-level Hypergraph classes, and some low-level hypergraph classes? That's how it works for graphs at the moment:

sage: graphs.PetersenGraph()._backend
<type 'sage.graphs.base.sparse_graph.SparseGraphBackend'>

This way we would have data-structure-specific code in data-structure-specific files, and high-level code in high-level files?

Nathann

comment:12 in reply to: ↑ 11 ; follow-up: Changed 6 years ago by Rudi

Replying to ncohen:

Hmmm.. Do you think we could settle that with having some high-level Hypergraph classes, and some low-level hypergraph classes?

Sounds right. So how about I come up with a barebones HypergraphDense? (or whatever name you like) at some point, and then we talk about how to integrate this in a parent class Hypergraph?

comment:13 in reply to: ↑ 12 Changed 6 years ago by ncohen

Yo !

Sounds right. So how about I come up with a barebones HypergraphDense? (or whatever name you like) at some point, and then we talk about how to integrate this in a parent class Hypergraph?

Hmmmm... Assuming that you did not mean "parent" like in "Parent/Element", then yes :-P But let's discuss the design a bit, first.

I am looking at matroid/set_system at the moment, and it seems that most of what you have inside are private methods. It really does not look like it should be used from the outside indeed O_o

Is there any reason why SetSystemIterator is a class? Could that be because it was written before 'yield' became a Cython keyword?

Nathann

comment:14 follow-up: Changed 6 years ago by ncohen

  • Reviewers set to Nathann Cohen

Hello again,

I agree with this ticket, and added a small commit at public/18682. It is only style and doc:

  • The first line of the docstring must be a short sentence
  • A link toward SetSystem instead of plain text
  • Comments in the code

It was a bit hard to read at first, for many variables hide their meaning:

  • _len is the number of subsets (it could have been the number of points)
  • _temp is a shipped bitset of size "number of points"

Anyway. If that's okay with you, then you can set this ticket to positive_review.

Nathann

comment:15 in reply to: ↑ 14 Changed 6 years ago by Rudi

  • Status changed from new to needs_review

Replying to ncohen:

Hello again,

I agree with this ticket, and added a small commit at public/18682. It is only style and doc:

  • The first line of the docstring must be a short sentence
  • A link toward SetSystem instead of plain text
  • Comments in the code

Agreed.

It was a bit hard to read at first, for many variables hide their meaning:

  • _len is the number of subsets (it could have been the number of points)
  • _temp is a shipped bitset of size "number of points"

The style is pretty terrible, I'll grant you that. I was learning python/cython while I was writing the whole thing, so there are several things I would do differently if I had to start over.

This in perhaps one of the reasons why making a entirely fresh start for Hypergraph (as opposed to re-using SetSystem?) is a good idea. This SetSystem? class essentially popped out of work on matroids and became much more of a separate entity than I had intended when I started writing the whole matroid code. So now it looks independent, but really isn't, with all kinds of backdoors for the matroid code I have to preserve.

On the positive side, I always did a lot of experimenting to see what actually gave the best efficiency. That experience should be reusable (even if it does create some uglyness in the code at times).

Anyway. If that's okay with you, then you can set this ticket to positive_review.

I will, thanks. My first review before I even asked for one! :)

comment:16 Changed 6 years ago by Rudi

First let's see how the patchbot likes it..

comment:17 Changed 6 years ago by ncohen

It likes it.

comment:18 Changed 6 years ago by Rudi

  • Status changed from needs_review to positive_review

Great!

comment:19 follow-up: Changed 6 years ago by ncohen

  • Status changed from positive_review to needs_work

Could you add my commit to your branch?

comment:20 in reply to: ↑ 19 ; follow-up: Changed 6 years ago by Rudi

Replying to ncohen:

Could you add my commit to your branch?

Sorry, I thought you had already committed these changes to this branch.

I'm afraid I'm too much of a git novice to know how to pull/merge your commit in public/18682. What git command do I give?

comment:21 in reply to: ↑ 20 Changed 6 years ago by ncohen

Helloooooooooo,

I'm afraid I'm too much of a git novice to know how to pull/merge your commit in public/18682. What git command do I give?

The most trivial method to do it is simply to change the branch's name to public/18682, though that's how you would learn the least.

You have several other ways to do it. The first would be, as you say, to merge my branch with yours. This would mean:

git pull trac public/18682

In theory, that should create a merge commit atop your branch and atop mine, containing the changes of both. In practice, as my branch is merely yours with an additional commit, that would just move your branch's tip where mine is.

You can then push your branch, as usual, at its current location.

Nathann

comment:22 Changed 6 years ago by git

  • Commit changed from b91c71c431875d347257e5d410534ed20b21b435 to 7b136abb135ee7872ef4999b06d1893d79905f4a

Branch pushed to git repo; I updated commit sha1. New commits:

69466b6trac #18682: Merged with 6.8.beta4
7b136abtrac #18682: Review

comment:23 Changed 6 years ago by Rudi

  • Status changed from needs_work to positive_review

comment:24 follow-up: Changed 6 years ago by ncohen

Perfect!

comment:25 in reply to: ↑ 24 ; follow-up: Changed 6 years ago by Rudi

Replying to ncohen:

Perfect!

I had a good teacher :)

So if I want to politely propose a change in the work someone else did on a ticket, then this the way to go, push my modified branch somewhere else as in git trac push public/some_name? Excellent, I was wondering about that before.

comment:26 in reply to: ↑ 25 Changed 6 years ago by ncohen

Helloooooooo,

So if I want to politely propose a change in the work someone else did on a ticket, then this the way to go, push my modified branch somewhere else as in git trac push public/some_name? Excellent, I was wondering about that before.

That's how I work, and I believe that it is indeed the "most polite way" to proceed. I always find it a bit unpleasant when I see that the branch's name has been changed by somebody else who pushed commits I don't like on a ticket I'm working on, sometimes without even explaining what they do.

By pushing that stuff to another branch, you propose them to the author and if he likes them he can merge them to his ticket. If he does not like them, you can either forget about them or propose them on another ticket of your own.

Nathann

comment:27 Changed 6 years ago by vbraun

  • Branch changed from u/Rudi/add_setsystem_is_connected__ to 7b136abb135ee7872ef4999b06d1893d79905f4a
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.