Opened 5 years ago
Closed 4 years ago
#24101 closed enhancement (fixed)
Implement Katz centrality and related methods for graphs
Reported by: | clienkaemper | Owned by: | gh-rajat1433 |
---|---|---|---|
Priority: | major | Milestone: | sage-8.7 |
Component: | graph theory | Keywords: | sd90 |
Cc: | Merged in: | ||
Authors: | Caitlin Lienkaemper, Rajat Mittal | Reviewers: | David Coudert |
Report Upstream: | N/A | Work issues: | |
Branch: | 1a87f58 (Commits, GitHub, GitLab) | Commit: | 1a87f58d508f68e838cac0ba92d47f5160b23bc4 |
Dependencies: | Stopgaps: |
Description
want to be able to compute Katz centrality, use Katz method for link prediction in graphs/networks (https://en.wikipedia.org/wiki/Katz_centrality)
Change History (46)
comment:1 Changed 5 years ago by
Branch: | → u/clienkaemper/implement_katz_centrality_and_related_methods_for_graphs |
---|
comment:2 Changed 5 years ago by
Commit: | → 99c2705ffb239f13cf58b6558ee5a66ba72f6e1c |
---|
comment:3 Changed 5 years ago by
Status: | new → needs_review |
---|
comment:4 Changed 5 years ago by
Reviewers: | → David Coudert |
---|---|
Status: | needs_review → needs_work |
Welcome to Sagemath
I have some comments to help you improving the code.
- unify
nonedges_only
andnonedgesonly
. Actually, #24094 usesnonedgesonly
. I don't know which is best, but should be the same. Returns the Katz matrix of G with parameter alpha.
->Return the Katz matrix of the graph with parameter alpha.
- only 1 empty line is needed after
INPUT::
`True`
->``True``
if true, eliminates common neighbors between adjacent vertices
. Could you add that the value for each edge is set to 0 ?the Katz matrix of the graph self with parameter alpha
->the Katz matrix of the graph with parameter alpha
G1 = Graph({1:[2,3], 2:[1,4], 3:[1,4], 4:[2,3]})
->G1 = graphs.CycleGraph(4)
?- since you don't have
G2
, you could renameG1
toG
- add a space between
*
and the text in*:meth:`~katz_centrality`
and the line after - add an empty line after
TESTS::
- The examples you give in the
TESTS
could be in the examples section. In a TESTS block, we expect examples raising errors, with the empty graph, etc. - the test
alpha <= 0
should be made before any other computation if nonedges_only == True:
->if nonedges_only:
in method def katz_centrality(self, alpha):
- you could add an optional parameter
u
to return the centrality of a specified vertex. For that, addu=None
and then the return statement becomes something likeif u is None: return K else: return K[u]
- you must indent the text in 80 columns mode. Some code editors are really helpful to do that without effort
- You return a list. Wouldn't it be better to return a dictionary keyed by vertices ? How to know which value corresponds to a node ?
comment:5 Changed 4 years ago by
Milestone: | sage-8.1 → sage-8.7 |
---|---|
Owner: | set to gh-rajat1433 |
comment:6 Changed 4 years ago by
You can either rebase this ticket on last develop branch (and fix the merge conflicts), or create a new branch based on last beta.
comment:7 Changed 4 years ago by
Branch: | u/clienkaemper/implement_katz_centrality_and_related_methods_for_graphs → u/gh-rajat1433/24101_katz_centrality |
---|---|
Commit: | 99c2705ffb239f13cf58b6558ee5a66ba72f6e1c |
comment:8 Changed 4 years ago by
Commit: | → c7fff6c43966ae48eeac2ee4b974a56875b5ccc7 |
---|
comment:9 Changed 4 years ago by
Status: | needs_work → needs_review |
---|
comment:10 Changed 4 years ago by
Several issues (I'm skipping the coding style issues for the moment, but please have a look at http://doc.sagemath.org/html/en/developer/coding_basics.html and check how other methods are written).
In method katz_matrix
:
- add some explanation of what is a Katz matrix
- please add parameter
vertices
similarly to method.adjacency_matrix()
. This is important as in the future we will no longer ensure that the list of vertices is sorted (due to Python 3).
In method katz_centrality
- add some explanation of what is the Katz centrality, plus a link to relevant wikipedia page
- in the code, choose an order like
verts = list(self)
, and pass it tokatz_matrix
. Here we don't need the list of vertices to be sorted, we just want to know the ordering, so we fix it. u not in self.vertices()
->u not in self
- You can simplify the construction of the dictionary
- for i in range(n): - Kdict[vert[i]] = katz_values[0][i] + Kdict = {u: katz_values[0][i] for i, u in enumerate(verts)}
- Do we really need to build dictionary
Kdict
whenu
is not None ?
Do not forget to check that all tests pass and that the documentation builds properly and displays nicely.
comment:11 Changed 4 years ago by
Commit: | c7fff6c43966ae48eeac2ee4b974a56875b5ccc7 → 66151a6333aef8c6af48bb28eb33c53041c87e0a |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
66151a6 | changes made in accordance to the review.
|
comment:12 follow-up: 15 Changed 4 years ago by
Thanks for the prompt reply and suggesting the changes. Done all the improvements mentioned above and also made sure to pass all the tests. I have done the changes as per required by the documentation in generic_graph.py file. But could not figure out how to view those changes as they will actually be looking in documentation. And is there any test command for checking if format is correct for documentation.
comment:13 Changed 4 years ago by
Commit: | 66151a6333aef8c6af48bb28eb33c53041c87e0a → 172082f5a200768a5f769cbc6373cb5a520b6c81 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
172082f | made additions for documentation.
|
comment:14 Changed 4 years ago by
Commit: | 172082f5a200768a5f769cbc6373cb5a520b6c81 → 47d85e37b4a64b10405088a34adfb7b5693d9ef3 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
47d85e3 | bug fix
|
comment:15 Changed 4 years ago by
I have done the changes as per required by the documentation in generic_graph.py file. But could not figure out how to view those changes as they will actually be looking in documentation. And is there any test command for checking if format is correct for documentation.
See http://doc.sagemath.org/html/en/developer/sage_manuals.html
You can rebuild the full documentation using
sage -b && sage --docbuild reference html
or rebuild only the documentation of the graph module
sage -b && sage --docbuild reference/graphs html
Then, you can look at the documentation with your web browser.
file://PATH_ON_YOUR_COMPUTER/sage/local/share/doc/sage/html/en/reference/graphs/index.html
For the coding style, please try to comply to https://www.python.org/dev/peps/pep-0008
comment:16 Changed 4 years ago by
Commit: | 47d85e37b4a64b10405088a34adfb7b5693d9ef3 → 67a23844be9cb5a0fce4293bf02d7ce616629c91 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
67a2384 | documentation build changes
|
comment:17 Changed 4 years ago by
I have made changes so as to build the documentation properly and also passed all the tests and tried to follow the coding convention. Can you please review the code so that we can finalize it.
comment:18 Changed 4 years ago by
- You must ensure that all comments are formatted in 80 columns mode. Some code editors can be configured to ease this task
- minor stylistic issue
- - ``nonedgesonly`` -- Boolean (default: ``True``); if True, value for each - edge present in the graph is set to zero. + - ``nonedgesonly`` -- boolean (default: ``True``); if ``True``, value for each + edge present in the graph is set to zero
- you can use
for u in self
orset(self)
orif u in self
, etc.- set(vertices) != set(self.vertex_iterator())): + set(vertices) != set(self)):
- pep8
- K = (In - alpha * A.transpose()).inverse()-In + K = (In - alpha * A.transpose()).inverse() - In
- Missing = onesmat - A- In + Missing = onesmat - A - In
- Please revise method
katz_centrality
: alignment, ensure that the code is correct, remove useless instructions, etc.
comment:19 Changed 4 years ago by
Commit: | 67a23844be9cb5a0fce4293bf02d7ce616629c91 → 27579a233a20e790ffc17b4273699b4fa0f17534 |
---|
comment:20 Changed 4 years ago by
Thanks for the review. Its been really a learning experience. I have started following alignment and pep8 seriously and I have fixed the issues above.
comment:21 Changed 4 years ago by
Please check carefully your code, that comments are in 80 columns mode, if possible that you follow the PEP8 coding style, that the comments are sufficient and helpful to the user, and that your code is best possible.
for instance, you have apparently not double check this part:
- if u : - idx = verts.index(u) - return katz_values[0][idx] - - Kdict = {} - Kdict = {u: katz_values[0][i] for i, u in enumerate(verts)} - - if u is None: - return Kdict - else : - return Kdict[u] + if u: + return katz_values[0][verts.index(u)] + return {u: katz_values[0][i] for i, u in enumerate(verts)}
comment:22 Changed 4 years ago by
Status: | needs_review → needs_work |
---|
comment:23 Changed 4 years ago by
Commit: | 27579a233a20e790ffc17b4273699b4fa0f17534 → f85cf1b95a6091ff0d04ec370c97dd77f1a25165 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
f85cf1b | improved code
|
comment:24 Changed 4 years ago by
Commit: | f85cf1b95a6091ff0d04ec370c97dd77f1a25165 → 42019acaea411c958b480c4f8cfc1f04ffc97cd4 |
---|
comment:25 Changed 4 years ago by
Status: | needs_work → needs_review |
---|
comment:26 Changed 4 years ago by
- except in the names of the method, use
Katz
and notkatz
everywhere. - remove trailing white spaces
- you should rephrase the description of method
katz_matrix
to first give the definition of this centrality measure, and then explain what's the returned matrix. - Avoid having too many empty lines like this. 1 is often enough
+ + +
- remove the
.
at the end of error messages if alpha <= 0:
->if alpha <= 0:
-
- We compute katz_centrality for the undirected 4-cycle again (note that + We compute the Katz centrality of a 4-cycle (note that
-
- Return the katz centrality of the vertex u of the graph. + Return the Katz centrality of vertex `u`.
- there is a
α
. It should be ok, but it might be safer to use\alpha
.
comment:27 Changed 4 years ago by
Commit: | 42019acaea411c958b480c4f8cfc1f04ffc97cd4 → cb9e45b198907ebbd354128cdddc30374b63fe9b |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
cb9e45b | fixed the coding style
|
comment:29 Changed 4 years ago by
- to use latex in the documentation, you must write
`\alpha`
, and not only\alpha
. Usingr"""
and then`...`
indicates to use latex.
- in
katz_centrality
, when you ask for the centrality of a vertexu
, you first computekatz_values = (M*matrix(QQ, n, 1, lambda i, j: 1)).transpose()
. However, if I understand well the definition of the Katz matrix, it suffices to make the some of the values of a particular row (or column), right ? If so, you could avoid the matrix multiplication when asking for a single value.
- also, a
.
and a- reciprocal of the spectral radius of the graph. (the maximum - absolute eigenvalue of the adjacency matrix ) + reciprocal of the spectral radius of the graph (the maximum + absolute eigenvalue of the adjacency matrix)
comment:30 Changed 4 years ago by
Commit: | cb9e45b198907ebbd354128cdddc30374b63fe9b → 3d9569b0792d34ec974ac7729c06762545e4710b |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
3d9569b | Improved Code
|
comment:31 Changed 4 years ago by
I have used sum function now and eliminated the multiplication step. I guess it give better complexity now in terms of space and time specially for a single vertex case.
comment:32 Changed 4 years ago by
This is much better now. We certainly same some computation time (but see point below). For the space complexity, it remains in O(n^2)
as we compute the Katz matrix in all cases.
- The test
if u and u not in self:
must be done before computingverts = list(self)
and more importantly the computation of the Katz matrix.
I have now more algorithmic questions (more interesting than coding style :P):
- What is the expected answer when the graph is not connected ? May be we can save computation time if the answer is always 0 for vertices in different components ?
- Same question for strongly connected digraphs? In the example, vertices 1, 2, 3, 7, 9 have no in-neighbors and vertex 4 has no out-neighbor. When you will get an answer, you should document the example explaining why some values are 0.
comment:33 Changed 4 years ago by
What is the expected answer when the graph is not connected ? May be we can save computation time if the answer is always 0 for vertices in different components ?
- When graph is not connected answer for a particular vertex is only dependent on the vertices in its connected component only and not in the other component. So yes we really can do optimize here. Will work on that.
Same question for strongly connected digraphs? In the example, vertices 1, 2, 3, 7, 9 have no in-neighbors and vertex 4 has no out-neighbor. When you will get an answer, you should document the example explaining why some values are 0.
- for strongly connected Digraph the answer will be non zero for every node. Also for any node with in-neighbors 0 the answer will be 0 as its the very essence of katz centrality to measure influence of an actor(node here) in a social network.
Will document the example with reasoning.
comment:34 Changed 4 years ago by
Commit: | 3d9569b0792d34ec974ac7729c06762545e4710b → c796415614414e13fcb9b80ca32b1f10f6512f61 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
c796415 | optimized
|
comment:35 Changed 4 years ago by
Commit: | c796415614414e13fcb9b80ca32b1f10f6512f61 → e2f562c2bba3c04e947df8813133d86cef61597a |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
e2f562c | optimized
|
comment:36 Changed 4 years ago by
- You can get the connected component containing vertex u directly using
C = G.connected_component_containing_vertex(u, sort=False)
.
Then you must compute the Katz matrix of the subgraph, i.e.,
self.subgraph(C)
.
if u: if self.is_connected(): G = self else: G = self.subgraph(self.connected_component_containing_vertex(u, sort=False)) verts = list(G) M = G.katz_matrix(alpha, nonedgesonly=False, vertices=verts) return sum(M[verts.index(u)])
- When you compute the Katz centrality of all vertices, you want the union of the Katz centralities of each subgraph. You can use
self.connected_components_subgraphs()
comment:37 Changed 4 years ago by
Commit: | e2f562c2bba3c04e947df8813133d86cef61597a → 13c28217b0d813ff0e4f23f789a5113d7b816935 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
13c2821 | optimized the code
|
comment:39 Changed 4 years ago by
Please double check your code and test it ! it contains obvious errors.
You can write directly for g in self.connected_components_subgraphs()
Add a doctest with a non connected graph, e.g., graphs.PathGraph(3) + graphs.PathGraph(4)
.
comment:40 Changed 4 years ago by
Commit: | 13c28217b0d813ff0e4f23f789a5113d7b816935 → f47a3dd2aac48aa6d400f1a89748844a92dea0c7 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
f47a3dd | fixed the errors
|
comment:41 Changed 4 years ago by
I am really sorry about the previous errors. Now its fixed and well tested.
comment:42 Changed 4 years ago by
Could you change the example to sage: (graphs.PathGraph(3) + graphs.PathGraph(4)).katz_centrality(1/20)
. This is to show that the values for the vertices in the P3 and in the P4 are the same than in previous examples.
comment:43 Changed 4 years ago by
Commit: | f47a3dd2aac48aa6d400f1a89748844a92dea0c7 → 1a87f58d508f68e838cac0ba92d47f5160b23bc4 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
1a87f58 | minor addition
|
comment:44 Changed 4 years ago by
Could you change the example to sage: (graphs.PathGraph?(3) + graphs.PathGraph?(4)).katz_centrality(1/20). This is to show that the values for the vertices in the P3 and in the P4 are the same than in previous examples.
Done!
comment:45 Changed 4 years ago by
Authors: | Caitlin Lienkaemper → Caitlin Lienkaemper, Rajat Mittal |
---|---|
Status: | needs_review → positive_review |
Further speed up improvements can certainly be obtained, but I don't know how.
For me this patch is good to go.
comment:46 Changed 4 years ago by
Branch: | u/gh-rajat1433/24101_katz_centrality → 1a87f58d508f68e838cac0ba92d47f5160b23bc4 |
---|---|
Resolution: | → fixed |
Status: | positive_review → closed |
Branch pushed to git repo; I updated commit sha1. New commits:
Adds functions for Katz centrality and related methods