#27480 closed enhancement (fixed)
Page Rank algorithm implementation for Graphs
Reported by:  ghrajat1433  Owned by:  ghrajat1433 

Priority:  major  Milestone:  sage8.8 
Component:  graph theory  Keywords:  pagerank 
Cc:  dcoudert  Merged in:  
Authors:  Rajat Mittal  Reviewers:  David Coudert 
Report Upstream:  N/A  Work issues:  
Branch:  07a884b (Commits, GitHub, GitLab)  Commit:  
Dependencies:  Stopgaps: 
Description
Page Rank computes the ranking of the nodes of the graph based on the structure of the incoming links. It is a useful metrics in graphs and can be quite useful. (https://towardsdatascience.com/graphsandpathspagerank54f180a1aa0a)
Below is a link to a thesis on this algorithm http://www.sagemath.org/files/thesis/augerithesis2008.pdf
It would be good to have its implementation in the Sage's graph module.
Change History (79)
comment:1 Changed 3 years ago by
 Cc dcoudert added; David Coudert removed
 Owner changed from (none) to ghrajat1433
comment:2 Changed 3 years ago by
comment:3 Changed 3 years ago by
I just found that networkx library already has Page Rank algorithm implemented in Python. Should I implement it in Sage freshly or should I make a function to use networkx implementation ? Or maybe we can write a faster cython implemetation of it. Need some suggetsions!
comment:4 Changed 3 years ago by
networkx has several methods to compute Page rank: a pure Python, one using numpy, another using scipy, etc.
The optional package igraph
also has an implementation of pagerank (see
igraph documentation for pagerank
(to install igraph, do sage i igraph
and sage i python_igraph
).
The best thing to do is to create a method including a parameter algorithm
, and that will call the different algorithms. So algorithm
could be None
,
'networkx',
'numpy',
'scipy',
'igraph'`.
comment:5 Changed 3 years ago by
When it is None, should an implementation in sage be used? And if so should I do an implementation in python or cython?
comment:6 Changed 3 years ago by
When it's None
, the method should choose the best available implementation. And the best implementation could depend on the size or density of the (di)graph. For instance, for very large graphs, methods based on matrix multiplication might not be appropriate (memory consumption), while such methods might be very efficient for small graphs. Some measurements are needed to decide.
Before deciding if a new implementation is needed, we must know if what we can easily get is fast enough or not.
comment:7 Changed 3 years ago by
 Branch set to u/ghrajat1433/27480_page_rank
comment:8 Changed 3 years ago by
 Dependencies set to #27496
comment:9 Changed 3 years ago by
I have installed igraph in my sage module. But however its pagerank algorithm is not able to work. It keeps on throwing a segmentation error. I use igraph_graph to convert sage graph into igraph. All other algorithms of igraph works fine. Following error message I get:
Cython backtrace  29 ../sysdeps/unix/sysv/linux/waitpid.c: No such file or directory. Traceback (most recent call last): File "<string>", line 56, in <module> File "/usr/lib/python3/distpackages/Cython/Debugger/libcython.py", line 689, in invoke for arg in string_to_argv(args): TypeError: argument 1 must be str, not bytes Saved trace to /home/rajat/.sage/crash_logs/crash_yLpPiW.log  Unhandled SIGSEGV: A segmentation fault occurred. This probably occurred because a *compiled* module has a bug in it and is not properly wrapped with sig_on(), sig_off(). Python will now terminate.
Regarding networkx,numpy and scipy, I have tested them on their runtime. Numpy does it by matrix multiplication and solving for eigenvalues, networkx has pure python implementation and scipy does matrix multiplication iteratively.
Following are the runtimes I got:
sage: G=nx.gnp_random_graph(40,0.01,directed=True) sage: %timeit nx.pagerank_numpy(G) 1000 loops, best of 3: 883 µs per loop sage: %timeit nx.pagerank_scipy(G) 1000 loops, best of 3: 1.7 ms per loop sage: %timeit nx.pagerank(G) 1000 loops, best of 3: 1.7 ms per loop sage: G=nx.gnp_random_graph(60,0.01,directed=True) sage: %timeit nx.pagerank(G) 100 loops, best of 3: 2.87 ms per loop sage: %timeit nx.pagerank_scipy(G) 1000 loops, best of 3: 1.87 ms per loop sage: %timeit nx.pagerank_numpy(G) 1000 loops, best of 3: 1.58 ms per loop sage: G=nx.gnp_random_graph(70,0.01,directed=True) sage: %timeit nx.pagerank(G) 100 loops, best of 3: 3.66 ms per loop sage: %timeit nx.pagerank_numpy(G) 100 loops, best of 3: 2.22 ms per loop sage: %timeit nx.pagerank_scipy(G) 100 loops, best of 3: 1.99 ms per loop sage: G=nx.gnp_random_graph(80,0.01,directed=True) sage: %timeit nx.pagerank(G) 100 loops, best of 3: 4.66 ms per loop sage: %timeit nx.pagerank_scipy(G) 1000 loops, best of 3: 1.93 ms per loop sage: %timeit nx.pagerank_numpy(G) 100 loops, best of 3: 3.3 ms per loop sage: G=nx.gnp_random_graph(100,0.01,directed=True) sage: %timeit nx.pagerank(G) 100 loops, best of 3: 7.07 ms per loop sage: %timeit nx.pagerank_scipy(G) 100 loops, best of 3: 2.29 ms per loop sage: %timeit nx.pagerank_numpy(G) 100 loops, best of 3: 4.63 ms per loop sage: G=nx.gnp_random_graph(400,0.01,directed=True) sage: %timeit nx.pagerank_numpy(G) 10 loops, best of 3: 175 ms per loop sage: %timeit nx.pagerank_scipy(G) 100 loops, best of 3: 3.9 ms per loop sage: %timeit nx.pagerank(G) 10 loops, best of 3: 53.2 ms per loop sage: G=nx.gnp_random_graph(4000,0.01,directed=True) sage: %timeit nx.pagerank(G) 1 loop, best of 3: 1.81 s per loop sage: %timeit nx.pagerank_scipy(G) 1 loop, best of 3: 206 ms per loop sage: %timeit nx.pagerank_numpy(G) 1 loop, best of 3: 1min 1s per loop
As per my analysis upto around 60 vertices numpy is fast but scipy is fastest after that.
comment:10 Changed 3 years ago by
I don't know what's the problem with igraph. I have installed igraph
and python_igraph
on my OSX laptop and I can do
sage: G = graphs.RandomGNP(1000, .01) sage: I = G.igraph_graph() sage: %timeit I.pagerank() 100 loops, best of 3: 1.92 ms per loop
Try recompiling sage (make build
or sage b
). If not working, you can ask for help on sagedevel. Explain what you did to install igraph, that other algorithms of igraph are working fine for you, and what's the error message you get.
comment:11 Changed 3 years ago by
Note that a random graph on 40 nodes with probability 0.01 has in average 8 edges... so your experiments are not very conclusive on small graphs. Nonetheless, scipy
seems the most efficient version.
comment:12 Changed 3 years ago by
More experiments with dense small graphs also suggest numpy to be best for small graphs
sage: G=nx.gnp_random_graph(40,0.5,directed=True) sage: **%timeit nx.pagerank_numpy(G) The slowest run took 86.64 times longer than the fastest. This could mean that an intermediate result is being cached. 1 loop, best of 3: 2.32 ms per loop** sage: %timeit nx.pagerank_scipy(G) The slowest run took 60.81 times longer than the fastest. This could mean that an intermediate result is being cached. 100 loops, best of 3: 2.41 ms per loop sage: %timeit nx.pagerank(G) 100 loops, best of 3: 12.5 ms per loop sage: G=nx.gnp_random_graph(60,0.5,directed=True) sage: %timeit nx.pagerank(G) 10 loops, best of 3: 27 ms per loop sage: %timeit nx.pagerank_scipy(G) 100 loops, best of 3: 3.47 ms per loop sage: **%timeit nx.pagerank_numpy(G) 100 loops, best of 3: 3.13 ms per loop**
comment:13 Changed 3 years ago by
 Milestone changed from sage8.7 to sage8.8
Ticket retargeted after milestone closed (if you don't believe this ticket is appropriate for the Sage 8.8 release please retarget manually)
comment:14 Changed 3 years ago by
 Dependencies changed from #27496 to #27496, #27502
comment:15 Changed 3 years ago by
Page rank for Ipython with weighted edges doesn't seem to be working: However there is a parameter called weights in Pagerank method of Igraph , I looked at its code and doc but it is not clear to me if it is using it.
https://github.com/igraph/pythonigraph/blob/master/src/graphobject.c
https://igraph.org/python/doc/igraph.GraphBaseclass.html#personalized_pagerank
sage: G = Graph(6) sage: I = G.igraph_graph() sage: I.add_edge(0,0,weight=40) igraph.Edge(<igraph.Graph object at 0x7f338814e148>, 0, {'weight': 40}) sage: I.add_edge(1,2,weight=50) igraph.Edge(<igraph.Graph object at 0x7f338814e148>, 1, {'weight': 50}) sage: I.add_edge(2,3,weight=60) igraph.Edge(<igraph.Graph object at 0x7f338814e148>, 2, {'weight': 60}) sage: I.add_edge(0,3,weight=70) igraph.Edge(<igraph.Graph object at 0x7f338814e148>, 3, {'weight': 70}) sage: I.add_edge(3,4,weight=80) igraph.Edge(<igraph.Graph object at 0x7f338814e148>, 4, {'weight': 80}) sage: I.add_edge(4,5,weight=20) igraph.Edge(<igraph.Graph object at 0x7f338814e148>, 5, {'weight': 20}) sage: I.pagerank(weights='weight') [0.1380494948975056, 0.11517272680482316, 0.19683228912731204, 0.237940473238224, 0.196832289127312, 0.11517272680482313] sage: I.pagerank() [0.1380494948975056, 0.11517272680482316, 0.19683228912731204, 0.237940473238224, 0.196832289127312, 0.11517272680482313]
comment:16 Changed 3 years ago by
Following experiments show igraph's results to be the best due to its clanguage code.
sage: G = graphs.RandomGNP(40, 0.2) sage: n = G.networkx_graph() sage: %timeit nx.pagerank_numpy(n) 1000 loops, best of 3: 1.06 ms per loop sage: %timeit nx.pagerank_scipy(n) The slowest run took 37.63 times longer than the fastest. This could mean that an intermediate result is being cached. 100 loops, best of 3: 2.31 ms per loop sage: %timeit nx.pagerank(n) 100 loops, best of 3: 11.3 ms per loop sage: i = G.igraph_graph() sage: %timeit i.pagerank() 10000 loops, best of 3: **23.3 µs** per loop sage: G = graphs.RandomGNP(40, 0.8) sage: n = G.networkx_graph() sage: %timeit nx.pagerank_numpy(n) 1000 loops, best of 3: 1.33 ms per loop sage: %timeit nx.pagerank_scipy(n) 100 loops, best of 3: 2.16 ms per loop sage: %timeit nx.pagerank(n) 10 loops, best of 3: 20 ms per loop sage: i = G.igraph_graph() sage: %timeit i.pagerank() 10000 loops, best of 3: **39.1 µs** per loop sage: G = graphs.RandomGNP(4000, 0.5) sage: n = G.networkx_graph() sage: i = G.igraph_graph() sage: %timeit i.pagerank() 1 loop, best of 3: **1.03 s** per loop sage: n = G.networkx_graph() sage: %timeit nx.pagerank_numpy(n) 1 loop, best of 3: 1min 9s per loop sage: %timeit nx.pagerank_scipy(n) 1 loop, best of 3: 7.12 s per loop sage: %timeit nx.pagerank(n) ....: ....: Killed
comment:17 Changed 3 years ago by
The code of the pagerank algorithm is here https://github.com/igraph/igraph/blob/master/src/centrality.c
comment:18 Changed 3 years ago by
 Branch changed from u/ghrajat1433/27480_page_rank to u/ghrajat1433/27480_page_rank_implementation
comment:19 Changed 3 years ago by
 Commit set to ecbb4a7c73b93b8ff4326e018cb58fea330fdb97
Branch pushed to git repo; I updated commit sha1. New commits:
b2c82ec  Fixes cycle basis for multiedges

1a68ce1  Add doctests

142e50f  Merge branch 'develop' of git://trac.sagemath.org/sage into develop

8836f0d  checkpoint1

a2f38d9  merged

8433c2a  merged

b57b730  pagerank method

ecbb4a7  removed xtra lines

comment:20 Changed 3 years ago by
 Branch changed from u/ghrajat1433/27480_page_rank_implementation to u/ghrajat1433/27480_page_rank_algo
 Commit ecbb4a7c73b93b8ff4326e018cb58fea330fdb97 deleted
comment:21 Changed 3 years ago by
 Commit set to 142e50fa00bf1b76d4ec435278d3a9feec183b45
comment:22 Changed 3 years ago by
 Commit changed from 142e50fa00bf1b76d4ec435278d3a9feec183b45 to faafb7947fab662e4fef0f965e68e5b6d16d6062
Branch pushed to git repo; I updated commit sha1. New commits:
faafb79  branch corrected

comment:23 Changed 3 years ago by
 Commit changed from faafb7947fab662e4fef0f965e68e5b6d16d6062 to cdef2867ecae6aad616400562b2ee825181e603d
Branch pushed to git repo; I updated commit sha1. New commits:
cdef286  fix small mistakes

comment:24 Changed 3 years ago by
 Commit changed from cdef2867ecae6aad616400562b2ee825181e603d to b4e6b8c4937cba885bfdf81c8f84d67a0de2be24
Branch pushed to git repo; I updated commit sha1. New commits:
b4e6b8c  removing spaces

comment:25 Changed 3 years ago by
Sorry for the mess. Its due to update in the sage version now its corrected :)
comment:26 Changed 3 years ago by
I think that this is an issue on igraph side that its not able to work for weighted graphs in python. https://lists.nongnu.org/archive/html/igraphhelp/200808/msg00030.html
Still I am seeing through how the code of igraph can be fixed to take weights in pagerank. Else we can use other libraries like scipy for weighted case.
comment:27 Changed 3 years ago by
I think the problem is with the version or codebase of igraph which sage currently uses , maybe its not been updated.
Using sage:
sage: G = Graph(5) sage: I = G.igraph_graph() sage: I.add_edge(0,1,weight=50) igraph.Edge(<igraph.Graph object at 0x7fe144ca7a00>, 0, {'weight': 50}) sage: I.add_edge(1,2,weight=70) ....: igraph.Edge(<igraph.Graph object at 0x7fe144ca7a00>, 1, {'weight': 70}) sage: I.add_edge(2,4,weight=35) igraph.Edge(<igraph.Graph object at 0x7fe144ca7a00>, 2, {'weight': 35}) sage: ....: sage: I.add_edge(3,4,weight=35) igraph.Edge(<igraph.Graph object at 0x7fe144ca7a00>, 3, {'weight': 35}) sage: I.pagerank() [0.134527027027027, 0.24594594594594593, 0.23905405405405405, 0.134527027027027, 0.24594594594594593] sage: I.pagerank(weights='weight') [0.134527027027027, 0.24594594594594593, 0.23905405405405405, 0.134527027027027, 0.24594594594594593]
Using python terminal
>>> I = Graph(5) >>> I <igraph.Graph object at 0x7f5bcd250528> >>> I.summary() 'IGRAPH U 5 0  ' >>> I.add_edge(0,1,weight=50) >>> I.add_edge(1,2,weight=70) >>> I.add_edge(2,4,weight=35) >>> I.add_edge(3,4,weight=35) >>> I.pagerank() [0.134527027027027, 0.24594594594594593, 0.23905405405405405, 0.134527027027027, 0.24594594594594593] >>> I.pagerank(weights='weight') [0.1326579757790889, 0.2898578139644863, 0.2595856492098718, 0.11586448311914739, 0.20203407792740563]
comment:28 Changed 3 years ago by
Sage 8.8.beta0 has been released and it includes the 2 tickets marked as dependencies. So you can update you develop branch and rebuild this ticket using last beta.
If it is not possible to use weight with igraph, add a TODO block in the method to say that it should be done someday.
comment:29 Changed 3 years ago by
I have already updated to sage8.8 beta0 today only!
Please read comment 27, as I am able to use weight parameter of igraph in my python console but in sage its not working (don't know why this discrepancy is present).
comment:30 Changed 3 years ago by
 Dependencies #27496, #27502 deleted
comment:31 Changed 3 years ago by
Should I open a new ticket mentioning this discrepancy of igraph package in sage?
comment:32 Changed 3 years ago by
I don't understand what's going on. Have you tried other methods using weights?
comment:33 Changed 3 years ago by
Yes other algorithms like dijkstra are working fine with weights. I think its an old bug of igraph which is updated as can be seen by the results in my python shell however its not been updated in sage version of igraph.
Reference: https://lists.nongnu.org/archive/html/igraphhelp/200808/msg00030.html
comment:34 Changed 3 years ago by
May the version that has been updated in #27502 is not the same as the version you have on your computer ? upstream should update python_igraph to be at least Python3 compatible. It currently contains deprecated stuff, so don't pass python3 tests :(
comment:35 Changed 3 years ago by
 Commit changed from b4e6b8c4937cba885bfdf81c8f84d67a0de2be24 to 78b1f8b5e409ae8115667cccb34f3a5bb6fd78c4
comment:36 Changed 3 years ago by
 Keywords pagerank added
 Reviewers set to David Coudert
 Status changed from new to needs_review
comment:37 followup: ↓ 39 Changed 3 years ago by
I have used scipy in case of weighted graphs and igraph in case of unweighted as default case as per the results I got mentioned in the previous comments.
comment:38 Changed 3 years ago by
 Commit changed from 78b1f8b5e409ae8115667cccb34f3a5bb6fd78c4 to 112cac944df1d70ab96cebb323add4bcdf4c539f
Branch pushed to git repo; I updated commit sha1. New commits:
112cac9  correct igraph example

comment:39 in reply to: ↑ 37 Changed 3 years ago by
Replying to ghrajat1433:
I have used scipy in case of weighted graphs and igraph in case of unweighted as default case as per the results I got mentioned in the previous comments.
 Unfortunately, igraph is an optional package, so you can use it only if installed. See e.g.
clique_maximum
to know how to check if a package is installed.
 Also, I suggest you start checking if default should be used, and then select most suitable default algorithm.
 instead of
implementation
, you should usealgorithm
.
 You should raise an error if the given name is unknown
 the name of igraph is
igraph
, so don't useIgraph
. In fact, you could doalgorithm = algorithm.lower()
to avoid errors.
if self.order() == 0:
>if not self.order():
 this is not needed
import igraph
comment:40 Changed 3 years ago by
 Commit changed from 112cac944df1d70ab96cebb323add4bcdf4c539f to e7ae5da2138792384eabeae9217b115106c372ff
Branch pushed to git repo; I updated commit sha1. New commits:
e7ae5da  improved the code

comment:41 Changed 3 years ago by
As per commment 9, 12 and 16 I think it will be a good idea to have threshold value of number of nodes in the graph as less than 60 nodes numpy implementation should be used as default and greater than that scipy implementation should be used. If you have another idea on thresholding based on density or any other parameter I would like to know.
comment:42 Changed 3 years ago by
why 60? I though scipy was fast even for small graphs...
comment:43 Changed 3 years ago by
Due to the following results I suggested 60. For small graphs time taken by numpy is lesser than that of scipy.
sage: G=nx.gnp_random_graph(40,0.01,directed=True) sage: %timeit nx.pagerank_numpy(G) 1000 loops, best of 3: 883 µs per loop sage: %timeit nx.pagerank_scipy(G) 1000 loops, best of 3: 1.7 ms per loop sage: %timeit nx.pagerank(G) 1000 loops, best of 3: 1.7 ms per loop sage: G=nx.gnp_random_graph(60,0.01,directed=True) sage: %timeit nx.pagerank(G) 100 loops, best of 3: 2.87 ms per loop sage: %timeit nx.pagerank_scipy(G) 1000 loops, best of 3: 1.87 ms per loop sage: %timeit nx.pagerank_numpy(G) 1000 loops, best of 3: 1.58 ms per loop sage: G=nx.gnp_random_graph(70,0.01,directed=True) sage: %timeit nx.pagerank(G) 100 loops, best of 3: 3.66 ms per loop sage: %timeit nx.pagerank_numpy(G) 100 loops, best of 3: 2.22 ms per loop sage: %timeit nx.pagerank_scipy(G) 100 loops, best of 3: 1.99 ms per loop sage: G=nx.gnp_random_graph(80,0.01,directed=True) sage: %timeit nx.pagerank(G) 100 loops, best of 3: 4.66 ms per loop sage: %timeit nx.pagerank_scipy(G) 1000 loops, best of 3: 1.93 ms per loop sage: %timeit nx.pagerank_numpy(G) 100 loops, best of 3: 3.3 ms per loop sage: G=nx.gnp_random_graph(100,0.01,directed=True) sage: %timeit nx.pagerank(G) 100 loops, best of 3: 7.07 ms per loop sage: %timeit nx.pagerank_scipy(G) 100 loops, best of 3: 2.29 ms per loop sage: %timeit nx.pagerank_numpy(G) 100 loops, best of 3: 4.63 ms per loop sage: G=nx.gnp_random_graph(400,0.01,directed=True) sage: %timeit nx.pagerank_numpy(G) 10 loops, best of 3: 175 ms per loop sage: %timeit nx.pagerank_scipy(G) 100 loops, best of 3: 3.9 ms per loop sage: %timeit nx.pagerank(G) 10 loops, best of 3: 53.2 ms per loop sage: G=nx.gnp_random_graph(4000,0.01,directed=True) sage: %timeit nx.pagerank(G) 1 loop, best of 3: 1.81 s per loop sage: %timeit nx.pagerank_scipy(G) 1 loop, best of 3: 206 ms per loop sage: %timeit nx.pagerank_numpy(G) 1 loop, best of 3: 1min 1s per loop sage: G=nx.gnp_random_graph(40,0.5,directed=True) sage: **%timeit nx.pagerank_numpy(G) The slowest run took 86.64 times longer than the fastest. This could mean that an intermediate result is being cached. 1 loop, best of 3: 2.32 ms per loop** sage: %timeit nx.pagerank_scipy(G) The slowest run took 60.81 times longer than the fastest. This could mean that an intermediate result is being cached. 100 loops, best of 3: 2.41 ms per loop sage: %timeit nx.pagerank(G) 100 loops, best of 3: 12.5 ms per loop sage: G=nx.gnp_random_graph(60,0.5,directed=True) sage: %timeit nx.pagerank(G) 10 loops, best of 3: 27 ms per loop sage: %timeit nx.pagerank_scipy(G) 100 loops, best of 3: 3.47 ms per loop sage: **%timeit nx.pagerank_numpy(G) 100 loops, best of 3: 3.13 ms per loop** sage: G = graphs.RandomGNP(40, 0.2) sage: n = G.networkx_graph() sage: %timeit nx.pagerank_numpy(n) 1000 loops, best of 3: 1.06 ms per loop sage: %timeit nx.pagerank_scipy(n) The slowest run took 37.63 times longer than the fastest. This could mean that an intermediate result is being cached. 100 loops, best of 3: 2.31 ms per loop sage: %timeit nx.pagerank(n) 100 loops, best of 3: 11.3 ms per loop sage: i = G.igraph_graph() sage: %timeit i.pagerank() 10000 loops, best of 3: **23.3 µs** per loop sage: G = graphs.RandomGNP(40, 0.8) sage: n = G.networkx_graph() sage: %timeit nx.pagerank_numpy(n) 1000 loops, best of 3: 1.33 ms per loop sage: %timeit nx.pagerank_scipy(n) 100 loops, best of 3: 2.16 ms per loop sage: %timeit nx.pagerank(n) 10 loops, best of 3: 20 ms per loop sage: i = G.igraph_graph() sage: %timeit i.pagerank() 10000 loops, best of 3: **39.1 µs** per loop sage: G = graphs.RandomGNP(4000, 0.5) sage: n = G.networkx_graph() sage: i = G.igraph_graph() sage: %timeit i.pagerank() 1 loop, best of 3: **1.03 s** per loop sage: n = G.networkx_graph() sage: %timeit nx.pagerank_numpy(n) 1 loop, best of 3: 1min 9s per loop sage: %timeit nx.pagerank_scipy(n) 1 loop, best of 3: 7.12 s per loop
comment:44 Changed 3 years ago by
OK.
comment:45 Changed 3 years ago by
 Commit changed from e7ae5da2138792384eabeae9217b115106c372ff to 7364a6b00f70ae05ccc68c602e7797536c4d8a66
comment:46 Changed 3 years ago by
I have added the default values and completed the code.
comment:47 Changed 3 years ago by
 We usually use
vertex/vertices
and notnode/nodes
.

 Return the PageRank of the nodes in the graph. + Return the PageRank of the vertices of ``self``.
 This text might be more informative.
 PageRank calculates the ranking of nodes in the graph G based on the  structure of the incoming links. It is popularly used to rank web  pages. + PageRank is a centrality measure first used to rank web pages. + The PageRank algorithm outputs the probability distribution that + a random walker in the graph visits a vertex.
 for the default case, the simpler version is:
+ if algorithm: + algorithm = algorithm.lower() + elif self.order <= 60: + algorithm = 'numpy' + else: + algorithm = 'scipy'
 If the graph is not connected, you can certainly compute pagerank on each connected component. Before doing it, check how to do the normalization afterward. Indeed:
sage: import networkx sage: G = graphs.CycleGraph(4) sage: N = G.networkx_graph() sage: print(networkx.pagerank(N)) {0: 0.25, 1: 0.25, 2: 0.25, 3: 0.25} sage: N = (G+G).networkx_graph() sage: print(networkx.pagerank(N)) {0: 0.125, 1: 0.125, 2: 0.125, 3: 0.125, 4: 0.125, 5: 0.125, 6: 0.125, 7: 0.125} sage: H = graphs.CycleGraph(10) sage: N = H.networkx_graph() sage: print(networkx.pagerank(N)) {0: 0.1, 1: 0.1, 2: 0.1, 3: 0.1, 4: 0.1, 5: 0.1, 6: 0.1, 7: 0.1, 8: 0.1, 9: 0.1} sage: N = (G+H).networkx_graph() sage: print(networkx.pagerank(N)) {0: 0.07142857142857142, 1: 0.07142857142857142, 2: 0.07142857142857142, 3: 0.07142857142857142, 4: 0.07142857142857142, 5: 0.07142857142857142, 6: 0.07142857142857142, 7: 0.07142857142857142, 8: 0.07142857142857142, 9: 0.07142857142857142, 10: 0.07142857142857142, 11: 0.07142857142857142, 12: 0.07142857142857142, 13: 0.07142857142857142}
comment:48 Changed 3 years ago by
 Commit changed from 7364a6b00f70ae05ccc68c602e7797536c4d8a66 to 552a429952c404796d612ae9daeca9cc53df60f2
Branch pushed to git repo; I updated commit sha1. New commits:
552a429  improved code

comment:49 Changed 3 years ago by
optimizing the code for computing pagerank for connected components subgraph will work for undirected graphs as we can easily find the normalization factor but for digraphs the nodes can be dangling also in such a case we randomly connect that node to other nodes in our graph with some weight value of the edge. In such cases this optimization may not be desired.
comment:50 Changed 3 years ago by
it can also be let for a future ticket
comment:51 Changed 3 years ago by
For undirected graphs, I can do the optimization in this ticket only. :)
comment:52 Changed 3 years ago by
Even for undirected graphs with personalization parameter set the optimization may not work as can be seen from example below:
sage: g = Graph([(1,2,3),(2,3,5),(3,5,8),(2,5,13),(2,4,25)]) sage: personalization={} sage: personalization[1] = 8 sage: personalization[2] = 16 sage: personalization[3] = 24 sage: personalization[4] = 36 sage: personalization[5] = 25 sage: g.pagerank(by_weight=True, personalization=personalization) sage: personalization2={} sage: personalization2[1] = 84 sage: personalization2[2] = 42 sage: personalization2[3] = 30 sage: personalization2[4] = 70 sage: f = Graph([(1,2,2),(2,3,15),(2,4,22)]) sage: f.pagerank(by_weight=True,personalization=personalization2) {1: 0.07643670757326188, 2: 0.474528450109004, 3: 0.17504521830388875, 4: 0.27398962401384513} sage: q = g + f sage: personalization[0] = 8 sage: personalization[1] = 16 sage: personalization[2] = 24 sage: personalization[3] = 36 sage: personalization[4] = 25 sage: personalization[5] = 84 sage: personalization[6] = 42 sage: personalization[7] = 30 sage: personalization[8] = 70 sage: q.pagerank(by_weight=True,personalization=personalization) {0: 0.010763853970772248, 1: 0.12955307566026533, 2: 0.04384414464203576, 3: 0.07596743980618663, 4: 0.0652489170000004, 5: 0.05156608491296348, 6: 0.3201286892373546, 7: 0.11808892042931564, 8: 0.18483887434110574}
So I guess its best to keep the method as it is for now.
comment:53 Changed 3 years ago by
 no capital letter after
;
. So  ``alpha``  float (default: ``0.85``); Damping parameter for +  ``alpha``  float (default: ``0.85``); damping parameter for
 you should give slightly more details on what the parameters are. For instance, the PageRank? value for vertices without inneighbors is
1  damping
. Furthermore, thedamping
factor (this term is better than parameter here) is the probability of resetting the random walk to a uniform distribution in each step.
 As reported by the patchbot. Please check it
 .. SEEALSO: + .. SEEALSO::
 ...
 elif self.order <= 60: + elif self.order() <= 60:
 simplify code and avoid using
.vertices()
when not necessary (it sorts vertices) if by_weight:  I = self.igraph_graph(edge_attrs={'weight': [weight_function(e)  for e in self.edge_iterator()]})  page_rank = I.pagerank(damping=alpha, weights='weight')  return {v: page_rank[i] for i, v in enumerate(self.vertices())}  else:  I = self.igraph_graph()  page_rank = I.pagerank(damping=alpha)  return {v: page_rank[i] for i, v in enumerate(self.vertices())} + if by_weight: + I = self.igraph_graph(edge_attrs={'weight': [weight_function(e) + for e in self.edge_iterator()]}) + else: + I = self.igraph_graph() + page_rank = I.pagerank(damping=alpha, weights=weight) + return {v: page_rank[i] for i, v in enumerate(self)}
 Don't start error message with a capital letter
raise NotImplementedError("Only 'NetworkX', 'Numpy', 'Scipy', and 'igraph' are supported")
comment:54 Changed 3 years ago by
 Commit changed from 552a429952c404796d612ae9daeca9cc53df60f2 to c32242e8e8fa04a0fef0518853fb23e5b6accc26
comment:55 Changed 3 years ago by
I have added more info on damping parameter alpha and improved the code as per review.
comment:56 Changed 3 years ago by
on my system all tests pass! but why isn't patchbot reflecting the same here. I was just wondering about how much time it takes for patchbot to get invoked here?
$ ./sage t src/sage/graphs/geneic_graph.py Running doctests with ID 2019040720115005efbf4e. Git branch: u/ghrajat1433/27480_page_rank_algo Using optional=dochtml,igraph,memlimit,mpir,python2,python_igraph,sage Doctesting 1 file. sage t warnlong 24.3 src/sage/graphs/generic_graph.py [3391 tests, 26.98 s]  All tests passed!  Total time for all tests: 28.0 seconds cpu time: 19.8 seconds cumulative wall time: 27.0 seconds
comment:57 Changed 3 years ago by
 please don't forget my previous #comment:47
 :meth:`~GenericGraph.pagerank`  Return the PageRank of the nodes in the graph. + :meth:`~GenericGraph.pagerank`  Return the PageRank of the vertices of ``self``.
 Return the PageRank of the vertices in the graph. + Return the PageRank of the vertices of ``self``.
The documentation requires further improvements. I know that the documentations of networkx/igraph are not very clear, but we can do better.
 I don't understand what the
personalization
parameter is used for. We can at least clarify the text and be closer to what is explained in networkx  ``personalization``  dict (default: ``None``); the "personalization  vector" consisting of a dictionary with a key for every graph node  and nonzero personalization value for each node.  By default, a uniform distribution is used. +  ``personalization``  dict (default: ``None``); a dictionary keyed by + vertices associating to each vertex a value. The personalization can be + specified for a subset of the vertices only, and the sum of the values + must be nonzero. + By default (``None``), a uniform distribution is used.
 if you understand what is
dangling
, please improve the description.
 You should indicate the parameters that cannot be used with an algorithm or that can only be used with an algorithm.
 may be it would be better to start the list of parameters with the algorithms ? Well, think about it.
 the note
Note that
'networkx'
does not support multigraphs.
could be added directly with the in the line
NetworkX

...
 you could add a
TESTS
block with errors, specific cases, etc. For instance withpersonalization={0: 1, 1: 1}
Also, note that using a graph like a 4cycle is interesting to illustrate the effect of the various parameters. Indeed, the pagerank of each vertex is 0.25.
Note that the example you have is also interesting as it shows that the result is slightly different from an algorithm to another.
comment:58 Changed 3 years ago by
and for the patchbot, I don't know when it is invoked. It depends on availability and so on the number of users who are kindly offering computing resource for that and spending time maintaining the system.
comment:59 Changed 3 years ago by
 Commit changed from c32242e8e8fa04a0fef0518853fb23e5b6accc26 to 03dd4451465f5c057ca24a19c2df5372c2874794
comment:60 Changed 3 years ago by
 Commit changed from 03dd4451465f5c057ca24a19c2df5372c2874794 to 06dbcc6483839e071908b6de17e76160b773f24e
Branch pushed to git repo; I updated commit sha1. New commits:
06dbcc6  revert some changes

comment:61 Changed 3 years ago by
I have rebased my branch on 8.8 beta 1 and did the changes as mentioned in comment 57.
comment:62 Changed 3 years ago by
 Commit changed from 06dbcc6483839e071908b6de17e76160b773f24e to 30d493def189d6d71796749ef9ed921c9e0f8a19
Branch pushed to git repo; I updated commit sha1. New commits:
30d493d  remved unnecessary spaces

comment:63 Changed 3 years ago by
Further improvements are certainly possible, but it's already good. Do at least this one (plurals)
 Parameter ``alpha``, ``by_weight`` and ``weight_function`` is common  for all algorithms but ``personalization`` and ``dangling``  parameters are used only in ``NetworkX``, ``Numpy`` and ``Scipy``  implementations. + Parameters ``alpha``, ``by_weight`` and ``weight_function`` are common + to all algorithms. Parameters ``personalization`` and ``dangling`` + are used only by algorithms ``NetworkX``, ``Numpy`` and ``Scipy``.
comment:64 Changed 3 years ago by
 Commit changed from 30d493def189d6d71796749ef9ed921c9e0f8a19 to 804fe603a582470952b017d444ee8bb02925482f
Branch pushed to git repo; I updated commit sha1. New commits:
804fe60  improved note

comment:65 Changed 3 years ago by
Done the changes!
comment:67 Changed 3 years ago by
 Branch changed from u/ghrajat1433/27480_page_rank_algo to 804fe603a582470952b017d444ee8bb02925482f
 Resolution set to fixed
 Status changed from positive_review to closed
comment:68 Changed 3 years ago by
 Branch changed from 804fe603a582470952b017d444ee8bb02925482f to u/ghrajat1433/27480_page_rank_algo
 Resolution fixed deleted
 Status changed from closed to new
********************************************************************** File "src/sage/graphs/generic_graph.py", line 9536, in sage.graphs.generic_graph.GenericGraph.? Failed example: G.pagerank(alpha=0.50, algorithm="igraph") Exception raised: Traceback (most recent call last): File "/var/lib/buildbot/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 671, in _run self.compile_and_execute(example, compiler, test.globs) File "/var/lib/buildbot/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 1095, in compile_and_execute exec(compiled, globs) File "<doctest sage.graphs.generic_graph.GenericGraph.?[2]>", line 1, in <module> G.pagerank(alpha=RealNumber('0.50'), algorithm="igraph") File "/var/lib/buildbot/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/graphs/generic_graph.py", line 9668, in pagerank raise PackageNotFoundError("igraph") PackageNotFoundError: the package 'igraph' was not found. You can install it by running 'sage i igraph' in a shell ********************************************************************** File "src/sage/graphs/generic_graph.py", line 9581, in sage.graphs.generic_graph.GenericGraph.? Failed example: G.pagerank(algorithm="igraph") Exception raised: Traceback (most recent call last): File "/var/lib/buildbot/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 671, in _run self.compile_and_execute(example, compiler, test.globs) File "/var/lib/buildbot/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 1095, in compile_and_execute exec(compiled, globs) File "<doctest sage.graphs.generic_graph.GenericGraph.?[10]>", line 1, in <module> G.pagerank(algorithm="igraph") File "/var/lib/buildbot/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/graphs/generic_graph.py", line 9668, in pagerank raise PackageNotFoundError("igraph") PackageNotFoundError: the package 'igraph' was not found. You can install it by running 'sage i igraph' in a shell ********************************************************************** 1 item had failures: 2 of 890 in sage.graphs.generic_graph.GenericGraph.? [3393 tests, 2 failures, 36.96 s] **********************************************************************
comment:69 Changed 3 years ago by
@Rajat: add to each doctest with igraph # optional  python_igraph
, like this
 sage: G.pagerank(alpha=0.50, algorithm="igraph") + sage: G.pagerank(alpha=0.50, algorithm="igraph") # optional  python_igraph {0: 0.25, 1: 0.25, 2: 0.24999999999999997, 3: 0.24999999999999997}
and
 sage: G.pagerank(algorithm="igraph") + sage: G.pagerank(algorithm="igraph") # optional  python_igraph
@Volker: sorry for this.
comment:70 Changed 3 years ago by
 Commit changed from 804fe603a582470952b017d444ee8bb02925482f to f10629a7b9f0efb844de21eaddaf01f8492e59d2
Branch pushed to git repo; I updated commit sha1. New commits:
f10629a  added optional igraph parameter

comment:71 Changed 3 years ago by
Added the optional igraph thing. Understood the importance of it as it may fail tests on systems on which igraph is not installed..
comment:72 Changed 3 years ago by
 Status changed from new to needs_review
comment:73 Changed 3 years ago by
$ ./sage t src/sage/graphs/generic_graph.py Running doctests with ID 201904140952005ed75243. Git branch: u/ghrajat1433/27480_page_rank_algo Using optional=dochtml,igraph,memlimit,mpir,python2,python_igraph,sage Doctesting 1 file. sage t warnlong 24.3 src/sage/graphs/generic_graph.py [3402 tests, 27.42 s]  All tests passed!  Total time for all tests: 28.3 seconds cpu time: 19.0 seconds cumulative wall time: 27.4 seconds
On my system all tests passes but patchbot has some failed examples don't know why...
comment:74 Changed 3 years ago by
This is numerical noise due to floating point arithmetic. See http://doc.sagemath.org/html/en/developer/coding_basics.html#specialmarkuptoinfluencedoctests
A solution is to add # abs tol 1e9
to all doctests.
 sage: G.pagerank(algorithm="Numpy") + sage: G.pagerank(algorithm="Numpy") # abs tol 1e9
...
 sage: G.pagerank() + sage: G.pagerank() # abs tol 1e9
also for igraph
 sage: G.pagerank(alpha=0.50, algorithm="igraph") # optional  python_igraph + sage: G.pagerank(alpha=0.50, algorithm="igraph") # optional  python_igraph # abs tol 1e9
comment:75 Changed 3 years ago by
 Commit changed from f10629a7b9f0efb844de21eaddaf01f8492e59d2 to 07a884b8ce9f1557b9ac1d2c98fa72aa0eea1e1e
Branch pushed to git repo; I updated commit sha1. New commits:
07a884b  floating point flag included

comment:77 Changed 3 years ago by
 Branch changed from u/ghrajat1433/27480_page_rank_algo to 07a884b8ce9f1557b9ac1d2c98fa72aa0eea1e1e
 Resolution set to fixed
 Status changed from positive_review to closed
comment:78 followup: ↓ 79 Changed 2 years ago by
 Commit 07a884b8ce9f1557b9ac1d2c98fa72aa0eea1e1e deleted
There is an issue with igraph on the arando patchbot with 8.8.b4 and 8.8.b5:
********************************************************************** File "src/sage/graphs/generic_graph.py", line 9663, in sage.graphs.generic_graph.GenericGraph.? Failed example: G.pagerank(alpha=0.50, algorithm="igraph") # optional  python_igraph Expected: {0: 0.25, 1: 0.25, 2: 0.24999999999999997, 3: 0.24999999999999997} Got: {0: 0.25, 1: 0.25, 2: 0.25, 3: 0.25} ********************************************************************** 1 item had failures: 1 of 860 in sage.graphs.generic_graph.GenericGraph.? [3497 tests, 1 failure, 50.96 s]  sage t long src/sage/graphs/generic_graph.py # 1 doctest failed
comment:79 in reply to: ↑ 78 Changed 2 years ago by
I think I forgot to add # abs tol 1e9 to this test.
Thanks David for opening #27811 for fixing it
Replying to chapoton:
There is an issue with igraph on the arando patchbot with 8.8.b4 and 8.8.b5:
********************************************************************** File "src/sage/graphs/generic_graph.py", line 9663, in sage.graphs.generic_graph.GenericGraph.? Failed example: G.pagerank(alpha=0.50, algorithm="igraph") # optional  python_igraph Expected: {0: 0.25, 1: 0.25, 2: 0.24999999999999997, 3: 0.24999999999999997} Got: {0: 0.25, 1: 0.25, 2: 0.25, 3: 0.25} ********************************************************************** 1 item had failures: 1 of 860 in sage.graphs.generic_graph.GenericGraph.? [3497 tests, 1 failure, 50.96 s]  sage t long src/sage/graphs/generic_graph.py # 1 doctest failed
I have a brief idea about it(used it in my college project) and can implement it in Sage modules.