Hi all,

I would like to use the weighted version of PageRank for a directed graph in R. I’ve read the user manual and, if I’m understanding this correctly, it uses the weight w_{ij} of the edge going from i to j and then normalises by i's out-degree k_i^{\text{out}}. Is this the correct formula? Is there a way to change the normalisation?

PageRank for a directed weighted graph is given by (if my understanding is correct):

\mathbf{x} = [ \mathbf{I} - \text{d} \mathbf{WD} ^{-1}]^{-1} \mathbf{1}\;,

where \mathbf{x} is the vector of PageRanks, \mathbf{I} is an identity matrix, d is the dampening factor, \mathbf{W} is the weighted adjacency matrix with W_{ij} = w_{ij} indicating the existence of a link from i to j of weight w_{ij} and \mathbf{D} is a diagonal matrix with elements D_{ii} = \text{max}(k_i^{out}, 1), where k_i^{out} = \sum_j A_{ij} is the out-degree of node i, and \mathbf{1} is a vector of ones.

Ultimately, what I’m after is to be able to change \mathbf{D}, is this possible? If so, how?

Many thanks!

Andrea

See the C documentation for an intuitive description of PageRank through random walks:

https://igraph.org/c/html/latest/igraph-Structural.html#igraph_pagerank

The “normalization” you refer to is not arbitrary. The random walker follows out-edges with probabilities proportional to their weights. Thus the weight matrix must be divided not by degrees, but by strengths (i.e. sums of outgoing weights), in order to make it stochastic.

Let W_{ij} be the weight of the edge i \rightarrow j. Then we can express PageRank p through the following self-consistency equation:

d\; \underbrace{\mathbf{S}^{-1} \mathbf{W}^T p}_\text{random walk} + (1-d)\; \underbrace{\mathbf{1}/n}_\text{teleportation} = p

where \mathbf{S} is the diagonal matrix of out-strengths s_i = \sum_j W_{ij} and n is the number of vertices.

If you are using igraph from R, I strongly recommend that you use the development version (due to become 1.3) as it is based on C/igraph 0.9 where PageRank calculations have received many fixes. See the changelog for 0.9.0 and 0.9.1 for details: Releases · igraph/igraph · GitHub In short, if you work with simple graphs and don’t use a personalization vector, then everything should be with with the PRPACK method.

@andreabacilieri I’m curious why you would want to change the normalization. As @szhorvat explains, it is well justified and can be understood in terms of a random walker.

Here is a simpler analogy that I used when first teaching centrality metrics to my students: The eigenvector centrality formula is like saying “I’ve got $100, so I’m going to give $100 to each of my 5 friends.” This results in creating more centrality than merited in (for example) isolated complete subgraphs as the process repeats. PageRank’s normalization, in this analogy, recognizes that if you have $100 you can only give $20 to each friend, so centrality is conserved.

1 Like

Thank you! It was wired indeed that it would still use the out-degree as normalisation.

Because in economics PageRank is similar to 3 widely used metrics: output multipliers, input multipliers and influence vector. However, d = 1, no teleportation and normalised differently so that, for instance, the input multipliers use \sum_j Z_{ij} < 1 \forall i, where Z_{ij} = W_{ij}/O_i, O_i being the out-strength of i plus final demand. I have a huge network, so I was hoping to be able to exploit the code of iGraph, which is very fast, instead of having to take some time to code this.

I am not sure what you mean by this. In weighted graphs, igraph always uses the out-strength, never simply out-degrees.