I’m looking into implementing the rich-club coefficient, and have a few questions about how to build flexibility into this implementation:
Scope: seems like NetworkX’s version does not support weighted graphs, directed graphs, and graphs with parallel edges or self loops; so the goal is to implement the rich-club coefficient \phi(k)=\frac{2E_{>k}}{N_{>k}(N_{>k}-1)}
with those options as user-selectable parameters?
Weighted version: my interpretation is that the weighted extension simply replaces E_{>k} with the total sum of edge weights among those nodes?
Colizza et al. also mentioned \phi_{\text{unc}}(k)\underset{k, k_{\text{max}}\rightarrow\infty}{\sim}\frac{k^2}{\langle k \rangle N}
for random uncorrelated networks, which I didn’t entirely understand. Is this related to null models (so not relevant to the unnormalized version for this implementation)?
It looks like you already read up on this topic in detail, which is great.
Note that while the implementation will be simple, this feature does require some thought, and perhaps some literature search about what variants of the rich-club coefficient are in common use.
More or less, yes.
There could be a Boolean directed parameter. Setting this to false will simply ignore edge directions in directed graphs. Generally, the density calculation should be done appropriately based on whether the graph is directed or undirected. The density is the fraction of vertex pairs that are actually connected. In an undirected simple graph, there are n(n-1)/2 pairs. In a directed graph, we must consider ordered pairs, of which there are n(n-1).
If self-loops are allowed in principle, then there are n(n+1)/2 and n^2 unordered and ordered pairs, respectively.
Multi-edges: the classic density is not meaningful for multi-edges, but I suggest to use the same formula nevertheless, i.e. number of edges divided by number of pairs. This makes “densities” larger than 1 possible in multigraphs.
Weighted graphs: for this implementation, let’s treat weights the same as edge multiplicities. Again, use the same formula, but instead of the number of edges, we take the total edge weight.
Yes, let us use this. But note that the most common weighted rich club concept found in the literature is something entirely different. You can see it here: https://doi.org/10.1103/PhysRevLett.101.168702 Let us NOT implement this different weighted concept for now. That can be done later, separately.
Yes, this is the theoretical rich-club coefficient for a degree-constraint null model, although I believe this formula is accurate only for multigraphs. It is approximately valid for sparse simple graphs in which no degree is very high.
Let’s skip the normalization for now. There are many ways to normalize \phi depending on the choice of null model, so let’s not include this in this function.
Thanks! I just submitted a draft PR with the very basic skeletons for the two functions in a new rich_club.c file, function declarations in igraph_structural.h and the .c file added to CMakeLists.txt, following the wiki. I’m thinking more about the specific parameters and implementation. Please let me know if what I have currently is structured correctly/a good starting point!
I also noticed that a bunch of the azure-pipelines tests are failing, along the lines of The process '/usr/local/bin/cmake' failed with exit code 1. I’m not entirely sure what’s causing this — how should I fix it?