Four types of directed assortativity were proposed in a Foster et al. PNAS paper (http://www.pnas.org/cgi/doi/10.1073/pnas.0912671107, see Figure 1) and are also summarized in Barabasi’s discussion of knn, box 7.3 of his text (Network Science by Albert-László Barabási; he cites Foster et al. in the printed text but not the online book). (Both figures are appended)
The types are in-in, in-out, out-in, and out-out, defining whether one is correlating the in-degree or out-degree of the source vertex with the in- or out-degree of the target vertex. In all cases, their figures and mathematics indicate that the target vertex is found by following edges out of the source in the direction of the edge.
When I first saw the igraph knn documentation (igraph R manual pages), I thought that “mode” would specify whether the in-degree or out-degree of the source vertex is used, and “neighbor.degree.mode” would specify whether the in- or out-degree of the target vertex is used for the correlations, so it would be simple to replicate the 4 cases of the above publications. But it turns out this is not quite right, as “mode” also specifies how to reach neighbors.
For mode: “the type of neighbors to consider in directed graphs. out considers out-neighbors, in considers in-neighbors”: I read this to mean “mode” is both how to find the neighbors AND which degree of the source vertex will be used for the k of knn(k). We can’t separate the choice of how to find neighbors and what is used for ego node k.
For neighbor.degree.mode: “The type of degree to average in directed graphs. out averages out-degrees, in averages in-degrees and all ignores edge directions for the degree calculation.”: This is clearly which degrees will be used to compute the knn(k) once the neighbors are reached: no problem here.
Given that these readings are correct, I’m trying to figure out how to specify the 4 assortativities of Foster et al. I’ve got 3 of them but not the 4th, which is hampered by the conflation of how to find neighbors with what degree to use for the source node. The cases:
Pattern “out-in” is of interest because it is the situation specified by Newman’s Mixing Patterns in Networks, Physical Review 2003, formula 25, and the documentation of igraph::assortativity. We want to see whether the in-degree of vertices reached by following out-edges are correlated with the originating vertices’ out-degree. So we give knn mode=“out” to both reach the desired neighbors and use out-degree for “k” in knn(k), and neighbor.degree.mode=“in” to specify that the in-degree of neighbors is to be computed for knn. No problem here.
Pattern “out-out” is also straightforward: mode=“out” both finds the out-degree neighbors and uses out-degree for k, while neighbor.degree.mode=“out” averages the neighbors’ out-degree.
The patterns starting with “in” puzzled me because the figures in the above mentioned papers imply one is finding neighbors by following links in the out direction in all 4 cases. But I made some progress when I realized that the “in-in” situation is symmetric: mode=“in” will take us from the right hand node to the left hand node in the figures, going backwards over the link (as well as specify the correct k), and neighbor-degree.mode=“in” computes the average in-degree of the left hand nodes so reached.
The one I am stuck on is “in-out”: Their figures show that we are reaching neighbors via out-degree but we want to use the in-degree of the ego nodes we started at for the “k” of knn(k). Yet a single parameter “mode” specifies both how to reach neighbors and what is the k, and it can’t have two values. If I use mode=“out” to get to the neighbors in the direction of the links I am using out-degree rather than in-degree for k. If instead I use the reversed strategy and set mode=“in” to take us from the right hand mode to the left hand node, that implies we are using in-degree on the right hand side, where we want out as shown in the figure. Specifying (in, out) creates a situation not actually shown in their figures.
In summary, there is a conflict between how we reach neighbors and which degree to use for the “k” of knn(k). There does not seem to be a way to get neighbors via out-going edges but then correlate ego node in-degree with these neighbors.
My question is: Am I correct in this analysis, or am I missing something (after hours of puzzling over it)? Is there a way to get the “(in, out)” situation of Foster et al. with igraph::knn?
Thanks for your consideration,
Dan Suthers