Closeness centrality between i and j when j is an isolated vertex

Imagine that I want to calculate closeness centrality for vertex i, and there are several vertices that are connected to i, but there is also an isolated vertex j in the graph. Then, according to the documentation, when calculating the summation of distances, for j the total number of vertices is used in the formula. I would expect that a zero should be used because j should be excluded in the calculation.
Does anybody know why total number of vertices is used and zero is not used in this case?
What can I do if I want to use zero in the formula? Maybe there is a workaround?
Thanks for any help!
igraph R manual pages

The justification for using the number of vertices was that this was greater than the largest possible path length in the graph. However, in my opinion this was an ad-hoc choice.

The behaviour has already been changed in C/igraph 0.9 and will eventually propagate to the R interface [Update: The new behaviour is used in R/igraph 1.3 and later.]. See the C/igraph documentation for how this will work in the future: igraph Reference Manual

In my opinion, the real answer to your issue is that closeness centrality makes no sense for disconnected graphs. You may come with up some kind of generalization to the disconnected case, but you will need to consider very careful how that alternative “closeness” measure can be interpreted, and whether it even fits your use case.

Using a zero distance is not the same as excluding j: adding a zero to a list of numbers does change the mean of the list.

To exclude j, you can remove it from the graph.

Hi! I am trying to estimate closeness centrality in a directed graph. I have problems with the new version of the package:

  1. closeness(graphs, mode=“out”,cutoff = -1) ## this with the graph 1.3.2

  2. closeness(graphs, mode=“out”) ## this with the graph 1.2.11
    In this version, the cutoff is not as an argument…

The main difference is that in the first case I have some nodes with value 1…

I think is the cutoff…I try with other values, positive, negative or 0 but the problem continues

This is:

and with the previous version:

It is not the cutoff, but more likely the handling of disconnected components. See the above discussion for what has changed. Check if the graph is connected (or strongly connected, in the directed case).

In short: if your graph is not connected, then closeness is not well-defined. Whatever software you use to compute closeness, check carefully what it does in the disconnected case, and think about whether that makes sense for your use case. Usually, it won’t.

Note that the closeness() function does not estimate closeness centrality. It computes the exact value.

In the unweighted case, this would happen if you have a component containing only two connected vertices.

Thanks for your response!!

The graph that I used, is a dendritic network and it is connected in one main direction.
It is weak connected…so…
how I have to use the metric to have the closeness value in this kind of network?

Thanks!
Ana

As I understand, you are asking how to compute closeness in a directed tree.

I can’t really answer this without context. The real question is not “how to compute closeness”, but “how to interpret the measure you computed”. Or: “what measure is meaningful in a given situation?” igraph will compute the closeness of a vertex as the inverse of the mean distance to all other reachable vertices. It is up to you to decide whether this is useful/meaningful for your specific application.

ok, thanks for your response.

My question was because as it was in the previous version of the metric, even not reachable vertices were considered in the estimation. So in a directed graph, when I estimated out-closeness the most extreme vertices (i.e. in-degree=0 and out-degree=1), had an out-closeness value.
In this version of the metric, that vertices are not considered, since are not reachable.

I am correct?..
thanks
Ana

In old versions the distance to unreachable vertices was considered to be numerically equal to the number of vertices, with the justification that this is larger than any possible distance in unweighted graphs. This is an ad-hoc choice that is in my opinion very dubious even for unweighted graphs, and unjustified in weighted ones.

Thanks!
It would be great if the two versions of the metric would be in the new version.

Thanks again
Ana

I do not think that the old version is justified, or that it’s even fair to call it closeness. I would not support its re-inclusion because it will confuse people who are inexperienced in working with networks, as well as people who do not read the documentation and thus do not realize that what is computed is not standard closeness (which is likely the majority …). Can you explain why you think that the old version is meaningful?

However, you can certainly compute it yourself starting with the distance matrix (distances() function). Instead of excluding infinities when computing averages, replace them by the number of vertices.


The classic (standard) definition of closeness simply does not apply to disconnected graphs. Probably the best thing igraph could do here is to issue a warning whenever the graph is disconnected. One can certainly generalize the concept in various ways, but if doing so, then the question is: is the usual interpretation of the centrality scores still valid? Are vertices with higher scores “more central” in some sense?

R/igraph 1.2 implemented a generalization which I find unusual, which does not apply to weighted graphs at all, and which was not used anywhere else as far as I’m aware. R/igraph 1.3 uses a different generalization: In undirected graphs, it is equivalent to computing closeness separately for each connected component. But then the scores are not comparable between the different components: smaller components will have higher scores.

networkx implements yet another generalization by Wasserman and Faust, see closeness_centrality — NetworkX 2.8.4 documentation This has not caught on in the community either. They again compute the scores per-component, then rescale with a certain function of the component size in an attempt to make them comparable. I do not find this well-justified either, as how the values scale with the component size depends on the specific graph structure.

Some people recommend using “harmonic centrality” i.e. averaging not shortest path lengths, but their inverses. This way the inverse distance to unreachable vertices can be taken as 0.


All that said, you are working with directed trees, which are a very special sort of graph. Instead of just trying to compute closeness, I would suggest thinking about what concept is suitable for measuring whatever you are trying to quantify in this network. I would think about this problem not as “how to compute closeness”, but “what should I measure in my network”.

Thanks for your comment and time!
I am going to think about it!

Hi!
To compare centrality metrics among graphs with different number of nodes, what is better to use:

  1. centrality node /sum (centralities)
  2. centrality node /max (centralities)

In each graph…

Thanks!
Ana