betweeness calculation with directed and weighted graph

Hi all, I am new to network analysis.

I believe I understand that Kate is on the shortest path from Edwina to Anthony, as well as from Edwina to Benedicte, so I think I understand the value of 2.

However, Anthony is indeed on the shortest path between Benedicte and Edwina in both directions (=2), and he is on the shortest path between Kate and Benedicte in both directions (=2). So, I would find the value of four for the number of shortest paths and hence the betweenness. Why then five?

I have tried other packages, and sometimes I get the following values: 4 0 0 0 or 6 0 0 0, 2 0 0 0, in short, I am confused…

Can someone help me understand?

Thank you in advance for any assistance you can provide.

Nancy

Below is the script and the dataset.

# data --------------------------------------------------------------------
(sub=tibble::tribble(
      ~source,     ~target, ~weight,
    "Anthony", "Benedicte",       3,
    "Anthony",    "Edwina",       1,
    "Anthony",      "Kate",       2,
  "Benedicte",   "Anthony",       4,
     "Edwina",      "Kate",       2,
       "Kate",   "Anthony",       1,
       "Kate",    "Edwina",       4
  ))
#> # A tibble: 7 × 3
#>   source    target    weight
#>   <chr>     <chr>      <dbl>
#> 1 Anthony   Benedicte      3
#> 2 Anthony   Edwina         1
#> 3 Anthony   Kate           2
#> 4 Benedicte Anthony        4
#> 5 Edwina    Kate           2
#> 6 Kate      Anthony        1
#> 7 Kate      Edwina         4


(sub_with_numbers=tibble::tribble(
                                                                                ~source, ~target, ~weight,
                                                                                      1,       2,       3,
                                                                                      1,       3,       1,
                                                                                      1,       4,       2,
                                                                                      2,       1,       4,
                                                                                      3,       4,       2,
                                                                                      4,       1,       1,
                                                                                      4,       3,       4
                                                                                ))
#> # A tibble: 7 × 3
#>   source target weight
#>    <dbl>  <dbl>  <dbl>
#> 1      1      2      3
#> 2      1      3      1
#> 3      1      4      2
#> 4      2      1      4
#> 5      3      4      2
#> 6      4      1      1
#> 7      4      3      4
# network --------------------------------------------------------------------
library(igraph)
#> 
#> Attachement du package : 'igraph'
#> Les objets suivants sont masqués depuis 'package:stats':
#> 
#>     decompose, spectrum
#> L'objet suivant est masqué depuis 'package:base':
#> 
#>     union
netsub <- graph.data.frame(sub, 
                           directed=T)
E(netsub)$width <- sub$weight
plot(netsub)

# betweeness with igraph --------------------------------------------------
betweenness(netsub)
#>   Anthony Benedicte    Edwina      Kate 
#>         5         0         0         2

betweenness(netsub,directed=TRUE,weights=E(netsub)$weight)
#>   Anthony Benedicte    Edwina      Kate 
#>         5         0         0         2

Created on 2023-10-16 with reprex v2.0.2

You missed the shortest path from Kate to Edwina. That’s the fifth. The path K->E has total length 4, while the path K->A->E has total length 2+1=3, so it’s shorter. It’s easier to see it like this:

These values are not correct for this network. Perhaps the graph was not constructed in the same way (with the same weights, same edge directions, etc.)

Thank you so much for your quick response.

In that case, maybe there’s something I didn’t understand about the social network. The values in the table correspond to the number of interactions. Put another way, if there’s a value of 4 between Kate and Edwina, then somewhere the relationship is stronger and maybe in this case the shortest route isn’t through Anthony.

Perhaps in this case, I need to consider the inverse of the number of interactions to get an idea of the betweeness?

And in this case, I understand because Anthony is the shortest path between Benedicte and Edwina, between Benedicte and Kate, between Edwina and Benedicte and between Kate and Benedicte.

And for Kate, she’s on the shortest path between Edwina and Anthony, between Edwina and Bénédicte and also because of the weighting between Bénédicte and Edwina and between Anthony and Edwina (path of 0.25 + 0.5 instead of 1).

Is this correct?

Thank you so much again for all your help :blush:

This is a common problem with weights and path-based centralities (was just teaching about this). Often we use weight to indicate closeness, e.g., number of interactions as in your case. This works for degree-based centralities (strength, eigenvector, pagerank) but not for path-based centralities, which interpret weights as distances, so bigger numbers are further away, not closer.

Inverting the weights will indeed make strong ties into shorter distances and weak ties into longer ones, but it is a nonlinear transformation, so you need to decide whether that is OK. An alternative linear transformation is to negate the weights and then add the max weight plus 1 so the distances range from 1 on up (if you don’t add 1 the closest nodes will have a distance of “0”: I’m not sure how the algorithms handle that.

Try plotting histograms of weights to see the difference. Something like:

# Existing weights 
ew <- E(g)$weight
# "Reversed" weights (negated and then shifted to be positive)
rw <- (max(ew) + 1) - ew 
# Inverted weights 
iw <- 1/ew 
hist(ew, breaks=100, main="Existing Weights")
hist(rw, breaks=100, main="Reversed Weights")
hist(iw, breaks=100, main="Inverted Weights")

You should have a theoretical or empirical reason for the choice between reversing/negating or inverting (or just computing unweighted). I’m not an expert on this and don’t know whether there is any literature on it. I have been wondering about this and hope others can comment.

Thank you very much for your quick reply, which put things into perspective. Thank you also very much for your thoughts on the inverse, which is not linear, and for your suggestion of a transformation that I’m going to try. And if at some point I come across a bit more literature on the subject, I won’t hesitate to share it with you. See you soon!