Output IGCommunitiesEdgeBetweenness

While running the IGCommunitiesEdgeBetweenness on Mathematica for weighted graphs, there appears the following warning message:

IGraphM::warning: community.c:471 - Membership vector will be selected based on the lowest modularity score.

IGraphM::warning: community.c:478 - Modularity calculation with weighted edge betweenness community detection might not make sense -- modularity treats edge weights as similarities while edge betwenness treats them as distances.

Does this mean, the algorithm is not taking into account the edge weights while calculating the edge betweenness as mentioned in the IGraphM documentation?

In my opinion, this function shouldn’t be used for weighted graphs. The warnings were added to the igraph core library as a quick-fix until a better weighted version can be implemented.


This function repeatedly finds the edge with the highest betweenness and removes it. Betweenness values are recomputed after each removal. Through the progressive removal of edges, the graph falls apart into connected components. This is used to build a dendrogram—each time a new component is separated, the dendrogram branches. For a more detailed description, look up the Girvan–Newman algorithm.

The entire dendrogram, as well as the order in which edges are removed, is stored in the IGClusterData object. To get a concrete clustering (community structure), we need to cut the dendrogram at a certain level. By default, igraph finds the dendrogram cut that maximizes modularity.

All this is fine for unweighted graphs. But when the graph is weighted, weights as interpreted as “lengths” for the betweenness calculation (i.e. path lengths are calculated by adding up weights). Thus, edges with high “weights” (i.e. long “lengths”) represent weak connections. This is the usual way to calculate weighted betweenness.

When calculating the modularity, high “weights” are interpreted as strong connections. This is again the usual way to calculate modularity.

As you can see, there is a mismatch between the two calculations. The dendrogram that the function returns is completely fine, but the automatically computed dendrogram cut is meaningless. You can use the dendrogram and determine where to cut it yourself, if you think that this type of community detection makes sense for your graph.


If what I wrote above doesn’t make sense for you, then simply don’t use this function with weighted graphs (at least not before taking the time to read up on how the algorithm works).

2 Likes

Thanks a lot @szhorvat. That makes sense. Can you explain briefly or refer to me an article or a blog that explains how to decipher the data inside IGCluster? Especially, the data stored inside the “HierarchicalClusters” property. I want to be able to use that to pick and choose to where to cut the dendrogram to arrive at the communities. Thanks in advance.

Unfortunately, there is not a very convenient way to do this at the moment. After the introduction of functions like ClusteringTree to Mathematica, I was expecting Wolfram to provide good built-in tools, but this is not happening, unfortunately … I’ll tell you what is possible below.


Generally, to explore an IGClusterData object, query its "Properties".

Needs["IGraphM`"]

SeedRandom[876];
g = RandomGraph[{10, 20}]

cl = IGCommunitiesEdgeBetweenness[g]
In[39]:= cl["Properties"]

Out[39]= {"Algorithm", "Bridges", "Communities", "EdgeBetweenness", \
"ElementCount", "Elements", "HierarchicalClusters", "Merges", \
"Modularity", "Properties", "RemovedEdges", "Tree"}

The properties are slightly different for each clustering methods. I’ll describe a few relevant ones here.

This is the algorithm used to obtain the clustering:

In[51]:= cl["Algorithm"]
Out[51]= "EdgeBetweenness"

cl["RemovedEdges"] is specific to this algorithm. It gives you the graph edges, in the same order in which they were removed.

cl["EdgeBetweenness"] gives the betweenness of each of these edges (in the same order as "RemovedEdges"), at the stage where it was removed.

"Brigges" gives you the steps at which the graph split into more components. In this example we get:

In[55]:= cl["Bridges"]
Out[55]= {20, 19, 17, 14, 13, 11, 9, 6, 1}

This means that the first, the 6th, the 9th, etc. edge removals have split the graph further. Notice that the order is reversed.

These three properties were all specific to the "EdgeBetweenness" algorithm. The following few are common to all hierarchical clustering methods, i.e. those that output a dendrogram.

This is the clustering tree:

cl["Tree"]

This property uses Mathematica’s built-in ClusteringTree function directly, and is thus only available in Mathematica versions which have this function (10.4 and later, I believe).

We can use the built-in Dendrogram function to plot a dendrogram:

Dendrogram[cl["Tree"]]

But we can’t do much else with this “clustering tree” object, unfortunately. Mathematica simply does not provide the functions.

An alternative is to use the old HierarchicalClustering package.

Needs["HierarchicalClustering`"]

We can now use cl["HierarchicalClusters"] to get a Cluster expression compatible with this package, and e.g. plot it:

DendrogramPlot[cl["HierarchicalClusters"], LeafLabels -> Automatic]

This package is a bit more capable than the built-in functions, so we can use it to split the dendrogram using ClusterSplit (look it up). But it appears to be a bit unwieldy. We can specify how many components to split the dendrogram into. Let’s choose three as an example:

ClusterSplit[cl["HierarchicalClusters"], 3]

{Cluster[Cluster[7, 4, 7, 1, 1], 3, 8, 2, 1], 

 Cluster[
  Cluster[
   Cluster[Cluster[Cluster[10, 8, 1, 1, 1], 6, 2, 2, 1], 5, 4, 3, 1], 
   9, 10, 4, 1], 1, 12, 5, 1], 

2}

The result is two Cluster objects and the single vertex 2. From the first two objects, we can extract the elements using ClusterFlatten, but we need a custom solution to handle the single vertex.

flatten[c_Cluster] := ClusterFlatten[c]
flatten[e_] := {e}
In[71]:= flatten /@ ClusterSplit[cl["HierarchicalClusters"], 3]

Out[71]= {{7, 4, 3}, {10, 8, 6, 5, 9, 1}, {2}}

So there it is, we got the result for 3 clusters. As you can see, it was not convenient, and we had to use a package which Wolfram appears to consider obsolete.

When I originally started to expose this functionality in IGraph/M, I just relied on the HierarchicalClustering because nothing else was available at the time, because implementing a full dendrogram handling functionality is a lot of work, and because I was expecting Mathematica to gain this functionality soon. Indeed, it did get ClusteringTree and Dendrogram, but as you can see, these are not very useful. 5 years after their introduction, we don’t even have built-ins to split a dendrogram!

Given the poor built-in support for dendrogram handling, I will look into adding some limited support directly into IGraph/M, but no promises … I don’t have the time to implement a complete dendrogram handling solution.

The relevant information is all in the "Merges" property.

In[83]:= cl["Merges"]
Out[83]= {{10, 8}, {11, 6}, {12, 5}, {7, 4}, {14, 3}, {13, 9}, {16, 1}, {15, 17}, {18, 2}}

It contains a list of pairs of dendrogram node indices. For the leaves, these are the same as the vertex indices of the graph. For internal nodes, they are higher numbers. To understand the list, cross-reference it with this dendrogram, which has the internal nodes labelled:

SetProperty[cl["Tree"], VertexLabels -> "Name"]

First, notice that we have 10 vertices, so vertex indices go from 1 to 10. {10, 8} means that the last split was separating vertices with indices 10 and 8. Or, reading the dendrogram from bottom to top, we merge 10 and 8 to obtain dendrogram node 11 (one plus the number of vertices). Then we merge 11 and 6 to obtain 12, and so on, until all vertices are merged together (assuming that the graph was connected).

1 Like

What is your operating system? If it is not Windows, I might be able to send you a prerelease build to test when some limited dendrogram functionality becomes available.

1 Like

Thanks! That’d be nice. I use Macbook pro which has the latest macOS. Also, thanks so much for the detailed answer.

OK, macOS is what I use too. I’ll let you know when I have something to share, but it’ll take a few days.

Awesome! Sure, no problem at all.