Unfortunately, there is not a very convenient way to do this at the moment. After the introduction of functions like ClusteringTree
to Mathematica, I was expecting Wolfram to provide good built-in tools, but this is not happening, unfortunately … I’ll tell you what is possible below.
Generally, to explore an IGClusterData
object, query its "Properties"
.
Needs["IGraphM`"]
SeedRandom[876];
g = RandomGraph[{10, 20}]
cl = IGCommunitiesEdgeBetweenness[g]
In[39]:= cl["Properties"]
Out[39]= {"Algorithm", "Bridges", "Communities", "EdgeBetweenness", \
"ElementCount", "Elements", "HierarchicalClusters", "Merges", \
"Modularity", "Properties", "RemovedEdges", "Tree"}
The properties are slightly different for each clustering methods. I’ll describe a few relevant ones here.
This is the algorithm used to obtain the clustering:
In[51]:= cl["Algorithm"]
Out[51]= "EdgeBetweenness"
cl["RemovedEdges"]
is specific to this algorithm. It gives you the graph edges, in the same order in which they were removed.
cl["EdgeBetweenness"]
gives the betweenness of each of these edges (in the same order as "RemovedEdges"
), at the stage where it was removed.
"Brigges"
gives you the steps at which the graph split into more components. In this example we get:
In[55]:= cl["Bridges"]
Out[55]= {20, 19, 17, 14, 13, 11, 9, 6, 1}
This means that the first, the 6th, the 9th, etc. edge removals have split the graph further. Notice that the order is reversed.
These three properties were all specific to the "EdgeBetweenness"
algorithm. The following few are common to all hierarchical clustering methods, i.e. those that output a dendrogram.
This is the clustering tree:
cl["Tree"]
This property uses Mathematica’s built-in ClusteringTree
function directly, and is thus only available in Mathematica versions which have this function (10.4 and later, I believe).
We can use the built-in Dendrogram
function to plot a dendrogram:
Dendrogram[cl["Tree"]]
But we can’t do much else with this “clustering tree” object, unfortunately. Mathematica simply does not provide the functions.
An alternative is to use the old HierarchicalClustering
package.
Needs["HierarchicalClustering`"]
We can now use cl["HierarchicalClusters"]
to get a Cluster
expression compatible with this package, and e.g. plot it:
DendrogramPlot[cl["HierarchicalClusters"], LeafLabels -> Automatic]
This package is a bit more capable than the built-in functions, so we can use it to split the dendrogram using ClusterSplit
(look it up). But it appears to be a bit unwieldy. We can specify how many components to split the dendrogram into. Let’s choose three as an example:
ClusterSplit[cl["HierarchicalClusters"], 3]
{Cluster[Cluster[7, 4, 7, 1, 1], 3, 8, 2, 1],
Cluster[
Cluster[
Cluster[Cluster[Cluster[10, 8, 1, 1, 1], 6, 2, 2, 1], 5, 4, 3, 1],
9, 10, 4, 1], 1, 12, 5, 1],
2}
The result is two Cluster
objects and the single vertex 2
. From the first two objects, we can extract the elements using ClusterFlatten
, but we need a custom solution to handle the single vertex.
flatten[c_Cluster] := ClusterFlatten[c]
flatten[e_] := {e}
In[71]:= flatten /@ ClusterSplit[cl["HierarchicalClusters"], 3]
Out[71]= {{7, 4, 3}, {10, 8, 6, 5, 9, 1}, {2}}
So there it is, we got the result for 3 clusters. As you can see, it was not convenient, and we had to use a package which Wolfram appears to consider obsolete.
When I originally started to expose this functionality in IGraph/M, I just relied on the HierarchicalClustering because nothing else was available at the time, because implementing a full dendrogram handling functionality is a lot of work, and because I was expecting Mathematica to gain this functionality soon. Indeed, it did get ClusteringTree
and Dendrogram
, but as you can see, these are not very useful. 5 years after their introduction, we don’t even have built-ins to split a dendrogram!
Given the poor built-in support for dendrogram handling, I will look into adding some limited support directly into IGraph/M, but no promises … I don’t have the time to implement a complete dendrogram handling solution.
The relevant information is all in the "Merges"
property.
In[83]:= cl["Merges"]
Out[83]= {{10, 8}, {11, 6}, {12, 5}, {7, 4}, {14, 3}, {13, 9}, {16, 1}, {15, 17}, {18, 2}}
It contains a list of pairs of dendrogram node indices. For the leaves, these are the same as the vertex indices of the graph. For internal nodes, they are higher numbers. To understand the list, cross-reference it with this dendrogram, which has the internal nodes labelled:
SetProperty[cl["Tree"], VertexLabels -> "Name"]
First, notice that we have 10 vertices, so vertex indices go from 1 to 10. {10, 8}
means that the last split was separating vertices with indices 10 and 8. Or, reading the dendrogram from bottom to top, we merge 10 and 8 to obtain dendrogram node 11 (one plus the number of vertices). Then we merge 11 and 6 to obtain 12, and so on, until all vertices are merged together (assuming that the graph was connected).