Hierarchical clustering using Leiden Algorithm

I have a graph G and I applied the Leiden algorithm which resulted in me in 12 clusters. As I wanted to do Hierarchical clustering, I searched in igraph and found methods like subgraphs, where I can pass the clustered index and do clustering individually for each clustered index. I followed the Louvain algorithm and the result of the layer 1 cluster is given into a graph and then given it as input for the second layer. But it is taking a lot of time and then I discovered Leiden and started applying it to my data. For the Leiden algorithm, how can the result of leid1 can be re-clustering again.


leid1 = G.community_leiden(objective_function="modularity", weights='WeightCount',
                                      n_iterations=500)
leid1 = G.modularity(leid_freq_coms_lay1, weights='WeightCount')


Data used for the algorithm

df_new.head()
	    Source    Target   weight_cosine WeightCount
644849	977782.0	958531.0	0.008294	43850
298835	958531.0	977782.0	0.008294	43850
465075	972787.0	972468.0	0.136831	39423
373608	972468.0	972787.0	0.136831	39423
297679	958531.0	954527.0	0.006997	31622
G = Graph.DataFrame(df_new, directed=False, vertices=skills_labels)

Please let me know about any packages available in igraph so that I can do clustering again on the result of leid1 on all cluster indexes at a time.

The result of calling community_leiden() on a graph is a VertexClustering object. Iterating over such an object gives you lists containing the vertices in each community:

g = Graph.GRG(100, 0.2)
clustering = g.community_leiden()
for members in clustering:
    print(members)

You can then use the induced_subgraph() method of the original graph to extract another graph that contains the vertices and edges within each cluster:

for members in clustering:
    sg = g.induced_subgraph(members)
    print(sg.get_edgelist())

Note that the IDs of vertices and edges are renumbered in the subgraph because they always have to start from zero and must be a continuous range. Nevertheless, you can use the vertex attributes to correlate the vertices of the subgraph back to the original vertices based on their IDs.

Thank you for your response. Can I know the reason why we are applying the method GRG for the graph.

g = Graph.GRG(100, 0.2)
clustering = g.community_leiden()
for members in clustering:
    print(members)

When I apply the code on graph G without GRG, I am getting clusters, but I have found some of the clusters are empty which didn’t happen with my code. And when I apply G.GRG and then community_leiden, I have empty members.

g = G.GRG(100, 0.2)
clustering = g.community_leiden()
for members in clustering:
    print(members)

This is the code I have followed. Please let me know if I am missing anything.

Graph.GRG() is just a quick way to construct a graph that usually has some kind of a clustered structure. (GRG stands for “geometric random graph”). This is my go-to example for creating such graphs in examples. Other than that, there is no specific reason for that, I just needed something for illustrative purposes.

Thanks for the explanation.