I wondered what walktrap is using as a stopping criterion? In the paper of the algorithm it is described as a hierarchical clustering method so the smallest amount of clusters should always be 2. I however sometimes receive values >2 for graph.vcount()-len(dendro.merges)+1.
Please expand on the question, and explain it in more detail. Add specific code examples, explain what output you get and what you expect. When referring to the paper, be specific about which part you are referring to.
Keep in mind that people who read you question don’t have a fresh memory of this paper/method and are constrained in time.
In the paper it says at 1.1 Our approach and results:
“One obtains this way a hierarchical community structure that may
be represented as a tree called dendrogram. We propose such an algorithm, called Walktrap, which computes a community structure in
time O(mnH) where H is the height of the corresponding dendrogram”
Dendrogram will allow a split at every clustersize from number of clusters equal to number of vertices (Each vertice is its own cluster) to number of cluster equal to 1 (whole graph is one cluster).
dendro = graph.community_walktrap(steps=3)
min_cluster = graph.vcount()-len(dendro.merges)
I expect:
min_cluster=1
What I get for some graphs:
min_cluster>1
So there has to be a stopping criteria, so that the dendrogram is not completed.
I think understand what you mean now. The minimum number of clusters will be the same as the number of connected components. Can you check if the graphs where you see this are disconnected?
Thanks a lot that was exactly my problem and makes total sense