Improve cluster_leiden doc.

[igraph R manual pages]

g <- graph.famous("Zachary")
# By default CPM is used
g <- cluster_leiden(g, resolution_parameter=0.06)

The example above can easily mislead the casual reader. The result of cluster_leiden() is not a graph.

My suggestion is to modify the documentation and include a plot to show the leiden partition:

g <- graph.famous("Zachary")  # By default CPM is used
ldc <- cluster_leiden(g, resolution_parameter=0.06)
plot(g, mark.groups = ldc)

Good point. Why don’t you open a pull request? We could certainly use help improving the documentation and examples. The example is located here in the code: rigraph/community.R at dev · igraph/rigraph · GitHub

I need to familiarize myself with github. But this is a save topic to start with.

Yes, this is safe. You cannot do any damage, as someone will need to approve changes before they go live. So don’t worry about that.

The simplest way to get started is to make the changes directly in your browser. You can find the file on GitHub and use the “edit” button (pencil icon). rigraph/community.R at dev · igraph/rigraph · GitHub I suggest you use this method for your first PR.

In the long term, if you plan to make more contributions (which I would encourage you to do), it will pay off to check out the git repository on your computer and work with it locally (instead of in the browser). If you are not very comfortable with git yet, GitHub Desktop may be the easiest way to get started.

1 Like

There’s another thing, in addition to the documentation, which you could help with, if you like.

At one time you did some tests, verifying if functions behave well when they are passed a very large number of vertices. We are working on this, and it will take time to resolve.

However, other types of invalid input would be useful to test for. Specifically:

  • Do all functions behave well with non-simple graphs (multigraphs, graphs with self-loops)?
  • Do functions generally behave well with edge cases such as the zero-vertex graph, the one-vertex graph (either with zero edges or self-loops), disconnected graphs, etc.?
  • Do functions reject invalid input such as a negative number of vertices for a graph generator?

To my knowledge, all these should work well. If there is a function that does not handle such inputs, it should be fixed.

Once again: What we cannot fix right now is very large numbers as input. We know that this fails and there is ongoing work to deal with this.

Doing such testing is of course rather boring, so I don’t expect that you would want to do it. But given that you have reported similar issues in the past, I wanted to let you know what kind of testing is the most useful at the moment.

I don’t mind testing. However, to do it systematically over a set of functions, it helps to have a list of functions, or a simple procedure to create the list yourself.

I carried out a few tests:

func <-  c( "cluster_walktrap(g, steps=4)"
          , "if (components(g)$no == 1) cluster_spinglass(g)"
          , "cluster_infomap(g)"
          , "if (components(g)$no == 1) cluster_fluid_communities(as.undirected(simplify(g)), no.of.communities=10)"
          , "cluster_leading_eigen(as.undirected(g))"
          , "cluster_edge_betweenness(g)"
          , "cluster_fast_greedy(as.undirected(g))"
          , "cluster_label_prop(g)"
          , "cluster_louvain(as.undirected(g))"
          , "if (gorder(g)< 50) cluster_optimal(g)"
          , "if (components(g)$no == 1) clusters(g)"

List of graph’s

  • g ← make_empty_graph(0L, directed=FALSE)
  • g ← make_empty_graph(0L, directed=TRUE)
  • g ← make_empty_graph(1L, directed=FALSE)
  • g ← make_empty_graph(1L, directed=TRUE)
  • g ← make_empty_graph(2L, directed=FALSE)
  • g ← make_empty_graph(2L, directed=TRUE)
  • g ← graph_from_literal(1-1)
  • g ← graph_from_literal(1-+1)
  • g ← make_de_bruijn_graph(2,10) # 2= alphabet, 10 = all unique 10-sequences
  • g ← sample_gnm(n, n/2);
  • g ← make_de_bruijn_graph(2,10)
  • g ← make_de_bruijn_graph(2,8); g ← g + g
  • g ← make_de_bruijn_graph(2,8); g ← add_edges(g, c(t(get.edgelist(g))))

I found no irregularities.

On closer inspection I found an issue with Spinglass and the length function.

g ← make_empty_graph(1L)
clu ← cluster_spinglass(g)

[1] 2


[1] NaN

[1] 0.01

[1] "spinglass"

[1] 1

[1] "communities"
[1] 2

Community IDs must start with one. The $membership vector contains a non-existing vertex, $csize is numeric(0) (wrong), and length() incorrectly sets the length to 2.

Thanks, this seems like a bug in the C core of igraph, which contains a separate branch for handling null and singleton input graphs. I’ll look into this.

Bug now fixed in the master and develop branches of the C core; the fix will be released in the next patch version of the R interface. Until then, I’m afraid you’ll need to special-case null and singleton graphs (i.e. if vcount(g) < 2) when working with the spinglass clustering.

As you can see in the example, the length is also wrong (2).
For clarity I will make a separate item.

The problem with length() will probably be solved by the fix in the C core as well because length(clu) is essentially max(clu$membership) and it’s the membership vector that’s incorrect. This happens with my R after fixing the C core:

> g <- make_empty_graph(1, directed=F)
> clu <- cluster_leiden(g)
> length(clu)
[1] 1
> clu$membership
[1] 1

I would like to clarify examples in help(‘-.igraph’). How do I find this file to process it?