ig.degree.betweenness: A community detection algorithm for "more descrptive" communities

Hello everyone!

I am in the process of synthesizing everything I worked on this past year with SNA. Among the work was the development of some “methodology” where we extended the Girvan Newman algorithm by also considering Node-Degree.

Check the GitHub here: GitHub - benyamindsmith/ig.degree.betweenness: An R package for the implementation of the "Smith-Pittman" community detection algorithm. Compatible with the igraph ecosystem.

Working paper: [2411.01394] Centrality in Collaboration: A Novel Algorithm for Social Partitioning Gradients in Community Detection for Multiple Oncology Clinical Trial Enrollments

The code is written in R and definitely does not have the igraph best practices (e.g. graphs need to have named nodes for this to work, so there’s a utility function that does that for unnamed nodes).

However I’m pretty proud of the work and intend on submitting a JOSS article soon to “market” the method.

1 Like

Here’s a working example which I just did this week:

(Writeup was for a non SNA audience so please forgive the simplifications)

I decided to try my hand with this past week’s TidyTuesday Dataset: Monster Movies.

Taking a Social Network Analysis (SNA) approach I decided to look at the the relationships between genres and monster movies.

Legend:

  • Bigger Nodes- More connections with a given Genre
  • Thicker Edges- Stronger connections
  • Convex Hulls/Colored Nodes- Identified Clusters

The first visual is the constructed network.

The other visuals are clusters based on the Girvan Newman (Edge Betweenness), Louvain (Direct Modularity Maximization) and Smith(thats me!)-Pittman (Node Degree + Edge Betweenness) [paper: [2411.01394] Centrality in Collaboration: A Novel Algorithm for Social Partitioning Gradients in Community Detection for Multiple Oncology Clinical Trial Enrollments].

Girvan Newman doesn’t tell any story (clustering everything in one group isn’t much of a story). Louvain might be telling us something in terms of strength of clustering but doesn’t necessarily speak about the reality of “monster” movie genre interactions. Smith-Pittman clustering tells the best story (albeit biased) with popular genres forming the primary working group followed by more ambivalent smaller subgroups and outlier nodes.

This aligns with the degree (popularity) distribution (the bar graph) of the nodes as well (which is what our working paper asserts as well for certain contexts).

1 Like