Enumerating all the elementary circuits of a directed graph

Olivier_GIVAUDAN · 23 November 2020 22:07

Hello everyone,

This summer I was looking for an R package with a function returning all the elementary circuits of a directed graph but didn’t find any.

That’s why I eventually implemented an algorithm to perform this task (Johnson’s algorithm).

Do you think it would be useful to add such a function to the package igraph?

Best regards,

Olivier Givaudan

szhorvat · 24 November 2020 09:55

Yes, contributions are absolutely welcome.

Basic algorithms are implemented in igraph’s C core. It would be a welcome addition there, if you are familiar with C.

I was wondering if you implemented it in R or in C.

szhorvat · 24 November 2020 10:02

For reference, I believe you mean this algorithm: https://epubs.siam.org/doi/abs/10.1137/0204007

It finds all simple cycles, i.e. closed paths with no repeating vertices.

In general, there can be a very large number of such cycles. The paper gives the count in a complete graph as:

Thus a useful implementation would make it possible to output only some of the cycles (either implement it as an iterator like in Python, or allow for a callback function to terminate the enumeration early, as with some existing functions)

Olivier_GIVAUDAN · 24 November 2020 11:42

I implemented it in R (I posted on purpose here in the R section).
I am familiar with C, although I didn’t code in this language for some years now.

Olivier_GIVAUDAN · 24 November 2020 11:50

Yes, indeed, I should have posted the exact reference.
I meant this algorithm, yes: Donald B. Johnson: Finding All the Elementary Circuits of a Directed Graph. SIAM Journal on Computing. Volumne 4, Nr. 1 (1975), pp. 77-84.
Another link to the article (for free, and legally): https://www.cs.tufts.edu/comp/150GA/homeworks/hw1/Johnson%2075.PDF

It relies on Tarjan’s algorithm to find the strongly connected components (as the elementary circuits are found within each of the strongly connected components):
Robert Tarjan: Depth-first search and linear graph algorithms. In: SIAM Journal on Computing. Volume 1, Nr. 2 (1972), pp. 146-160.
A link to the article (for free, and legally): https://www.semanticscholar.org/paper/Depth-First-Search-and-Linear-Graph-Algorithms-Tarjan/385742fffcf113656f0d3cf6c06ef95cb8439dc6

For my particular use I didn’t encounter performance/memory issues related to the number of elementary circuits (I typically have 1000+ vertices and 1000+ edges). So it was possible to output the complete list.
I agree there should be some restrictions for bigger graphs: on the length of the circuits perhaps?
[EDIT]: OK, that’s what is already done in igraph_cliques_callback.

Olivier_GIVAUDAN · 25 November 2020 23:05

So, what would be the next step?

szhorvat · 26 November 2020 10:48

As far as I am concerned, this would be a very welcome addition to igraph’s C core. If you would like to give it a go, the first step is to get familiar with igraph and its quirks: compile the development version and make sure you can compile small programs that use igraph. If you need help, feel free to ask (I suggest opening a new thread in the Development category). I’m sure you’ll have questions. There are also useful resources in the Wiki. Once you got started with the basics, you can open a PR against the develop branch on GitHub and we can provide guidance there as you are proceeding with the implementation.

Implementing it in C is of course going to be quite a bit more work than doing it in R, so you should feel no obligation to do it. However, if you do decide to try, this would be a welcome addition.

As for incorporating an existing R implementation into the R interface, I can’t really comment on that as I am not deeply involved with that interface. We are a bit short on resources, and short on R developers. To be frank, the first priority is to simply update the R interface to use the latest C core, and get a release out. This is already taking up a lot of time. While I think it would be a good addition, I can’t personally make it happen, and I am not sure we have the resources right now do deal with it. Of course, it is possible that another maintainer would step up and be willing to incorporate the R implementation—however, there has no been any response so far. Therefore, for the moment I would suggest publishing the R implementation as a separate package that depends on igraph (if you are looking to publish it). Once again, this is just my opinion, and I am saying it because we are really short on resources right now.

Olivier_GIVAUDAN · 27 November 2020 21:52

Many thanks, Szabolcs, for your detailed answer.

OK, let’s see how much available time I have in the next weeks/months.

Olivier_GIVAUDAN · 4 November 2023 10:10

As I wrote 3 years ago, I implemented this algorithm in R without any issue and did the same in C (that could be later integrated into igraph) but I now have only one issue with the latter implementation.

Precisely I am struggling to understand why the B-lists (the ones that keep track of the nodes that were already visited and that did not yield any circuit) don’t have the same structure as the adjacency list of the initial (i.e. not the subgraphs thereafter) directed graph. With same structure I mean the number of neighbours of each node in the B-lists should be at most the number of neighbours of each node in this initial adjacency list.

However in practice I noticed this is not the case (overflow errors and not the right number of circuits) but I don’t understand why (theoretical problem).

As a consequence, my initial implementation (storing the B-lists on the stack as a unique big array of length equal to the total number of edges from the initial graph) doesn’t work (pratical problem) but I have first to understand the underlying theory so that I can implement it in a correct way. Of course knowning the maximum size of the big array beforehand would be much better to avoid dynamic memory management in C (reallocations) and the related loss in execution speed.

BTW in R I didn’t encounter/notice this issue because you don’t have to worry about memory management issues in this language - the B-lists could grow and shrink dynamically without any problem.

My question is: Do you know whether there exists a non-trivial (a trivial one being the number of distinct nodes for each individual B-list) upper bound either for the length of each B-list (individual upper bound) or for the sum of all B-lists (i.e. over all the vertices - collective upper bound) and why is this collective upper bound not equal to the number of edges of the initial graph?

Olivier_GIVAUDAN · 25 December 2024 00:04

I found a practical solution already in December 2023 but never took the time to post it here.

It might not be the best solution.

As mentioned on page 4 of the original Johnson’s paper (whose link was posted in my previous message), “B-lists are used to remember information obtained from searches of portions of the graph which do not yield an elementary circuit.”

Practically, the object ‘B’ (a collection of B-lists, for each vertex) can be implemented as an adjacency list, i.e. as a 2-dimension list: a list of n vertices and from each of these n vertices, a list of neighbours of each vertex.

Heuristically, I replicated the structure of the adjacency list of the initial graph (not from the subsequent subgraphs) to instantiate and initialise the object ‘B’, and kept track of the physical length (initially equal to the length of each adjacency ‘branch’ from the adjacency list for a given vertex) as well as of the logical length of each B-list (i.e. at each vertex) when building ‘B’.

Whenever the logical length of a B-list becomes larger than its physical length, then this B-list is reallocated so that it contains as many items as vertices (i.e. n); as the number of vertices n is a (trivial) upper bound, this reallocation, if it happens (in my sample of 300+ graphs, it happened less than on 10 graphs), will not happen more than once for each B-list.

The alternative to avoid these dynamic reallocations would be to allocate B at once as a n x n table (i.e. n B-lists with n items each) but that would be far too memory-intensive for graphs with a large number of vertices (O(n²) storage space).

szhorvat · 27 December 2024 11:14

We now have an implementation of cycle enumeration in igraph, released less than a month ago:

Here’s the function in the C library:

https://igraph.org/c/html/latest/igraph-Cycles.html#igraph_simple_cycles

In Python and R, the feature is not merged yet, but the PRs are ready:

github.com/igraph/rigraph

feat: `simple_cycles()` lists all simple cycles

igraph:simple-cycles-base-branch ← igraph:feat/simple_cycles

opened 11:35AM - 11 Nov 24 UTC

szhorvat

+130 -25

This PR depends on #1567 and #1571. For now, it targets the `simple_cycles-base-…branch` branch, which has those PRs merged. Once they are merged to `main`, it should be marked as ready-for-review and re-targeted. Can you please help me implement tests @maelle ? What I'd like to have a is a check on the cycle length histograms for the graph already added to the examples. Count cycles of length 1, length 2, etc. and add a check on the expected numbers. I'm not sure what's the best way to do this in R. Closes #1573

Olivier_GIVAUDAN · 27 December 2024 19:07

I know, I even gave a few examples of graphs with self-loops to test the self-loop detection on Github in early November (issue #2692).
Looking forward to comparing the performance on my graphs database.

Topic		Replies	Views
Find cycles in graph Usage Python	1	20	11 June 2025
C/igraph 0.9.0 Announcements C	0	776	16 February 2021
R/igraph 1.3.0 Announcements R	0	655	14 April 2022
amount of chordless cycles Usage C	8	603	16 December 2021
C/igraph 0.10.15 Announcements C	0	40	6 November 2024

Enumerating all the elementary circuits of a directed graph

Related topics