Isoclasses and motif finding for 5 and 6-vertex graphs

I was looking into extending the motif finder to support 5- and 6-vertex undirected motifs (of which there are 34 and 156, respectively, still fewer than the 218 directed 4-motifs we currently support).

I noticed that parts of the lookup tables are never used. Thus I cannot figure out their purpose.

For 3-vertex motifs, we have:

  • igraph_i_isoclass_3: This is never used. What is it for?
  • igraph_i_isoclass_3_idx: This assigns a unique bit to each edge, its purpose is clear.
  • igraph_i_isoclass2_3: This is the actual lookup table from a bitwise unsigned int-based encoding of adjacency matrices to “isoclasses”. Its purpose is clear.
  • igraph_i_classedges_3 and igraph_i_isographs_3: These are used for creating a graph from an “isoclass”. Again, it’s clear what they are for.

@Gabor Do you recall what igraph_i_isoclass_3 is for? If not, should we just remove these unused tables? Perhaps they come from the FANMOD software which is the original implementation of the motif finding algorithm used by igraph. Unfortunately, I can no longer find any downloads for FANMOD.

When I create the tables for 5- and 6-vertex graphs: Was there any particular logic to assigning isoclasses to each graph? Is there any particular rule I should stick to?

This is a reminder of the “isoclasses” for 3-vertex directed graphs. Note that it is not even sorted by edge count, with 8 having more edges than 9.

@Gabor Do you recall what igraph_i_isoclass_3 is for

I don’t remember.

Do you recall if it was borrowed from FANMOD or if you created the lookup tables from scratch?

@tamas @vtraag OK if I remove the unused parts of the lookup tables at the same time as adding the new ones for 5- and 6-vertex undirected motifs?

@szhorvat Sure, please go ahead.

As for the ordering, I don’t know whether there was any specific ruleset that we applied for graphs with 3 vertices, but the Graph Atlas uses these rules:

  1. in increasing order of number of vertices;
  2. for a fixed number of vertices, in increasing order of the number of edges;
  3. for fixed numbers of vertices and edges, in increasing order of the degree sequence, for example 111223 < 112222;
  4. for fixed degree sequence, in increasing number of automorphisms.

These rules are not sufficient to break all ties. For example, among 5-vertex graphs, these two would be equivalent:

They have an identical degree sequence: 1, 2, 2, 2, 3. And they both have 2 automorphisms: the trivial one and the vertical flip on the above drawings.

How about I just use the order in which Nauty’s geng tool generates the graphs? That is what I am going to use anyway. This is what it gives for 5 vertices:

EDIT: Or I can directly take them from the graph atlas and use the same order.

But note that the graph atlas does not have directed graphs while the nauty suite can still produce them using something like geng 3 | directg.

You mean copied? fanmod was not open source so I am fairly sure that I did not copy anything.

@szhorvat Just pick the order that is convenient for you then; I guess it’s the one based on nauty.

Thanks for the response Gábor!

In the meantime, I found some (modified) FANMOD sources here. Perhaps it was opened up at some point? I also looked more carefully at the paper, and the way it works is not through any lookup table, but by canonicalizing subgraphs using nauty. This way it is able to support up to size-8 motifs.

This is a good review paper from 2015 about motif finding, and it says that many others use nauty as well:

I was thinking about how we can extend igraph to work with larger motifs. We can easily add lookup-table support for size-5 and size-6 undirected ones. There are 1044 size-7 undirected ones and 9608 size-5 directed ones. Adding support for these within the same framework is technically possible, but perhaps not a great idea.

What we can do instead is to use Bliss to canonicalize subgraphs and return the result not as a vector, but as a graph-to-count mapping (“dictionary”). In this mapping, we can use some very compact representation of graphs, such as a bitwise packing of the adjacency matrix into an integer type. This allows storing 8-vertex directed or 11-vertex undirected graphs in a single 64-bit integer (or size-6 directed and size-8 undirected in a 32-bit integer). Since such a bitwise representation would be very cumbersome for users to work with from high-level interfaces, the public API could use nauty’s Graph6/Digraph6 formats instead. This is also a bitwise representation of the adjacency matrix, but it’s encoded into ASCII and it is supported by lots and lots of graph theory tools. We’d need to add a Graph6 encoder/decoder to igraph, but I already have the decoder implemented in IGraph/M, I just need to port it to the C core.

Does this sound good?

The one thing that might go wrong (and will likely go wrong) is that Bliss will be too slow for this task. Bliss is faster than nauty for hard problems and large graphs, but I expect that it has a high initialization cost, which would be an issue, since it needs to be called for every subgraph.

To be clear, this project on handling larger motifs is for the long term. As you might have guessed @tamas, it was prompted by the ISMAGS feature request.

Adding 5- and 6-vertex undirected motifs is easy and will be done for 0.10.

They published the sources at some point, with serious restrictions, definitely not open source, so I suggest you don’t read it.

1 Like

I figured out the ordering of graphs in the existing lookup tables:

All possible adjacency matrices are generated in a certain natural order. Whenever a new graph appears, that is not isomorphic to any previously generated ones, it is added to the list of graphs.

In other words, if you look at the igraph_i_isoclass2 tables, whenever a new, so-far-unseen number appears, it is precisely one larger than the largest number encountered so far.

I will keep to the same order, for consistency.

1 Like

Sounds good to me! But as you said, this is something for beyond 0.10 as we have plenty of open development threads for 0.10 already, and this is not API-breaking so it’s not necessary to push this into 0.10.