I was looking into extending the motif finder to support 5- and 6-vertex undirected motifs (of which there are 34 and 156, respectively, still fewer than the 218 directed 4-motifs we currently support).
igraph_i_isoclass_3: This is never used. What is it for?
igraph_i_isoclass_3_idx: This assigns a unique bit to each edge, its purpose is clear.
igraph_i_isoclass2_3: This is the actual lookup table from a bitwise unsigned int-based encoding of adjacency matrices to “isoclasses”. Its purpose is clear.
igraph_i_classedges_3 and igraph_i_isographs_3: These are used for creating a graph from an “isoclass”. Again, it’s clear what they are for.
@Gabor Do you recall what igraph_i_isoclass_3 is for? If not, should we just remove these unused tables? Perhaps they come from the FANMOD software which is the original implementation of the motif finding algorithm used by igraph. Unfortunately, I can no longer find any downloads for FANMOD.
When I create the tables for 5- and 6-vertex graphs: Was there any particular logic to assigning isoclasses to each graph? Is there any particular rule I should stick to?
This is a reminder of the “isoclasses” for 3-vertex directed graphs. Note that it is not even sorted by edge count, with 8 having more edges than 9.
As for the ordering, I don’t know whether there was any specific ruleset that we applied for graphs with 3 vertices, but the Graph Atlas uses these rules:
in increasing order of number of vertices;
for a fixed number of vertices, in increasing order of the number of edges;
for fixed numbers of vertices and edges, in increasing order of the degree sequence, for example 111223 < 112222;
for fixed degree sequence, in increasing number of automorphisms.
They have an identical degree sequence: 1, 2, 2, 2, 3. And they both have 2 automorphisms: the trivial one and the vertical flip on the above drawings.
How about I just use the order in which Nauty’s geng tool generates the graphs? That is what I am going to use anyway. This is what it gives for 5 vertices:
In the meantime, I found some (modified) FANMOD sources here. Perhaps it was opened up at some point? I also looked more carefully at the paper, and the way it works is not through any lookup table, but by canonicalizing subgraphs using nauty. This way it is able to support up to size-8 motifs.
This is a good review paper from 2015 about motif finding, and it says that many others use nauty as well:
I was thinking about how we can extend igraph to work with larger motifs. We can easily add lookup-table support for size-5 and size-6 undirected ones. There are 1044 size-7 undirected ones and 9608 size-5 directed ones. Adding support for these within the same framework is technically possible, but perhaps not a great idea.
What we can do instead is to use Bliss to canonicalize subgraphs and return the result not as a vector, but as a graph-to-count mapping (“dictionary”). In this mapping, we can use some very compact representation of graphs, such as a bitwise packing of the adjacency matrix into an integer type. This allows storing 8-vertex directed or 11-vertex undirected graphs in a single 64-bit integer (or size-6 directed and size-8 undirected in a 32-bit integer). Since such a bitwise representation would be very cumbersome for users to work with from high-level interfaces, the public API could use nauty’s Graph6/Digraph6 formats instead. This is also a bitwise representation of the adjacency matrix, but it’s encoded into ASCII and it is supported by lots and lots of graph theory tools. We’d need to add a Graph6 encoder/decoder to igraph, but I already have the decoder implemented in IGraph/M, I just need to port it to the C core.
Does this sound good?
The one thing that might go wrong (and will likely go wrong) is that Bliss will be too slow for this task. Bliss is faster than nauty for hard problems and large graphs, but I expect that it has a high initialization cost, which would be an issue, since it needs to be called for every subgraph.
To be clear, this project on handling larger motifs is for the long term. As you might have guessed @tamas, it was prompted by the ISMAGS feature request.
Adding 5- and 6-vertex undirected motifs is easy and will be done for 0.10.
I figured out the ordering of graphs in the existing lookup tables:
All possible adjacency matrices are generated in a certain natural order. Whenever a new graph appears, that is not isomorphic to any previously generated ones, it is added to the list of graphs.
In other words, if you look at the igraph_i_isoclass2 tables, whenever a new, so-far-unseen number appears, it is precisely one larger than the largest number encountered so far.
Sounds good to me! But as you said, this is something for beyond 0.10 as we have plenty of open development threads for 0.10 already, and this is not API-breaking so it’s not necessary to push this into 0.10.