Hello,
In my usage of igraph, I consistently extract attribute information from vertices in my graphs. These graphs can have an upwards of 1e6 vertices, with typically a degree of 2-3 per vertex.
I’ve found that it’s faster to extract vertex attribute information from vertices by first creating a subgraph of the vertices and then pulling information from them from that subgraph. Mainly, I’m posting because I want to ask if this observation in typical? Am I doing something incorrectly to be causing the vertex retrieval to be so slow?
See a minimally reproducible example below.
import igraph as ig
import numpy as np
from time import perf_counter as pf
### Graph setup ###
vs = 1000000
es = 1000000
edges = np.random.randint(0, vs, (es, 2))
g = ig.Graph()
g.add_vertices(vs)
g.vs['radius'] = np.random.randint(0, 20, vs)
g.add_edges(edges)
g.vs['coords'] = np.random.randint(0,1000, (vs, 3))
components = g.components()
### V selection attribute extraction ###
t = pf()
for component in components:
vs = g.vs[component]
coords = vs['coords']
print (f"Vertex isolation speed: {pf() - t:.2f} s")
### Subgraph attribute extraction ###
t = pf()
for component in components:
c_graph = g.subgraph(component)
coords = c_graph.vs['coords']
print (f"Subgraph isolation speed: {pf() - t:.2f} s")
And an example speed test on my computer.
Vertex isolation speed: 169.08 s
Subgraph isolation speed: 33.84 s
Sorry, Christmas season struck and I dropped the ball on this one. I’ll get back to this as soon as possible, hopefully some time next week. Thanks for your patience!
Okay, I think I’ve found the bottleneck deep in the Python-C glue code; g.vs[component] unnecessarily constructs a vector in C that contains all the integers from 0 to g.vcount()-1 and then indexes that; that’s why g.vs[component] is so slow. I will provide a fix for this in the next patch release. Thanks for reporting!
You can follow the issue in the issue tracker here.
For the time being - is this a problem with a recent update? I.e., do older versions of igraph have faster vertex access? If so, I might downgrade for the moment for the sake of speed. Thanks!
Sorry for the late reply, Christmas holidays I don’t think it’s a recent problem, older versions are likely to have the same issue. However, I plan to fix this for the next patch version in a few days so stay tuned.
The planned schedule is as follows:
Version 0.9.6 of the C core is released later today (if everything goes well with the CI tests of the various higher-level interfaces).
Then I’ll fix this bug, hopefully this week.
A new version of python-igraph is then expected towards the end of this week.
Okay, so now I have a patch that brings down the time of the “vertex isolation” case to 0.22s from 88.57s with 1 million vertices. I’ll test it a bit more, then fix the same thing in g.es[...] and then this will go into 0.9.9.