Subgraph vs g.vs.select behavior

Hello,
In my usage of igraph, I consistently extract attribute information from vertices in my graphs. These graphs can have an upwards of 1e6 vertices, with typically a degree of 2-3 per vertex.

I’ve found that it’s faster to extract vertex attribute information from vertices by first creating a subgraph of the vertices and then pulling information from them from that subgraph. Mainly, I’m posting because I want to ask if this observation in typical? Am I doing something incorrectly to be causing the vertex retrieval to be so slow?

See a minimally reproducible example below.

import igraph as ig
import numpy as np
from time import perf_counter as pf

### Graph setup ### 
vs = 1000000
es = 1000000

edges = np.random.randint(0, vs, (es, 2))

g = ig.Graph()
g.add_vertices(vs)
g.vs['radius'] = np.random.randint(0, 20, vs)
g.add_edges(edges)
g.vs['coords'] = np.random.randint(0,1000, (vs, 3))

components = g.components()

### V selection attribute extraction ### 
t = pf()
for component in components:
    vs = g.vs[component]
    coords = vs['coords']
print (f"Vertex isolation speed: {pf() - t:.2f} s")

### Subgraph attribute extraction ### 
t = pf()
for component in components:
    c_graph = g.subgraph(component)
    coords = c_graph.vs['coords']
print (f"Subgraph isolation speed: {pf() - t:.2f} s")

And an example speed test on my computer.

Vertex isolation speed: 169.08 s
Subgraph isolation speed: 33.84 s

I’ll take a look at this soon (probably tomorrow) and get back to you. It does not seem normal to me at first glance.

Sorry, Christmas season struck and I dropped the ball on this one. I’ll get back to this as soon as possible, hopefully some time next week. Thanks for your patience!

Okay, I think I’ve found the bottleneck deep in the Python-C glue code; g.vs[component] unnecessarily constructs a vector in C that contains all the integers from 0 to g.vcount()-1 and then indexes that; that’s why g.vs[component] is so slow. I will provide a fix for this in the next patch release. Thanks for reporting!

You can follow the issue in the issue tracker here.

1 Like

Thanks for the update tamas - will definitely be keeping an eye out for this update!

For the time being - is this a problem with a recent update? I.e., do older versions of igraph have faster vertex access? If so, I might downgrade for the moment for the sake of speed. Thanks!

Sorry for the late reply, Christmas holidays :slight_smile: I don’t think it’s a recent problem, older versions are likely to have the same issue. However, I plan to fix this for the next patch version in a few days so stay tuned.

The planned schedule is as follows:

  • Version 0.9.6 of the C core is released later today (if everything goes well with the CI tests of the various higher-level interfaces).
  • Then I’ll fix this bug, hopefully this week.
  • A new version of python-igraph is then expected towards the end of this week.
1 Like

Okay, so now I have a patch that brings down the time of the “vertex isolation” case to 0.22s from 88.57s with 1 million vertices. I’ll test it a bit more, then fix the same thing in g.es[...] and then this will go into 0.9.9.

1 Like

Awesome. I’ll be looking forward to the new release. Thanks tamas!

0.9.9 is already out. You can install it right now.

1 Like