Hi! Is there a way to tell if my igraph installation has OpenMP enabled?
It seems like changing the OMP_NUM_THREADS variable doesn’t change the performance of the personalized_pagerank function so I think it is not enabled but I would like to be sure!
@tamas, do you have any comments on this? I though that the PyPI distribution was compiled with OpenMP, but it does indeed seem to use a single thread on my machine.
In theory they are built with OpenMP support, judging from the build logs on Github Actions (see here). Drilling down to the “Build wheels (manylinux)” part it seems like OpenMP should be enabled (the build logs display OpenMP: yes). Also, by examining _igraph.abi3.so within the downloaded wheel using ldd, it seems like it is linking to the OpenMP library, which is vendored within the wheel itself as igraph.libs/libgomp-a34b3233.so.1.0.0.
So, all seems good to me. I only have a four-core Linux machine around to test things with, so I did this:
import igraph
g = igraph.Graph.GRG(1000000, 0.0025)
x = g.pagerank()
and watched the CPU usage during the calculation. With OMP_NUM_THREADS=4, the CPU usage went up to more than 270% during the PageRank calculation. During the graph generation it hovered stably at 100% as the GRG graph generation runs on a single thread only. So it seems to me that OpenMP is working.
I followed your process @tamas and it does seem like igraph OpenMP is working according to the OMP_NUM_THREADS variable.
After using the quick script below it seems like when changing OMP_NUM_THREADS from 1 to 128 the CPU usage looks drastically different but they both took an average of 2.35s per run. Is this to be expected?
import igraph
from tqdm import tqdm
g = igraph.Graph.GRG(1000000, 0.0025)
for i in tqdm(range(10)):
x = g.pagerank()
the code is dominated by the generation of the GRG graph and not by the call to pagerank()
PRPACK contains multiple implementations of the algorithm and it decides between them based on some criteria that I haven’t dived into in the source code. Not all the implementations are parallelized by OpenMP so you might not even be hitting the part of PRPACK that is parallelized.
setup of OpenMP threads and the coordination between them might cost more than what you gain with parallelization at this graph size
An extreme number of threads is not going to be productive. Try using only a few. I see a very clear improvement with 4 threads.
A good benchmark graph is igraph.Graph.Kautz(7,7).
@tamas On macOS / arm64 I only see a single thread working when using the PyPI version. When I use the MacPorts version and enable OpenMP for it, I see 4 threads. Are you sure something’s not wrong with the macOS PyPI version?
You are right, arm64 wheels seem to be compiled without OpenMP. I strongly suspect that this is due to the fact that arm64 wheels are cross–compiled with cibuildwheel; maybe the build environment is not ready for OpenMP? I can compile igraph with OpenMP just fine on my Mac if I do it natively.