igraph is written in C89 (not C++), which does not have explicit support for inline functions. Usually, the compiler would automatically inline functions where possible, but it can not do this across compilation units—unless we use link-time optimization.
igraph does have several functions which perform such a small task that the time they take may be comparable to the overhead from the function call. For example, the random number generation functions can only return a single random number at a time.
In a preliminary test, I saw a speedup from 9.1 s to 5.1 s just by compiling with
-flto=thin on macOS. We should investigate if providing official binaries compiled with LTO is useful, e.g. for the Python interface. On macOS I already do this for the Mathematica interface.
Ideally, we should also try to eliminate this issue with very small functions that cannot be inlined, and make sure that such functions are not called in a loop when avoidable.
For example, in C++ it is safe (performance-wise) to do
for (int i=0; i < vec.size(); ++i) and call
.size() in each iteration, because it gets inlined anyway. Doing the same with
igraph_vector_size() would slow things down. But calling
vecor_size repeatedly is avoidably, while calling the RNG repeatedly is not.
This idea brings me to another topic that I’ve had in mind for a while: are there any well-established benchmarks that can be used to test the performance of a library like ours? If so, we should implement a few of these so we can test whether the optimizations we come up with actually make sense in the long run or not.
I know that there are a few benchmarks in
examples/benchmarks - are these compiled as part of the build process? If so, maybe we could have a
make benchmark goal in the Makefile so we can run these easily every now and then.
@szhorvat You can now pass
--enable-lto to the configure script before compilation to enable
-flto for the whole build. Untested, so beware. It’s not enabled by default.
Thanks for pointing it out. I didn’t know about this. I did it with CFLAGS, CXXFLAGS, LDFLAGS. To be specific, I used
I experimented quite a bit with LTO for igraph over the years, and one thing that I must note is that it tends to result in weird crashes that I could not debug. These could be due to real but subtle igraph bugs (e.g. code with undefined behaviour) or a bug in the compiler.
So far I use specifically
-flto=thin and only on macOS because I did not experience any issues with this setup yet.
Ah, you just added this today! I didn’t get this at first
Yes, good suggestion! If the hardware of the CI is stable, it would provide an easy way to run these benchmarks.