Update for matlab-igraph toolbox

Hi, I’ve started work on the toolbox again. As suggested in the previous post, I’ve moved from make to cmake, which is much cleaner. The toolbox now builds on macOS, Windows, and Linux. (Windows is not running yet, however, I’m not familiar with how Windows deals with libraries but I think I need to ship some DLLs with it.) On linux/macOS I have it statically compiled so, at runtime, it isn’t dependent on any non-standard libraries. The build process is know automated with github’s actions for each OS. It also runs in MATLAB online and is submitted to their File Exchange, which in theory allows anyone to install it through MATLAB’s add-on interface.

There are still a few issues, for one the File Exchange does not allow me to attach multiple toolboxes so anyone not using linux will have to manual grab the correct toolbox from the github release. I can dump all the compiled files into a single toolbox and let MATLAB sort it out at runtime based on the architecture but that does mean significantly larger toolboxes. I’m also having some issues with runtime dependencies. On linux, MATLAB ships with an old version of stdlibc++ it then gives preference to the libraries it ships with which leads to errors. I can either build igraph with an older version of stdlibc++, but I haven’t managed to get it to compile correctly, or manually remove the lib shipped with MATLAB so the system one is used. I also tried running it on an out-of-date lab computer and had issues with old versions of libm and libc but I assume there’s nothing I can do on the distribution side to handle an end user have too old of libraries.

After figuring out Windows, I intend to replace the use of matrices as the default MATLAB side datatype with MATLAB’s preexisting graph type which allows for multigraphs and supplying attributes/metadata to. After that I will look into stimulus to see how much hand-generated code I can replace with it.

Got Windows figured out by packaging the runtime dlls with it. Thought I could do the same with the Linux issues but it seems like MATLAB isn’t respecting the RPATHs. I can confirm the RPATH is set correctly and ldd displays shared libs I ship with the toolbox but inside MATLAB it looks elsewhere.

This is a heads up to keep an eye on the changelog for what will (fairly soon) become igraph 1.0. Feedback is always welcome. If there are issues that affect matlab-igraph specifically that you’d like to discuss, it’s best to do it as soon as possible.

Note that there are a number of changes related to attribute handling that are not yet in the changelog.

Thank you. I’ll start working on a dev branch for matlab-igraph to test out the dev branch of igraph and let you know if there’s any major issues. Congrats on the milestone.

Was a little slow to get to this but I’ve tried out the development branch and didn’t run into too many issues. The only difficulty I’ve found is setting the default rng is now crashing matlab. But I can’t reproduce it in pure C so it looks like it’s not an issue with igraph itself.

I haven’t dug into it too much but the crash appears to occur when the rng is called for the first time not when setting it and it only occurs if it’s set in one matlab/mex call then used in another call so I assume some memory is getting cleaned up somewhere that shouldn’t be. And it only occurs if the rng has explicitly been set; the default default works fine. It’s reproducibly crashing on the development branch but not on 0.10.12.

I’ll give it a closer look soon but thought it might be worth noting in case the other packages run into the same thing.

There were no changes that affect the RNG as far as I’m aware, so this is surprising. Are you able to figure out where exactly the crash happens using a debugger or AddressSanitizer?

AddressSanitizer is a little tricky to set up when loading a shared library into a program you don’t compile yourself, like MATLAB, but it’s doable. I do use it to debug igraph’s Mathematica interface. I described the basics here: Using sanitizers to find bugs · igraph/igraph Wiki · GitHub You can refer to either the Mathematica or the Python section. Basically:

  • When you compile igraph, pass -DUSE_SANITIZERS=Address to CMake. This adds the necessary compiler flags.
  • When you compile your own code, you probably need to add the compiler flags yourself. For compilation, use -fsanitize=address -fno-omit-frame-pointer -g, and it is recommended to use the -Og or -O1 optimization level.
  • When you start MATLAB, use it from the command line, and pre-load the AddressSanitizer libraries using DYLD_INSERT_LIBRARIES on macOS or LD_PRELOAD on Linux. When using Clang (not GCC!), if you don’t do this, and try to load matlab-igraph anyway, MATLAB should crash and a message that includes the necessary LD_PRELOAD line should be printed.

Notes:

  • Doing this should be possible on Linux or macOS, I’m not sure about Windows.
  • GCC or Clang both work, but Clang might work better (e.g. better error messages when things go wrong)
  • Be sure to preload the correct ASan libaries, as there might be more than one copy on your machine. Use the same one that the matlab-igraph shared library is linking to.

Thanks. I was declaring the new RNG inside a function scope so it was getting cleaned up. Moving the variable outside keeps it alive and fixed the issue. I traced it to e905376dd. There was a minor change to the behavior of igraph_rng_set_default so it now replaces the old RNG with the new RNG, which was leaked since I defined it in the wrong scope, instead of copying the new RNG to the default.

- * This function \em copies the internal structure of the given \type igraph_rng_t
- * object to igraph's internal default RNG structure. The structure itself
- * contains two pointers only, one to the "methods" of the RNG and one to the
- * memory buffer holding the internal state of the RNG. This means that if you
- * keep on generating random numbers from the RNG after setting it as the
- * default, it will affect the state of the default RNG as well because the two
- * share the same state pointer. However, do \em not expect
- * \ref igraph_rng_default() to return the same pointer as the one you passed
- * in here - the state is shared, but the entire structure is not.
+ * This function updates the default RNG used by igraph to be the one
+ * pointed to by \p rng, and returns a pointer to the previous default
+ * RNG. Future calls to \ref igraph_rng_default() will return the same
+ * pointer as \p rng. The RNG pointed to by \p rng must not be destroyed
+ * for as long as it is used as the default.

In either case, it was my mistake as the docs clearly state not to destroy the RNG.

Also taking a closer look at rigraph’s rrandom.c file, I finally noticed it’s wrapping R’s internal RNG for igraph’s use. That’s really neat you can do that sort of thing. I’m still fairly new to C and am picking up a lot from working through the igraph library. I’ll try to add that in to matlab-igraph as the separation between matlab and igraph’s RNGs was a shortcoming I was aware of but thought I had to accept.

Yes, this is absolutely doable, and is intended to be done in high-level interfaces. I do it in the Mathematica interface as well (see here).

The minimum you need to do is to provide the get member function of the igraph_rng_type_t, and specify the number of bits it produces. The more bits you can provide, the better the performance will be, but the maximum possible bits is the size of igraph_integer_t. That is by default 32 on 32-bit systems and 64 on 64-bit systems.

Something to think about is if you want consistent behaviour between the 32-bit and 64-bit versions, in case you support both. If yes, don’t supply more than 32 bits. I do this in the Mathematica interface, and I rely on it for using the same test suite on both 32-bit and 64-bit platforms.

You may additionally provide specific implementations of samplers from various distributions. This can potentially improve performance. igraph does supply high quality samplers.

I looked into it some and mathworks doesn’t expose the headers for their RNG functions (as far as I can tell, no hits for rand, rng in their header files). I found the symbols in a shared lib they ship (libmwmathrng.so) but actually using these functions would certainly be unsupported. Not sure if I would want to add that in. Joys of working with proprietary software.

I had a similar problem with Mathematica in that accessing the RNG from C wasn’t documented. But Mathematica has a feature to generate C code from Mma code. This code links to a runtime library, which is presumably compatible between versions. I looked at how this generated code accesses the RNG from the runtime library, and used the same.

This might or might not be an option for MATLAB, I just wanted to mention the possibility. I believe MATLAB has much better support for C code generation than Mathematica, and I think it also has a corresponding runtime library.

Code generation seemed promising but requires an expensive toolbox. If the code it generates can be compiled without the toolbox it might be feasible otherwise, I can’t use it with github actions.

For now I went with linking to the undocumented math rng library. It turns out mathworks doesn’t expose their check user interrupt function either so I have to use undocumented functions anyway.

Except it doesn’t work on Windows. I’m not sure why but I keep getting undefined reference errors from the linker even though I’m providing it with the library that contains those functions. This works fine on Linux and macOS and the private interrupt library can be linked on Windows.

Not ideal, but I’m using calls from C to MATLAB as a fallback for when compiled on windows for now. It’s about two orders of magnitude slower than directly calling the C++ RNG functions. For many operations I don’t think it’ll be noticeable and I left in the igraph.rng function to switch to one of the RNGs provided by igraph if needed.

Caching results by requesting several random numbers in a single MATLAB call significantly improves performance but with the downside I can’t use MATLAB’s random integer function since the low and high range isn’t known ahead of time. And I would rather not juggle multiple random number caches that are all modifying the RNG’s state. So I would just use a cache for uniformly distributed numbers for the RNG’s get function and use igraph’s methods to convert that to integers/normal distribution/etc. Additionally, I would have to deal with a stale cache when the RNG gets reseaded which likely means clearing the cache on every call from MATLAB to C. So for now decided not to bother with trying to cache results.

Also now have support for MATLAB’s builtin graph class. Makes it much easier to keep track of graph data (i.e. node/edge attributes). Functions that create or read in a graph default to using the graph class but can return matrices if requested. Any function that accepts a graph can use a matrix or graph object interchangeably and if the function accepts an attribute it can be specified as either a standalone vector or the name an attribute associated with a graph object. Similarly, if a function returns something that could be considered an attribute, that can be returned as a vector or it can modify the graph object.

The support for working with classes in MATLAB’s mex C API is very limited. I cannot access all the data needed in mex and cannot create a graph in mex so I have to convert to and from a structure with MATLAB calls. This is generally fine but requires a calling the MATLAB graph constructor which is slow. I assume for any graph that’s big enough for the slowdown to be noticeable, it will be small compared to the time it takes to run other algorithms so hopefully not an issue.

Now just cleaning up the error/interruption handling, that I haven’t set up well yet. Then try to generate code with simulate.