Check if vertex exists, create it if it doesn't--igraph is Much Slower than Networkx

Given a name “n1” (string), I want to check whether there exists a vertex having the same name “n1” in a graph. If not, I will add “n1” to the graph.

Currently I have two methods:

  1. Time complexity: O(n).
if "n1" in g.vs["name"]:
    print("Already exists")
  1. O(1), but need to use “try…except”.
    id = g.vs.find("n1")
except ValueError: # does not exist

I wonder which one is better in terms of time, or if there exist other methods?

The answer also depends on how frequently you need to add a new vertex to the graph. If it happens rarely (i.e. you are adding or modifying edges most of the time, and very rarely you also need to add a new vertex), then you can convert g.vs["name"] into a Python set, use the in operator to test whether the name is already in there, and update the set every time you need to add a new vertex. Or, you can use the try..except syntax proposed in your second example, they should be the same in terms of performance if vertex additions are rare.

On the other hand, if vertex additions happen frequently, then maintaining a separate mapping from vertex name to vertex index is the way to go. This is because the Python interface simply invalidates its internal mapping from vertex name to vertex index when a new vertex is added or the name of a vertex is changed. So, if you have 10K nodes and you add one extra node, the mapping is invalidated and will be re-constructed from scratch the next time you try to refer to a vertex by its name. If you have a separate mapping and you know that you simply added a new vertex, you can update the external mapping on your own and not pay the cost of re-constructing the internal mapping at every call to g.vs.find().

The UniqueIdGenerator is a helper class that you may find useful if you care about performance and you are adding vertices frequently to a graph. Basically, you need to seed a UniqueIdGenerator with the existing names from your graph first:

id_gen = UniqueIdGenerator(initial=g.vs["name"])

and then simply use id_gen["foo"] to look up the ID corresponding to "foo". If "foo" already has an ID, the UniqueIdGenerator will give you that, otherwise it will give you the next available ID. You can then compare the received ID with g.vcount() to check whether you need to add a new vertex or not.

It is a good method to maintain an external set to record names of current vertices in the graph.

However, I find that it is not the essential part to influence the efficiency of adding new vertex to graph. As I have posted in another topic, the operation of adding vertices and edges to an existing graph in python-igraph is less efficient compared with networkx.

This is because in that other example you are adding edges one by one. igraph’s internal data structures are designed for static graphs so the cost of adding a single edge is almost the same as adding many of them. I’ll elaborate on this in the other topic.