most efficient way to refer to a vertex in a graph

As stated in the tutorial, each vertex has a unique number as its id, starting from 0 continuously to N-1, where N is the number of vertices in the graph.

However, the id for a vertex p may change if one vertex prior to it was deleted from the graph. So what is the most efficient way to refer to p after the delete operation?

Currently, I can only define a unique “name” attribute for each vertex beforeahead, and then use seq = g.vs(name_eq = unique_name_of_p) and seq[0] to get the reference of vertex p, which I intuitively think may be really inefficient, especially when dealing with a large graph.

I wonder whether igraph provides a more direct way to refer to a specific vertex, like a static and unique key so that the vertex can be referred directly?

1 Like

The name attribute is treated in a special way by the internals of the Python interface; in particular, the name attribute is indexed internally in a Python dict so looking up the ID of a node given its name attribute is an O(1) operation (on average). The index is invalidated if you mutate the graph or change the name attribute. This happens here and here in the C implementation of the Python interface.

Furthermore, there are dedicated code paths for handling g.vs.find(name="something"), which is the idiomatic way to get a handle for the Vertex object corresponding to the given name. (Note that Vertex objects are ephemeral, though, they are essentially a container for a graph and a vertex index, so they become invalid when you mutate the graph). So, g.vs.find(name="something") should be able to look up a vertex with a given name in constant amortized time.

Also, any method of the Graph object that accepts a single vertex index as an input should also accept a vertex name instead, and it will look up the index of the vertex corresponding the given name using the index. So, if you have a vertex with name equal to "A" and you want to get its degreee, you can simply call g.degree("A"). For example:

>>> from igraph import Graph
>>> g=Graph.GRG(25, 0.4)
>>> g.vs["name"] = list("ABCDEFGHIJKLMNOPQRSTUVWXY")
>>> g.vs.find("I").index
8
>>> g.degree("I")
9
>>> g.degree(8) == g.degree("I")
True

Note that the existence of the index assumes that the vertex names are unique, but this is not enforced in any way If you accidentally assign the same name to two vertices, the index will map the name to one of the vertices but it’s unspecified which one will be the “winner” (and you won’t get any warnings either).

1 Like

Thanks for your comprehensive reply. This solved my problem.

By the way, I find that there is the relative explanation in the tutorial. Maybe I missed the part when I read the tutorial. Anyway, thanks for your time and patience.

A further question: can we get the handle of an edge in a graph by using its name attribute in O(1)?

I failed to get the handle of an edge by the following code:

edge = g.es.find(edge_name)

However, the following line works, which I do not know whether it is executed in O(1).

edge = g.es.select(name_eq = edge_name)

Compared with getting the handle of vertex, I think it is OK if igraph does not support getting the handle of an edge directly by name in O(1). Currently I use edge = g.get_eid(v1, v2) to get the handle of an edge. I just want to confirm the inner mechanism.

The name attribute of edges is treated the same way as any other attribute, so no, there is no indexing behind that and g.es.select(name_eq=edge_name) would make a full scan. In theory it wouldn’t be too hard to add indexing for the edges as well but so far it has been a relatively uncommon scenario. g.get_eid(v1, v2) is definitely faster; O(1) to find the vertices by name and O(log(d)) for finding the edge among the edges incident on v1 (or v2).

If you feel like your code could really benefit from constant-time edge lookups by name, let me know and we’ll figure out something.

Thanks for the reply. I agree with you that “add indexing for the edges as well but so far it has been a relatively uncommon scenario”. Currently g.get_eid(v1, v2) is enough for me.

This story about edge names will become much more interesting if we start working with multigraphs and remove edges at the same time. Consider the following graph:

Distinguishing the two edges between 1 and 3 is important because removing one merges the green/orange, removing the other merges the blue/orange faces. Currently, we can only distinguish edges by index. If we repeatedly do such removals (face mergers), the edge indices will change. Thus the only reliable way to refer to edges consistently is to give them names. Then it follows that fast lookup by name might be a good idea.

Another application is moving between the original graph and its line graph.

I think we’re not quite there yet though, it’s not yet time to implement fast edge lookup by name. We need a more practical use case first.

2 Likes

A post was split to a new topic: Check if vertex exists, create it if it doesn’t