No vertices "names" in Pandas dataframe import

I’m having a problem when I create a graph from a Pandas dataframe: vertices have no identification (or, at least, I’m not knowing how to get it).

I created a dataframe from this file, of String website. The result is below:

                 protein1           protein2  neighborhood  fusion  cooccurence  coexpression  experimental  database  textmining  combined_score
0       5664.LmjF.01.0010  5664.LmjF.20.1620             0       0            0             0             0         0         784             784
1       5664.LmjF.01.0010  5664.LmjF.11.0010             0       0            0             0             0         0         784         784
2       5664.LmjF.01.0010  5664.LmjF.23.0030             0       0            0             0             0         0         784             784
3       5664.LmjF.01.0010  5664.LmjF.18.0010             0       0            0             0             0         0         784             784
4       5664.LmjF.01.0030  5664.LmjF.33.0810             0       0            0             0           235         0           0             235
...                   ...                ...           ...     ...          ...           ...           ...       ...         ...             ...
854913  5664.LmjF.36.7000  5664.LmjF.33.2720           163       0            0             0             0         0           0             163
854914  5664.LmjF.36.7000  5664.LmjF.33.0960           173       0            0             0             0         0           0             173
854915  5664.LmjF.36.7000  5664.LmjF.28.1970           167       0            0             0             0         0           0             167
854916  5664.LmjF.36.7000  5664.LmjF.24.0770           173       0            0             0             0         0           0             173
854917  5664.LmjF.36.7000  5664.LmjF.12.0280           172       0            0             0             0         0          64             191

Then I converted it to a graph object using :

>>> g = Graph.DataFrame(df, directed=False)

(“df” is the dataframe)

According to this excerpt in igraph documentation,

def DataFrame(cls, edges, directed=True, vertices=None, use_vids=False):

Generates a graph from one or two dataframes.
Parameters   edges    pandas DataFrame containing edges and metadata. The first two
                      columns of this DataFrame contain the source and target vertices
                      for each edge. These indicate the vertex *names* rather than ids
                      unless `use_vids` is True and these are nonnegative integers.

(since I’m not using “use_vids”), I was expecting that the first and second columns content would appear as vertices “names”. But it didn’t happen. For example, if I use

>>> g.vs[0]

I get:

igraph.Vertex(<igraph.Graph object at 0x7fd56cdc2c70>, 0, {})

(notice the empty brackets). And with g.summary(), I get this, with only edge attributes:

'IGRAPH U--- 4455 854918 -- \n+ attr: coexpression (e), combined_score (e), cooccurence (e), database (e), experimental (e), fusion (e), neighborhood (e), textmining (e)'

Am I doing anything wrong? Or did I misunderstand how Graph.dataframe works?

Answering to myself, in case someone have the same situation. It was a problem with my igraph installation. I have an “igraph” package in my pip setup (don’t remember why). So I just uninstalled it with:

pip uninstall igraph

And installed the correct package:

pip install python-igraph

Everything is fine now. :slight_smile:

1 Like