I’m having a problem when I create a graph from a Pandas dataframe: vertices have no identification (or, at least, I’m not knowing how to get it).
I created a dataframe from this file, of String website. The result is below:
protein1 protein2 neighborhood fusion cooccurence coexpression experimental database textmining combined_score
0 5664.LmjF.01.0010 5664.LmjF.20.1620 0 0 0 0 0 0 784 784
1 5664.LmjF.01.0010 5664.LmjF.11.0010 0 0 0 0 0 0 784 784
2 5664.LmjF.01.0010 5664.LmjF.23.0030 0 0 0 0 0 0 784 784
3 5664.LmjF.01.0010 5664.LmjF.18.0010 0 0 0 0 0 0 784 784
4 5664.LmjF.01.0030 5664.LmjF.33.0810 0 0 0 0 235 0 0 235
... ... ... ... ... ... ... ... ... ... ...
854913 5664.LmjF.36.7000 5664.LmjF.33.2720 163 0 0 0 0 0 0 163
854914 5664.LmjF.36.7000 5664.LmjF.33.0960 173 0 0 0 0 0 0 173
854915 5664.LmjF.36.7000 5664.LmjF.28.1970 167 0 0 0 0 0 0 167
854916 5664.LmjF.36.7000 5664.LmjF.24.0770 173 0 0 0 0 0 0 173
854917 5664.LmjF.36.7000 5664.LmjF.12.0280 172 0 0 0 0 0 64 191
Then I converted it to a graph object using :
>>> g = Graph.DataFrame(df, directed=False)
(“df” is the dataframe)
According to this excerpt in igraph documentation,
def DataFrame(cls, edges, directed=True, vertices=None, use_vids=False):
Generates a graph from one or two dataframes.
Parameters edges pandas DataFrame containing edges and metadata. The first two
columns of this DataFrame contain the source and target vertices
for each edge. These indicate the vertex *names* rather than ids
unless `use_vids` is True and these are nonnegative integers.
(since I’m not using “use_vids”), I was expecting that the first and second columns content would appear as vertices “names”. But it didn’t happen. For example, if I use
>>> g.vs[0]
I get:
igraph.Vertex(<igraph.Graph object at 0x7fd56cdc2c70>, 0, {})
(notice the empty brackets). And with g.summary(), I get this, with only edge attributes:
'IGRAPH U--- 4455 854918 -- \n+ attr: coexpression (e), combined_score (e), cooccurence (e), database (e), experimental (e), fusion (e), neighborhood (e), textmining (e)'
Am I doing anything wrong? Or did I misunderstand how Graph.dataframe works?