Update
You can now directly construct a graph based on a pandas
DataFrame
using Graph.DataFrame
.
Old answer
The easiest is to use Graph.DictList
as follows:
G = ig.Graph.DictList(
vertices=nodes_df.to_dict('records'),
edges=edges_df.to_dict('records'),
directed=True,
vertex_name_attr='id',
edge_foreign_keys=('source', 'target'));
Here, the vertex_name_attr
refers to the columns of nodes_df
that contains the node identifier (which is assumed to be id
here). The edge_foreign_keys
refer to the columns of edges_df
that contains the source and target identifier of the edges. All other attributes are automatically added as node or edge attributes. For example, if nodes_df
had a column group
, it will become accessible as G.vs['group']
, and if edges_df
had a column weight
, it will become available as G.es['weight']
. This assumes that no relevant columns are being used as indices. The downside of this is that it is somewhat slow.
A somewhat faster variant is provided by:
G = ig.Graph.TupleList(edges_df.values,
weights=True, directed=True,
edge_attrs=edges_df.columns[3:])
This assumes that the first three columns of edges_df
are respectively the source, the target and the edge weight. The names of the columns are irrelevant in this case, it only matters they are the first three columns, in this exact order. Any remaining edge attributes can then be provided in the edge_attrs
argument, and it now simply uses all remaining columns.
We then have to add the node attributes stil separately. The ig.Graph.TupleList
automatically kept track of the node identifiers in the node attribute name
, which we can use to assign the proper attribute names. Assuming you have an index set on the node identifier, you can do the following
for column in nodes_df:
G.vs[column] = nodes_df.loc[G.vs['name'],column]
If you have to set the node identifier as an index, you can do this as follows: nodes_df = nodes_df.set_index('id')
, assuming the node identifier is the column id
.
If you still encounter issues, let us know.