Load nodes with attributes and edges from DataFrame to NetworkX
Here's basically the same answer, but updated with some details filled in. We'll start with basically the same setup, but here there won't be indices for the nodes, just names to address @LancelotHolmes comment and make it more general:
import networkx as nx
import pandas as pd
linkData = pd.DataFrame({'source' : ['Amy', 'Bob'],
'target' : ['Bob', 'Cindy'],
'weight' : [100, 50]})
nodeData = pd.DataFrame({'name' : ['Amy', 'Bob', 'Cindy'],
'type' : ['Foo', 'Bar', 'Baz'],
'gender' : ['M', 'F', 'M']})
G = nx.from_pandas_edgelist(linkData, 'source', 'target', True, nx.DiGraph())
Here the True
parameter tells NetworkX to keep all the properties in the linkData as link properties. In this case I've made it a DiGraph
type, but if you don't need that, then you can make it another type in the obvious way.
Now, since you need to match the nodeData by the name of the nodes generated from the linkData, you need to set the index of the nodeData dataframe to be the name
property, before making it a dictionary so that NetworkX 2.x can load it as the node attributes.
nx.set_node_attributes(G, nodeData.set_index('name').to_dict('index'))
This loads the whole nodeData dataframe into a dictionary in which the key is the name, and the other properties are key:value pairs within that key (i.e., normal node properties where the node index is its name).
Create the weighted graph from the edge table using nx.from_pandas_dataframe
:
import networkx as nx
import pandas as pd
edges = pd.DataFrame({'source' : [0, 1],
'target' : [1, 2],
'weight' : [100, 50]})
nodes = pd.DataFrame({'node' : [0, 1, 2],
'name' : ['Foo', 'Bar', 'Baz'],
'gender' : ['M', 'F', 'M']})
G = nx.from_pandas_dataframe(edges, 'source', 'target', 'weight')
Then add the node attributes from dictionaries using set_node_attributes
:
nx.set_node_attributes(G, 'name', pd.Series(nodes.name, index=nodes.node).to_dict())
nx.set_node_attributes(G, 'gender', pd.Series(nodes.gender, index=nodes.node).to_dict())
Or iterate over the graph to add the node attributes:
for i in sorted(G.nodes()):
G.node[i]['name'] = nodes.name[i]
G.node[i]['gender'] = nodes.gender[i]
Update:
As of nx 2.0
the argument order of nx.set_node_attributes
has changed: (G, values, name=None)
Using the example from above:
nx.set_node_attributes(G, pd.Series(nodes.gender, index=nodes.node).to_dict(), 'gender')
And as of nx 2.4
, G.node[]
is replaced by G.nodes[]
.
A small remark:
from_pandas_dataframe doesn't work in nx 2, referring to this one
G = nx.from_pandas_dataframe(edges, 'source', 'target', 'weight')
I think that in nx 2.0 it goes like that:
G = nx.from_pandas_edgelist(edges, source = "Source", target = "Target")