Pandas: A clean way to initialize data frame with a list of namedtuple
In a similar vein to creating a Series from a namedtuple, you can use the _fields
attribute:
In [11]: Point = namedtuple('Point', ['x', 'y'])
In [12]: points = [Point(1, 2), Point(3, 4)]
In [13]: pd.DataFrame(points, columns=Point._fields)
Out[13]:
x y
0 1 2
1 3 4
Assuming they are all of the same type, in this example all Point
s.
The function you want is from_records.
For namedtuple
instances you must pass the _fields
property of the namedtuple to the columns
parameter of from_records
, in addition to a list of namedtuples:
df = pd.DataFrame.from_records(
[namedtuple_instance1, namedtuple_instance2],
columns=namedtuple_type._fields
)
If you have dictionaries, you can use it directly as
df = pd.DataFrame.from_records([dict(a=1, b=2), dict(a=2, b=3)])