Build pandas data frame from list of numpy arrays
As @MaxGhenis pointed out in the comments, from_items
is deprecated as of version 0.23. The link suggests to use from_dict
instead, so the old answer can be modified to:
pd.DataFrame.from_dict(dict(zip(names, data)))
--------------------------------------------------OLD ANSWER-------------------------------------------------------------
I would use .from_items
:
pd.DataFrame.from_items(zip(names, data))
which gives
data1 data2 data3
0 0 0 0
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
6 6 6 6
7 7 7 7
8 8 8 8
9 9 9 9
That should also be faster than transposing:
%timeit pd.DataFrame.from_items(zip(names, data))
1000 loops, best of 3: 281 µs per loop
%timeit pd.DataFrame(data, index=names).T
1000 loops, best of 3: 730 µs per loop
Adding a fourth column is then also fairly simple:
df['data4'] = range(1, 11)
which gives
data1 data2 data3 data4
0 0 0 0 1
1 1 1 1 2
2 2 2 2 3
3 3 3 3 4
4 4 4 4 5
5 5 5 5 6
6 6 6 6 7
7 7 7 7 8
8 8 8 8 9
9 9 9 9 10
As mentioned by @jezrael in the comments, a third option would be (beware: order not guaranteed)
pd.DataFrame(dict(zip(names, data)), columns=names)
Timing:
%timeit pd.DataFrame(dict(zip(names, data)))
1000 loops, best of 3: 281 µs per loop
There are many ways to solve your problem, but the easiest way seems to be df.T
(T
being shorthand for pandas.DataFrame.transpose
):
>>> df = pd.DataFrame(data=data, index=names)
>>> df
0 1 2 3 4 5 6 7 8 9
data1 0 1 2 3 4 5 6 7 8 9
data2 0 1 2 3 4 5 6 7 8 9
data3 0 1 2 3 4 5 6 7 8 9
>>> df.T
data1 data2 data3
0 0 0 0
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
6 6 6 6
7 7 7 7
8 8 8 8
9 9 9 9