"Pivot" a Pandas DataFrame into a 3D numpy array
It is doable using df.pivot_table
. I added one more row to your sample to have both Measurement Type
. On missing values, it will be represented by np.nan
sample `df`
Date Site Measurement_Type Value
0 1/1/2020 A Temperature 32.3
1 1/1/2020 A Humidity 60%
2 1/2/2020 B Humidity 70%
Try the followings
iix = pd.MultiIndex.from_product([np.unique(df.Date), np.unique(df.Measurement_Type)])
df_pivot = (df.pivot_table('Value', 'Site', ['Date', 'Measurement_Type'], aggfunc='first')
.reindex(iix, axis=1))
arr = np.array(df_pivot.groupby(level=0, axis=1).agg(lambda x: [*x.values])
.to_numpy().tolist())
print(arr)
Out[1447]:
array([[['60%', '32.3'],
[nan, nan]],
[[nan, nan],
['70%', nan]]], dtype=object)
Method 2: using pivot_table
on different columns and numpy reshape
iix_n = pd.MultiIndex.from_product([np.unique(df.Site), np.unique(df.Date)])
arr = (df.pivot_table('Value', ['Site', 'Date'], 'Measurement_Type', aggfunc='first')
.reindex(iix_n).to_numpy()
.reshape(df.Site.nunique(),df.Date.nunique(),-1))
Out[1501]:
array([[['60%', '32.3'],
[nan, nan]],
[[nan, nan],
['70%', nan]]], dtype=object)