Finding the mean and standard deviation of a timedelta object in pandas df
Pandas mean()
and other aggregation methods support numeric_only=False
parameter.
dropped.groupby('bank').mean(numeric_only=False)
Found here: Aggregations for Timedelta values in the Python DataFrame
You need to convert timedelta
to some numeric value, e.g. int64
by values
what is most accurate, because convert to ns
is what is the numeric representation of timedelta
:
dropped['new'] = dropped['diff'].values.astype(np.int64)
means = dropped.groupby('bank').mean()
means['new'] = pd.to_timedelta(means['new'])
std = dropped.groupby('bank').std()
std['new'] = pd.to_timedelta(std['new'])
Another solution is to convert values to seconds
by total_seconds
, but that is less accurate:
dropped['new'] = dropped['diff'].dt.total_seconds()
means = dropped.groupby('bank').mean()