How to calculate mean values grouped on another column in Pandas
This is what groupby
is for:
In [117]:
df.groupby('StationID')['BiasTemp'].mean()
Out[117]:
StationID
BB 5.0
KEOPS 2.5
SS0279 15.0
Name: BiasTemp, dtype: float64
Here we groupby the 'StationID' column, we then access the 'BiasTemp' column and call mean
on it
There is a section in the docs on this functionality.
can be done as follows:
df.groupby('StationID').mean()
You could groupby
on StationID
and then take mean()
on BiasTemp
. To output Dataframe
, use as_index=False
In [4]: df.groupby('StationID', as_index=False)['BiasTemp'].mean()
Out[4]:
StationID BiasTemp
0 BB 5.0
1 KEOPS 2.5
2 SS0279 15.0
Without as_index=False
, it returns a Series
instead
In [5]: df.groupby('StationID')['BiasTemp'].mean()
Out[5]:
StationID
BB 5.0
KEOPS 2.5
SS0279 15.0
Name: BiasTemp, dtype: float64
Read more about groupby
in this pydata tutorial.