mean calculation in pandas excluding zeros
df[df["Column_name"] != 0]["Column_name"].mean()
or if your column name does not contain space char
df[df.Column_Name != 0].Column_Name.mean()
hopefully it can be included as a parameter in the next "mean" object version
.mean(exclude=0) #wondering in next versions
It also depends on the meaning of 0 in your data.
- If these are indeed '0' values, then your approach is good
If '0' is a placeholder for a value that was not measured (i.e. 'NaN'), then it might make more sense to replace all '0' occurrences with 'NaN' first. Calculation of the mean then by default exclude NaN values.
df = pd.DataFrame([1, 0, 2, 3, 0], columns=['a']) df = df.replace(0, np.NaN) df.mean()