Taking np.average while ignoring NaN's?
You can create a masked array like this:
data = np.array([[1,2,3], [4,5,np.NaN], [np.NaN,6,np.NaN], [0,0,0]])
masked_data = np.ma.masked_array(data, np.isnan(data))
# calculate your weighted average here instead
weights = [1, 1, 1]
average = np.ma.average(masked_data, axis=1, weights=weights)
# this gives you the result
result = average.filled(np.nan)
print(result)
This outputs:
[ 2. 4.5 6. 0. ]
You can simply multiply the input array with the weights
and sum along the specified axis ignoring NaNs
with np.nansum
. Thus, for your case, assuming the weights
are to be used along axis = 1
on the input array sst_filt
, the summations would be -
np.nansum(sst_filt*weights,axis=1)
Accounting for the NaNs while averaging, we will end up with :
def nanaverage(A,weights,axis):
return np.nansum(A*weights,axis=axis)/((~np.isnan(A))*weights).sum(axis=axis)
Sample run -
In [200]: sst_filt # 2D array case
Out[200]:
array([[ 0., 1.],
[ nan, 3.],
[ 4., 5.]])
In [201]: weights
Out[201]: array([ 0.25, 0.75])
In [202]: nanaverage(sst_filt,weights=weights,axis=1)
Out[202]: array([0.75, 3. , 4.75])
I'd probably just select the portion of the array that isn't NaN and then use those indices to select the weights too.
For example:
import numpy as np
data = np.random.rand(10)
weights = np.random.rand(10)
data[[2, 4, 8]] = np.nan
print data
# [ 0.32849204, 0.90310062, nan, 0.58580299, nan,
# 0.934721 , 0.44412978, 0.78804409, nan, 0.24942098]
ii = ~np.isnan(data)
print ii
# [ True True False True False True True True False True]
result = np.average(data[ii], weights = weights[ii])
print result
# .6470319
Edit: I realized this won't work with two dimensional arrays. In that case, I'd probably just set the values and weights to zero for the NaNs. This yields the same result as if those indices were just not included in the calculation.
Before running np.average:
data[np.isnan(data)] = 0;
weights[np.isnan(data)] = 0;
result = np.average(data, weights=weights)
Or create copies if you want to keep track of which indices were NaN.