Compare Series containing None
The None
get casted to NaN
and NaN
has the property that it is not equal to itself:
[54]:
b = pd.Series([None, None, 4, 5])
b
Out[54]:
0 NaN
1 NaN
2 4.0
3 5.0
dtype: float64
As you can see here:
In[55]:
b==b
Out[55]:
0 False
1 False
2 True
3 True
dtype: bool
I'm not sure how you can get this to work correctly, although this works:
In[68]:
( (b == b.shift()) | ( (b != b.shift()) & (b != b) ) )
Out[68]:
0 True
1 True
2 False
3 False
dtype: bool
You'll get a false result for the first row because when you shift
down you're comparing against a non-existent row:
In[69]:
b.shift()
Out[69]:
0 NaN
1 NaN
2 NaN
3 4.0
dtype: float64
So the NaN
is comparing True
from the boolean logic as the first row is NaN
and so is the shifted series' first row.
To work around the first row False-positive you could slice the resultant result to ignore the first row:
In[70]:
( (b == b.shift()) | ( (b != b.shift()) & (b != b) ) )[1:]
Out[70]:
1 True
2 False
3 False
dtype: bool
As to why it gets casted, Pandas
tries to coerce the data to a compatible numpy, here float is selected because of the int
s and None
values, None
and NaN
cannot be represented by int
s
To get the same result as a
in your example, you should overwrite the first row to False
as it should always fail:
In[78]:
result = pd.Series( ( (b == b.shift()) | ( (b != b.shift()) & (b != b) ) ) )
result.iloc[0] = False
result
Out[78]:
0 False
1 True
2 False
3 False
dtype: bool