Querying for NaN and other names in Pandas

In general, you could use @local_variable_name, so something like

>>> pi = np.pi; nan = np.nan
>>> df = pd.DataFrame({"value": [3,4,9,10,11,np.nan,12]})
>>> df.query("(value < 10) and (value > @pi)")
   value
1      4
2      9

would work, but nan isn't equal to itself, so value == NaN will always be false. One way to hack around this is to use that fact, and use value != value as an isnan check. We have

>>> df.query("(value < 10) or (value == @nan)")
   value
0      3
1      4
2      9

but

>>> df.query("(value < 10) or (value != value)")
   value
0      3
1      4
2      9
5    NaN

According to this answer you can use:

df.query('value < 10 | value.isnull()', engine='python')

I verified that it works.

Tags:

Python

Pandas