Why is pandas.apply() executing on null elements?

None and nan are semantically equivalent. There is no point in replacing None with numpy.nan. apply will still apply the function to NaN elements.

df[2] = numpy.nan
df.apply(lambda x: print(x))

Output: [1, 2]
        [2, 3, 4, 5]
        nan

You have to check for a missing value in your function you want to apply or use pandas.dropna and apply the function to the result:

df.dropna().apply(lambda x: print(x))

Alternatively, use pandas.notnull() which returns a series of booleans:

df[df.notnull()].apply(lambda x: print(x))

Please also read: http://pandas.pydata.org/pandas-docs/stable/missing_data.html

And specifically, this:

Warning:

One has to be mindful that in python (and numpy), the nan's don’t compare equal, but None's do. Note that Pandas/numpy uses the fact that np.nan != np.nan, and treats None like np.nan.

Tags:

Python

Pandas