Pandas: drop columns with all NaN's

From the dropna docstring:

    # drop the columns where all elements are NaN:

    >>> df.dropna(axis=1, how='all')
         A    B  D
    0  NaN  2.0  0
    1  3.0  4.0  1
    2  NaN  NaN  5

dropna() drops the null values and returns a dataFrame. Assign it back to the original dataFrame.

fish_frame = fish_frame.dropna(axis = 1, how = 'all')

Referring to your code:

fish_frame.dropna(thresh=len(fish_frame) - 3, axis=1)

This would drop columns with 7 or more NaN's (assuming len(df) = 10), if you want to drop columns with more than 3 Nan's like you've mentioned, thresh should be equal to 3.


dropna() by default returns a dataframe (defaults to inplace=False behavior) and thus needs to be assigned to a new dataframe for it to stay in your code.

So for example,

fish_frame = fish_frame.dropna()

As to why your dropna is returning an empty dataframe, I'd recommend you look at the "how" argument in the dropna method (https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html). Also bear in mind, axis=0 corresponds to columns, and axis=1 corresponds to rows.

So to remove columns with all "NAs", axis=0, how="any" should do the trick:

fish_frame = fish_frame.dropna(axis=0, how="any")

Finally, the "thresh" argument designates explicitly how many NA's are necessary for a drop to occur. So

fish_frame = fish_frame.dropna(axis=0, thresh=3, how="any") 

should work fine and dandy to remove any column with three NA's.

Also, as Corley pointed out, how="any" is the default and is thus not necessary.