Remove NaN/NULL columns in a Pandas dataframe?
Another solution would be to create a boolean dataframe with True values at not-null positions and then take the columns having at least one True value. This removes columns with all NaN values.
df = df.loc[:,df.notna().any(axis=0)]
If you want to remove columns having at least one missing (NaN) value;
df = df.loc[:,df.notna().all(axis=0)]
This approach is particularly useful in removing columns containing empty strings, zeros or basically any given value. For example;
df = df.loc[:,(df!='').all(axis=0)]
removes columns having at least one empty string.
Yes, dropna
. See http://pandas.pydata.org/pandas-docs/stable/missing_data.html and the DataFrame.dropna
docstring:
Definition: DataFrame.dropna(self, axis=0, how='any', thresh=None, subset=None)
Docstring:
Return object with labels on given axis omitted where alternately any
or all of the data are missing
Parameters
----------
axis : {0, 1}
how : {'any', 'all'}
any : if any NA values are present, drop that label
all : if all values are NA, drop that label
thresh : int, default None
int value : require that many non-NA values
subset : array-like
Labels along other axis to consider, e.g. if you are dropping rows
these would be a list of columns to include
Returns
-------
dropped : DataFrame
The specific command to run would be:
df=df.dropna(axis=1,how='all')