Testing if a pandas DataFrame exists

Option 1 (my preferred option)

This is @Ami Tavory's

Please select his answer if you like this approach

It is very idiomatic python to initialize a variable with None then check for None prior to doing something with that variable.

df1 = None

if df1 is not None:
    print df1.head()

Option 2

However, setting up an empty dataframe isn't at all a bad idea.

df1 = pd.DataFrame()

if not df1.empty:
    print df1.head()

Option 3

Just try it.

try:
    print df1.head()
# catch when df1 is None
except AttributeError:
    pass
# catch when it hasn't even been defined
except NameError:
    pass

Timing

When df1 is in initialized state or doesn't exist at all

enter image description here

When df1 is a dataframe with something in it

df1 = pd.DataFrame(np.arange(25).reshape(5, 5), list('ABCDE'), list('abcde'))
df1

enter image description here

enter image description here


In my code, I have several variables which can either contain a pandas DataFrame or nothing at all

The Pythonic way of indicating "nothing" is via None, and for checking "not nothing" via

if df1 is not None:
    ...

I am not sure how critical time is here, but since you measured things:

In [82]: t = timeit.Timer('if x is not None: pass', setup='x=None')

In [83]: t.timeit()
Out[83]: 0.022536039352416992

In [84]: t = timeit.Timer('if isinstance(x, type(None)): pass', setup='x=None')

In [85]: t.timeit()
Out[85]: 0.11571192741394043

So checking that something is not None, is also faster than the isinstance alternative.


If the dataframe is stored as a dictionary value, you could test for its existence this way:

import pandas as pd

d = dict()
df = pd.DataFrame()

d['df'] = df

## the 'None' is default but including it for the example
if d.get('df', None) is not None:
    ## get df shape
    print(df.shape)
else:
    print('no df here')