Cannot interpolate dataframe even if most of the data is filled
I had a similar problem, recreated the dataframe with definition of dtype as float (e.g. dtype='float32'
). it fixed.
df = pd.DataFrame(data = df.values, columns= cols, dtype='float32')
Check that your DataFrame has numeric dtypes, not object
dtypes. The
TypeError: Cannot interpolate with all NaNs
can occur if the DataFrame
contains columns of object
dtype. For example, if
import numpy as np
import pandas as pd
df = pd.DataFrame({'A':np.array([1,np.nan,30], dtype='O')},
index=['2016-01-21 20:06:22', '2016-01-21 20:06:23',
'2016-01-21 20:06:24'])
then df.interpolate()
raises the TypeError.
To check if your DataFrame has columns with object dtype, look at df3.dtypes
:
In [92]: df.dtypes
Out[92]:
A object
dtype: object
To fix the problem, you need to ensure the DataFrame has numeric columns with native NumPy dtypes. Obviously, it would be best to build the DataFrame correctly from the very beginning. So the best solution depends on how you are building the DataFrame.
A less appealing patch-up fix would be to use pd.to_numeric
to convert the object arrays to numeric arrays after-the-fact:
for col in df:
df[col] = pd.to_numeric(df[col], errors='coerce')
With errors='coerce'
, any value that could not be converted to a number is converted to NaN. After calling pd.to_numeric
on each column, notice that the dtype is now float64
:
In [94]: df.dtypes
Out[94]:
A float64
dtype: object
Once the DataFrame has numeric dtypes, and the DataFrame has a DatetimeIndex, then df.interpolate(method='time')
will work:
import numpy as np
import pandas as pd
df = pd.DataFrame({'A':np.array([1,np.nan,30], dtype='O')},
index=['2016-01-21 20:06:22', '2016-01-21 20:06:23',
'2016-01-21 20:06:24'])
for col in df:
df[col] = pd.to_numeric(df[col], errors='coerce')
df.index = pd.DatetimeIndex(df.index)
df = df.interpolate(method='time')
print(df)
yields
A
2016-01-21 20:06:22 1.0
2016-01-21 20:06:23 15.5
2016-01-21 20:06:24 30.0