Pandas: Reading Excel with merged cells

df = df.fillna(method='ffill', axis=0)  # resolved updating the missing row entries

You could use the Series.fillna method to forword-fill in the NaN values:

df.index = pd.Series(df.index).fillna(method='ffill')

For example,

In [42]: df
Out[42]: 
       Sample    CD4   CD8
Day 1    8311  17.30  6.44
NaN      8312  13.60  3.50
NaN      8321  19.80  5.88
NaN      8322  13.50  4.09
Day 2    8311  16.00  4.92
NaN      8312   5.67  2.28
NaN      8321  13.00  4.34
NaN      8322  10.60  1.95

[8 rows x 3 columns]

In [43]: df.index = pd.Series(df.index).fillna(method='ffill')

In [44]: df
Out[44]: 
       Sample    CD4   CD8
Day 1    8311  17.30  6.44
Day 1    8312  13.60  3.50
Day 1    8321  19.80  5.88
Day 1    8322  13.50  4.09
Day 2    8311  16.00  4.92
Day 2    8312   5.67  2.28
Day 2    8321  13.00  4.34
Day 2    8322  10.60  1.95

[8 rows x 3 columns]

To casually come back 8 years later, pandas.read_excel() can solve this internally for you with the index_col parameter.

df = pd.read_excel('path_to_file.xlsx', index_col=[0])

Passing index_col as a list will cause pandas to look for a MultiIndex. In the case where there is a list of length one, pandas creates a regular Index filling in the data.

Pandas: Reading Excel with merged cells

Tags:

Python

Pandas

Excel

Related

Recent Posts