Pandas read _excel: 'utf-8' codec can't decode byte 0xa8 in position 14: invalid start byte

The problem is that the original requester is calling read_excel with a filehandle as the first argument. As demonstrated by the last responder, the first argument should be a string containing the filename.

I ran into this same error using:

df = pd.read_excel(open("file.xlsx",'r'))

but correct is:

df = pd.read_excel("file.xlsx")

Panda support encoding feature to read your excel In your case you can use:

df=pd.read_excel('your_file.xlsx',encoding='utf-8')

or if you want in more of system specific without any surpise you can use:

df=pd.read_excel('your_file.xlsx',encoding='sys.getfilesystemencoding()')

Most probably you're using Python3. In Python2 this wouldn't happen.

xlsx files are binary (actually they're an xml, but it's compressed), so you need to open them in binary mode. Use this call to open:

open('1.xlsx', 'rb')

There's no full traceback, but I imagine the UnicodeDecodeError comes from the file object, not from read_excel(). That happens because the stream of bytes can contain anything, but we don't want decoding to happen too soon; read_excel() must receive raw bytes and be able to process them.

Pandas read _excel: 'utf-8' codec can't decode byte 0xa8 in position 14: invalid start byte

Tags:

Python

Pandas

Excel

Related

Recent Posts