pdf to xlsx python code example

Example 1: convert pdf folder to excell pandas

import tabula
# Extaer los datos del pdf al DataFrame
df = tabula.read_pdf("inforatge.pdf")
# lo convierte en un csv llamdo out.csv codificado con utf-8
df.to_csv('out.csv', sep='\t', encoding='utf-8')

Example 2: convert a pdf folder to excell pandas

# import packages needed
import glob
import tabula

# transform the pdfs into excel files
for filepath in glob.iglob('C:/Users/myfolderwithpdfs/*.pdf'):
    tabula.convert_into(filepath, output_format="xlsx")