Python Camelot borderless table extraction issue
To improve the detected area, you can increase the edge_tol (default: 50) value to counter the effect of text being placed relatively far apart vertically. Larger edge_tol will lead to longer textedges being detected, leading to an improved guess of the table area. Let’s use a value of 500.
>>> tables = camelot.read_pdf('edge_tol.pdf', flavor='stream', edge_tol=500)
>>> camelot.plot(tables[0], kind='contour')
>>> plt.show()
>>> tables[0].df
Camelot uses lattice by default which relies on clear lines dividing the cells.
For tables without lines you want to use stream:
tables = camelot.read_pdf('your_file_name.pdf', flavor = 'stream')