How can i read a PDF file from inline raw_bytes (not from file)?

you can use io

import requests, PyPDF2, io

url = 'http://www.asx.com.au/asxpdf/20171108/pdf/43p1l61zf2yct8.pdf'
response = requests.get(url)

with io.BytesIO(response.content) as open_pdf_file:
    read_pdf = PyPDF2.PdfFileReader(open_pdf_file)
    num_pages = read_pdf.getNumPages()
    print(num_pages)

PS. To open files, always use a context manager (with-statement)

Try This (With IO module and an additional decryptor) :

import requests, PyPDF2, io


url = 'http://www.asx.com.au/asxpdf/20171103/pdf/43nyyw9r820c6r.pdf'
response = requests.get(url).content

reserve_pdf_on_memory = io.BytesIO(response)
load_pdf = PyPDF2.PdfFileReader(reserve_pdf_on_memory)

if load_pdf.isEncrypted:
    load_pdf.decrypt("")
    print(load_pdf.getPage(0).extractText())

else:
    print(load_pdf.getPage(0).extractText())

Good Luck ... :)

How can i read a PDF file from inline raw_bytes (not from file)?

Tags:

Pdf

Python 3.X

Python Requests

Related

Recent Posts