How do I use Python and lxml to parse a local html file?
If the file is local, you shouldn't be using requests
-- just open the file and read it in. requests
expects to be talking to a web server.
with open(r'C:\Users\...site_1.html', "r") as f:
page = f.read()
tree = html.fromstring(page)
You can also try using Beautiful Soup
from bs4 import BeautifulSoup
f = open("filepath", encoding="utf8")
soup = BeautifulSoup(f)
f.close()
There is a better way for doing it:
using parse
function instead of fromstring
tree = html.parse("C:\Users\...site_1.html")
print(html.tostring(tree))