Python follow redirects and then download the page?

You might be better off with Requests library which has better APIs for controlling redirect handling:

https://requests.readthedocs.io/en/master/user/quickstart/#redirection-and-history

Requests:

https://pypi.org/project/requests/ (urllib replacement for humans)

Use requests as the other answer states, here is an example. The redirect will be in r.url. In the example below the http is redirected to https

For HEAD:

In [1]: import requests
   ...: r = requests.head('http://github.com', allow_redirects=True)
   ...: r.url

Out[1]: 'https://github.com/'

For GET:

In [1]: import requests
   ...: r = requests.get('http://github.com')
   ...: r.url

Out[1]: 'https://github.com/'

Note for HEAD you have to specify allow_redirects, if you don't you can get it in the headers but this is not advised.

In [1]: import requests

In [2]: r = requests.head('http://github.com')

In [3]: r.headers.get('location')
Out[3]: 'https://github.com/'

To download the page you will need GET, you can then access the page using r.content

Python follow redirects and then download the page?

Tags:

Python

Html

Web Scraping

Related

Recent Posts