how to get the original start_url in scrapy (before redirect)
This gave me the original 'referer URL', i.e. which of my start_urls led to the URL corresponding to this request object being scraped:
req = response.request
req_headers = req.__dict__['headers']
referer_url = req_headers['Referer'].decode('utf-8')
You can find what you need in response.request.meta['redirect_urls']
.
Quote from docs:
The urls which the request goes through (while being redirected) can be found in the redirect_urls Request.meta key.
Hope that helps.