How to add Headers to Scrapy CrawlSpider Requests?
You can pass REFERER
manually to each request using headers
argument:
yield Request(parse=..., headers={'referer':...})
RefererMiddleware does the same, automatically taking the referrer url from the previous response.
You have to enable the SpiderMiddleware
that will populate the referer
for responses. See the documentation for scrapy.contrib.spidermiddleware.referer.RefererMiddleware
In short, you need to add this middleware to your project's settings file.
SPIDER_MIDDLEWARES = {
'scrapy.contrib.spidermiddleware.referer.RefererMiddleware': True,
}
Then in your response parsing method, you can use, response.request.headers.get('Referrer', None)
, to get the referer.