How to add Headers to Scrapy CrawlSpider Requests?

You can pass REFERER manually to each request using headers argument:

yield Request(parse=..., headers={'referer':...})

RefererMiddleware does the same, automatically taking the referrer url from the previous response.


You have to enable the SpiderMiddleware that will populate the referer for responses. See the documentation for scrapy.contrib.spidermiddleware.referer.RefererMiddleware

In short, you need to add this middleware to your project's settings file.

SPIDER_MIDDLEWARES = {
'scrapy.contrib.spidermiddleware.referer.RefererMiddleware': True,
}

Then in your response parsing method, you can use, response.request.headers.get('Referrer', None), to get the referer.

Tags:

Python

Scrapy