How to bypass cloudflare bot/ddos protection in Scrapy?
Obviously the best way to do this would be to whitelist your IP in CloudFlare; if this isn't suitable let me recommend the cloudflare-scrape library. You can use this to get the cookie token, then provide this cookie token in your Scrapy request back to the server.
So I executed JavaScript using Python with help of cloudflare-scrape.
To your scraper, you need to add the following code:
def start_requests(self):
for url in self.start_urls:
token, agent = cfscrape.get_tokens(url, 'Your prefarable user agent, _optional_')
yield Request(url=url, cookies=token, headers={'User-Agent': agent})
alongside parsing functions. And that's it!
Of course, you need to install cloudflare-scrape first and import it to your spider. You also need a JS execution engine installed. I had Node.JS already, no complaints.