How to use PyCharm to debug Scrapy projects
You just need to do this.
Create a Python file on crawler folder on your project. I used main.py.
- Project
- Crawler
- Crawler
- Spiders
- ...
- main.py
- scrapy.cfg
- Crawler
- Crawler
Inside your main.py put this code below.
from scrapy import cmdline
cmdline.execute("scrapy crawl spider".split())
And you need to create a "Run Configuration" to run your main.py.
Doing this, if you put a breakpoint at your code it will stop there.
The scrapy
command is a python script which means you can start it from inside PyCharm.
When you examine the scrapy binary (which scrapy
) you will notice that this is actually a python script:
#!/usr/bin/python
from scrapy.cmdline import execute
execute()
This means that a command like
scrapy crawl IcecatCrawler
can also be executed like this: python /Library/Python/2.7/site-packages/scrapy/cmdline.py crawl IcecatCrawler
Try to find the scrapy.cmdline package.
In my case the location was here: /Library/Python/2.7/site-packages/scrapy/cmdline.py
Create a run/debug configuration inside PyCharm with that script as script. Fill the script parameters with the scrapy command and spider. In this case crawl IcecatCrawler
.
Like this:
Put your breakpoints anywhere in your crawling code and it should work™.
As of 2018.1 this became a lot easier. You can now select Module name
in your project's Run/Debug Configuration
. Set this to scrapy.cmdline
and the Working directory
to the root dir of the scrapy project (the one with settings.py
in it).
Like so:
Now you can add breakpoints to debug your code.