Scrapy overwrite json files instead of appending the file
There is a flag which allows overwriting the output file, you can do so by passing the file reference via -O
option instead of -o
, so you can use this instead:
scrapy crawl myspider -O /path/to/json/my.json
More information:
$ scrapy crawl --help
Usage
=====
scrapy crawl [options] <spider>
Run a spider
Options
=======
--help, -h show this help message and exit
-a NAME=VALUE set spider argument (may be repeated)
--output=FILE, -o FILE append scraped items to the end of FILE (use - for
stdout)
--overwrite-output=FILE, -O FILE
dump scraped items into FILE, overwriting any existing
file
--output-format=FORMAT, -t FORMAT
format to use for dumping items
Global Options
--------------
--logfile=FILE log file. if omitted stderr will be used
--loglevel=LEVEL, -L LEVEL
log level (default: DEBUG)
--nolog disable logging completely
--profile=FILE write python cProfile stats to FILE
--pidfile=FILE write process ID to FILE
--set=NAME=VALUE, -s NAME=VALUE
set/override setting (may be repeated)
--pdb enable pdb on failure
scrapy crawl myspider -t json --nolog -o - > "/path/to/json/my.json"