Scrapy - Silently drop an item

The proper way to do this looks to be to implement a custom LogFormatter for your project, and change the logging level of dropped items.

Example:

from scrapy import log
from scrapy import logformatter

class PoliteLogFormatter(logformatter.LogFormatter):
    def dropped(self, item, exception, response, spider):
        return {
            'level': log.DEBUG,
            'format': logformatter.DROPPEDFMT,
            'exception': exception,
            'item': item,
        }

Then in your settings file, something like:

LOG_FORMATTER = 'apps.crawler.spiders.PoliteLogFormatter'

I had bad luck just returning "None," which caused exceptions in future pipelines.


In recent Scrapy versions, this has been changed a bit. I copied the code from @jimmytheleaf and fixed it to work with recent Scrapy:

import logging
from scrapy import logformatter


class PoliteLogFormatter(logformatter.LogFormatter):
    def dropped(self, item, exception, response, spider):
        return {
            'level': logging.INFO,
            'msg': logformatter.DROPPEDMSG,
            'args': {
                'exception': exception,
                'item': item,
            }
        }

Tags:

Python

Scrapy