How to check if an RSS feed has been updated in Python?
For "good" feeds you can use ETag and last-modfied-since mechanism, it's described here http://www.kbcafe.com/rss/rssfeedstate.html
But some servers doesn't support it, so you need to simply check post dates and ids and see, do you have such posts in your DB or not.
Each feed item has an identifier, in item.id
. Track those, together with their .updated
(or .updated_parsed
) entry, to check for new items.
So, see if you already have seen the item (via item.id
) or if it has been updated since the last time you checked (via item.updated
or item.updated_parsed
).
Do make sure you take advantage of the feedparser E-Tag support to check for changed feed contents though. This will only save you from downloading feeds with no new items; you still need to detect items have been added or updated when you get a fresh new copy of the feed.