Anyone know of a tool to detect and report on repeating patterns in a log file?
Solution 1:
Splunk works wonders for this sort of stuff. I use it internally to gather all the logs and do quick searches via its excellent browser-based interface.
Solution 2:
syslog-ng has a patterndb named feature. You can make patterns and match log entries to them in real time then send those entries to separate logfiles.
Solution 3:
I've heard of people applying Bayesian filtering on log files to spot interesting stuff versus routine log entries. They used spam filters, where the routine uninteresting entries were considered "good" while the unusual ones were considered as "spam" and using that coloring they were able to shift through.
It sounds a lot like machine learning stuff to me, but then again I've not seen it in action, only heard of it over beers.
Solution 4:
While looking into syslog-ng and patterndb (+1 to that answer, above), I encountered a web-based tool called ELSA: http://code.google.com/p/enterprise-log-search-and-archive/. It's F/OSS in perl, with a web interface, and supposed to be really fast.
I haven't tried it yet, but once I'm done filtering using patterndb, I'll be trying ELSA.