Should we keep logs forever to investigate past data breaches?
There is no "correct" answer to your question, unfortunately. Data retention policies are specific to the needs of an organization, and are often implemented out of necessity to comply with various legal requirements , which vary depending on the nature of the data being stored, as well as the jurisdiction that the data falls under.
Retaining log data can allow for post-mortem analysis if a breach is discovered, as you're alluding to in your question. However, retained data can also be a security risk in its own right if the logs contain sensitive information, so steps must be taken to secure log files if necessary. The other obvious factor in play is the cost of keeping the logs. Depending on availability requirements, different backup solutions may be more cost effective than others, such as keeping old logs offsite on tape storage rather than using disk redundancy.
10 Years
Storing logs is cheap, more often they're ASCII/UNICODE and easily compressed for long-term archival.
Keeping your logs is better than purging for the reasons you can't anticipate.
But a minimum, a ten-year retention policy is an industry best practice for US-based businesses since the federal statute of limitations and in most states is a decade maximum regarding non-grave person crimes.
Specific industry sectors go further, medical clinicians including hospitals retain health records and the corollary electronic log data for 50 years.
Telecommunications providers such as NYNEX (acquired by Verizon) and other "Baby Bells" retained their Pen Registers, the logs of their subscriber's phone calls forever.
Records retention, mirroring and safe off-site archival is a practice that every sizeable organization has to tackle but becomes routine when implemented.
If you're a services provider, hosting company or in any way a custodian of Personally Identifiable Data, a 10-year retention policy will keep you in compliance with every well known and industry accepted security standards including PCI-DSS and the rest of the phonebook of industry best practices.
Demonstrating a uniform ironclad retention policy helps a business quickly staunch the topic in the RFP selection process and will define yourselves as "up to par".
Storing these log files indefinitely MAY BE illegal in the EU. I am saying MAY BE, since the new data protection legislative comes into effect in May 2018 and there are still some unclear areas. However, the rules are following:
If you don't have explicit consent (which, I presume, you don't have), you are allowed to keep personal data only for purposes allowed by the law. Keeping log files for the purpose of investigation of data breaches is allowed, since the following exception applies: "processing is necessary in order to protect the vital interests of the data subject or of another natural person".
However, you are still bound by the principle of proportionality, so you can store log data only to the extent that it is "necessary". At some point, the usefullness of the data is only theoretical, so the legal ground for processing disappears. There is no hard-set limit, but in any case, the burden of proof is on your side - you have to prove that storing log files is necessary to protect security.
You should be concerned of this, even if you are operating in the US, since this regulation applies very widely (for example, you have clients in the EU).
Anyway, there is a way around this regulation - if your logs don't contain personal data (eg. user cannot be identified), regulation does not apply. However, since IP address is considered personal data (eve