What is caching?
You will most likely read about caching in the context of web applications. Because of the nature of the Web, caching can make a big performance difference.
Consider the following:
A web page request gets to the web server, which passes the request on to the application server, which executes some code that renders the page, which needs to turn to the database to dynamically retrieve data.
This model does not scale well, because as the number of requests for the page goes up, the server has to do the same thing over and over again, for every request.
This becomes even more of an issue if web server, application server, and database are on different hardware and communicate over the network with each other.
If you have a large number of users hitting this page, it makes sense to not go all the way through to the database for every request. Instead, you resort to caching at different levels.
Resultset Cache
Resultset caching is storing the results of a database query along with the query in the application. Every time a web page generates a query, the applications checks whether the results are already cached, and if they are, pulls them from an in-memory data set instead. The application still has to render the page.
Component Cache
A web page is comprised of different components - pagelets, or whatever you may want to call them. A component caching strategy must know what parameters were used to request the component. For instance, a little "Latest News" bar on the site uses the user's geographical location or preference to show local news. Consequently, if the news for a location is cached, the component does not need to be rendered and can be pulled from a cache.
Page Cache
One strategy for caching entire pages is to store the query string and/or header parameters along with the completely renderered HTML. The file system is fast enough for this - it is still way less expensive for a web server to read a file than to make a call to the application server to have the page rendered. In this case, every user who sends the same query string will get the same cached content.
Combining these caching strategies intelligently is the only way to create really scalable web apps for large numbers of concurrent users. As you can easily see, the potential risk here is that if a piece of content in the cache cannot be uniquely identified by it's key, people will start to see the wrong content. This can get pretty complicated, particularly when users have sessions and there is security context.
Caching is just the practice of storing data in and retrieving data from a high-performance store (usually memory) either explicitly or implicitly.
Let me explain. Memory is faster to access than a file, a remote URL (usually), a database or any other external store of information you like. So if the act of using one of those external resources is significant then you may benefit from caching to increase performance.
Knuth once said that premature optimization is the root of all evil. Well, premature caching is the root of all headaches as far as I'm concerned. Don't solve a problem until you have a problem. Every decision you make comes at a cost that you'll pay to implement it now and pay again to change it later so the longer you can put off making a deicsion and changing your system the better.
So first identify that you actually have a problem and where it is. Profiling, logging and other forms of performance testing will help you here. I can't stress enough how important this step is. The number of times I've seen people "optimize" something that isn't a problem is staggering.
Ok, so you have a performance problem. Say your pages are running a query that takes a long time. If it's a read then you have a number of options:
- Run the query as a separate process and put the result into a cache. All pages simply access the cache. You can update the cached version as often as is appropriate (once a day, once a week, one every 5 seconds, whatever is appropriate);
- Cache transparently through your persistence provider, ORM or whatever. Of course this depends on what technology you're using. Hibernate and Ibatis for example support query result caching;
- Have your pages run the query if the result isn't in the cache (or it's "stale", meaning it is calculated longer ago than the specified "age") and put it into the cache. This has concurrency problems if two (or more) separate processes all decide they need to update the result so you end up running the same (expensive) query eight times at once. You can handle this locking the cache but that creates another performance problem. You can also fall back to concurrency methods in your language (eg Java 5 concurrency APIs).
If it's an update (or updates take place that need to be reflected in your read cache) then it's a little more complicated because it's no good having an old value in the cache and a newer value in the database such that you then provide your pages with an inconsistent view of the data. But broadly speaking there are four approaches to this:
- Update the cache and then queue a request to update the relevant store;
- Write through caching: the cache provider may provide a mechanism to persist the update and block the caller until that change is made; and
- Write-behind caching: same as write-through caching but it doesn't block the caller. The update happens asynchronously and separately; and
- Persistence as a Service models: this assumes your caching mechanism supports some kind of observability (ie cache event listeners). Basically an entirely separate process--unknown to the caller--listens for cache updates and persists them as necessary.
Which of the above methodologies you choose will depend a lot on your requirements, what technologies you're using and a whole host of other factors (eg is clustering and failover support required?).
It's hard to be more specific than that and give you guidance on what to do without knowing much more detail about your problem (like whether or not you have a problem).