Drupal - Real world experience in scaling and tuning performance

Markdorison's answer is basically the accepted method of attacking this problem. I'll take that a little further.

When you have Pressflow for D6 or Drupal for D7, Memcached and Varnish all working nicely together you'll need to custom code your VCL file. There are free ones available that make starting points but you always need to play with them.

To get Varnish to work optimally make sure you start it with -s malloc xG rather than the default of -s file /path/to/file. Also with Varnish have Varnish cache static items for as long as you can.

If you have more than one web server remove the ETag from the header sent to Varnish in VCL. I also remove Expires and simply rely on Age and max-age in the headers so get browsers back to the site.

Version 1.5 (as of 3rd March 2011) is still the fastest version of Memcached module from Drupal.org. I typically deploy it using a single bin per server to lower tcp traffic for connections to multiple bins at large scale)

Configure the caching in "Performance" to external and set a max age which will send the correct headers to a caching proxy such as Varnish.

If you can't get certain pages to cache properly in Varnish check out blog posts on the web that detail how to inspect the requests. Here is an example post I wrote a while back: What is stopping Varnish and Drupal Pressflow from caching anonymous users page views

You should pick InnoDB (or one of it's other names from other providers like XtraDB) for MySQL and move all tables into it. Then check out this blog post for basic tuning advice http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/

Having a large buffer pool is fundamentally important. When load testing the site turn on the slow query log. You probably want to at first capture queries taking longer than 50msec then tune the queries and repetitively reduce the slow log capture time down until you have most queries running using indexes, and executing fairly quickly.

Other basics involve having APC in for PHP. If you go for fast CGI rather than mod_php do spend some time trying to make the APC cache shared across the php instances by configuring a good wrapper script. Also make sure that the APC cache is in a memory mapped file to squeeze every last bit out of PHP.


I would recommend starting with Pressflow (if using Drupal 6), Memcache, Varnish, and some form of Content Distribution Network (CDN) such as Akamai. The end result should be as few of those users as possible actually hitting your origin server.

If you have parts of the page that you are not able to cache for non-anonymous users (things that are specific to that user, "Welcome userX" etc.), you can explore options to populate these pieces of the page such as asynchronous callbacks or edge side includes.

If you have a smaller group of internal users (such as a group of editors) that need to be able to view an uncached version of the site, I would recommend exposing an uncached version of your site at a different URL (protected behind a VPN or equivalent if possible).


2500 hits per second over a day -- if by "hit" you mean "page delivered" then that's 216 million page a day. Let me tell you this: you do not have 216 million pages a day. I love these clients...

That said, a raw traffic data does not say anything. While the advice in this thread is sound about Varnish / CDN if all you have is anonymous traffic but if you have logged in traffic, you are facing a challenge. But before spending an ungodly amount of time and effort to solve a problem, make sure you have a problem. 2500 hits per second, bing gets less than that, you realize that, right?