Optimize Drupal with External Caching
The best way -- perhaps the only way -- to fully optimize Drupal is with external caching.
Here's why:
- Modules, modules, more modules
- Database writes
- Poorly optimized MySQL queries
Modules, modules, more modules
Modules slow Drupal down drastically[*]. The more modules installed, the worse it gets. Most sites that I've seen have hundreds of modules to provide their functionality.
Database writes
Drupal writes to the database constantly. Access logs, error logs, view counts... It's possible to turn these off, but functionality may be lost.
Poorly optimized MySQL queries
Some SQL queries cannot be optimized. For example, this query runs on one site's home page (and all pagenated "node" lists):
SELECT DISTINCT(n.nid), n.sticky, n.title, n.created FROM node n INNER JOIN term_node tn0 ON n.vid = tn0.vid WHERE n.STATUS = 1 AND tn0.tid IN (X) ORDER BY n.sticky DESC, n.created DESC LIMIT Y, Z
That query cannot be optimized due to the combination of DISTINCT, JOIN, ORDER BY and LIMIT. You might be able to hack the code to remove the DISTINCT (and filter out duplicates in PHP), but given that any module can alter even core queries with a hook, it's very difficult to do so. Furthermore, many tables -- like cache tables used by both core and 3rd-party modules -- contain TEXT fields, which force temporary tables to be created on disk rather than in RAM.
The solution
The only solution I've see work for busy sites with a lot of content is a combination of a caching reverse proxy like Varnish, and memcached. Functionality for both is provided by the drop-in replacement for Drupal called Pressflow.
[*] The 2bits article is old (but still true) and has more recent updates at the bottom.
