How To Improve Website Performance (With Drupal, PHP, MySQL and Apache)

Recently, when my new website security scanning startup site launched, I was inundated with traffic – receiving several thousand hits in the space of a few hours. Coupled with several  website security scans my site was running, and the load was more than my server could handle, forcing me to restart Apache several times. After this, I decided to focus on tuning my application performance, based on number of requests served per second, and page load time. In this post, I’ll cover how I tuned the following tech stack components to improve results:

  1. Infrastructure (Hardware) improvements
  2. Apache Settings
  3. PHP
  4. Application performance

Prepare yourself – this is a very in depth tuning article, so grab a cup of coffee and get your tuning hat on!

Web Application Performance Tuning Methodology

When tuning server based applications, it is best to use a pyramid approach. Rather than immediately focusing on profiling or tuning specific functions or code, it’s better to focus on places in the technology stack where a single change improves as many other downstream components as possible. Thus, whenever I approach a tuning problem, I start with the hardware it’s running on, and progress up the stack until I get to code profiling and specific code bottlenecks.

Website Tuning Priority Matrix (pyramid)

For my current tuning exercise, I needed a way to benchmark various changes I made to the environment, and settled on two tools:

Request/second benchmarking: Apache Benchmark (ab) for tuning server requests per second, running with 50 concurrent users and 1000 requests.

Page load speed: Web Page Test, since it allows for an independent look at page load speed.

The fundamental problem I am trying to solve is dealing with high traffic loads, so I am more concerned with the Apache benchmark results than page load speed right now, though I want to improve loading speed as much as possible as well.

I am running my site on a VPS with 1.5Ghz dedicated processor, 1GB RAM, and CentOS 4.9. The application stack uses MySQL for the database, PHP with Apache for the web server, and Drupal 6 with normal caching as the application framework. My development server runs on Fedora Core 14, so all commands shown are the ones used in Fedora, while benchmarks took place against production.

Tuning the Hardware

Hardware is generally not something which can be tuned – we can pay to upgrade it, or potentially upgrade the operating system to take advantage of new hardware optimizations. In my case, the VPS I run on was purchased several years ago, and has not been upgraded since. I looked at various other hosting providers to see what options I had in terms of better hardware at the same price point. Once I saw some competitive offerings, I went back to my hosting company and negotiated with them for new hardware. As a result, they offered to upgrade me to new hardware without changing my pricing structure! All the benchmarks were performed on the old hardware prior to upgrading, but the same general results should hold true. When I have completed the hardware upgrade, I will update this post with the results.

Tuning Apache – Enable GZIP, modify configuration

There are a few things which are relatively painless to implement with Apache. One of these is enabling gzip. I actually thought I had this enabled anyway… but it pays to check using yslow or a similar tool. I also tuned a few key Apache parameters. The result? Page load time decreased nearly 32%, while requests served per second barely moved (they remained within the same confidence interval)


GZIP Page Speed Optimizations Realized
How to enable GZIP with Apache

Enabling gzip is a very easy process. Open your Apache configuration file, and make sure mod_deflate is enabled. You should see a line like this (if you don’t, add it)

LoadModule deflate_module modules/mod_deflate.so

You generally want to compress everything except images (which are compressed already), so add the following lines to your httpd.conf file, and restart apache. You may want to customize this using the Apache documentation.

# Insert filter
SetOutputFilter DEFLATE
# Netscape 4.x has some problems...
BrowserMatch ^Mozilla/4 gzip-only-text/html
# Netscape 4.06-4.08 have some more problems
BrowserMatch ^Mozilla/4\.0[678] no-gzip
# MSIE masquerades as Netscape, but it is fine
# BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
# NOTE: Due to a bug in mod_setenvif up to Apache 2.0.48
# the above regex won't work. You can use the following
# workaround to get the desired effect:
BrowserMatch \bMSI[E] !no-gzip !gzip-only-text/html
# Don't compress images
SetEnvIfNoCase Request_URI \
\.(?:gif|jpe?g|png)$ no-gzip dont-vary
# Make sure proxies don't deliver the wrong content
Header append Vary User-Agent env=!dont-vary

Apache Configuration Tuning

There are a few key parameters which should also be tuned:

KeepAlive – Set it to ON, this reduces overhead as users load pages

KeepAliveTimeout – Set this to something reasonably low, 1.5 to 2 times your page load speed.

TimeOut – The default is around 2 minutes – way too long to tie up a single process. I set mine to 20 seconds.

StartServers – The number of threads Apache starts with. I set this to 2 threads. Most sites should do fine with 2-5 startup threads. Busy sites should tune based on the number and speed of CPU’s.

MinSpareServers – How many idle threads should be running to quickly accept new connections. I keep this to one since I am CPU bound, and it has no major impact on connection speed with my current traffic levels.

MaxSpareServers – I set this to 2, so I don’t have many spare servers sitting around after a traffic burst.

MaxClients – The maximum number of clients Apache threads will handle before adding new connection requests to the queue. The main constraint Apache usually hits is server memory, so a good rule of thumb is to set this based on Apache thread memory and your system free memory. I did the following:

1)      See total memory used by each Apache thread:

ps -eafly | grep httpd | awk '{print $8}' | sort –n

2)      Check free memory on the system

free

Divide the second number into an average of the first to get an idea of what MaxClients should be. Keep in mind that most Unix systems allocate free memory to a file cache, so you should take this into account when determining what is safely usable by Apache (real free memory is free memory + system cache).

Tuning PHP

Drupal and other content management systems are known to be a bit demanding on server resources, using up significant CPU and Memory per Apache thread. My Drupal site was taking upwards of 18MB per process, and WordPress used around 12MB.

The greatest PHP speed boost will generally be gained through opcode caching and full page caching. Opcode caching is pre-compiling PHP scripts in memory, to avoid the overhead of parsing and executing a script every time it is called. Full page caching goes a step further and caches the entire output of a page, serving it as if it were static content. Blogs which are updated once a week for instance, could cache all pages for a week at a time, and serve them as if they were static content, giving huge performance boosts (Although the cached version would need to be invalidated after every comment). My site is too dynamic to allow for full page caching, but pages running largely static content sites may find it useful.

Enabling opcode caching on the server

To enable OpCode caching, I bench-marked three systems: Zend community edition server, eAccelerator, and APC. Each will automatically begin working once installed without any extra work. I also took additional factors into account such as availability and API offerings.

APC is part of PHP and available as an add-in module. It offers an excellent caching API for disk and memory caching. Supposedly, it will make it into PHP core in a future release.

To install:

yum install php-devel    #install pre-req
pecl install apc                   #install APC with pecl
echo "extension=apc.so" > /etc/php.d/apc.ini   #enable APC
/etc/init.d/apachectl restart           #restart apache

Zend is a well-known performance server with a powerful API and development framework. There is an open source community edition server available, which includes a nice API, and which is the version I tested with.

To install, Zend has a great guide for various platforms.

eAccelerator is the continuation of MMCache, though in the latest version the caching API was removed. Now it essentially only performs opcode caching.

To install eAccelerator:

yum install php-eaccelerator

Opcode caching benchmark results

Overall, APC gave me the best results, increasing requests per second by 98%, compared to 87% for eAccelerator and only 51% for Zend CE. All slightly increased page load times, but not by a statistically significant amount.

PHP Opcode benchmark improvementsI also wanted to use partial page caching for commonly used elements in my code, which is possible using both APC and Zend. Given the API’s offered by these tools, I removed eAccelerator from the running and chose APC as my opcode caching framework.  After monitoring APC for some time, I found I needed to increase the cache size to 64MB from the default 32 to prevent fragmentation.

Additional benchmarking considerations

I can’t say these figures will hold true for every site, since other benchmarks have shown different results for Drupal overall. Most notable, 2bits benchmarked APC vs. eAccelerator and APC vs. Zend CE

Zend also published a benchmark study (using their commercial server) which showed even better results than shown by 2bits.

Finally, on my development server, eAccelerator performed the best, hitting more than 200 requests/second, compared to 150 for APC and 110 for Zend, so do some testing on your own install to see what will work for you.

There are other PHP optimization methods I didn’t cover, some of which are covered in a great set of slides for Apache and PHP performance tuning.

Application Tuning

Because I use Drupal and am not willing to modify core, I can generally only tune my own custom code and modules. I profiled several of the highest traffic pages using Xdebug but found that Drupal functions were using up more than 95% of the system resources. Thus, I started to look at how I could reduce the number of function calls in my application. I wrote my own custom theme, and two places with a substantial number of function calls are in the generation of the site header and footer. The footer always remains the same for users, but the header and menus are different for every user and page, due to title and meta tags. Thus, I decided to use APC shared memory caching to cache header and footer output on a per-user and per-page basis. I wanted to remove the dependency on APC from my code, in case it was disabled or not working for whatever reason.

To accomplish this, I first removed the header and footer generation code from my page.tpl.php (the main template file in Drupal) and put them into their own include files. Then, I generated the header and footer sections into a variable, and finally stored that variable in the APC cache if it was available. On the next page request, I return the cached version instead of regenerating it.

Header generation include file:

/*generate the entire page header and store in a
 *single variable for caching. Actual code
 *not shown, since there is a lot.
 */
$custom_header = 'lots of HTML and PHP'

Top section of page.tpl.php (header cache):

//make sure APC is enabled and working
if(function_exists('apc_fetch'))
{
	//check to see if we have a header in the cache for this user and this page
	$custom_header_apc = apc_fetch("header-{$user->uid}-{$head_title}");
	if($custom_header_apc === false)	//nothing in cache
	{
		//include file generates a header variable custom_header
		require_once 'generate_header.php';
		//store the header in the cache for next time
		apc_store("header-{$user->uid}-{$head_title}",$custom_header, 24*3600);
		//prepare to print the header
		$custom_header_apc = $custom_header;
	}
	echo $custom_header_apc;	//send the header
}
else	//APC is not installed
{
	//just generate the header variable and print it
	require_once 'generate_header.php';
	echo $custom_header;
}

This resulted in an additional 15% gain in requests served per second, with no change in page load time.

Performance Improvements from Shared Memory Cache

Additional gains could be made if I cached node contents, or entire pages. Nearly all of my high traffic pages are not cacheable, as they are different for each user, and may vary during a single user session greatly, but the more static a site, the more can be cached.

What Else?

Notably missing from this analysis is MySQL tuning. I left this off since MySQL currently takes up no more than 15% of server resources during high load periods, and most queries were optimized at the time of writing them, along with the data model. However, database tuning should be an integral part of holistic application tuning in general, so be sure to check MySQL slow query logs, and make sure all sql joins use indexes where possible when doing your own application tuning.

Are there other areas readers have modified to see improved requests per second?

Need Help with Drupal or Performance Tuning?

I consult on Drupal and performance tuning. Contact me for details.

Be Sociable, Share!

    31 comments… add one

    • Awesome post! When I read this line, I chuckled a bit:

      18MB per process, and WordPress used around 12MB

      If you come from Rails (where once instance of Ruby can take 100-300 MB), these numbers are insane.

      Also, Slowcop is a useful tool for quick performance tests.

      Reply
    • I’d use mod_worker with mod_fcgid at least.

      Reply
    • Love the look of your hosting service Orien. I sent your support team a few questions about it, but from the look of it, CleverKite may be a better option than the new hardware being offered by my current provider.

      Reply
    • See total memory used by each Apache thread

      This is a very tricky subject, most of the children memory is shared. See http://stackoverflow.com/questions/131303/linux-how-to-measure-actual-memory-usage-of-an-application-or-process

      I think you omitted operating system tuning (but a dedicated server with root access is needed for most of it). If you have high load choosing right filesystem, process scheduler or memory settings (swappiness etc.) can make a big difference.

      Reply
    • What about stopping Apache from looking for .htaccess in every directory it is accessing, removing it hitting the disk?

      Reply
    • Good thinking! dis-allowing htaccess should definitely yield some improvements. It only works if you have full control of a server though, since shared hosting typically doesn’t allow it. With my VPS, I already had it disabled since I do all the configurations in the main conf files.

      Reply
    • I will right away snatch your rss as I can not to find your email subscription hyperlink or newsletter service. Do you’ve any? Kindly allow me recognize in order that I could subscribe. Thanks.

      Reply
    • Great weblog right here! Additionally your site quite a bit up very fast! What host are you the use of? Can I get your associate hyperlink in your host? I desire my site loaded up as fast as yours lol

      Reply
    • Good day! I could have sworn I’ve visited this website before but after browsing through a few of the articles I realized it’s new to me.
      Nonetheless, I’m definitely happy I found it and I’ll be bookmarking it and checking back frequently!

      Reply
    • Be aware that unscrupulous market advisers and stock scam artists can manufacture their own positive reviews!
      You should thoroughly investigate any online market resource before using it.
      Take random positive feedback with a healthy grain of salt.
      You will want to lend greater weight to reviews you
      find from truly trustworthy, unbiased sources.

      __________________________________________________________________________

      My Blog penny stock Lists

      Reply
    • You must understand how the stock market operates prior
      to investing in it. Supply and demand is how the stock market operates.
      The amount of shares of stocks is what makes up the supply.
      The demand is determined by the amount of shares that investors are interested in purchasing.

      Furthermore, be aware that with every shared that is bought,
      there is a person on the other side that sells the share.

      Reply
    • Green Choice Pest Control is a company that is dedicated to taking
      a complex subject and making it as simple as possible.
      The Multi-Unit Commercial Service module is fully integrated with Pest – Pac, Pest – Pac
      Mobile and our Premium Customer Portal, reducing training ramp time and making
      it extremely convenient and cost effective to use right out of the box.
      ” The EPA, the Centers for Disease Control and Prevention (CDC), and the United States Department of Agriculture (USDA) all consider bed bugs a public health pest.

      Reply
    • Under mounting pressure, restaurants in the US are now trying to
      tighten credit card security for their customers as credit card fraud at restaurants has grown
      in popularity. As part of an international team of criminals, Burak was involved in a scheme to traffic stolen and counterfeit credit cards, accounting for an astonishing
      $21 million of losses. Regardless of the length of this credit card Bin
      Amex, it also has a numerical sequencing that works on the same Luhn validation formula and
      the mathematical check sum equation hence you should
      not experience any complications when using the card. Other cases involve installing a skimmer onto
      an ATM or kiosk.

      Reply
    • Thank you, I’ve recently been looking for information approximately this
      topic for a long time and yours is the best I’ve came upon so far.
      However, what about the bottgom line? Are you certain concerning
      the supply?

      Reply
    • Thanks a lot for sharing this with all folks you really realize what you are talking approximately!
      Bookmarked. Kindly additionally discuss with my web site =).
      We can have a hyperlink exchange contraact among us

      Reply
    • Hello, i feel that i saw you visited my web site thus i got here to go back the desire?.I am attempting to to find issues to enhance my site!I suppose its good enough to make use of some of your concepts!!

      Reply
    • Thhis setting makes it impossible for you to use it with a different network,
      which can be quite a problem iif its default has limited coverage and outrageous rates.
      Learn down below too verify pertaining to a lot of the Samsujg Galaxy S3′s features, specs, as well as launch
      date. This will make the other person feel really good
      and create some interest in you.

      Reply
    • Write more, thats all I have to say. Literally, it seems as though you relied on the video to
      make your point. You clearly know what youre talking about, why waste
      your intelligence on just posting videos to your
      site when you could be giving us something enlightening to read?

      Reply
    • I am regular visitor, how are you everybody? This piece of writing posted at this web
      site is actually nice.

      Reply
    • If you trry to move heavy furniture by yourself, especially
      if you have no experience in moving to another house, one of the most important matters on your mind.
      From your biggest pieces of furniture to the smallest items, a company specializing on house removals, it is
      best to avail of this service if you have a stack of home removal companies nottingham nno more than 4 plates.

      Reply
    • Howdy! Someone in my Facebook group shared this site with us
      so I came to check it out. I’m definitely enjoying
      the information. I’m bookmarking and will be tweeting this to my
      followers! Great blog and outstanding style and design.

      Reply
    • Wow that was odd. I just wrote an really long comment but after I clicked submit my comment didn’t appear.

      Grrrr… well I’m not writing all that over again.
      Anyway, just wanted to say superb blog!

      Reply
    • I’m gone to convey my little brother, that he should also visit this
      weblog on regular basis to obtain updated
      from most up-to-date information.

      Reply
    • Great post. Continue writing…it is very consistent and useful. I am sure it helped a lot of developers. Have a nice day!

      Reply
    • Hi there, after reading this remarkable piece of writing i am as well
      delighted to share my knowledge here with friends.

      Reply
    • Good answer back in return of this matter with firm arguments
      and describing the whole thing concerning that.

      Reply
    • Hey I know this is off topic but I was wondering if you knew of any widgets I could add
      to my blog that automatically tweet my newest twitter updates.
      I’ve been looking for a plug-in like this for quite some time and was hoping maybe you would have
      some experience with something like this. Please let me know if you run
      into anything. I truly enjoy reading your blog and I look
      forward to your new updates.

      Reply
    • That is a really good tip particularly to those fresh to the blogosphere.
      Short but very accurate info… Thanks for sharing this one.
      A must read article!

      Reply
    • Asking questions are genuinely pleasant thing if you are not understanding anything entirely,
      except this paragraph presents nice understanding even.

      Reply
    • This website was… how do you say it? Relevant!!
      Finally I have found something which helped me. Kudos!

      Reply
    • Very nice post. I just stumbled upon your weblog and wished to say
      that I have really enjoyed browsing your blog posts.

      After all I’ll be subscribing to your feed and I hope you write again very soon!

      Reply

    Leave a Comment

    Next Post:

    Previous Post: