Pragmatic Performance Tuning in Drupal

One of the Drupal based applications we manage received user feedback that performance was not meeting their expectations.  Even though my background is primarily Rails and Java, I was excited to take up this challenge.  Before I could really get down to business I needed a couple of tools for the job. This post is about creating those performance tools inspired by the following tips from The Pragmatic Programmer. 

  • Tip 20 Keep Knowledge in Plain Text
  • Tip 21 Use the Power of Command Shells
  • Tip 28 Learn a Text Manipulation Language  

Tip 20 Keep Knowledge in Plain Text

 There is a great module that a lot of Drupal developers use http://drupal.org/project/devel The module is great for development in that it shows the developer how many queries fire, how long they took, and if they are repeated queries.  Drupal seems to be very chatty with the database and keeping this under control can really help with performance.  

For this particular problem I was coming in after the development was complete and hoping to fine tune or refactor the existing code base. I wanted a logfile that could show me the queries fired per page, what the most expensive queries were, the functions that called those queries, and the ability to reproduce them in other environments (development, QA, and Production).  Patrick Shaughnessy (http://patshaughnessy.net/) gave me some quick pointers on the Drupal framework allowing me to build an easy logfile strategy.

Step 1: Build a Switch for logging

I wanted to leave the logging code in place, but provide a mechanism to quickly turn the logging on and off.  I did not want to store this in the variable table as I suspected most of the performance issues were database related.  I resolved to create a simple global in global_settings.php

# Turn on performance tuning in the logfile
global $performance_logging;
$performance_logging  = 0;

Step 2: Create a debug function

I put mine in index.php, but there is probably a better location for common methods:

function debug($s) {
  global $performance_logging;
  if ($performance_logging) {
    $file = fopen("/tmp/performance.log", "a");
    fwrite($file, $s . "\n");
    fclose($file);
  }
}

Step 3: Place instrumentation in the code to log performance

First lets log how long it takes to build the entire page.  This can be done quickly by placing the following code in the index.php before the bootstrap include.

$then = microtime(true);  # Mark the beginning of the page
require_once './includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

Then again at the bottom of the page, with a call to our debug function to record the time.  

$elapsed_time = microtime(true) - $then;
if ($performance_logging) {
  debug(sprintf("%f ENDPAGE %s",$elapsed_time,$_GET['q']));
}

When this is shown in the logfile it will appear like this:

0.358291 ENDPAGE node/3

The time in seconds, a tag to say the page has ended, and the parameter that was used to build the page.

Step 4: Record the database performance metrics

 In the database.mysql.inc file all database transactions will flow through this method.  We can log how long these database calls took with a couple of pieces of code in the _db_query method around the call to mysql_query.

Add the following code to the _db_query function:

if ($performance_logging) { $then = microtime(true); }
  $result = mysql_query($query, $active_db);
  if ($performance_logging) {
    $elapsed_time = microtime(true) - $then;
    $bt = debug_backtrace();
    debug(sprintf("%f -- %s -- %s",$elapsed_time, $bt[2]['function'], $query));
  }
}

This will log the time in seconds, a tag to indicate the function called, and the query that was executed to the database.  

Step 5: The Results

Now when a page is rendered we will get a log file like below:

0.000069 -- community_is_private -- select group_nid from og_ancestry where nid = 13430
0.000076 -- community_is_private -- select selective from og where nid = 13429
0.000599 -- taxonomy_get_vocabularies -- SELECT v.*, n.type FROM vocabulary v LEFT JOIN vocabulary_node_types n ON v.vid = n.vid ORDER BY v.weight, v.name
...  A Lot more lines ...
0.000144 -- get_homepage -- SELECT * FROM og_home WHERE gid = 3
0.000448 -- statistics_exit -- UPDATE node_counter SET daycount = daycount + 1, totalcount = totalcount + 1, timestamp = 1275527799 WHERE nid = 3
0.000246 -- statistics_exit -- INSERT INTO accesslog (title, path, url, hostname, uid, sid, timer, timestamp) values('Spaces', 'node/3', '', '127.0.0.1', 14149, '983da4b0314d4dfca4c919acfcbbd33b', 346, 1275527799)
0.358291 ENDPAGE node/3

Column Breakdown:

  1. The amount of time the database query took to execute
  2. The method the database query was executed from
  3. The actual SQL statement

Tip 21 Use the Power of Command Shells

This simple logfile is incredibly powerful for instrumenting the application to perform performance tuning with just a handful of Unix commands

How do I look at just the queries on one page?

  • Simply truncate the logfile, view the page, and re-open the logfile
  • The logfile can be truncated by the Unix command

cp /dev/null /tmp/performance.log

What are the most expensive queries on a page?

  • Truncate the logfile (above)
  • Look at the page in question
  • Execute the following Unix command to sort the file numerically, in reverse order, and look at the top 5

sort -rn /tmp/performance.log | head -5  

How many queries were executed to build the page?

  • Truncate the logfile (above)
  • Look at the page in question
  • Execute the following Unix command to see the number of lines in the file (which is the number of queires on that page)

wc /tmp/performance.log

Tip 28 Learn a Text Manipulation Language 

Now that there is a powerful flow of information, here are a couple more tools to figure out what is going on from a performance perspective:

Summary of application performance, right from the server!

- Use perl as a text parsing language from the command line and you can ascertain a lot of information.  Perl is almost always installed on the server and is a powerful text manipulation tool. Here is one such use case:

perl -F"--" -lane 'if (/ENDPAGE/) {print "$F[1] $t";$t=0;next;} $t+=$F[0]'  mckwiki_debug.txt

This will return something like this:

0.227044 ENDPAGE node/86 0.020057
2.461857 ENDPAGE node/3 2.128639
0.354404 ENDPAGE node/3 0.1329
0.295097 ENDPAGE og 0.03797

  • The first column is the total time the page took to build (including php time and mysql query time)
  • The second column can be ignored, just a place holder to trigger that a page has ended
  • The third column is the url that was called
  • The fourth column is the total time in database calls

Breaking down the command:

Perl and Ruby both provide the capability to execute scripts directly from the command line.  This can be quite powerful when trying to quickly analyze a problem.  Of course if a repeatable solution is found, then the inline script should be converted into a file based script for reusing later.  This is how this script above works.

perl    # The script interpreter
-F"--"  # this option indicates that for every line of text seen, split the line into an array delimited by "--"
-lane   
    # -e = Execute the next string passed as a script, for example perl -e 'print "hello world"'
    # -n = Automatically loop through each line of a file (last argument)
    # -a = Automatically parse each line of text into an array (using the -F argument above as a delim)
    # -l = Automatically put a "\n" at the end of each line of output

I find that I am always using this four options, so I just remember -lane (nice easy phrase)

Create SQL performance scripts from the logfile:

Another handy tool is a list of all the SQL statements in an executable file to help identify performance in other environments:

I normally truncate the log file first before generating this script.  This way I know the page I view has just the SQL statements for that page when I execute them on another server.

perl -F"--" -lane 'print $F[2] if $F[2]'  mckwiki_debug.txt > performance_debug.sql

This will create a file with all the SQL statements for that page.  

 

In the next article we will explore how these tools can be used to fine tune a Drupal application for performance.