Skip navigation.

Puppet Drupal recipes

Drupal, Puppet. Puppet, meet Drupal

Puppet and Drupal make a great combination. Drupal is an amazing tool for quickly constructing attractive, functional web sites. It lets you manage large numbers of web sites from a single installation, and (via add-on modules) provides almost any CMS or blog feature you could want.

However, like any powerful tool, Drupal takes some learning. It also needs a certain amount of discipline to manage Drupal servers without getting into a chaotic mess. The Drupal sysadmin can end up trying to navigate a spaghetti of ad-hoc symlinks and face problems upgrading, maintaining, monitoring and backing up a large Drupal installation. Aegir can help with this (I’ll look at Aegir vs. Puppet in a future article) but first we need to get Drupal itself under control.

Fortunately, Puppet can help you tame Drupal and use the power of configuration management to bring your Drupal sites under control. In this article I’ll explain some techniques and Puppet recipes I use to manage Drupal sites and servers, including my own sites, including this one!

Installing Drupal with Puppet

Firstly, the server where Drupal is running (or where you plan to set up Drupal) needs to be under Puppet control. If you are already confident with Puppet, all you need to do is add a node definition for the server to your Puppet manifests, and then add the recipes from this article. If you’re new to Puppet, start with my Puppet tutorial series.

For this example I’ll assume we’re setting up a new Drupal install. Although there are packages for Drupal, I recommend that you install from source, because Drupal is frequently updated and one of the biggest causes of security problems with Drupal sites is using outdated core versions.

Download the latest release from the Drupal download page. At the moment that will be Drupal 6.x, but Drupal 7 is in alpha and will soon be officially released. Unless there is a specific reason (such as custom modules) to use Drupal 5.x, I always recommend 6.x or higher for production sites.

I suggest unpacking the release in the /var/www/drupal directory.

Setting up a Drupal multisite installation

Although you can have a separate copy of the Drupal code for each web site you’re serving, I recommend that you use Drupal’s powerful multisite feature: this saves disk space, avoids duplicating effort, and makes it easy to keep up with core updates.

All you need to do is create a directory sites/example.com, if the site is named example.com. Copy the settings.php file from the sites/default directory and enter the appropriate settings for your site (normally just the database name). For each extra site you create, add a directory for it in sites. See the Drupal multisite howtos for more information.

Puppet and Drush

Because Drupal has a modular architecture, the core is kept small, but for most sites you will need to download and use some additional modules to enable things like CAPTCHAs, comment notifications, or image galleries. Fortunately, there is a superb automation tool available for Drupal called Drush (the Drupal shell). Puppet plus Drush is a great combo.

Drush makes automating Drupal with Puppet a breeze. Here’s an example manifest for installing Drush and making it available:

class drupal::drush {
  exec { "download-drush":
    cwd => "/root",
    command => "/usr/bin/wget http://ftp.drupal.org/files/projects/drush-All-Versions-2.1.tar.gz",
    creates => "/root/drush-All-Versions-2.1.tar.gz",
  }
  
  exec { "install-drush":
    cwd => "/var/www/drupal/sites/all/modules",
    command => "/bin/tar xvzf /root/drush-All-Versions-2.1.tar.gz",
    creates => "/var/www/drupal/sites/all/modules/drush",
    require => [ Exec["download-drush"], File["/var/www/drupal/sites/all/modules"] ],
  }
  
  file { "/usr/local/bin/drush":
    ensure => "/var/www/drupal/sites/all/modules/drush/drush",
  }
}

This downloads the Drush tarball and installs it in your modules directory, and creates a symlink so that the drush will be available in your path.

Puppet PHP configuration

There are a few dependencies for Drupal: you need PHP, a MySQL database, and a web server (I use Apache, but any server capable of running PHP apps will do, such as Nginx).

For Drush you will need at least PHP 5.2 (if you’re using CentOS, see Install PHP 5.2 on CentOS 5.2 using Yum). You will also need the php-mysql, php-gd, and php-mbstring extensions, which are likely available as packages for your platform.

You will also probably need to increase the default PHP memory limit of 8MB to at least 256MB (more if you have it). You can do this by having Puppet deploy a php.ini file with this setting to the machine. Once this is done, Puppet will update php.ini any time you change it. Here’s my PHP-for-Drupal Puppet recipe:

class php {
  package { [ "php", "php-mysql", "php-gd", "php-mbstring" ]:
    ensure => installed,
  }
  
  file { "/etc/php.ini":
    source => "puppet:///php/php.ini",
  }
}

And here’s the php.ini file:

memory_limit = 256M      ; Maximum amount of memory a script may consume (8MB)

A common cause of Drupal errors is PHP running out of memory, so try increasing this setting if you encounter mysterious problems.

You’ll also need to make sure various directories exist for installing modules into:

file { [ "/var/www/drupal/",
         "/var/www/drupal/sites",
         "/var/www/drupal/sites/all/",
         "/var/www/drupal/sites/all/modules" ]:
  ensure => directory,
}

Creating a Drupal site instance with Puppet

In a Drupal multisite environment, in addition to the per-site directory in sites, you need the following things for each new site instance:

  • A MySQL database for the site, with appropriate user permissions for Drupal
  • A virtual host file for your web server, mapping the domain name to the Drupal instance

We can have Puppet create all these things for us, so each site can be configured in a minimal way:

drupal::site { "example.com":
  db => "drupal_example",
}

and Puppet will know what to do. Here’s a manifest that will do this:

define site( $db ) {  
  $drupal_password = "i_love_drupal"

  file { "/etc/httpd/conf.d/$name.conf":
    source => "puppet:///sites/$name.conf",
    notify => Service["httpd"],
  }

  cron { "drupal-cron-$name":
    command => "wget -O - -q -t 1 http://$name/cron.php",
    hour => inline_template("<%= name.hash % 24 %>"),
    minute => "00",
  }
  
  exec { "create-$name-db":
    unless => "/usr/bin/mysql -udrupal -p$drupal_password $db",
    command => "/usr/bin/mysql -uroot -p$mysql_password -e \"create database $db; \
      grant all on $db.* to drupal@localhost identified by '$drupal_password';\"",
    require => Service["mysqld"],
  }
}

This will have Puppet do a MySQL ‘create database if not exists’ operation - if the database is already present, the ‘unless’ command will succeed and so the exec will not run. (In the code example above, I’ve used backslashes to break long lines in the Puppet manifest. You may wish to take these out and reformat.)

Note that the cron resource automatically assigns the site cron job to a random time based on the name of the site. This helps spread the load on your server, especially if it’s hosting many sites, rather than hitting them all at once.

Drupal vhost

There are three resources defined here for each site. First, an Apache vhost config file (you could use Nginx config snippets instead). This will obviously vary depending on your site, but here’s a typical config that I use:

<VirtualHost *:80>
    ServerName example.com
    ServerAdmin john@example.com
    DocumentRoot /var/www/drupal
    ErrorLog logs/example.com-error_log
    CustomLog logs/example.com-access_log common
	
    Redirect 302 /files http://example.com/sites/example.com/files
  
    <Directory /var/www/drupal>
        Allow from all
        Options +Includes +Indexes +FollowSymLinks
        AllowOverride all
    </Directory>
</VirtualHost>

<VirtualHost *:80>
    ServerName www.example.com
    Redirect 301 / http://example.com/
</VirtualHost>

This virtual host redirects requests for www.example.com to example.com.

It also points requests for example.com/files/foo to the appropriate files directory in your site. This way, you can construct shorter URLs for resources in your files directory.

The vhost config could also be a Puppet template, interpolating the site name into the virtual host in various places, but in the case of an existing site you will probably want to stick with the static config file that you have.

Drupal cron update

Secondly, the manifest creates a cron job which will request the Drupal cron page for your site, once a day (adjust this if you want the site updated more frequently). The timing of the cron job is staggered so that not all your sites will update at once.

Drupal database

Finally, Puppet creates a database for the Drupal site, complete with grants for the Drupal db user (note the use of unless to make sure this only runs if it’s needed).

Managing Drupal modules with Puppet

With a multisite Drupal setup, you can either keep modules in the per-site directory, or in sites/all/modules, in which case they’re accessible to all sites. I prefer the latter.

One of the many powerful features of Drush is installing modules with a single command:

define module() {
  exec { "install-module-$name":
    cwd => "/var/www/drupal/sites/all/modules",
    command => "/usr/local/bin/drush dl $name",
    creates => "/var/www/drupal/sites/all/modules/$name",
  }
}

Then you can write a global manifest for your modules:

class drupal::modules {
  drupal::module { [ "admin_menu",
                     "cck",
                     "comment_notify",
                     "contact_forms",
                     "filefield",
                     "google_analytics",
                     "imagecache",
                     "nodewords",
                     "views",
                     "views_attach",
                     "views_bulk_operations",
                     "weblinks" ]: }
}

Or you can include a module in the manifest for a particular site, to make it explicit which site requires it:

class sites::mygreatsite {
  include drupal
  drupal::site { "mygreatsite.com": 
    db => "drupal_mygreatsite",
  }
  drupal::module { "webform" }
}

Now when you find you need module foo, you can just add it into the list of modules and run Puppet, then refresh the admin/modules page for your site and enable it. You can do the same thing with themes (Drush also handles these).

You should also back up your sites, of course, and you can have Puppet run Drush to generate a SQL dump of the site database which can go into your backups directory. For more useful things you can do with Drush, including keeping your Drupal core up to date, run drush on the command line. Almost any maintenance operation on your Drupal site can be done with Puppet Drush commands.

Now there’s no excuse

I hope you find this article useful in putting together your own Puppet Drupal recipes. The code shown here is by no means the last word in automating Drupal, and I hope if you extend it or come up with ingenious new ways to manage Drupal with Puppet, you’ll comment here and let me know. Happy Drupalling!

Cool, but...

Security-wise, this might impose some problems if you have any tech-savvy users logged onto your servers (or even just people able to upload some sort of PHP shell):

command => “/usr/bin/mysql -uroot -p$mysql_password -e "create database $db; \
grant all on $db.* to drupal@localhost identified by ’$drupal_password’;"”,

What if someone monitors you process list - that’s where your MySQL root credentials will show up, this way?

Re: Cool, but...

Fair point Chris, but this command only runs once and will be in the process list for a fraction of a second while it executes. If you’re worried that people might be logged on to your servers who shouldn’t be, you probably have bigger problems.

You could always wrap this command in a shell script which Puppet calls, but then the root credentials will be visible in the shell script. Unfortunately, I don’t think there’s a perfect solution to this problem.

Yes, unfortunately, there's

Yes, unfortunately, there’s not much room for other options. Fortunately, though, some MySQL client implementations “x” out the passwords in their argv’s - but not on all platforms. D’oh.
Had a short look at some other opinions on this topic and found some good reading, e.g. http://www.lenzg.net/archives/256-Basic-MySQL-Security-Providing-passwor… or http://www.lenzg.net/index.php?url=archives/257-More-on-MySQL-password-s….

Anyway: Thanks, John, for sharing your great article, will definitely have another look at it (in a TYPO3 context)!

MySQL security

If you are paranoid enough then you can use a defaults file for mysql together with —defaults-file to avoid passing passwords on the command line.

Bit overkill for the creation but perhaps not for the “unless” parameter.

Not come across drush before, that looks very handy. Another well written article.

Re: MySQL security

Drush is brilliant - I’m so glad I discovered it, because I was all set to write it if it didn’t exist! (Someone else has, apparently, and he’s probably not the only one.)

The next step is using Aegir to manage all my sites in one place. I’m looking forward to that!

Thre right tool for the job?

I do frequent updates and releases of Drupal sites for my clients and, without a doubt, the sort of automation and instant integration that you’re describing here is really pivotal in freeing developers, release managers and devops from the drudgery and pitfalls of doing it all by hand. Staying agile, with either a large or small “A”, demands one-click integrations.

I usually use traditional build scripting technologies such as ANT, or more recently Fabric, to deploy installations. As you mention here, drush is inevitably part of the deployment pipeline, when one is releasing Drupal sites.

Your use of Puppet, while undoubtedly clever and workable, seems slightly out of place. When scripting with Puppet, aren’t we essentially in a systems/os domain rather than application space? When we install our Drupal sites on servers, aren’t we then in a “stack” domain, be it LAMP, WAMP, XAMPP and similar? Using Puppet as you’ve described feels a bit like changing the radio station in your car with a torque wrench or tyre iron.

Ultimately, it really depends on who is managing the Drupal sites. In a datacentre or managed hosting environment, the setup you described is spot-on for the devops and sysadmins who need to provision boxen. But for the developers and release managers out there I think ANT/Fabric/make/Maven are the way to go.

Thought-provoking post. Keep ‘em coming.

Re: The right tool for the job?

Branden,

Thanks very much for that - you’re asking absolutely the right question. Where’s the boundary between sysadmin and dev responsibilities for things that need to be on the server? Perhaps this is a topic for an article in itself.

I think different organisations draw the line in different places. Maybe there shouldn’t be a line!

Drush as a local command

I Like to have drush as a local command, not somewhere in the DOCROOT of one of the drupal stacks. My puppet code would be like:

class drupal::drush {
  exec { "download-drush":
    cwd => "/root",
    command => "/usr/bin/wget http://ftp.drupal.org/files/projects/drush-All-Versions-2.1.tar.gz",
    creates => "/root/drush-All-Versions-2.1.tar.gz",
  }
  
  exec { "install-drush":
    cwd => "/var/www/drupal/sites/all/modules",
    command => "/bin/tar xvzf /root/drush-All-Versions-2.1.tar.gz",
    creates => "/usr/local/drush",
    require => [ Exec["download-drush"], File["/usr/local"] ],
  }
  
  exec { "symlink-drush":
    command => "/bin/ln -s /usr/local/drush/drush /usr/local/bin/drush",
    creates => "/usr/local/bin/drush",
  }
}

FWIW - you don’t need to run

FWIW - you don’t need to run an exec for a sym link. In fact it’s a nice one liner when using the file resource:

file { “/etc/my_sym_location”: ensure => “/etc/local/bin/my_real_location” }

That will do the same as:
ln -s /etc/local/bin/my_real_location /etc/my_sym_location

Quite right, and I’ve no idea

Quite right, and I’ve no idea why I forgot that :D

I’ve updated the article. Thanks.

Updating drupal itself?

Do you have any thoughts on how to update drupal?

drush dl drupal

will get the latest version to the $CWD, but would be nice to know what version it downloaded so that we can use that to install the modules and styles underneath (ls -tr drupal-* | tail -1 might be one method I suppose - anything you’re aware of that’s more elegant?)

Debian and other distributions as well as architecture.

The path of conf.d changes as does the service name.

# This is not 100% correct, but should get you started making flexible manifest files.
$httpd = $operatingsystem ? {
solaris => unknown,
redhat => httpd,
debian => apache2,
ubuntu => apache,
default => none
 }

For sure Drupal is the best

Nice article. For sure, Drupal needs some learning. But at the same time it is very easy to find the information about it. I’ve downloaded lots of books in Drupal at shared files SE. So, the only thing that is demanded is some time to read, understand and practise it.

Indeed.

Indeed.

Post new comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.
By submitting this form, you accept the Mollom privacy policy.