Deploying PHP Websites with Capistrano

Have you ever had hugely ambitious plans for a website project, only to let it fall by the wayside soon after deployment because pushing incremental improvements was a total pain in the behind? If so, I'll venture a guess that you were using FTP to update the website files. FTP is like the bag of sand Indiana Jones used to try and swap for the monkey statue at the beginning of "Raiders of the Lost Ark". The bag feels so good and familiar in your hands, and with just a little tweaking you're absolutely convinced it's going to do the job. But things didn't go as expected and now the giant but incredibly smooth boulder is rolling directly towards you.

FTP is an unacceptable solution because ongoing, incremental improvements to a website will only occur if the deployment barrier is sufficiently low. Because FTP is slow, stupid, and lacking an undo feature, I can guarantee your updates will be occasional, tentative, and painful. Deploying framework-based websites is even more tenuous because many solutions require additional tweaking after the files have been updated. For instance, a Zend Framework-driven website will run in development mode on your laptop but in production mode on your server. The mode is typically set in an .htaccess file via a variable named APPLICATION_ENV. If you're using FTP to blindly transfer the files over, then you'll need to SSH in following deployment and update that variable. Every single time.

Fortunately, an alternative deployment solution exists called Capistrano which resolves all of FTP's issues quite nicely. Not only can you use Capistrano to securely and efficiently deploy changes to your Zend Framework website, but it's also possible to rollback your changes should a problem arise. I'll show you how to configure your development environment to use Capistrano, freeing you from ever having to worry again about a botched website deployment.

Installing Capistrano

Capistrano is an open source automation tool originally written for and primarily used by the Rails community. However it is perfectly suitable for use with other languages, PHP included. But because Capistrano is written in Ruby, you'll need to install Ruby on your development machine. If you're running OS X or most versions of Linux, then Ruby is likely already installed. If you're running Windows, the easiest way to install Ruby is via the Ruby Installer for Windows.

Once installed, you'll use the RubyGems package manager to install Capistrano and another application called Railsless Deploy which will hide many of the Rails-specific features otherwise bundled into Capistrano. Although Railsless Deploy is not strictly necessary, installing it will dramatically streamline the number of Capistrano menu options otherwise presented, all of which would be useless to you anyway because they are intended for use in conjunction with Rails projects.

RubyGems is bundled with Ruby, meaning if you've installed Ruby then RubyGems is also available. Open up a terminal window and execute the following command to install Capistrano:

$ gem install capistrano

Next, install Railsless Deploy using the following command:

$ gem install railsless-deploy

Once installed you should be able to display a list of available Capistrano commands:

$ cap -T
cap deploy               # Deploys your project.
cap deploy:check         # Test deployment dependencies.
cap deploy:cleanup       # Clean up old releases.
cap deploy:cold          # Deploys and starts a `cold' application.
cap deploy:pending       # Displays the commits since your last…
cap deploy:pending:diff  # Displays the `diff' since your last…
cap deploy:rollback      # Rolls back to a previous version and…
cap deploy:rollback:code # Rolls back to the previously deployed…
cap deploy:setup         # Prepares one or more servers for depl…
cap deploy:symlink       # Updates the symlink to the most recen…
cap deploy:update        # Copies your project and updates the s…
cap deploy:update_code   # Copies your project to the remote ser…
cap deploy:upload        # Copy files to the currently deployed…
cap invoke               # Invoke a single command on the remote…
cap shell                # Begin an interactive Capistrano sessi…

Configuring Public Key Authentication

The final general configuration step you'll need to take is configuring key-based authentication. Key-based authentication allows a client to securely connect to a remote server without requiring the client to provide a password, by instead relying on public-key authentication to verify the client's identity.

Public-key cryptography works by generating a pair of keys, one public and another private, and then transferring a copy of the public key to the remote server. When the client attempts to connect to the remote server, the server will challenge the client by asking the client to generate a unique signature using the private key. This signature can only be verified by the public key, meaning the server can use this technique to verify that the client is allowed to connect. As you might imagine, some fairly heady mathematics are involved in this process, and I'm not even going to attempt an explanation; the bottom line is that configuring public-key authentication is quite useful because it means you don't have to be bothered with supplying a password every time you want to SSH into a remote server.

Configuring public-key authentication is also important when setting up Capistrano to automate the deployment process, because otherwise you'll have to configure Capistrano to provide a password every time you want to deploy the latest changes to your website.

If you're running a Linux/Unix-based system, creating a public key pair is a pretty simple process. Although I won't be covering the configuration process for Windows or OSX-based systems, I nonetheless suggest carefully reading this section as it likely won't stray too far from the steps you'll need to follow. Start by executing the following command to generate your public and private key:

$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/wjgilmore/.ssh/id_rsa):

Unless you have good reasons for overriding the default key name and location, go ahead and accept the default. Next you'll be greeted with the following prompt:

Enter passphrase (empty for no passphrase):

Some tutorials promote entering an empty passphrase (password), however I discourage this because should your private key ever be stolen, the thief could use the private key to connect to any server possessing your public key. Instead, you can have your cake and eat it to by defining a passphrase and then using a service called ssh-agent to cache your passphrase, meaning you won't have to provide it each time you login to the remote server. Therefore choose a passphrase which is difficult to guess but one you won't forget.

Once you've defined and confirmed a passphrase, your public and private keys will be created. You'll next want to securely copy your public key to the remote server. This is probably easiest done using the scp utility:

$ scp ~/.ssh/id_rsa.pub username@remote:publickey.txt

You'll need to replace username and remote with the remote server's username and address, respectively. Next SSH into the server and add the key to the authorized_keys file:

$ ssh username@remote
...
$ mkdir ~/.ssh
$ chmod 700 .ssh
$ cat publickey.txt >> ~/.ssh/authorized_keys
$ rm ~/publickey.txt
$ chmod 600 ~/.ssh/*

You should now be able to login to the remote server, however rather than provide your account password you'll provide the passphrase defined when you created the key pair:

$ ssh username@remote
Enter passphrase for key '/home/wjgilmore/.ssh/id_rsa':

Of course, entering a passphrase each time you login defeats the purpose of using public-key authentication to forego entering a password, doesn't it? Thankfully, you can securely store this passphrase ssh-agent, which will automatically supply it when the client connects to the server. Cache your passphrase by executing the following commands:

$ ssh-agent bash
$ ssh-add
Enter passphrase for /home/wjgilmore/.ssh/id_rsa:
Identity added: /home/wjgilmore/.ssh/id_rsa (home/wjgilmore/.ssh/id_rsa)

Try logging into your remote server again and this time you'll be whisked right to the remote terminal, with no need to enter your passphrase! However, in order to forego having to manually start ssh-agent every time your client boots you'll want to configure it so that it starts up automatically. If you happen to be running Ubuntu, then ssh-agent is already configured to automatically start. This may not be the case on other operating systems, however in my experience configuring ssh-agent to automatically start is a very easy process. A quick search should turn up all of the information you require.

Creating the Deployment File

With these general configuration steps out of the way, it's time to ready your website for deployment. You'll only need to carry out these steps once per project, all of which are thankfully quite straightforward.

The first step involves creating a file called Capfile (no extension) which resides in your project's home directory. The Capfile is essentially Capistrano's bootstrap, responsible for loading needed resources and defining any custom deployment-related tasks. This file will also retrieve any project-specific settings, such as the location of the project repository and the name of the remote server which hosts the production website. I'll explain how to define these project-specific settings in a moment.

Capistrano will by default look for the Capfile in the directory where the previously discussed cap command is executed, and if not found will begin searching up the directory tree for the file. This is because if you are using Capistrano to deploy multiple websites, then it will make sense to define a single Capfile in your projects' root directory. Just to keep things simple, I suggest placing this file in your project home directory for now. Also, because we're using the Railsless Deploy gem to streamline Capistrano, our Capfile looks a tad different than those you'll find for the typical Rails project:

require 'rubygems'
require 'railsless-deploy'
load    'config/deploy.rb'

Notice the third line of the Capfile refers to a file called deploy.rb which resides in a directory named config. This file contains the aforementioned project-specific settings, including which version control solution (if any) is used to manage the project, the remote server domain, and the remote server directory to which the project will be deployed, among others. The deploy.rb file I use to deploy my projects is presented next. I've commented the code, however if you're looking for a thorough explanation consider picking up a copy of my book, Easy PHP Websites with the Zend Framework.

01 # What is the name of the local application?
02 set :application, "gamenomad.wjgilmore.com"
03
04 # What user is connecting to the remote server?
05 set :user, "wjgilmore"
06
07 # Where is the local repository?
08 set :repository,  "file:///var/www/dev.wjgames.com"
09
10 # What is the production server domain?
11 role :web, "gamenomad.wjgilmore.com"
12
13 # What remote directory hosts the production website?
14 set :deploy_to,   "/home/wjgilmorecom/gamenomad.wjgilmore.com"
15
16 # Is sudo required to manipulate files on the remote server?
17 set :use_sudo, false
18
19 # What version control solution does the project use?
20 set :scm,        :git
21 set :branch,     'master'
22
23 # How are the project files being transferred?
24 set :deploy_via, :copy
25
26 # Maintain a local repository cache. Speeds up the copy process.
27 set :copy_cache, true
28
29 # Ignore any local files?
30 set :copy_exclude, %w(.git)
31
32 # This task symlinks the proper .htaccess file to ensure the
33 # production server's APPLICATION_ENV var is set to production
34 task :create_symlinks, :roles => :web do
35   run "rm #{current_release}/public/.htaccess"
36   run "ln -s #{current_release}/production/.htaccess
37              #{current_release}/public/.htaccess"
38 end
39
40 # After deployment has successfully completed
41 # create the .htaccess symlink
42 after "deploy:finalize_update", :create_symlinks

Readying Your Remote Server

As I mentioned at the beginning of this chapter, one of the great aspects of Capistrano is the ability to rollback your deployment to the previous version should something go wrong. This is possible because (when using the copy strategy) Capistrano will store multiple versions of your website on the remote server, and link to the latest version via a symbolic link named current which resides in the the directory defined by the :deploy_to setting found in your deploy.rb file. These versions are stored in a directory called releases, also located in the :deploy_to directory. Each version is stored in a directory with a name reflecting the date and time at the time the release was deployed. For instance, a deployment which occurred on February 24, 2011 at 12:37:27 Eastern will be stored in a directory named 20110224183727 (these timestamps are stored using Greenwich Mean Time).

Additionally, Capistrano will create a directory called shared which also resides in the :deploy_to directory. This directory is useful for storing custom user avatars, cache data, and anything else you don't want overwritten when a new version of the website is deployed. You can then use Capistrano's deploy:finalize_update task to create symbolic links just as was done with the .htaccess file.

Therefore given my :deploy_to directory is set to /home/wjgilmore/gamenomad.wjgilmore.com, the directory contents will look similar to this:

current -> /home/wjgilmore/gamenomad.wjgilmore.com/
releases/20110224184826
releases
  20110224181647/
  20110224183727/
  20110224184826/
shared

Capistrano can create the releases and shared directories for you, something you'll want to do when you're ready to deploy your website for the first time. Create these directories using the deploy:setup command, as demonstrated here:

$ cap deploy:setup

Deploying Your Project

Now comes the fun part. To deploy your project, execute the following command:

If you've followed the instructions I've provided so far verbatim, remember that Capistrano will be deploying your latest committed changes. Whether you've saved the files is irrelevant, as Capistrano only cares about committed files.

Presuming everything is properly configured, the changes should be immediately available via your production server. If something went wrong, Capistrano will complain in the fairly verbose status messages which appear when you execute the deploy command. Notably you'll probably see something about rolling back the changes made during the current deployment attempt, which Capistrano will automatically do should it detect that something has gone wrong.

Rolling Back Your Project

One of Capistrano's greatest features is its ability to revert, or rollback, a deployment to the previous version should you notice something just isn't working as you expected. This is possible because as I mentioned earlier in the chapter, Capistrano stores multiple versions of the website on the production server, meaning returning to an earlier version is as simple as removing the symbolic link to the most recently deployed version and then creating a new symbolic link which points to the previous version.

To rollback your website to the previously deployed version, just use the deploy:rollback command:

$ cap deploy:rollback

Conclusion

If this is your first encounter with Capistrano, I'd imagine reading this blog post is akin to finding a pot of gold at the rainbow (or perhaps a monkey statue). Putting all of the pieces in place can be a bit confusing at first, however once they are I guarantee you'll never let a project fall to the wayside again (at least because of deployment hindrances). If you have any questions, I'm happy to help! E-mail me at wj AT wjgilmore.com.