Categories
Cloud Developer Tips

Creating Consistent Snapshots of a Live Instance with XFS on a Boot-from-EBS AMI

Eric Hammond has taught us how to create consistent snapshots of EBS volumes. Amazon has allowed us to use EBS snapshots as AMIs, providing a persistent root filesystem. Wouldn’t it be great if you could use both of these techniques together, to take a consistent snapshot of the root filesystem without stopping the instance? Read on for my instructions how to create an XFS-formatted boot-from-EBS AMI, allowing consistent live snapshots of the root filesystem to be created.

The technique presented below owes its success to the Canonical Ubuntu team, who created a kernel image that already contains XFS support. That’s why these instructions use the official Canonical Ubuntu 9.10 Karmic Koala AMI – because it has XFS support built in. There may be other AKIs out there with XFS support built in – if so, the technique should work with them, too.

How to Do It

The general steps are as follows:

  1. Run an instance and set it up the way you like.
  2. Create an XFS-formatted EBS volume.
  3. Copy the contents of the instance’s root filesystem to the EBS volume.
  4. Unmount the EBS volume, snapshot it, and register it as an AMI.
  5. Launch an instance of the new AMI.

More details on each of these steps follows.

1. Run an instance and set it up the way you like.

As mentioned above, I use the official Canonical Ubuntu 9.10 Karmic Koala AMI (currently ami-1515f67c for 32-bit architecture – see the table on Alestic.com for the most current Ubuntu AMI IDs).

ami=ami-1515f67c
security_groups=default
keypair=my-keypair
instance_type=m1.small
ec2-run-instances $ami -t $instance_type -g $security_groups -k $keypair

Wait until the ec2-describe-instances command shows the instance is running and then ssh into it:

ssh -i my-keypair ubuntu@ec2-1-2-3-4.amazonaws.com

Now that you’re in, set up the instance’s root filesystem the way you want. Don’t forget that you probably want to run

sudo apt-get update

to allow you to pull in the latest packages.

In our case we’ll want to install ec2-consistent-snapshot, as per Eric Hammond’s article:

codename=$(lsb_release -cs)
echo "deb http://ppa.launchpad.net/alestic/ppa/ubuntu $codename main" | sudo tee /etc/apt/sources.list.d/alestic-ppa.list
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys BE09C571
sudo apt-get update
sudo apt-get install -y ec2-consistent-snapshot
sudo PERL_MM_USE_DEFAULT=1 cpan Net::Amazon::EC2

2. Create an XFS-formatted EBS volume.

First, install the XFS tools:

sudo apt-get install -y xfsprogs

These utilities allow you to format filesystems using XFS and to freeze and unfreeze the XFS filesystem. They are not necessary in order to read from XFS filesystems, but we want these programs installed on the AMI we create because they are used in the process of creating a consistent snapshot.

Next, create an EBS volume in the availability zone your instance is running in. I use a 10GB volume, but you can use any size and grow it later using this technique. This command is run on your local machine:

ec2-create-volume --size 10 -z $zone

Wait until the ec2-describe-volumes command shows the volume is available and then attach it to the instance:

ec2-attach-volume $volume --instance $instance --device /dev/sdh

Back on the instance, format the volume with XFS:

sudo mkfs.xfs /dev/sdh
sudo mkdir -m 000 /vol
sudo mount -t xfs /dev/sdh /vol

Now you should have an XFS-formatted EBS volume, ready for you to copy the contents of the instance’s root filesystem.

3. Copy the contents of the instance’s root filesystem to the EBS volume.

Here’s the command to copy over the entire root filesystem, preserving soft-links, onto the mounted EBS volume – but ignoring the volume itself:

sudo rsync -avx --exclude /vol / /vol

My command reports that it copied about 444 MB to the EBS volume.

4. Unmount the EBS volume, snapshot it, and register it as an AMI.

You’re ready to create the AMI. On the instance do this:

sudo umount /vol

Now, back on your local machine, create the snapshot:

ec2-create-snapshot $volume

Once ec2-describe-snapshots shows the snapshot is 100% complete, you can register it as an AMI. The AKI and ARI values used here should match the AKI and ARI that the instance is running – in this case, they are the default Canonical AKI and ARI for this AMI. Note that I give a descriptive “name” and “description” for the new AMI – this will make your life easier as the number of AMIs you create grows. Another note: some AMIs (such as the Ubuntu 10.04 Lucid AMIs) do not have a ramdisk, so skip the --ramdisk $ramdisk arguments if you’ve used such an AMI.

kernel=aki-5f15f636
ramdisk=ari-0915f660
description="Ubuntu 9.10 Karmic formatted with XFS"
ami_name=ubuntu-9.10-32-bit-ami-1515f67c-xfs
ec2-register --snapshot $snapshot --kernel $kernel --ramdisk $ramdisk '--description=$description' --name=$ami_name --architecture i386 --root-device-name /dev/sda1 --block-device-mapping /dev/sda2=ephemeral0

This displays the newly registered AMI ID – let’s say it’s ami-00000000.

5. Launch an instance of the new AMI.

Here comes the moment of truth. Launch an instance of the newly registered AMI:

ami=ami-00000000
security_groups=default
keypair=my-keypair
instance_type=m1.small
ec2-run-instances $ami -t $instance_type -g $security_groups -k $keypair

Again, wait until ec2-describe-instances shows it is running and ssh into it:

ssh -i my-keypair ubuntu@ec2-5-6-7-8.amazonaws.com

Now, on the instance, you should be able to see that the root filesystem is XFS with the mount command. The output should contain:

/dev/sda1 on / type xfs (rw)
...

We did it! Let’s create a consistent snapshot of the root filesystem. Look back into the output of ec2-describe-instances to determine the volume ID of the root volume for the instance.

sudo ec2-consistent-snapshot --aws-access-key-id $aws_access_key_id --aws-secret-access-key $aws_secret_access_key --xfs-filesystem / $volumeID

The command should display the snapshot ID of the snapshot that was created.

Using ec2-consistent-snapshot and an XFS-formatted EBS AMI, you can create snapshots of the running instance without stopping it. Please comment below if you find this helpful, or with any other feedback.

Categories
Cloud Developer Tips

Mount an EBS Volume Created from Snapshot at Startup

There are many posts about how to mount an EBS volume to your EC2 instance during the startup process. But requiring an instance to use a specific EBS volume has limitations that make the technique unsuitable for large-scale use. In this article I present a more flexible technique that uses an EBS snapshot instead.

Update December 2009: Amazon released native support for automatically mounting EBS volumes created from a snapshot, as part of supporting the boot-from-EBS feature. The techniques in this article are no longer necessary. But they’re cool anyway.


Limitations of Mounting an EBS Volume on Instance Startup

Mounting an EBS volume at startup is relatively straightforward (see the above-referenced posts for details). The main features of the procedure are:

  • The instance uses the EC2 API tools to attach the specified EBS volume. These tools, in turn, require Java and your EC2 credentials – your certificate and private key.
  • Ideally, the AMI contains a hook to allow the EBS volume ID to be specified dynamically at startup time, either as a parameter in the user-data or retrieved from S3 or SimpleDB.
  • The AMI should already contain (or its startup scripts should create) the appropriate references to the necessary locations on the mounted EBS volume. For example, if the volume is mounted at /vol, then /var/lib/mysql (the default MySQL data directory) might be soft-linked (ln -s) or mount --binded to /vol/var/lib/mysql. Alternatively, the applications can be configured to use the locations on the mounted volume directly.

There are many benefits to mounting an EBS volume at instance startup:

  • Avoid the need to burn a new AMI when the content on the instance’s disks changes.
  • Gain the redundancy provided by an EBS volume.
  • Gain the point-in-time backups provided by EBS snapshots.
  • Avoid the need to store the updated content into S3 before instance shutdown.
  • Avoid the need to copy and reconstitute the content from S3 to the instance during startup.
  • Avoid paying for an instance to be always-on.

But mounting an EBS volume at startup also has important limitations:

  • Instances must be launched in the same availability zone as the EBS volume. EBS volumes are availability-zone specific, and are only usable by instances running in the same availability zone. Large-scale deployments use instances in multiple availability zones to mitigate risk, so limiting the deployment to a single availability zone is not reasonable.
  • There is no way to create multiple instances that each have the same EBS volume attached. When you need multiple instances that each have the same data, one EBS volume will not do the trick.
  • As a corollary to the previous point, it is difficult to create Auto Scaling groups of AMIs that mount an EBS volume automatically because each instance needs its own EBS volume.
  • It is difficult to automate the startup of a replacement instance when the original instance still has the EBS volume attached. Large-scale deployments need to be able to handle failure automatically because instances will fail. Sometimes instances will fail in a not-nice way, leaving the EBS volume attached. Detaching the EBS volume may require manual intervention, which is something that should be avoided if at all possible for large-scale deployments.

These limitations make the technique of mounting an EBS volume at startup unsuitable for large-scale deployments.

The Alternative: Mount an EBS Volume Created from a Snapshot at Startup

Instead of specifying an EBS volume to mount at startup, we can specify an EBS snapshot. At startup the instance creates a new EBS volume from the given snapshot and attaches the new volume to itself. The basic startup flow looks like this:

  1. If there is a volume already attached at the target location, do nothing – this is a reboot. Otherwise, continue to step 2.
  2. Create a new EBS volume from the specified snapshot. This requires the following:
    • Java
    • The EC2 API tools
    • The EC2 account’s certificate and private key
    • The EBS snapshot ID
  3. Attach the newly-created EBS volume and mount it to the mount point.
  4. Restore any filesystem pointers, if necessary, to point to the proper locations beneath the EBS volume’s mount point.

Like the technique of mounting an EBS volume, this technique should ideally support specifying the snapshot ID dynamically at instance startup time, perhaps via the user-data or retrieved from S3 or SimpleDB.

Why Mount an EBS Volume Created from a Snapshot at Startup?

As outlined above, the procedure is simple and it offers the following benefits:

  • Instances need not be launched in the same availability zone as the EBS volume. However, instances are limited to using EBS snapshots that are in the same region (US or EU).
  • Instances no longer need to rely on a specific EBS volume being available.
  • Multiple instances can be launched easily, each of which will automatically configure itself with its own EBS volume made from the snapshot.
  • Costs can be reduced by allowing “duplicate” EBS volumes to be provisioned only when they are needed by an instance. “Duplicate” EBS volumes are created on demand, and can also (optionally) be deleted during instance termination. Previously, you needed to keep around as many EBS volumes as the maximum number of simultaneous instances you would use.
  • Large-scale deployments requiring content on an EBS volume are easy to build.

Here are some cool things that are made possible by this technique:

  • MySQL replication slave (or cluster member) launching can be made more efficient. By specifying a recent snapshot of the master database’s EBS volume, the new MySQL slave instance will begin its life already containing most of the data. This slave will demand fewer resources from the master instance and will take less time to catch-up to the master. If you do plan to use this technique for launching MySQL slaves, see Eric Hammond’s article on EBS snapshots of a MySQL slave database in EC2 for some sage words of advice.
  • Auto Scaling can launch instances that mount content stored in EBS at startup. If the auto-scaled instances all need to access common content that is stored in EBS, this technique allows you to duplicate that content onto each auto-scaled instance automatically. And, if the instance gets the snapshot ID from its user-data at startup, you can easily change the snapshot ID for auto-scaled instances by updating the launch configuration.

I am currently exploring how to combine this technique with the one discussed in my article about how to boot the entire instance from an EBS volume. Combining these approaches could provide the ability to “boot from a snapshot”, allowing you relate to bootable snapshots the same way you think about AMIs. Stay tuned to this blog for an article on this approach.

Caveats

Sounds great, huh? Despite these benefits, this technique can introduce a new problem: too many EBS volumes. As you may know, AWS limits the number of EBS volumes you can create to 20 (initially, and you can request a higher limit). This technique creates a new EBS volume each time an instance starts up, so your account will accumulate many EBS volumes. Plus, each volume will be almost indistinguishable from the others, mak
ing them difficult to track.

One potential way to distinguish the EBS volumes would be the ability to tag them via the API: Each instance would tag the volume upon creation, and these tags would be visible in the management interface to provide information about the origin of the volume. Unfortunately the EC2 API does not offer a way to tag EBS volumes. Until that feature is supported, use the ElasticFox Firefox extension to tag EBS volumes manually. I find it helpful to tag volumes with the creating instance’s ID and the instance’s “tag” security groups (see my article on using security groups to tag instances). ElasticFox displays the snapshot ID from which the volume was created and its creation timestamp, which are also useful to know.

As already hinted at, you will still need to think about what to do when the newly-created EBS volumes are no longer in use by the instance that created them. If you know you won’t need them, have a script to detach and delete the volume during instance shutdown (but not shutdown-before-reboot). Be aware that if an instance fails to terminate nicely the attached EBS volume may still exist and you will be charged for it.

In any case, make sure you keep track of your EBS volumes because the cost of keeping them around can add up quickly.

How to Mount an EBS Volume Created from a Snapshot on Startup

Now for the detailed instructions. Please note that the instructions below have been tested on Ubuntu 8.04, and should work on Debian or Ubuntu systems. If you are running a Red Hat-based system such as CentOS then some of the procedure may need to be adjusted accordingly.

There are four parts of getting set up:

  • Setting up the Original Instance with an EBS volume
  • Creating the EBS Snapshot
  • Preparing the AMI
  • Launching a New Instance

In the last step the new instance will create a new volume from the specified snapshot and mount it during startup.

Setting Up the Original Instance with an EBS volume

[Note: this section is based on the fine article about runing MySQL with EBS by Eric Hammond.]

Start out with an EC2 instance booted from an AMI that you like. I recommend one of the Alestic Ubuntu Hardy 8.04 Server AMIs. The instance will be assigned an instance ID (in this example i-11111111) and a public IP address (in this example 1.1.1.1).

ec2-run-instances -z us-east-1a --key MyKeypair ami-0772946e

ec2-describe-instances i-11111111

Once the ec2-describe-instances output shows that the instance is running, continue by creating an EBS volume. This command creates a 1 GB volume in the us-east-1a availability zone, which is the same zone in which the instance was launched. The volume will be assigned a volume ID (in this example vol-00000000).

ec2-create-volume -z us-east-1a -s 1
ec2-describe-volumes vol-0000000

Once the ec2-describe-volumes output shows that the volume is available, attach it to the instance:

ec2-attach-volume -d /dev/sdh -i i-11111111 vol-0000000

Next we can log into the instance and set it up. The following will install MySQL and the XFS filesystem drivers, and then mount and format the EBS volume. When prompted, specify a MySQL root password. If you are running a Canonical Ubuntu AMI you need to change the ssh username from root to ubuntu in these commands.

ssh -i id_rsa-MyKeypair root@1.1.1.1

sudo apt-get update && sudo apt-get upgrade -y
sudo apt-get install -y xfsprogs mysql-server
sudo modprobe xfs
sudo mkfs.xfs /dev/sdh
sudo mount -t xfs -o noatime /dev/sdh /vol
sudo mkdir /vol
sudo mount /vol

The EBS volume is now attached and formatted and MySQL is installed, so now we configure MySQL to use the EBS volume for its data, configuration, and logs:

sudo /etc/init.d/mysql stop
# Due to a minor MySQL bug this may be necessary - does not hurt
sudo killall mysqld_safe
export EBS_MOUNT_DIR=/vol
export EBS_EXPORTS="/etc/mysql /var/lib/mysql /var/log/mysql"
for i in $EBS_EXPORTS
do
EBS_MOUNTED_EXPORT_DIR="$EBS_MOUNT_DIR""$i"
sudo mkdir -p `dirname "$EBS_MOUNTED_EXPORT_DIR"`
sudo mv $i `dirname "$EBS_MOUNTED_EXPORT_DIR"`
sudo mkdir $i
sudo mount --bind
"$EBS_MOUNTED_EXPORT_DIR" "$i"
done
sudo /etc/init.d/mysql start
# Needed later to hold our credentials for bundling an AMI
sudo -H mkdir ~/.ec2

Before we go on, we’ll make sure the EBS volume is being used by MySQL. The data directory on the EBS volume is /vol/var/lib/mysql so we should expect new databases to be created there.

mysql -u root -p -e create database db_on_ebs"
ls -l /vol/var/lib/mysql/

The listing should show that the new directory db_on_ebs was created. This proves that MySQL is using the EBS volume for its data store.

Creating the EBS Snapshot

All the above steps prepare the original instance and the EBS volume for being snapshotted. The following procedure can be used to snapshot the volume.

On the instance perform the following to stop MySQL and unmount the EBS volume:

sudo /etc/init.d/mysql stop
sudo umount /etc/mysql
sudo umount /var/log/mysql
sudo umount /var/lib/mysql
sudo umount /vol

Then, from your local machine, create a snapshot as follows. Remember the snapshot ID for later (snap-00000000 in this example).

ec2-create-snapshot vol-00000000

The snapshot is in progress and you can check its status with the ec2-describe-snapshots command.

Preparing the AMI

At this point in the procedure we have the following set up already:

  • an instance that uses an EBS volume for MySQL files.
  • an EBS volume attached to that instance, having the MySQL files on it.
  • an EBS snapshot of that volume.

Now we are ready to prepare the instance for becoming an AMI. This AMI, when launched, will be able to create a new EBS volume from the snapshot and mount it at startup time.

First, from your local machine, copy your credentials to the EC2 instance:

scp -i id_rsa-MyKeypair pk-whatever1234567890.pem cert-whatever1234567890.pem root@1.1.1.1:~/.ec2/

Back on the EC2 instance install Java (skipping the annoying interactive license agreement) and the EC2 API tools:

export DEBIAN_FRONTEND=noninteractive
echo sun-java6-jdk shared/accepted-sun-dlj-v1-1 select true | sudo /usr/bin/debconf-set-selections
echo sun-java6-jre shared/accepted-sun-dlj-v1-1 select true | sudo /usr/bin/debconf-set-selections
sudo -E apt-get install -y unzip sun-java6-jdk
sudo -H wget -O ~/ec2-api-tools.zip http://s3.amazonaws.com/ec2-downloads/ec2-api-tools.zip && \
cd ~ && unzip ec2-api-tools.zip && ln -s ec2-api-tools-1.3-36506 ec2-api-tools

Note: Future versions of the EC2 API tools will have a different version number, and the above command will need to change accordingly.

Next, set up the script that does the create-volume-and-mount-it magic at startup. Download it from here with the following command:

sudo curl -Lo /etc/init.d/create-ebs-vol-from-snapshot-and-mount \
https://sites.google.com/site/shlomosfiles/clouddevelopertips/create-ebs-vol-from-snapshot-and-mount?attredirects=0

The script has a number of items to customize:

  • The EC2 account credentials: Put a pointer to your private key and certificate file into the script in the appropriate place. If you followed the above instructions these will be in /root/.ec2. Make sure the credentials are located on the instance’s root partition in order to ensure the keys are bundled into the AMI.
  • The snapshot ID. This too can either be hard-coded into the script or, even better, provided as part of the user-data. It is controlled by the EBS_VOL_FROM_SNAPSHOT_ID setting. See below for an example of how to specify and override this value via the user-data.
  • The JAVA_HOME directory. This is the location of the Java installation. On most linux distributions this should point to /usr/lib/jvm/java-6-sun .
  • The EC2_HOME directory. This is the location where the EC2 API tools are installed. If you followed the procedure above this will be /root/ec2-api-tools .
  • The device attach point for the EBS volume. This is controlled by the EBS_ATTACH_DEVICE setting, and is /dev/sdh in these instructions.
  • The filesystem mount directory for the EBS volume. This is controlled by the EBS_MOUNT_DIR setting, and is /vol in these instructions.
  • The directories to be exported from the EBS volume. These are the directories that will be “mapped” to the root filesystem via mount --bind. These are specified in the EBS_EXPORTS setting.
  • If you are creating an AMI for the EU region, uncomment the line export EC2_URL=https://eu-west-1.ec2.amazonaws.com by removing the leading #.

Once you customize the script, set it up to run upon startup as follows:

sudo chmod +x /etc/init.d/create-ebs-vol-from-snapshot-and-mount
sudo update-rc.d create-ebs-vol-from-snapshot-and-mount start S 89 .

As mentioned above, if you do not want the newly-created EBS volume to persist after the instance terminates you can configure the script to be run on shutdown, allowing it to delete the volume. One way of doing this is to create the AMI with a shutdown hook already in place. To do this:

sudo ln -s /etc/init.d/create-ebs-vol-from-snapshot-and-mount /etc/rc0.d/K32create-ebs-vol-from-snapshot-and-mount

Alternatively, you can defer this decision to instance launch time, by passing in the above command via a user-data script – see below for more on this.

Remember: Running this script as part of the shutdown process as described above will delete the EBS volume. If you do not want this to happen automatically, don’t execute the above command. If you mistakenly ran the above command you can fix things as follows:

sudo rm /etc/rc0.d/K32create-ebs-vol-from-snapshot-and-mount

Next is a little cleanup before bundling:

# New instances need their own host keys generated at first boot
chmod +x ec2-ssh-host-key-gen
# New instances should not contain leftovers from this instance
sudo rm -f /root/.*hist*
sudo rm -f /var/log/*.gz
sudo find /var/log -name mysql -prune -o -type f -print | \
while read i; do sudo cp /dev/null $i; done

The instance is now ready to be bundled into an AMI, uploaded, and registered. The commands below show this process. For more about the considerations when bundling an AMI see the this article by Eric Hammond.

bucket=com.mybucket.images
prefix=ubuntu-hardy-32bit-attach-snapshot-at-startup-for-mysql-20090805
export AWS_USER_ID=MY-USER-ID
export AWS_ACCESS_KEY_ID=MY-ACCESS-KEY
export AWS_SECRET_ACCESS_KEY=MY-SECRET-ACCESS-KEY
arch=i386

sudo -E ec2-bundle-vol \
-r $arch \
-d /mnt \
-p $prefix \
-u $AWS_USER_ID \
-k ~/.ec2/pk-*.pem \
-c ~/.ec2/cert-*.pem \
-s 10240 \
-e /mnt,/tmp,/root/.ssh
ec2-upload-bundle \
-b $bucket \
-m /mnt/$prefix.manifest.xml \
-a $AWS_ACCESS_KEY_ID \
-s $AWS_SECRET_ACCESS_KEY

Once the bundle has uploaded successfully, register it from your local machine as follows:

bucket=com.mybucket.images
prefix=ubuntu-hardy-32bit-attach-snapshot-at-startup-for-mysql-20090805
ec2-register $bucket/$prefix.manifest.xml

The ec2-register command displays an AMI ID (ami-012345678 in this example).

We are finally ready to test it out!

Launching a New Instance

Now we are ready to launch a new instance that creates and mounts an EBS volume from the snapshot. The snapshot ID is configurable via the user-data payload specified at instance launch time. Here is an example user-data payload showing how to specify the snapshot ID:

EBS_VOL_FROM_SNAPSHOT_ID=snap-00000000

Note that the format of the user-data payload is compatible with the Running User-Data Scripts technique – just make sure the first line of the user-data payload begins with a hashbang #! and that the EBS_VOL_FROM_SNAPSHOT_ID setting is located somewhere in the payload, at the beginning of a line.

Launch an instance of the AMI with the user-data specifying the snapshot ID, in a different availability zone. The instance will be assigned an instance ID (in this example i-22222222) and a public IP address (in this example 2.2.2.2).

ec2-run-instances -z us-east-1c --key MyKeypair \
-d "EBS_VOL_FROM_SNAPSHOT_ID=snap-00000000" ami-012345678

ec2-describe-instances i-22222222

Once the ec2-describe-instances output shows that the instance is running, check for the new EBS volume that should have been created from the snapshot (in this example, vol-22222222) in the new availability zone.

ec2-describe-volumes

Finally, ssh into the instance and verify that it is now working from the new EBS volume:

ssh -i id_rsa-MyKeypair root@2.2.2.2

mysql -u root -p -e “show databases”

You should see the db_on_ebs database in the results. This demonstrates that the startup sequence successfully created a new EBS volume, attached and mounted it, and set itself up to use the MySQL data on the EBS volume.

Cleaning Up

Don’t forget to clean up the pieces of this procedure when you no longer need them:

# the original instance
ec2-terminate-instances i-11111111
# the original EBS volume
ec2-delete-volumes vol-00000000
# the instance that created a new volume from the snapshot
ec2-terminate instances i-22222222

If you set up the shutdown hook to delete the EBS volume then you can verify that this works by checking that the ec2-describe-volumes output no longer contains the new EBS volume. Otherwise, delete it manually:

# the new volume created from the snapshot
ec2-delete-volumes vol-22222222

And don’t forget to un-register the AMI and delete the files from S3 when you are done. These steps are not shown.

Making Changes to the Configuration

Now that you have a configuration using EBS snapshots which is easily scalable to any availability zone, how do you make changes to it?

Let’s say you want to add a web server to the AMI and your web server’s static content to the EBS volume. (I generally don’t recommend storing your web-layer data in the same place as your database storage, but this example serves as a useful illustration.) You would need to do the following:

  1. Launch an instance of the AMI specifying the snapshot ID in the user-data.
  2. Install the web server on the instance.
  3. Put your web server’s static content onto the instance (perhaps from S3) and test that the web server works.
  4. Stop the web server.
  5. Move the web server’s static content to the EBS volume.
  6. mount --bind” the EBS locations to the original directories without adding entries to /etc/fstab.
  7. Restart the web server and test that the web server still works.
  8. Edit the startup script, adding entries for the web server’s directories to EBS_EXPORTS.
  9. Stop the web server and unmount (umount) all the mount bind directories and the EBS volume.
  10. Remove the mount bind and /vol entries for the EBS exported directories from /etc/fstab.
  11. Perform the cleanup prior to bundling.
  12. Bundle and upload the new AMI.
  13. Create a new snapshot of the EBS volume.
  14. Change your deployment configurations to start using the new AMI and the new snapshot ID.

If you decide that you would like the automatically-created EBS volumes to be deleted when the instances terminate, you have two ways to do this:

  • Execute this command
    sudo ln -s /etc/init.d/create-ebs-vol-from-snapshot-and-mount \
    /etc/rc0.d/K32create-ebs-vol-from-snapshot-and-mount

    and rebundle the AMI.
  • Pass the above command to the instance via a user-data script. The user-data could also specify the snapshot ID, and might look like this:

    #! /bin/bash
    EBS_VOL_FROM_SNAPSHOT_ID=snap-00000000
    ln -s /etc/init.d/create-ebs-vol-from-snapshot-and-mount \
    /etc/rc0.d/K32create-ebs-vol-from-snapshot-and-mount

The technique of mounting an EBS volume created from a snapshot at startup was born of necessity: I needed a way to allow many instances across availability zones to share the same content which lives on an EBS drive. This article shows how you can apply the technique to your deployments. If you also find this technique useful, please share it in the comments!

Thanks

Eric Hammond reviewed early drafts of this article and provided valuable feedback. Thanks!

Categories
Cloud Developer Tips

Boot EC2 Instances from EBS

EBS offers the ability to have a “virtual disk drive” whose lifetime is independent of any EC2 instance. You can attach EBS drives to any running instance, and any changes made to the drive will persist even after the instance is terminated. You can also set up an EBS drive to be the root filesystem for an instance – giving you the benefits of an always-on instance but without paying for it when it’s not in use – and this article shows you how to do that. I also explore how to estimate the cost savings you can achieve using this solution.

I’m not the first person to think of this idea: credit goes to AWS forum users rickdane for starting the discussion and N. Martin and Troy Volin for posts in this AWS forum thread that describe most of the heavy-lifting – this solution is largely based on their work.

Update December 2009: With the introduction of native support for boot-from-EBS AMIs, the techniques in this article are largely unnecessary. But they’re still pretty cool.

Note: this article assumes you are comfortable in the linux shell and familiar with the EC2 AMI tools and EC2 API tools.

Why Boot EC2 Instances from EBS?

My company runs applications in EC2, and we test new application features in EC2 (after testing them locally) before we deploy them to our production environment. At first, each time we had a new feature to test out, we would construct a testing environment as follows:

  1. Launch a new instance of our base AMI (based on an Alestic Ubuntu Hardy Server AMI, which I highly recommend, and customized with our own basic application server stack).
  2. Make the necessary environmental changes (e.g. adding monitoring services) to the test instance.
  3. Deploy the the application (the one with new features we need to test) to the test instance.
  4. Deploy the test database from S3 to the test instance.
  5. Update the test database with any schema or data changes necessary.
  6. Test and debug the application.
  7. Save the updated test database to S3 for later use.
  8. Terminate the test instance.

This process worked great at first (even if it was a manual process). Our application is a WAR and a bunch of jar files, so they were easily uploaded to the test instance. We stored our test database in S3 as a gzipped MySQL mysqldump file, and it was easy to import into / export from MySQL. That is, until we got to the point where our test database was big enough for it to take a long time – over an hour – to reconstitute. At that point it became very annoying to bring up the test environment.

The last thing you want, as a developer, is a test environment that is difficult or annoying to set up. You want testing to be easy, quick, and inexpensive (otherwise you will start looking for shortcuts to avoid testing, which is not good for quality). Once our test environment began to require almost an hour just to set up, we realized it was no longer serving our needs. I began researching a setup that:

  • Is ready-to-go within a short time (less than five minutes)
  • Contains the most-recent environment (tools, application, and database) already
  • Only costs money when it is being used

Basically, we wanted the benefits of a server that is always-on but without paying for it when we weren’t using it.

That’s the “why”. Read on for the “what” and the “how”.

Ingredients and Tools

The solution consists of two pieces:

  • An AMI that can boot from an EBS drive (the “boot AMI”)
  • An EBS drive that contains the bootable linux stack and, later, anything else you add (the “bootable EBS volume”)

The main tool used to put the pieces together is the linux utility pivot_root. This program can only be run by the /sbin/init process (the startup process itself, pid 1), and it “swaps” the root filesystem out from under the OS and replaces it with a directory you specify. This will be the root directory of the EBS drive.

EBS volumes are attached to specific devices (/dev/sdb through /dev/sdp), and you’ll need to choose what attach device you want the boot AMI to use. We could, theoretically, build the boot AMI to read the attach device name from the user-data provided at launch time. But this would require bringing up networking in order to download the user-data to the instance, which is complicated. Instead of doing this we will hardcode the attach device into the boot AMI. You will need to remember what device you chose in order to know where to attach the EBS drive when you launch an instance of the boot AMI. A good way to remember the chosen attach device is to name the boot AMI accordingly, for example boot-from-EBS-to-dev-sdp-32bit-20090624.

As part of choosing the attach device you’ll need to know the mknod minor number associated with the device. mknod minor numbers begin at 0 (for /dev/sda) and progress in increments of 16 (so /dev/sdb is number 16, /dev/sdc is number 32, etc.) until /dev/sdp (which is number 240). mknod minor numbers are detailed in the Documentation/devices.txt file in the linux kernel source code. In the procedure below I use /dev/sdp (“p” for “pivot_root”), with mknod minor number 240.

The EC2 AMI tools are useful in preparing the bootable EBS drive: they can create an image file from an existing instance. These tools will be used to create a temporary image containing the bootable linux stack copied from the running instance, and this temporary image will then be copied to the EBS drive to make it bootable.

How to Set it Up

Here is an outline of the setup process:

  1. Set up a bootable EBS volume.
  2. Set up the boot AMI.

The detailed setup instructions follow.

Setting up a bootable EBS volume:

  1. Launch an instance of an AMI you like. I use the Alestic Ubuntu Hardy Server AMI.
  2. Create an EBS volume in the same availability zone as the instance you launched. This volume should be large enough to fit everything you plan to put on the root partition (later, not now). It can even exceed 10GB: unlike the “real” root filesystem on EC2 instances which is limited to 10GB, the bootable EBS volume is not limited to this size.
  3. Once the EBS volume is created, attach it to the instance on /dev/sdp.
  4. SSH into the instance as root and format the EBS volume:
    mkfs.ext3 /dev/sdp
  5. Mount the EBS volume to the instance’s filesystem:
    mkdir /ebs && mount /dev/sdp /ebs
  6. Make an image bundle containing the root filesystem:
    mkdir /mnt/prImage
    ec2-bundle-vol -c cert -k key -u user -e /ebs -r i386 -d /mnt/prImage

    These arguments are the same as you would use to bundle an AMI – please see the Developer Guide: Bundling a Unix or Linux AMI for details. The certificate and private key credentials should be copied to the instance in the /mnt partition so they don’t get bundled into the image and copied to the bootable EBS volume. If you’re creating a 64-bit image you’ll need to substitute -r x86_64 instead.
  7. Copy the contents of the image to the EBS drive:
    mount -o loop /mnt/prImage/image /mnt/img-mnt
    rsync -a /mnt/img-mnt/ /ebs/
    umount /mnt/img-mnt
  8. Prepare the EBS volume for being the target of pivot_root. First, edit /ebs/etc/fstab, changing the entry for / (the root filesystem) from /dev/sda1 to /dev/sdp. Next,
    mkdir /ebs/old-root
    rm /ebs/root/.ssh/authorized_keys
    umount /ebs
    rmdir /ebs
  9. Detach the EBS volume from your instance.

Your bootable EBS volume is now ready. You might want to take a snapshot of it. Don’t terminate the EC2 instance yet, because it will be used to create the boot AMI.

Setting up the boot AMI:

  1. Create the mount point where the bootable EBS will be mounted:
    mkdir /new-root
  2. Replace the original /sbin/init with a new one. First, rename the original:
    mv /sbin/init /sbin/init.old
    Then, copy this file into /sbin/init, as follows:
    curl -o /sbin/init -L https://sites.google.com/site/shlomosfiles/clouddevelopertips/init?attredirects=0
    As mentioned above, if you choose an attach device different than /dev/sdp you should edit the new /sbin/init file to assign DEVNO the corresponding mknod minor number.
    Finally, make it executable:
    chmod 755 /sbin/init
  3. Clean up the current SSH public key that was used to login to the instance:
    rm /root/.ssh/authorized_keys
    Not so fast! Once you perform this step you will no longer be able to SSH into the instance. Instead, you can move the SSH authorized_keys file out of the way, as follows:
    mv /root/.ssh/authorized_keys /mnt
  4. Bundle, upload, and register the AMI. Give the bundle a name that indicates the processor architecture and the device on which it expects the bootable EBS volume to be attached. For example, boot-from-EBS-to-dev-sdp-32bit-20090624.
  5. If you opted to move the SSH authorized_keys out of the way in Step 3, restore SSH access to the instance:
    mv /mnt/authorized_keys /root/.ssh/authorized_keys

The boot AMI is now ready to be launched. If you are feeling brave you can terminate the instance. Otherwise, you can leave it running and use it to troubleshoot problems with the launch process – but hopefully this won’t be necessary.

How to Launch an Instance

Once the bootable EBS volume and boot AMI are set up, launch instances that boot from the EBS volume as follows:

  1. Launch an instance of the boot AMI.
  2. Attach the bootable EBS volume to the chosen device (/dev/sdp in the above instructions).

The boot AMI will wait until a volume is attached to the chosen device, pivot_root to the EBS volume, and then continue its boot sequence from the EBS volume.

Some troubleshooting tips:

  • To troubleshoot problems launching the instance you should look at the console output from the instance. The console output might not get updated consistently. If the console output is still empty a few minutes after launching the instance and attaching the EBS volume, try rebooting the instance, which will force the console to refresh.
  • If you need to attach the bootable EBS volume to another instance (not the boot AMI), attach it to /dev/sdp and mount it as follows:
    mkdir /ebs && mount /dev/sdp /ebs
  • The boot AMI includes a hook that allows you to add a program to be executed before it performs a pivot_root to the EBS drive. This avoids the need to re-bundle the boot AMI if you need to change the startup process. The hook looks for a file called /pre-pivot.sh on the EBS drive and executes the file if it can.
  • Booting from an EBS drive seems to take slightly longer than booting a regular EC2 instance. Don’t be surprised if it takes up to five minutes until you can get an SSH connection to the instance.
  • You don’t need to re-attach the EBS volume when you reboot the instance – it stays attached.

Usage Tips

  • Create a file in the root of each bootable EBS volume labeling the purpose of the volume, such as /bootable-32bit-appserver. This helps when you mount the EBS drive to an instance as a “regular” volume: the label indicates what the purpose of the volume is, and helps you distinguish it from other attached EBS drives. This is good practice even for non-bootable EBS volumes. See below for a tip on how to track the purpose of EBS volumes without looking at their contents.
  • Once you get the boot AMI running properly with the bootable EBS volume, shut it down and take a snapshot of the volume before you make any other changes to it. This snapshot is your “master bootable volume” snapshot, containing only the minimum setup necessary to boot.
  • After creating a “master bootable volume” snapshot you can (and should!) launch the boot AMI again (remembering to attach the bootable EBS volume) and customize the instance any way you want: add your application stack, your database, your tools, anything else. These changes will persist on the bootable EBS volume even after the instance has terminated. This is the main motivation for the bootable EBS volume solution!
  • You can create multiple bootable EBS volumes from the “master bootable volume” snapshot and customize them each. See below for a tip on how to keep track of the purpose of each EBS volume.
  • The setup instructions above can be used for creating either a 32-bit or a 64-bit boot AMI and matching bootable EBS volume. Because you can’t run an AMI for one architecture on an EC2 instance of the other architecture, you’ll need to create two separate boot AMIs and two separate bootable EBS volumes if you plan to run on both 32-bit and 64-bit EC2 instances. And, because the bootable EBS volumes created by this procedure will contain a 32-bit or 64-bit specific linux stack, be sure to attach the corresponding bootable EBS volume for the boot AMI you launch.
  • If you work with multiple EBS volumes you will want to identify the purpose of each volume without attaching it to a running instance and looking at the label file you created. Unfortunately the EC2 API does not currently offer a way to tag EBS volumes. But the ElasticFox Firefox extension does – and I highly recommend it for this purpose. Note that the volume tags will only be visible in the browser that creates them, not on other machines. [See my article on Copying ElasticFox tags between browsers for a workaround.]

Cost Implications of Booting Instances from EBS

EBS costs money beyond what you pay for the disk drives that come as part of EC2 instances. There are four components of the EBS cost to consider:

  1. Allocated storage size ($0.10 per GB per month)
  2. I/O requests ($0.10 per million I/O requests)
  3. Snapshot storage (same as S3 storage costs; depends on the region)
  4. Snapshot transfer (same as S3 transfer costs; depends on the region)

The AWS Simple Monthly Calculator can help you estimate these costs. Here are some guidelines to help you figure out what numbers to put into these fields:

    • Component #1 is easy: just plug in the size of your bootable EBS volume.
    • Component #2 should be estimated based on the I/O usage of your existing application instance. You can use iostat to estimate this number as follows:
      iostat | awk -F" " 'BEGIN {x="0.0"} /^sd/ {x=x+$3} END {print x}'
      The result is the number of I/O transactions per second. Multiply this figure by 2592000 (60 * 60 * 24 * 30, the number of seconds in a month) to get the number of I/O transactions per month, then divide by 1 million. Or, better yet, use this instead:
      iostat | awk -F" " 'BEGIN {x="0.0"} /^sd/ {x=x+$3} END {printf "%12.2f\n", x*2.592}'
      For my test environment, this figure comes to 29 (million I/O requests per month).
    • Component #3 can be estimated with the help of the guidelines at the bottom of the EBS Product Info page:

Snapshot storage is based on the amount of space your data consumes in Amazon S3. Because data is compressed before being saved to Amazon S3, and Amazon EBS does not save empty blocks, it is likely that the size of a snapshot will be considerably less than the size of your volume. For the first snapshot of a volume, Amazon EBS will save a full copy of your data to Amazon S3. However for each incremental snapshot, only the part of your Amazon EBS volume that has been changed will be saved to Amazon S3.

As a conservative estimate of snapshot storage size, I use the same size as the actual data on the EBS volume. I figure this covers the initial compressed snapshot plus a month’s worth of delta snapshots.

    • Component #4 can also be estimated from the guidelines further down the same page:

Volume data is broken up into chunks before being transferred to Amazon S3. While the size of the chunks could change through future optimizations, the number of PUTs required to save a particular snapshot to Amazon S3 can be estimated by dividing the size of the data that has changed since the last snapshot by 4MB. Conversely, when loading a snapshot from Amazon S3 into and Amazon EBS volume, the number of GET requests needed to fully load the volume can be estimated by dividing the full size of the snapshot by 4MB. You will also be charged for GETs and PUTs at normal Amazon S3 rates.

My own monthly estimate includes taking one full EBS volume snapshot and creating one EBS volume from a snapshot. I keep around 20GB of data on the EBS volume. So I divide 20480 (20 GB expressed in MB) by 4 to get 5120. This is the estimated number of PUTs per month. It is also the estimated number of GETs per month.

For my usage in our test environment, the cost of running instances that boot from a 50GB EBS volume with 20GB of data on it in the US region comes to approximately $10.96 per month.

Is it financially worthwhile?

As long as your EBS cost estimate is less than the cost of running the instance for the hours it would sit unused, this solution saves you money. My test instance is an m1.large in the US region, which costs $0.40 per hour. If my instance would sit idle for at least (10.96/0.4=) 28 hours a month, I save money by using a bootable EBS volume and terminating the instance when it is unused. As it happens, my test instance is unused more than 360 hours a month, so for me it is a no-brainer: I use bootable EBS volumes.

EC2 instances that boot from an EBS volume offer the benefits of an always-on machine without the costs of paying for it when it is not in use. The implementation presented here was developed based on posts by the aforementioned forum users, and I thank them for their help.