Categories
Cloud Developer Tips

Scalability and HA Limitations of AWS Marketplace AMIs

Reading AWS’s recent announcement of the AWS Marketplace you would think that it provides a catalog of click-to-deploy, highly-available, scalable applications running on EC2. You’d be partially right: the applications available in the AWS Marketplace are deployable in only a few clicks. But highly-available and scalable services will be difficult to build using Marketplace images. Here’s why.

Essential Ingredients of HA and Scalability on AWS

AWS makes it easy to run scalable, HA applications via several features. Not all applications use all of these features, but it would be very difficult to provide scalable and highly available service without using at least one of these:

  • Elastic Load Balancing
  • Auto Scaling
  • Elastic Block Storage volumes

ELB and AutoScaling both enable horizontal scalability: spreading load and controlling deployment size via first-class-citizen tools integrated into the AWS environment. They also enable availability by providing an automated way to recover from the failure of individual instances. [Scalability and availability often move in lock-step; improving one usually improves the other.] EBS volumes provide improved data availability: data can be retrieved off of dying instances – and often are used in RAID configurations to improve write performance.

AWS Marketplace Limitations

The AWS Marketplace has limitations that cripple two of the above features, making highly available and scalable services much more difficult to provide.

Marketplace AMI instances cannot be added to an ELB

Update 17 May 2012: The Product Manager for AWS Marketplace informed me that AWS Marketplace instances are now capable of being used with ELB. This limitation no longer exists.

Try it. You’ll get this error message:

 Error: InvalidInstance: ElasticLoadBalancing does not support the paid AMI or supported AMI of instance i-10abf677.

There is no mention of this limitation in the relevant ELB documentation.

This constraint severely limits horizontal scalability for Marketplace AMIs. Without ELB it’s difficult to share web traffic to multiple identically-configured instances of these AMIs. The AWS Marketplace offers several categories of AMIs, including Application Stacks (RoR, LAMP, etc.) and Application Servers (JBoss, WebSphere, etc.), that are typically deployed behind an ELB – but that won’t work with these Marketplace AMIs.

Root EBS volumes of Marketplace AMI instances cannot be mounted on non-root devices

Because all Marketplace AMIs are EBS-backed, you might think that there is a quick path to recover data if the instance dies unexpectedly: simply attach the root EBS volume to another device on another instance and get the data from there. But don’t rely on that – it won’t work. Here is what happens when you try to mount the root EBS volume from an instance of a Marketplace AMI on an another instance:

Failed to attach EBS volume 'New-Mongo-ROOT-VOLUME' (/dev/sdj) to 'New-Mongo' due to: OperationNotPermitted: 'vol-98c642f7' with Marketplace codes may not be attached as a secondary device.

This limitation is described here in AWS documentation:

If a volume has an AWS Marketplace product code:

  • The volume can only be attached to the root device of a stopped instance.
  • You must be subscribed to the AWS Marketplace code that is on the volume.
  • The configuration (instance type, operating system) of the instance must support that specific AWS Marketplace code. For example, you cannot take a volume from a Windows instance and attach it to a Linux instance.
  • AWS Marketplace product codes will be copied from the volume to the instance.

Closing a Licensing Loophole

Why did AWS place these constraints on using Marketplace-derived EBS volumes? To help Sellers keep control of the code they place into their AMI. Without the above limitations it’s simple for the purchaser of a Marketplace AMI to clone the root filesystem and create as many clones of that Marketplace-derived instance without necessarily being licensed to do so and without paying the premiums set by the Seller. It’s to close a licensing loophole.

AWS did a relatively thorough job of closing that hole. Here is a section of the current (25 April 2012) AWS overview of the EC2 EBS API and Command-Line Tools, with relevant Marketplace controls highlighted:

Command and API Action Description
ec2-create-volumeCreateVolume Creates a new Amazon EBS volume using the specified size or creates a new volume based on a previously created snapshot. Any AWS Marketplace product codes from the snapshot are propagated to the volume. For an overview of the AWS Marketplace, go to https://aws.amazon.com/marketplace/help/200900000. For details on how to use the AWS Marketplace, see AWS Marketplace.
ec2-attach-volumeAttachVolume Attaches the specified volume to a specified instance, exposing the volume using the specified device name. A volume can be attached to only a single instance at any time. The volume and instance must be in the same Availability Zone. The instance must be in the running or stoppedstate.

[Note] Note
If a volume has an AWS Marketplace product code:

  • The volume can only be attached to the root device of a stopped instance.
  • You must be subscribed to the AWS Marketplace code that is on the volume.
  • The configuration (instance type, operating system) of the instance must support that specific AWS Marketplace code. For example, you cannot take a volume from a Windows instance and attach it to a Linux instance.
  • AWS Marketplace product codes will be copied from the volume to the instance.

For an overview of the AWS Marketplace, go to https://aws.amazon.com/marketplace/help/200900000. For details on how to use the AWS Marketplace, see AWS Marketplace.

ec2-detach-volumeDetachVolume Detaches the specified volume from the instance it’s attached to. This action does not delete the volume. The volume can be attached to another instance and will have the same data as when it was detached. If the root volume is detached from an instance with an AWS Marketplace product code, then the AWS Marketplace product codes from that volume will no longer be associated with the instance.
ec2-create-snapshotCreateSnapshot Creates a snapshot of the volume you specify. After the snapshot is created, you can use it to create volumes that contain exactly the same data as the original volume. When a snapshot is created, any AWS Marketplace product codes from the volume will be propagated to the snapshot.
ec2-modify-snapshot-attributeModifySnapshotAttribute Modifies permissions for a snapshot (i.e., who can create volumes from the snapshot). You can specify one or more AWS accounts, or specify all to make the snapshot public.

[Note] Note
Snapshots with AWS Marketplace product codes cannot be made public.

The constraints above are meant to maintain the AWS Marketplace product code, the mechanism AWS uses to identify resources (AMIs, snapshots, volumes, and instances) that require Marketplace licensing integration. Note that not all AMIs in the AWS Marketplace have a product code – for example, the Amazon Linux AMI does not have one. AMIs that do not require licensing control (such as Amazon Linux, and Ubuntu without support) do not have AWS Marketplace product codes – but the rest do.

A Hole

There remains a hole in this lockdown scheme. Any instance whose kernel allows booting from a volume based on its volume label can be manipulated into booting from a secondary EBS volume. This requires root privileges on the instance. I have successfully booted an instance of the MongoDB AMI in the AWS Marketplace from a secondary EBS volume created from the Amazon Linux AMI. Anyone exploiting this hole can circumvent the product code lockdown.

Plugging the Hole

Sellers want these licensing controls and lockdowns. Here’s how:

  • Disable the root account.
  • Disable sudo.
  • Prevent user-data from being executed. On the Amazon Linux AMI and Ubuntu AMIs, user-data beginning with a hashbang is executed as root during the startup sequence.

Unfortunately these mitigations result in a crippled instance. Users won’t be able to mount EBS volumes – which requires root access – so data can’t be stored on EBS volumes for better recoverability.

Alternatively, you could develop your AWS Marketplace solutions as SaaS applications. For many potential Sellers this would be a long-term effort.

I’m still looking for good ways to enable scalability and HA of Marketplace AMIs. I welcome your suggestions.

Update 27 April 2012: Amazon Web Services PR has contacted me to say they are actively working on a fix for the ELB limitations, and are also working on removing the limitation related to mounting Marketplace-derived EBS volumes on secondary devices. I’ll update this article when this happens. In the meantime, AWS said that users who want to recover data from Marketplace-derived EBS volumes should reach out to AWS Support for help.

Update 17 May 2012: The Product Manager for AWS Marketplace informed me that AWS Marketplace instances are now capable of being used with ELB.

Categories
Cloud Developer Tips

Storing AWS Credentials on an EBS Snapshot Securely

Thanks to reader Ewout and his comment on my article How to Keep Your AWS Credentials on an EC2 Instance Securely for suggesting an additional method of transferring credentials: via a snapshot. It’s similar to burning credentials into an AMI, but easier to do and less prone to accidental inclusion in the application’s AMI.

Read on for a discussion of how to implement this technique.

How to Store AWS Credentials on an EBS Snapshot

This is how to store a secret on an EBS snapshot. You do this only once, or whenever you need to change the secret.

We’re going to automate as much as possible to make it easy to do. Here’s the command that launches an instance with a newly created 1GB EBS volume, formats it, mounts it, and sets up the root user to be accessible via ssh and scp. The new EBS volume created will not be deleted when the instance is terminated.

$ ec2-run-instances -b /dev/sdf=:1:false -t m1.small -k \
my-keypair -g default ami-6743ae0e -d '#! /bin/bash
yes | mkfs.ext3 /dev/sdf
mkdir -m 000 /secretVol
mount -t ext3 -o noatime /dev/sdf /secretVol
cp /home/ubuntu/.ssh/authorized_keys /root/.ssh/'

We have set up the root user to be accessible via ssh and scp so we can store the secrets on the EBS volume as the root user by directly copying them to the volume as root. Here’s how we do that:

$ ls -l
total 24
-r--r--r-- 1 shlomo  shlomo  916 Jun 20  2010 cert-NT63JNE4VSDEMH6VHLHBGHWV3DRFDECP.pem
-r--------  1 shlomo  shlomo   90 Jun  1  2010 creds
-r-------- 1 shlomo  shlomo  926 Jun 20  2010 pk-NT63JNE4VSDEMH6VHLHBGHWV3DRFDECP.pem
$ scp -i /path/to/id_rsa-my-keypair * root@174.129.83.237:/secretVol/

Our secret is now on the EBS volume, visible only to the root user.

We’re almost done. Of course you want to test that your application can access the secret appropriately. Once you’ve done that you can terminate the instance – don’t worry, the volume will not be deleted due to the “:false” specification in our launch command.

$ ec2-terminate-instance $instance
$ ec2-describe-volumes
VOLUME	vol-7ce48a15	1		us-east-1b	available	2010-07-18T17:34:01+0000
VOLUME	vol-7ee48a17	15	snap-5e4bec36	us-east-1b	deleting	2010-07-18T17:34:02+0000

Note that the root EBS volume is being deleted but the new 1GB volume we created and stored the secret on is intact.

Now we’re ready for the final two steps:
Snapshot the volume with the secret:

$ ec2-create-snapshot $secretVolume
SNAPSHOT	snap-2ec73045	vol-7ce48a15	pending	2010-07-18T18:05:39+0000		540528830757	1

And, once the snapshot completes, delete the volume:

$ ec2-describe-snapshots -o self
SNAPSHOT	snap-2ec73045	vol-7ce48a15	completed	2010-07-18T18:05:40+0000	100%	540528830757	1
$ ec2-delete-volume $secretVolume
VOLUME	vol-7ce48a15
# save the snapshot ID
$ secretSnapshot=snap-2ec73045

Now you have a snapshot $secretSnapshot with your credentials stored on it.

How to Use Credentials Stored on an EBS Snapshot

Of course you can create a new volume from the snapshot, attach the volume to your instance, mount the volume to the filesystem, and access the secrets via the root user. But here’s a way to do all that at instance launch time:

$ ec2-run-instances un-instances -b /dev/sdf=$secretSnapshot -t m1.small -k \
my-keypair -g default ami-6743ae0e -d '#! /bin/bash
mkdir -m 000 /secretVol
mount -t ext3 -o noatime /dev/sdf /secretVol
# make sure it gets remounted if we reboot
echo "/dev/sdf /secretVol ext3 noatime 0 0" > /etc/fstab'

This one-liner uses the -b option of ec2-run-instances to specify a new volume be created from $secretSnapshot, attached to /dev/sdf, and this volume will be automatically deleted when the instance terminates. The user-data script sets up the filesystem mount point and mounts the volume there, also ensuring that the volume will be remounted if the instance reboots.
Check it out, a new volume was created for /dev/sdf:

$ ec2-describe-instances
RESERVATION	r-e4f2608f	540528830757	default
INSTANCE	i-155b857f	ami-6743ae0e			pending	my-keypair	0		m1.small	2010-07-19T15:51:13+0000	us-east-1b	aki-5f15f636	ari-d5709dbc	monitoring-disabled					ebs
BLOCKDEVICE	/dev/sda1	vol-8a721be3	2010-07-19T15:51:22.000Z
BLOCKDEVICE	/dev/sdf	vol-88721be1	2010-07-19T15:51:22.000Z

Let’s make sure the files are there. SSHing into the instance (as the ubuntu user) we then see:

$ ls -la /secretVol
ls: cannot open directory /secretVol: Permission denied
$ sudo ls -l /secretVol
total 28
-r--------  1 root root   916 2010-07-18 17:52 cert-NT63JNE4VSDEMH6VHLHBGHWV3DRFDECP.pem
-r--------  1 root root    90 2010-07-18 17:52 creds
dr--------  2 root   root   16384 2010-07-18 17:42 lost+found
-r--------  1 root root   926 2010-07-18 17:52 pk-NT63JNE4VSDEMH6VHLHBGHWV3DRFDECP.pem

Your application running the instance (you’ll install it by adding to the user-data script, right?) will need root privileges to access those secrets.

Categories
Cloud Developer Tips

Creating Consistent Snapshots of a Live Instance with XFS on a Boot-from-EBS AMI

Eric Hammond has taught us how to create consistent snapshots of EBS volumes. Amazon has allowed us to use EBS snapshots as AMIs, providing a persistent root filesystem. Wouldn’t it be great if you could use both of these techniques together, to take a consistent snapshot of the root filesystem without stopping the instance? Read on for my instructions how to create an XFS-formatted boot-from-EBS AMI, allowing consistent live snapshots of the root filesystem to be created.

The technique presented below owes its success to the Canonical Ubuntu team, who created a kernel image that already contains XFS support. That’s why these instructions use the official Canonical Ubuntu 9.10 Karmic Koala AMI – because it has XFS support built in. There may be other AKIs out there with XFS support built in – if so, the technique should work with them, too.

How to Do It

The general steps are as follows:

  1. Run an instance and set it up the way you like.
  2. Create an XFS-formatted EBS volume.
  3. Copy the contents of the instance’s root filesystem to the EBS volume.
  4. Unmount the EBS volume, snapshot it, and register it as an AMI.
  5. Launch an instance of the new AMI.

More details on each of these steps follows.

1. Run an instance and set it up the way you like.

As mentioned above, I use the official Canonical Ubuntu 9.10 Karmic Koala AMI (currently ami-1515f67c for 32-bit architecture – see the table on Alestic.com for the most current Ubuntu AMI IDs).

ami=ami-1515f67c
security_groups=default
keypair=my-keypair
instance_type=m1.small
ec2-run-instances $ami -t $instance_type -g $security_groups -k $keypair

Wait until the ec2-describe-instances command shows the instance is running and then ssh into it:

ssh -i my-keypair ubuntu@ec2-1-2-3-4.amazonaws.com

Now that you’re in, set up the instance’s root filesystem the way you want. Don’t forget that you probably want to run

sudo apt-get update

to allow you to pull in the latest packages.

In our case we’ll want to install ec2-consistent-snapshot, as per Eric Hammond’s article:

codename=$(lsb_release -cs)
echo "deb http://ppa.launchpad.net/alestic/ppa/ubuntu $codename main" | sudo tee /etc/apt/sources.list.d/alestic-ppa.list
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys BE09C571
sudo apt-get update
sudo apt-get install -y ec2-consistent-snapshot
sudo PERL_MM_USE_DEFAULT=1 cpan Net::Amazon::EC2

2. Create an XFS-formatted EBS volume.

First, install the XFS tools:

sudo apt-get install -y xfsprogs

These utilities allow you to format filesystems using XFS and to freeze and unfreeze the XFS filesystem. They are not necessary in order to read from XFS filesystems, but we want these programs installed on the AMI we create because they are used in the process of creating a consistent snapshot.

Next, create an EBS volume in the availability zone your instance is running in. I use a 10GB volume, but you can use any size and grow it later using this technique. This command is run on your local machine:

ec2-create-volume --size 10 -z $zone

Wait until the ec2-describe-volumes command shows the volume is available and then attach it to the instance:

ec2-attach-volume $volume --instance $instance --device /dev/sdh

Back on the instance, format the volume with XFS:

sudo mkfs.xfs /dev/sdh
sudo mkdir -m 000 /vol
sudo mount -t xfs /dev/sdh /vol

Now you should have an XFS-formatted EBS volume, ready for you to copy the contents of the instance’s root filesystem.

3. Copy the contents of the instance’s root filesystem to the EBS volume.

Here’s the command to copy over the entire root filesystem, preserving soft-links, onto the mounted EBS volume – but ignoring the volume itself:

sudo rsync -avx --exclude /vol / /vol

My command reports that it copied about 444 MB to the EBS volume.

4. Unmount the EBS volume, snapshot it, and register it as an AMI.

You’re ready to create the AMI. On the instance do this:

sudo umount /vol

Now, back on your local machine, create the snapshot:

ec2-create-snapshot $volume

Once ec2-describe-snapshots shows the snapshot is 100% complete, you can register it as an AMI. The AKI and ARI values used here should match the AKI and ARI that the instance is running – in this case, they are the default Canonical AKI and ARI for this AMI. Note that I give a descriptive “name” and “description” for the new AMI – this will make your life easier as the number of AMIs you create grows. Another note: some AMIs (such as the Ubuntu 10.04 Lucid AMIs) do not have a ramdisk, so skip the --ramdisk $ramdisk arguments if you’ve used such an AMI.

kernel=aki-5f15f636
ramdisk=ari-0915f660
description="Ubuntu 9.10 Karmic formatted with XFS"
ami_name=ubuntu-9.10-32-bit-ami-1515f67c-xfs
ec2-register --snapshot $snapshot --kernel $kernel --ramdisk $ramdisk '--description=$description' --name=$ami_name --architecture i386 --root-device-name /dev/sda1 --block-device-mapping /dev/sda2=ephemeral0

This displays the newly registered AMI ID – let’s say it’s ami-00000000.

5. Launch an instance of the new AMI.

Here comes the moment of truth. Launch an instance of the newly registered AMI:

ami=ami-00000000
security_groups=default
keypair=my-keypair
instance_type=m1.small
ec2-run-instances $ami -t $instance_type -g $security_groups -k $keypair

Again, wait until ec2-describe-instances shows it is running and ssh into it:

ssh -i my-keypair ubuntu@ec2-5-6-7-8.amazonaws.com

Now, on the instance, you should be able to see that the root filesystem is XFS with the mount command. The output should contain:

/dev/sda1 on / type xfs (rw)
...

We did it! Let’s create a consistent snapshot of the root filesystem. Look back into the output of ec2-describe-instances to determine the volume ID of the root volume for the instance.

sudo ec2-consistent-snapshot --aws-access-key-id $aws_access_key_id --aws-secret-access-key $aws_secret_access_key --xfs-filesystem / $volumeID

The command should display the snapshot ID of the snapshot that was created.

Using ec2-consistent-snapshot and an XFS-formatted EBS AMI, you can create snapshots of the running instance without stopping it. Please comment below if you find this helpful, or with any other feedback.

Categories
Cloud Developer Tips

Cool Things You Can Do with Shared EBS Snapshots

I’ve been awaiting this feature for a long time: Shared EBS Snapshots. Here’s a brief intro to using the feature, and some cool things you can do with shared snapshots. I also offer predictions about things that will appear as this feature gains adoption among developers.

How to Share an EBS Snapshot

Really, it’s easy. The first thing you’ll need to know is the Account Number of the user with whom you want to share the snapshot. If you want to make the snapshot public then you don’t need this. The account number can be found in the Your Account > Account Activity page. It’s in small numbers in the top-right of the page (so small you may need to click on the image below to see it in full size):


The person with whom you want to share the snapshot (you are the sharer, they are the “sharee”?) should tell you this 12-digit number. Don’t worry, sharee, it’s not a secret.

Once you have the sharee’s account number you, the sharer, go into the AWS Management Console and choose the Snapshots item. Find the snapshot you want to share and right-click on it, choosing “Snapshot Permissions”. You’ll get the following dialog:

Fill in the sharee’s account number, without the separating dashes, into the dialog, and hit “Save”. It should only take a few seconds and… presto! The snapshot should be visible in the sharee’s AWS Management Console Snapshots page.

Cool Things You Can Do with Shared Snapshots

Update 27 September 2009: Before you share snapshots publicly, read Eric Hammond’s warning about the dangers of doing so.

Easily move data between development, testing, and production

You’ve been keeping separate AWS accounts for your production environment, your testing environment, and your development environment, right? Right? Well, in case you haven’t, you no longer have any excuse not to do so. You can now share your database, your HDFS volumes (if you use Cloudera’s Hadoop distribution with EBS support), and anything else of significant size between these separate accounts. No more “tar, gzip, split into < 5GB chunks, upload to S3” and “download from S3, concatenate, untar-gzip”. Your data is ready to go with the newly-created volume.

Share entire setups for troubleshooting and support

If you support a product that is deployed in EC2 you no longer need to jump through hoops to get access to your customer’s files when there’s a problem. Simply have them put the relevant files into an EBS volume, snapshot it, and share the snapshot with you.

Deliver your application in a more granular manner

Until today you delivered your application as an AMI – perhaps even a DevPay AMI – and you may not have given your customers root access. But, if your application used less than 100% of an instance’s CPU, the customer was stuck paying for an entire CPU. Now, you can distribute your applications as a shared snapshot instead, and your customers will be free to use the rest of the instance’s CPU. You’ll just need to build a way to manage access, only allowing authorized customers to see the snapshot.

Deliver you customer’s results in a more usable format

If you run a service that provides large amounts of data, you no longer need to use S3 to share the results. Until today you had to store the results in S3, and your customer needed to retrieve the results from S3 in order to use them. No longer: now you can provide a shared snapshot of the results, and the customer can access them via their filesystem more simply. “The shared snapshot is the new bucket.”

Mount a volume created from a shared snapshot at startup

In a previous article I explained how to automatically mount an EBS volume created from a snapshot during the instance’s startup sequence. I provided a script that gets the snapshot ID via the user-data and does all the rest automatically. Now you can also use snapshots that have been shared.

Update 25 September 2009: Share entire machines

Reader Robert Staveley (Tom) comments below about his use for shared snapshots: Sharing entire machines – boot code and everything – between development, testing, and production accounts. Using the technique to boot an instance from an EBS volume he points out that the entire bootable hard drive and all applications (even beyond 10GB) can be shared between these accounts.

Things to Expect in the Future

Shared snapshots are still a very new feature, but here are some things I expect to happen now that this is possible.

  • The AWS Management Console is the only UI that allows you to share a snapshot. ElasticFox will be adding this capability Real Soon Now, and I am sure others will as well.
  • Alternatives to AMIs. AMIs have many limitations, such as the 10GB maximum size, that can be circumvented using a technique I described to boot from an EBS volume. I expect to see OS distributions packaged as a shared EBS snapshot. These distributions could all share a common AMI containing just enough code to create a volume from the shared distribution snapshot, mount it, and boot from it. No more headaches bundling an AMI – just share a new bootable EBS snapshot.
    Update 3 December 2009: This prediction has come true, with AWS’s release of EBS-backed AMIs.
  • Payment gateway services for managing access to shared snapshots. Now that you’re distributing software as a shared snapshot you’ll need to manage access to the snapshot, limiting it to authorized customers. You might build that system yourself today, but soon we’ll see third-party services that do this for you.
    Update 19 April 2012: This prediction has come true, with AWS’s release of the AWS Marketplace.

Do you have other cool uses or predictions for shared snapshots? Please comment!

Categories
Cloud Developer Tips

Mount an EBS Volume Created from Snapshot at Startup

There are many posts about how to mount an EBS volume to your EC2 instance during the startup process. But requiring an instance to use a specific EBS volume has limitations that make the technique unsuitable for large-scale use. In this article I present a more flexible technique that uses an EBS snapshot instead.

Update December 2009: Amazon released native support for automatically mounting EBS volumes created from a snapshot, as part of supporting the boot-from-EBS feature. The techniques in this article are no longer necessary. But they’re cool anyway.


Limitations of Mounting an EBS Volume on Instance Startup

Mounting an EBS volume at startup is relatively straightforward (see the above-referenced posts for details). The main features of the procedure are:

  • The instance uses the EC2 API tools to attach the specified EBS volume. These tools, in turn, require Java and your EC2 credentials – your certificate and private key.
  • Ideally, the AMI contains a hook to allow the EBS volume ID to be specified dynamically at startup time, either as a parameter in the user-data or retrieved from S3 or SimpleDB.
  • The AMI should already contain (or its startup scripts should create) the appropriate references to the necessary locations on the mounted EBS volume. For example, if the volume is mounted at /vol, then /var/lib/mysql (the default MySQL data directory) might be soft-linked (ln -s) or mount --binded to /vol/var/lib/mysql. Alternatively, the applications can be configured to use the locations on the mounted volume directly.

There are many benefits to mounting an EBS volume at instance startup:

  • Avoid the need to burn a new AMI when the content on the instance’s disks changes.
  • Gain the redundancy provided by an EBS volume.
  • Gain the point-in-time backups provided by EBS snapshots.
  • Avoid the need to store the updated content into S3 before instance shutdown.
  • Avoid the need to copy and reconstitute the content from S3 to the instance during startup.
  • Avoid paying for an instance to be always-on.

But mounting an EBS volume at startup also has important limitations:

  • Instances must be launched in the same availability zone as the EBS volume. EBS volumes are availability-zone specific, and are only usable by instances running in the same availability zone. Large-scale deployments use instances in multiple availability zones to mitigate risk, so limiting the deployment to a single availability zone is not reasonable.
  • There is no way to create multiple instances that each have the same EBS volume attached. When you need multiple instances that each have the same data, one EBS volume will not do the trick.
  • As a corollary to the previous point, it is difficult to create Auto Scaling groups of AMIs that mount an EBS volume automatically because each instance needs its own EBS volume.
  • It is difficult to automate the startup of a replacement instance when the original instance still has the EBS volume attached. Large-scale deployments need to be able to handle failure automatically because instances will fail. Sometimes instances will fail in a not-nice way, leaving the EBS volume attached. Detaching the EBS volume may require manual intervention, which is something that should be avoided if at all possible for large-scale deployments.

These limitations make the technique of mounting an EBS volume at startup unsuitable for large-scale deployments.

The Alternative: Mount an EBS Volume Created from a Snapshot at Startup

Instead of specifying an EBS volume to mount at startup, we can specify an EBS snapshot. At startup the instance creates a new EBS volume from the given snapshot and attaches the new volume to itself. The basic startup flow looks like this:

  1. If there is a volume already attached at the target location, do nothing – this is a reboot. Otherwise, continue to step 2.
  2. Create a new EBS volume from the specified snapshot. This requires the following:
    • Java
    • The EC2 API tools
    • The EC2 account’s certificate and private key
    • The EBS snapshot ID
  3. Attach the newly-created EBS volume and mount it to the mount point.
  4. Restore any filesystem pointers, if necessary, to point to the proper locations beneath the EBS volume’s mount point.

Like the technique of mounting an EBS volume, this technique should ideally support specifying the snapshot ID dynamically at instance startup time, perhaps via the user-data or retrieved from S3 or SimpleDB.

Why Mount an EBS Volume Created from a Snapshot at Startup?

As outlined above, the procedure is simple and it offers the following benefits:

  • Instances need not be launched in the same availability zone as the EBS volume. However, instances are limited to using EBS snapshots that are in the same region (US or EU).
  • Instances no longer need to rely on a specific EBS volume being available.
  • Multiple instances can be launched easily, each of which will automatically configure itself with its own EBS volume made from the snapshot.
  • Costs can be reduced by allowing “duplicate” EBS volumes to be provisioned only when they are needed by an instance. “Duplicate” EBS volumes are created on demand, and can also (optionally) be deleted during instance termination. Previously, you needed to keep around as many EBS volumes as the maximum number of simultaneous instances you would use.
  • Large-scale deployments requiring content on an EBS volume are easy to build.

Here are some cool things that are made possible by this technique:

  • MySQL replication slave (or cluster member) launching can be made more efficient. By specifying a recent snapshot of the master database’s EBS volume, the new MySQL slave instance will begin its life already containing most of the data. This slave will demand fewer resources from the master instance and will take less time to catch-up to the master. If you do plan to use this technique for launching MySQL slaves, see Eric Hammond’s article on EBS snapshots of a MySQL slave database in EC2 for some sage words of advice.
  • Auto Scaling can launch instances that mount content stored in EBS at startup. If the auto-scaled instances all need to access common content that is stored in EBS, this technique allows you to duplicate that content onto each auto-scaled instance automatically. And, if the instance gets the snapshot ID from its user-data at startup, you can easily change the snapshot ID for auto-scaled instances by updating the launch configuration.

I am currently exploring how to combine this technique with the one discussed in my article about how to boot the entire instance from an EBS volume. Combining these approaches could provide the ability to “boot from a snapshot”, allowing you relate to bootable snapshots the same way you think about AMIs. Stay tuned to this blog for an article on this approach.

Caveats

Sounds great, huh? Despite these benefits, this technique can introduce a new problem: too many EBS volumes. As you may know, AWS limits the number of EBS volumes you can create to 20 (initially, and you can request a higher limit). This technique creates a new EBS volume each time an instance starts up, so your account will accumulate many EBS volumes. Plus, each volume will be almost indistinguishable from the others, mak
ing them difficult to track.

One potential way to distinguish the EBS volumes would be the ability to tag them via the API: Each instance would tag the volume upon creation, and these tags would be visible in the management interface to provide information about the origin of the volume. Unfortunately the EC2 API does not offer a way to tag EBS volumes. Until that feature is supported, use the ElasticFox Firefox extension to tag EBS volumes manually. I find it helpful to tag volumes with the creating instance’s ID and the instance’s “tag” security groups (see my article on using security groups to tag instances). ElasticFox displays the snapshot ID from which the volume was created and its creation timestamp, which are also useful to know.

As already hinted at, you will still need to think about what to do when the newly-created EBS volumes are no longer in use by the instance that created them. If you know you won’t need them, have a script to detach and delete the volume during instance shutdown (but not shutdown-before-reboot). Be aware that if an instance fails to terminate nicely the attached EBS volume may still exist and you will be charged for it.

In any case, make sure you keep track of your EBS volumes because the cost of keeping them around can add up quickly.

How to Mount an EBS Volume Created from a Snapshot on Startup

Now for the detailed instructions. Please note that the instructions below have been tested on Ubuntu 8.04, and should work on Debian or Ubuntu systems. If you are running a Red Hat-based system such as CentOS then some of the procedure may need to be adjusted accordingly.

There are four parts of getting set up:

  • Setting up the Original Instance with an EBS volume
  • Creating the EBS Snapshot
  • Preparing the AMI
  • Launching a New Instance

In the last step the new instance will create a new volume from the specified snapshot and mount it during startup.

Setting Up the Original Instance with an EBS volume

[Note: this section is based on the fine article about runing MySQL with EBS by Eric Hammond.]

Start out with an EC2 instance booted from an AMI that you like. I recommend one of the Alestic Ubuntu Hardy 8.04 Server AMIs. The instance will be assigned an instance ID (in this example i-11111111) and a public IP address (in this example 1.1.1.1).

ec2-run-instances -z us-east-1a --key MyKeypair ami-0772946e

ec2-describe-instances i-11111111

Once the ec2-describe-instances output shows that the instance is running, continue by creating an EBS volume. This command creates a 1 GB volume in the us-east-1a availability zone, which is the same zone in which the instance was launched. The volume will be assigned a volume ID (in this example vol-00000000).

ec2-create-volume -z us-east-1a -s 1
ec2-describe-volumes vol-0000000

Once the ec2-describe-volumes output shows that the volume is available, attach it to the instance:

ec2-attach-volume -d /dev/sdh -i i-11111111 vol-0000000

Next we can log into the instance and set it up. The following will install MySQL and the XFS filesystem drivers, and then mount and format the EBS volume. When prompted, specify a MySQL root password. If you are running a Canonical Ubuntu AMI you need to change the ssh username from root to ubuntu in these commands.

ssh -i id_rsa-MyKeypair root@1.1.1.1

sudo apt-get update && sudo apt-get upgrade -y
sudo apt-get install -y xfsprogs mysql-server
sudo modprobe xfs
sudo mkfs.xfs /dev/sdh
sudo mount -t xfs -o noatime /dev/sdh /vol
sudo mkdir /vol
sudo mount /vol

The EBS volume is now attached and formatted and MySQL is installed, so now we configure MySQL to use the EBS volume for its data, configuration, and logs:

sudo /etc/init.d/mysql stop
# Due to a minor MySQL bug this may be necessary - does not hurt
sudo killall mysqld_safe
export EBS_MOUNT_DIR=/vol
export EBS_EXPORTS="/etc/mysql /var/lib/mysql /var/log/mysql"
for i in $EBS_EXPORTS
do
EBS_MOUNTED_EXPORT_DIR="$EBS_MOUNT_DIR""$i"
sudo mkdir -p `dirname "$EBS_MOUNTED_EXPORT_DIR"`
sudo mv $i `dirname "$EBS_MOUNTED_EXPORT_DIR"`
sudo mkdir $i
sudo mount --bind
"$EBS_MOUNTED_EXPORT_DIR" "$i"
done
sudo /etc/init.d/mysql start
# Needed later to hold our credentials for bundling an AMI
sudo -H mkdir ~/.ec2

Before we go on, we’ll make sure the EBS volume is being used by MySQL. The data directory on the EBS volume is /vol/var/lib/mysql so we should expect new databases to be created there.

mysql -u root -p -e create database db_on_ebs"
ls -l /vol/var/lib/mysql/

The listing should show that the new directory db_on_ebs was created. This proves that MySQL is using the EBS volume for its data store.

Creating the EBS Snapshot

All the above steps prepare the original instance and the EBS volume for being snapshotted. The following procedure can be used to snapshot the volume.

On the instance perform the following to stop MySQL and unmount the EBS volume:

sudo /etc/init.d/mysql stop
sudo umount /etc/mysql
sudo umount /var/log/mysql
sudo umount /var/lib/mysql
sudo umount /vol

Then, from your local machine, create a snapshot as follows. Remember the snapshot ID for later (snap-00000000 in this example).

ec2-create-snapshot vol-00000000

The snapshot is in progress and you can check its status with the ec2-describe-snapshots command.

Preparing the AMI

At this point in the procedure we have the following set up already:

  • an instance that uses an EBS volume for MySQL files.
  • an EBS volume attached to that instance, having the MySQL files on it.
  • an EBS snapshot of that volume.

Now we are ready to prepare the instance for becoming an AMI. This AMI, when launched, will be able to create a new EBS volume from the snapshot and mount it at startup time.

First, from your local machine, copy your credentials to the EC2 instance:

scp -i id_rsa-MyKeypair pk-whatever1234567890.pem cert-whatever1234567890.pem root@1.1.1.1:~/.ec2/

Back on the EC2 instance install Java (skipping the annoying interactive license agreement) and the EC2 API tools:

export DEBIAN_FRONTEND=noninteractive
echo sun-java6-jdk shared/accepted-sun-dlj-v1-1 select true | sudo /usr/bin/debconf-set-selections
echo sun-java6-jre shared/accepted-sun-dlj-v1-1 select true | sudo /usr/bin/debconf-set-selections
sudo -E apt-get install -y unzip sun-java6-jdk
sudo -H wget -O ~/ec2-api-tools.zip http://s3.amazonaws.com/ec2-downloads/ec2-api-tools.zip && \
cd ~ && unzip ec2-api-tools.zip && ln -s ec2-api-tools-1.3-36506 ec2-api-tools

Note: Future versions of the EC2 API tools will have a different version number, and the above command will need to change accordingly.

Next, set up the script that does the create-volume-and-mount-it magic at startup. Download it from here with the following command:

sudo curl -Lo /etc/init.d/create-ebs-vol-from-snapshot-and-mount \
https://sites.google.com/site/shlomosfiles/clouddevelopertips/create-ebs-vol-from-snapshot-and-mount?attredirects=0

The script has a number of items to customize:

  • The EC2 account credentials: Put a pointer to your private key and certificate file into the script in the appropriate place. If you followed the above instructions these will be in /root/.ec2. Make sure the credentials are located on the instance’s root partition in order to ensure the keys are bundled into the AMI.
  • The snapshot ID. This too can either be hard-coded into the script or, even better, provided as part of the user-data. It is controlled by the EBS_VOL_FROM_SNAPSHOT_ID setting. See below for an example of how to specify and override this value via the user-data.
  • The JAVA_HOME directory. This is the location of the Java installation. On most linux distributions this should point to /usr/lib/jvm/java-6-sun .
  • The EC2_HOME directory. This is the location where the EC2 API tools are installed. If you followed the procedure above this will be /root/ec2-api-tools .
  • The device attach point for the EBS volume. This is controlled by the EBS_ATTACH_DEVICE setting, and is /dev/sdh in these instructions.
  • The filesystem mount directory for the EBS volume. This is controlled by the EBS_MOUNT_DIR setting, and is /vol in these instructions.
  • The directories to be exported from the EBS volume. These are the directories that will be “mapped” to the root filesystem via mount --bind. These are specified in the EBS_EXPORTS setting.
  • If you are creating an AMI for the EU region, uncomment the line export EC2_URL=https://eu-west-1.ec2.amazonaws.com by removing the leading #.

Once you customize the script, set it up to run upon startup as follows:

sudo chmod +x /etc/init.d/create-ebs-vol-from-snapshot-and-mount
sudo update-rc.d create-ebs-vol-from-snapshot-and-mount start S 89 .

As mentioned above, if you do not want the newly-created EBS volume to persist after the instance terminates you can configure the script to be run on shutdown, allowing it to delete the volume. One way of doing this is to create the AMI with a shutdown hook already in place. To do this:

sudo ln -s /etc/init.d/create-ebs-vol-from-snapshot-and-mount /etc/rc0.d/K32create-ebs-vol-from-snapshot-and-mount

Alternatively, you can defer this decision to instance launch time, by passing in the above command via a user-data script – see below for more on this.

Remember: Running this script as part of the shutdown process as described above will delete the EBS volume. If you do not want this to happen automatically, don’t execute the above command. If you mistakenly ran the above command you can fix things as follows:

sudo rm /etc/rc0.d/K32create-ebs-vol-from-snapshot-and-mount

Next is a little cleanup before bundling:

# New instances need their own host keys generated at first boot
chmod +x ec2-ssh-host-key-gen
# New instances should not contain leftovers from this instance
sudo rm -f /root/.*hist*
sudo rm -f /var/log/*.gz
sudo find /var/log -name mysql -prune -o -type f -print | \
while read i; do sudo cp /dev/null $i; done

The instance is now ready to be bundled into an AMI, uploaded, and registered. The commands below show this process. For more about the considerations when bundling an AMI see the this article by Eric Hammond.

bucket=com.mybucket.images
prefix=ubuntu-hardy-32bit-attach-snapshot-at-startup-for-mysql-20090805
export AWS_USER_ID=MY-USER-ID
export AWS_ACCESS_KEY_ID=MY-ACCESS-KEY
export AWS_SECRET_ACCESS_KEY=MY-SECRET-ACCESS-KEY
arch=i386

sudo -E ec2-bundle-vol \
-r $arch \
-d /mnt \
-p $prefix \
-u $AWS_USER_ID \
-k ~/.ec2/pk-*.pem \
-c ~/.ec2/cert-*.pem \
-s 10240 \
-e /mnt,/tmp,/root/.ssh
ec2-upload-bundle \
-b $bucket \
-m /mnt/$prefix.manifest.xml \
-a $AWS_ACCESS_KEY_ID \
-s $AWS_SECRET_ACCESS_KEY

Once the bundle has uploaded successfully, register it from your local machine as follows:

bucket=com.mybucket.images
prefix=ubuntu-hardy-32bit-attach-snapshot-at-startup-for-mysql-20090805
ec2-register $bucket/$prefix.manifest.xml

The ec2-register command displays an AMI ID (ami-012345678 in this example).

We are finally ready to test it out!

Launching a New Instance

Now we are ready to launch a new instance that creates and mounts an EBS volume from the snapshot. The snapshot ID is configurable via the user-data payload specified at instance launch time. Here is an example user-data payload showing how to specify the snapshot ID:

EBS_VOL_FROM_SNAPSHOT_ID=snap-00000000

Note that the format of the user-data payload is compatible with the Running User-Data Scripts technique – just make sure the first line of the user-data payload begins with a hashbang #! and that the EBS_VOL_FROM_SNAPSHOT_ID setting is located somewhere in the payload, at the beginning of a line.

Launch an instance of the AMI with the user-data specifying the snapshot ID, in a different availability zone. The instance will be assigned an instance ID (in this example i-22222222) and a public IP address (in this example 2.2.2.2).

ec2-run-instances -z us-east-1c --key MyKeypair \
-d "EBS_VOL_FROM_SNAPSHOT_ID=snap-00000000" ami-012345678

ec2-describe-instances i-22222222

Once the ec2-describe-instances output shows that the instance is running, check for the new EBS volume that should have been created from the snapshot (in this example, vol-22222222) in the new availability zone.

ec2-describe-volumes

Finally, ssh into the instance and verify that it is now working from the new EBS volume:

ssh -i id_rsa-MyKeypair root@2.2.2.2

mysql -u root -p -e “show databases”

You should see the db_on_ebs database in the results. This demonstrates that the startup sequence successfully created a new EBS volume, attached and mounted it, and set itself up to use the MySQL data on the EBS volume.

Cleaning Up

Don’t forget to clean up the pieces of this procedure when you no longer need them:

# the original instance
ec2-terminate-instances i-11111111
# the original EBS volume
ec2-delete-volumes vol-00000000
# the instance that created a new volume from the snapshot
ec2-terminate instances i-22222222

If you set up the shutdown hook to delete the EBS volume then you can verify that this works by checking that the ec2-describe-volumes output no longer contains the new EBS volume. Otherwise, delete it manually:

# the new volume created from the snapshot
ec2-delete-volumes vol-22222222

And don’t forget to un-register the AMI and delete the files from S3 when you are done. These steps are not shown.

Making Changes to the Configuration

Now that you have a configuration using EBS snapshots which is easily scalable to any availability zone, how do you make changes to it?

Let’s say you want to add a web server to the AMI and your web server’s static content to the EBS volume. (I generally don’t recommend storing your web-layer data in the same place as your database storage, but this example serves as a useful illustration.) You would need to do the following:

  1. Launch an instance of the AMI specifying the snapshot ID in the user-data.
  2. Install the web server on the instance.
  3. Put your web server’s static content onto the instance (perhaps from S3) and test that the web server works.
  4. Stop the web server.
  5. Move the web server’s static content to the EBS volume.
  6. mount --bind” the EBS locations to the original directories without adding entries to /etc/fstab.
  7. Restart the web server and test that the web server still works.
  8. Edit the startup script, adding entries for the web server’s directories to EBS_EXPORTS.
  9. Stop the web server and unmount (umount) all the mount bind directories and the EBS volume.
  10. Remove the mount bind and /vol entries for the EBS exported directories from /etc/fstab.
  11. Perform the cleanup prior to bundling.
  12. Bundle and upload the new AMI.
  13. Create a new snapshot of the EBS volume.
  14. Change your deployment configurations to start using the new AMI and the new snapshot ID.

If you decide that you would like the automatically-created EBS volumes to be deleted when the instances terminate, you have two ways to do this:

  • Execute this command
    sudo ln -s /etc/init.d/create-ebs-vol-from-snapshot-and-mount \
    /etc/rc0.d/K32create-ebs-vol-from-snapshot-and-mount

    and rebundle the AMI.
  • Pass the above command to the instance via a user-data script. The user-data could also specify the snapshot ID, and might look like this:

    #! /bin/bash
    EBS_VOL_FROM_SNAPSHOT_ID=snap-00000000
    ln -s /etc/init.d/create-ebs-vol-from-snapshot-and-mount \
    /etc/rc0.d/K32create-ebs-vol-from-snapshot-and-mount

The technique of mounting an EBS volume created from a snapshot at startup was born of necessity: I needed a way to allow many instances across availability zones to share the same content which lives on an EBS drive. This article shows how you can apply the technique to your deployments. If you also find this technique useful, please share it in the comments!

Thanks

Eric Hammond reviewed early drafts of this article and provided valuable feedback. Thanks!

Categories
Cloud Developer Tips

Copying ElasticFox Tags from One Browser to Another

The ElasticFox Firefox extension allows you to tag EC2 instances, EBS volumes, EBS snapshots, Elastic IPs, and AMIs. ElasticFox’s tags are stored locally within Firefox, so if you use ElasticFox from more than one browser your tags from one browser are not visible in any other browser. In this article I demonstrate how to copy tags from one ElasticFox installation to another.

Where Are ElasticFox Tags Stored?

My article about Tagging EC2 Instances Using Security Groups shows a brief walkthrough of using ElasticFox to tag instances. Under the hood, ElasticFox stores tags in the prefs.js file within your Firefox Profile directory. But instead of poking around in there directly, take a look at Firefox’s built-in interface to these preferences. It’s the about:config screen, and you get there by typing about:config into the address bar:

The Firefox about:config screen

Click “I’ll be careful, I promise!” Up comes the about:config screen, with all sorts of interesting-looking settings. We’re interested in the ElasticFox tags only, so type ec2ui.*tag in the Filter field, and the result looks like this:

Viola! The ElasticFox tags are stored here.

Viola! Those are the ElasticFox tags that I’ve set up.

Copying ElasticFox Tags to Another Browser Manually

You can copy ElasticFox tags from the about:config screen, via the clipboard, into a text file. First, open the about:config screen and use the Filter as described above to see the relevant settings. Create an empty text file and copy each setting (right-click on the entry and choose “Copy”) and paste it into the text file. Add each setting you want to transfer into the text file. You can email the text file to yourself (or put it on your USB key) and then use the text file to help you change the values of these settings in the target browser’s about:config screen.

Once copied setting looks like this:

ec2ui.eiptags.MyDrifts.us-east-1;({'75.101.15.8':'mydrifts.com'})

The setting’s name is ec2ui.eiptags.MyDrifts.us-east-1 and its value is ({'75.101.15.8':'mydrifts.com'}), separated from the name by a semicolon (;). For each setting you want to transfer to the target browser, select the value (after the semicolon) from the text file and copy it to the clipboard. Then, in the target browser’s about:config screen, right-click on the corresponding setting entry and choose “Modify”, and paste in the new value.

All that selecting and copying and pasting is annoying, and you might make a mistake when entering the values in the target browser. I feel your pain. There must be a better way.

Copying ElasticFox Tags to Another Browser with OPIE

OPIE is a Firefox extension that allows you to import and export preference settings. The good thing about it is that it can be used to export only some settings – and we only want to export the ElasticFox settings. Unfortunately, it does not support ElasticFox!

Please help lobby the OPIE developer to add ElasticFox support by posting in this forum thread – OPIE support would make copying ElasticFox tags (and indeed, entire ElasticFox setups) between browsers much easier.

Copying ElasticFox Tags to Another Browser with Shell Scripts

I’ve put together two shell scripts that can be used to directly export ElasticFox settings from the Firefox prefs.js file and to directly import them back in. These scripts really export and import all your ElasticFox settings, not just tags. Read on for directions on how to use these scripts, which work on Linux, Mac OS X, and Solaris, for Firefox 3.x and above.

A word of caution before you export and import via prefs.js directly: make sure Firefox is not running when you export or import. Firefox overwrites the prefs.js file upon exiting, and reads it upon starting up. In order to make sure all your ElasticFox tags (and all other settings, too) are in the export, Firefox must be closed before exporting. Likewise, in order to make sure Firefox picks up your imported settings you need to make sure it is closed before importing.

How to Export ElasticFox Settings

Here is a handy script to export ElasticFox preferences from Firefox. You can download it, or copy it from here and save it to your local machine:

#! /bin/bash

# the base name of the extension preferences
# that are to be imported
PREF_BASE=ec2ui
# ec2ui is the extension base name of ElasticFox

# backslash-escape special characters as you
# would in a regular expression
# the following command will only escape
# periods (".") but that should suffice
# for firefox extensions
ESCAPED_PREF_BASE=$(echo $PREF_BASE | sed -e "s/\./\\\\./g")

# export desired prefs
egrep "^user_pref\(\"$ESCAPED_PREF_BASE\." prefs.js

Once you save the script locally, make it executable:

chmod +x exportFirefoxPrefs.sh

The script expects to be run in the same directory as the prefs.js file, but it’s most convenient to keep this script in your home directory and create a soft-link there to the prefs.js file. On my machine, the command to make a soft-link to the prefs.js file is this:

ln -s /Users/shlomo/Library/Application\ Support/Firefox/Profiles/f0psntn6.default/prefs.js prefs.js

You’ll need to specify the location of your Firefox profile directory (instead of mine).

Remember to exit Firefox. Now you’re ready to run the export script.

./exportFirefoxPrefs.sh > savedPrefs

This will create a file called savedPrefs containing the ElasticFox settings. Move this file to the target browser’s machine, where we will import it.

How to Import ElasticFox Settings

Here is a script to import to import ElasticFox settings into Firefox. Download it, or copy it from here and save it to your local machine:

#! /bin/bash

WORKFILE=workfile.txt
OUTFILE=newPrefs.js
XFERFILE="$1"

# the base name of the extension preferences
# that are to be imported
PREF_BASE=ec2ui
# ec2ui is the extension base name of ElasticFox

if [ -z "$1" ]
then
  echo "Please provide the filename of the file containing the prefs to import."
else
# backslash-escape any  characters as you would
# in a regular expression
# this command only handles periods
# but that should suffice for firefox extensions
ESCAPED_PREF_BASE=$(echo $PREF_BASE | sed -e "s/\./\\\\./g")

# backup the existing prefs
cp prefs.js prefs.js.old

# get the file header
egrep -v "^user_pref\(" prefs.js > $OUTFILE

# add all other prefs
egrep "^user_pref\(" prefs.js | egrep -v "^user_pref\(\"$ESCAPED_PREF_BASE\." > $WORKFILE

# add desired prefs
cat $XFERFILE >> $WORKFILE

# sort all prefs into output
sort $WORKFILE >> $OUTFILE

# clean up
rm $WORKFILE

# uncomment this when you trust this script
# to replace your prefs automatically
#cp $OUTFILE prefs.js
#rm $OUTFILE
fi

After you save the script locally, make it executable:

chmod +x importFirefoxPrefs.sh

As with the export script above, the import script expects to be run in the same directory as the Firefox prefs.js file, so make a soft-link to the prefs.js file in the same directory as the script (see above).

Once you’ve made sure that Firefox is not running, you’re ready to run the import script:

./importFirefoxPrefs.sh savedPrefs

This will create a file called newPrefs.js in the same directory. newPrefs.js will contain the imported ElasticFox settings from the savedPrefs file, merged with the other already-existing Firefox settings from prefs.js. The script will also create a backup copy of prefs.js called prefs.js.old. All you need to do is to replace your Firefox’s prefs.js with newPrefs.js, as follows:

cp newPrefs.js prefs.js
rm newPrefs.js

Now, restart Firefox. It should contain the ElasticFox settings from the source browser. If anything doesn’t work, simply exit Firefox and restore the previous prefs.js:

cp prefs.js.old prefs.js

Once you are confident that the script works consistently, you can uncomment the last few lines of the script and let it automatically replace Firefox’s prefs.js file by itself – then you do not need to perform that step manually.

This method of copying ElasticFox settings using shell scripts is relatively easy, but not perfect. OPIE would be a friendlier way, and would support Windows as well (so please lobby the developer to support ElasticFox). If you’re looking for your ElasticFox tags to be available via the EC2 APIs, check out my article on using security groups as a way to tag instances (but unfortunately not Elastic IPs, EBS volumes and snapshots, or AMIs).

Categories
Cloud Developer Tips

Boot EC2 Instances from EBS

EBS offers the ability to have a “virtual disk drive” whose lifetime is independent of any EC2 instance. You can attach EBS drives to any running instance, and any changes made to the drive will persist even after the instance is terminated. You can also set up an EBS drive to be the root filesystem for an instance – giving you the benefits of an always-on instance but without paying for it when it’s not in use – and this article shows you how to do that. I also explore how to estimate the cost savings you can achieve using this solution.

I’m not the first person to think of this idea: credit goes to AWS forum users rickdane for starting the discussion and N. Martin and Troy Volin for posts in this AWS forum thread that describe most of the heavy-lifting – this solution is largely based on their work.

Update December 2009: With the introduction of native support for boot-from-EBS AMIs, the techniques in this article are largely unnecessary. But they’re still pretty cool.

Note: this article assumes you are comfortable in the linux shell and familiar with the EC2 AMI tools and EC2 API tools.

Why Boot EC2 Instances from EBS?

My company runs applications in EC2, and we test new application features in EC2 (after testing them locally) before we deploy them to our production environment. At first, each time we had a new feature to test out, we would construct a testing environment as follows:

  1. Launch a new instance of our base AMI (based on an Alestic Ubuntu Hardy Server AMI, which I highly recommend, and customized with our own basic application server stack).
  2. Make the necessary environmental changes (e.g. adding monitoring services) to the test instance.
  3. Deploy the the application (the one with new features we need to test) to the test instance.
  4. Deploy the test database from S3 to the test instance.
  5. Update the test database with any schema or data changes necessary.
  6. Test and debug the application.
  7. Save the updated test database to S3 for later use.
  8. Terminate the test instance.

This process worked great at first (even if it was a manual process). Our application is a WAR and a bunch of jar files, so they were easily uploaded to the test instance. We stored our test database in S3 as a gzipped MySQL mysqldump file, and it was easy to import into / export from MySQL. That is, until we got to the point where our test database was big enough for it to take a long time – over an hour – to reconstitute. At that point it became very annoying to bring up the test environment.

The last thing you want, as a developer, is a test environment that is difficult or annoying to set up. You want testing to be easy, quick, and inexpensive (otherwise you will start looking for shortcuts to avoid testing, which is not good for quality). Once our test environment began to require almost an hour just to set up, we realized it was no longer serving our needs. I began researching a setup that:

  • Is ready-to-go within a short time (less than five minutes)
  • Contains the most-recent environment (tools, application, and database) already
  • Only costs money when it is being used

Basically, we wanted the benefits of a server that is always-on but without paying for it when we weren’t using it.

That’s the “why”. Read on for the “what” and the “how”.

Ingredients and Tools

The solution consists of two pieces:

  • An AMI that can boot from an EBS drive (the “boot AMI”)
  • An EBS drive that contains the bootable linux stack and, later, anything else you add (the “bootable EBS volume”)

The main tool used to put the pieces together is the linux utility pivot_root. This program can only be run by the /sbin/init process (the startup process itself, pid 1), and it “swaps” the root filesystem out from under the OS and replaces it with a directory you specify. This will be the root directory of the EBS drive.

EBS volumes are attached to specific devices (/dev/sdb through /dev/sdp), and you’ll need to choose what attach device you want the boot AMI to use. We could, theoretically, build the boot AMI to read the attach device name from the user-data provided at launch time. But this would require bringing up networking in order to download the user-data to the instance, which is complicated. Instead of doing this we will hardcode the attach device into the boot AMI. You will need to remember what device you chose in order to know where to attach the EBS drive when you launch an instance of the boot AMI. A good way to remember the chosen attach device is to name the boot AMI accordingly, for example boot-from-EBS-to-dev-sdp-32bit-20090624.

As part of choosing the attach device you’ll need to know the mknod minor number associated with the device. mknod minor numbers begin at 0 (for /dev/sda) and progress in increments of 16 (so /dev/sdb is number 16, /dev/sdc is number 32, etc.) until /dev/sdp (which is number 240). mknod minor numbers are detailed in the Documentation/devices.txt file in the linux kernel source code. In the procedure below I use /dev/sdp (“p” for “pivot_root”), with mknod minor number 240.

The EC2 AMI tools are useful in preparing the bootable EBS drive: they can create an image file from an existing instance. These tools will be used to create a temporary image containing the bootable linux stack copied from the running instance, and this temporary image will then be copied to the EBS drive to make it bootable.

How to Set it Up

Here is an outline of the setup process:

  1. Set up a bootable EBS volume.
  2. Set up the boot AMI.

The detailed setup instructions follow.

Setting up a bootable EBS volume:

  1. Launch an instance of an AMI you like. I use the Alestic Ubuntu Hardy Server AMI.
  2. Create an EBS volume in the same availability zone as the instance you launched. This volume should be large enough to fit everything you plan to put on the root partition (later, not now). It can even exceed 10GB: unlike the “real” root filesystem on EC2 instances which is limited to 10GB, the bootable EBS volume is not limited to this size.
  3. Once the EBS volume is created, attach it to the instance on /dev/sdp.
  4. SSH into the instance as root and format the EBS volume:
    mkfs.ext3 /dev/sdp
  5. Mount the EBS volume to the instance’s filesystem:
    mkdir /ebs && mount /dev/sdp /ebs
  6. Make an image bundle containing the root filesystem:
    mkdir /mnt/prImage
    ec2-bundle-vol -c cert -k key -u user -e /ebs -r i386 -d /mnt/prImage

    These arguments are the same as you would use to bundle an AMI – please see the Developer Guide: Bundling a Unix or Linux AMI for details. The certificate and private key credentials should be copied to the instance in the /mnt partition so they don’t get bundled into the image and copied to the bootable EBS volume. If you’re creating a 64-bit image you’ll need to substitute -r x86_64 instead.
  7. Copy the contents of the image to the EBS drive:
    mount -o loop /mnt/prImage/image /mnt/img-mnt
    rsync -a /mnt/img-mnt/ /ebs/
    umount /mnt/img-mnt
  8. Prepare the EBS volume for being the target of pivot_root. First, edit /ebs/etc/fstab, changing the entry for / (the root filesystem) from /dev/sda1 to /dev/sdp. Next,
    mkdir /ebs/old-root
    rm /ebs/root/.ssh/authorized_keys
    umount /ebs
    rmdir /ebs
  9. Detach the EBS volume from your instance.

Your bootable EBS volume is now ready. You might want to take a snapshot of it. Don’t terminate the EC2 instance yet, because it will be used to create the boot AMI.

Setting up the boot AMI:

  1. Create the mount point where the bootable EBS will be mounted:
    mkdir /new-root
  2. Replace the original /sbin/init with a new one. First, rename the original:
    mv /sbin/init /sbin/init.old
    Then, copy this file into /sbin/init, as follows:
    curl -o /sbin/init -L https://sites.google.com/site/shlomosfiles/clouddevelopertips/init?attredirects=0
    As mentioned above, if you choose an attach device different than /dev/sdp you should edit the new /sbin/init file to assign DEVNO the corresponding mknod minor number.
    Finally, make it executable:
    chmod 755 /sbin/init
  3. Clean up the current SSH public key that was used to login to the instance:
    rm /root/.ssh/authorized_keys
    Not so fast! Once you perform this step you will no longer be able to SSH into the instance. Instead, you can move the SSH authorized_keys file out of the way, as follows:
    mv /root/.ssh/authorized_keys /mnt
  4. Bundle, upload, and register the AMI. Give the bundle a name that indicates the processor architecture and the device on which it expects the bootable EBS volume to be attached. For example, boot-from-EBS-to-dev-sdp-32bit-20090624.
  5. If you opted to move the SSH authorized_keys out of the way in Step 3, restore SSH access to the instance:
    mv /mnt/authorized_keys /root/.ssh/authorized_keys

The boot AMI is now ready to be launched. If you are feeling brave you can terminate the instance. Otherwise, you can leave it running and use it to troubleshoot problems with the launch process – but hopefully this won’t be necessary.

How to Launch an Instance

Once the bootable EBS volume and boot AMI are set up, launch instances that boot from the EBS volume as follows:

  1. Launch an instance of the boot AMI.
  2. Attach the bootable EBS volume to the chosen device (/dev/sdp in the above instructions).

The boot AMI will wait until a volume is attached to the chosen device, pivot_root to the EBS volume, and then continue its boot sequence from the EBS volume.

Some troubleshooting tips:

  • To troubleshoot problems launching the instance you should look at the console output from the instance. The console output might not get updated consistently. If the console output is still empty a few minutes after launching the instance and attaching the EBS volume, try rebooting the instance, which will force the console to refresh.
  • If you need to attach the bootable EBS volume to another instance (not the boot AMI), attach it to /dev/sdp and mount it as follows:
    mkdir /ebs && mount /dev/sdp /ebs
  • The boot AMI includes a hook that allows you to add a program to be executed before it performs a pivot_root to the EBS drive. This avoids the need to re-bundle the boot AMI if you need to change the startup process. The hook looks for a file called /pre-pivot.sh on the EBS drive and executes the file if it can.
  • Booting from an EBS drive seems to take slightly longer than booting a regular EC2 instance. Don’t be surprised if it takes up to five minutes until you can get an SSH connection to the instance.
  • You don’t need to re-attach the EBS volume when you reboot the instance – it stays attached.

Usage Tips

  • Create a file in the root of each bootable EBS volume labeling the purpose of the volume, such as /bootable-32bit-appserver. This helps when you mount the EBS drive to an instance as a “regular” volume: the label indicates what the purpose of the volume is, and helps you distinguish it from other attached EBS drives. This is good practice even for non-bootable EBS volumes. See below for a tip on how to track the purpose of EBS volumes without looking at their contents.
  • Once you get the boot AMI running properly with the bootable EBS volume, shut it down and take a snapshot of the volume before you make any other changes to it. This snapshot is your “master bootable volume” snapshot, containing only the minimum setup necessary to boot.
  • After creating a “master bootable volume” snapshot you can (and should!) launch the boot AMI again (remembering to attach the bootable EBS volume) and customize the instance any way you want: add your application stack, your database, your tools, anything else. These changes will persist on the bootable EBS volume even after the instance has terminated. This is the main motivation for the bootable EBS volume solution!
  • You can create multiple bootable EBS volumes from the “master bootable volume” snapshot and customize them each. See below for a tip on how to keep track of the purpose of each EBS volume.
  • The setup instructions above can be used for creating either a 32-bit or a 64-bit boot AMI and matching bootable EBS volume. Because you can’t run an AMI for one architecture on an EC2 instance of the other architecture, you’ll need to create two separate boot AMIs and two separate bootable EBS volumes if you plan to run on both 32-bit and 64-bit EC2 instances. And, because the bootable EBS volumes created by this procedure will contain a 32-bit or 64-bit specific linux stack, be sure to attach the corresponding bootable EBS volume for the boot AMI you launch.
  • If you work with multiple EBS volumes you will want to identify the purpose of each volume without attaching it to a running instance and looking at the label file you created. Unfortunately the EC2 API does not currently offer a way to tag EBS volumes. But the ElasticFox Firefox extension does – and I highly recommend it for this purpose. Note that the volume tags will only be visible in the browser that creates them, not on other machines. [See my article on Copying ElasticFox tags between browsers for a workaround.]

Cost Implications of Booting Instances from EBS

EBS costs money beyond what you pay for the disk drives that come as part of EC2 instances. There are four components of the EBS cost to consider:

  1. Allocated storage size ($0.10 per GB per month)
  2. I/O requests ($0.10 per million I/O requests)
  3. Snapshot storage (same as S3 storage costs; depends on the region)
  4. Snapshot transfer (same as S3 transfer costs; depends on the region)

The AWS Simple Monthly Calculator can help you estimate these costs. Here are some guidelines to help you figure out what numbers to put into these fields:

    • Component #1 is easy: just plug in the size of your bootable EBS volume.
    • Component #2 should be estimated based on the I/O usage of your existing application instance. You can use iostat to estimate this number as follows:
      iostat | awk -F" " 'BEGIN {x="0.0"} /^sd/ {x=x+$3} END {print x}'
      The result is the number of I/O transactions per second. Multiply this figure by 2592000 (60 * 60 * 24 * 30, the number of seconds in a month) to get the number of I/O transactions per month, then divide by 1 million. Or, better yet, use this instead:
      iostat | awk -F" " 'BEGIN {x="0.0"} /^sd/ {x=x+$3} END {printf "%12.2f\n", x*2.592}'
      For my test environment, this figure comes to 29 (million I/O requests per month).
    • Component #3 can be estimated with the help of the guidelines at the bottom of the EBS Product Info page:

Snapshot storage is based on the amount of space your data consumes in Amazon S3. Because data is compressed before being saved to Amazon S3, and Amazon EBS does not save empty blocks, it is likely that the size of a snapshot will be considerably less than the size of your volume. For the first snapshot of a volume, Amazon EBS will save a full copy of your data to Amazon S3. However for each incremental snapshot, only the part of your Amazon EBS volume that has been changed will be saved to Amazon S3.

As a conservative estimate of snapshot storage size, I use the same size as the actual data on the EBS volume. I figure this covers the initial compressed snapshot plus a month’s worth of delta snapshots.

    • Component #4 can also be estimated from the guidelines further down the same page:

Volume data is broken up into chunks before being transferred to Amazon S3. While the size of the chunks could change through future optimizations, the number of PUTs required to save a particular snapshot to Amazon S3 can be estimated by dividing the size of the data that has changed since the last snapshot by 4MB. Conversely, when loading a snapshot from Amazon S3 into and Amazon EBS volume, the number of GET requests needed to fully load the volume can be estimated by dividing the full size of the snapshot by 4MB. You will also be charged for GETs and PUTs at normal Amazon S3 rates.

My own monthly estimate includes taking one full EBS volume snapshot and creating one EBS volume from a snapshot. I keep around 20GB of data on the EBS volume. So I divide 20480 (20 GB expressed in MB) by 4 to get 5120. This is the estimated number of PUTs per month. It is also the estimated number of GETs per month.

For my usage in our test environment, the cost of running instances that boot from a 50GB EBS volume with 20GB of data on it in the US region comes to approximately $10.96 per month.

Is it financially worthwhile?

As long as your EBS cost estimate is less than the cost of running the instance for the hours it would sit unused, this solution saves you money. My test instance is an m1.large in the US region, which costs $0.40 per hour. If my instance would sit idle for at least (10.96/0.4=) 28 hours a month, I save money by using a bootable EBS volume and terminating the instance when it is unused. As it happens, my test instance is unused more than 360 hours a month, so for me it is a no-brainer: I use bootable EBS volumes.

EC2 instances that boot from an EBS volume offer the benefits of an always-on machine without the costs of paying for it when it is not in use. The implementation presented here was developed based on posts by the aforementioned forum users, and I thank them for their help.