Cloud Developer Tips

Scalability and HA Limitations of AWS Marketplace AMIs

Reading AWS’s recent announcement of the AWS Marketplace you would think that it provides a catalog of click-to-deploy, highly-available, scalable applications running on EC2. You’d be partially right: the applications available in the AWS Marketplace are deployable in only a few clicks. But highly-available and scalable services will be difficult to build using Marketplace images. Here’s why.

Essential Ingredients of HA and Scalability on AWS

AWS makes it easy to run scalable, HA applications via several features. Not all applications use all of these features, but it would be very difficult to provide scalable and highly available service without using at least one of these:

  • Elastic Load Balancing
  • Auto Scaling
  • Elastic Block Storage volumes

ELB and AutoScaling both enable horizontal scalability: spreading load and controlling deployment size via first-class-citizen tools integrated into the AWS environment. They also enable availability by providing an automated way to recover from the failure of individual instances. [Scalability and availability often move in lock-step; improving one usually improves the other.] EBS volumes provide improved data availability: data can be retrieved off of dying instances – and often are used in RAID configurations to improve write performance.

AWS Marketplace Limitations

The AWS Marketplace has limitations that cripple two of the above features, making highly available and scalable services much more difficult to provide.

Marketplace AMI instances cannot be added to an ELB

Update 17 May 2012: The Product Manager for AWS Marketplace informed me that AWS Marketplace instances are now capable of being used with ELB. This limitation no longer exists.

Try it. You’ll get this error message:

 Error: InvalidInstance: ElasticLoadBalancing does not support the paid AMI or supported AMI of instance i-10abf677.

There is no mention of this limitation in the relevant ELB documentation.

This constraint severely limits horizontal scalability for Marketplace AMIs. Without ELB it’s difficult to share web traffic to multiple identically-configured instances of these AMIs. The AWS Marketplace offers several categories of AMIs, including Application Stacks (RoR, LAMP, etc.) and Application Servers (JBoss, WebSphere, etc.), that are typically deployed behind an ELB – but that won’t work with these Marketplace AMIs.

Root EBS volumes of Marketplace AMI instances cannot be mounted on non-root devices

Because all Marketplace AMIs are EBS-backed, you might think that there is a quick path to recover data if the instance dies unexpectedly: simply attach the root EBS volume to another device on another instance and get the data from there. But don’t rely on that – it won’t work. Here is what happens when you try to mount the root EBS volume from an instance of a Marketplace AMI on an another instance:

Failed to attach EBS volume 'New-Mongo-ROOT-VOLUME' (/dev/sdj) to 'New-Mongo' due to: OperationNotPermitted: 'vol-98c642f7' with Marketplace codes may not be attached as a secondary device.

This limitation is described here in AWS documentation:

If a volume has an AWS Marketplace product code:

  • The volume can only be attached to the root device of a stopped instance.
  • You must be subscribed to the AWS Marketplace code that is on the volume.
  • The configuration (instance type, operating system) of the instance must support that specific AWS Marketplace code. For example, you cannot take a volume from a Windows instance and attach it to a Linux instance.
  • AWS Marketplace product codes will be copied from the volume to the instance.

Closing a Licensing Loophole

Why did AWS place these constraints on using Marketplace-derived EBS volumes? To help Sellers keep control of the code they place into their AMI. Without the above limitations it’s simple for the purchaser of a Marketplace AMI to clone the root filesystem and create as many clones of that Marketplace-derived instance without necessarily being licensed to do so and without paying the premiums set by the Seller. It’s to close a licensing loophole.

AWS did a relatively thorough job of closing that hole. Here is a section of the current (25 April 2012) AWS overview of the EC2 EBS API and Command-Line Tools, with relevant Marketplace controls highlighted:

Command and API Action Description
ec2-create-volumeCreateVolume Creates a new Amazon EBS volume using the specified size or creates a new volume based on a previously created snapshot. Any AWS Marketplace product codes from the snapshot are propagated to the volume. For an overview of the AWS Marketplace, go to For details on how to use the AWS Marketplace, see AWS Marketplace.
ec2-attach-volumeAttachVolume Attaches the specified volume to a specified instance, exposing the volume using the specified device name. A volume can be attached to only a single instance at any time. The volume and instance must be in the same Availability Zone. The instance must be in the running or stoppedstate.

[Note] Note
If a volume has an AWS Marketplace product code:

  • The volume can only be attached to the root device of a stopped instance.
  • You must be subscribed to the AWS Marketplace code that is on the volume.
  • The configuration (instance type, operating system) of the instance must support that specific AWS Marketplace code. For example, you cannot take a volume from a Windows instance and attach it to a Linux instance.
  • AWS Marketplace product codes will be copied from the volume to the instance.

For an overview of the AWS Marketplace, go to For details on how to use the AWS Marketplace, see AWS Marketplace.

ec2-detach-volumeDetachVolume Detaches the specified volume from the instance it’s attached to. This action does not delete the volume. The volume can be attached to another instance and will have the same data as when it was detached. If the root volume is detached from an instance with an AWS Marketplace product code, then the AWS Marketplace product codes from that volume will no longer be associated with the instance.
ec2-create-snapshotCreateSnapshot Creates a snapshot of the volume you specify. After the snapshot is created, you can use it to create volumes that contain exactly the same data as the original volume. When a snapshot is created, any AWS Marketplace product codes from the volume will be propagated to the snapshot.
ec2-modify-snapshot-attributeModifySnapshotAttribute Modifies permissions for a snapshot (i.e., who can create volumes from the snapshot). You can specify one or more AWS accounts, or specify all to make the snapshot public.

[Note] Note
Snapshots with AWS Marketplace product codes cannot be made public.

The constraints above are meant to maintain the AWS Marketplace product code, the mechanism AWS uses to identify resources (AMIs, snapshots, volumes, and instances) that require Marketplace licensing integration. Note that not all AMIs in the AWS Marketplace have a product code – for example, the Amazon Linux AMI does not have one. AMIs that do not require licensing control (such as Amazon Linux, and Ubuntu without support) do not have AWS Marketplace product codes – but the rest do.

A Hole

There remains a hole in this lockdown scheme. Any instance whose kernel allows booting from a volume based on its volume label can be manipulated into booting from a secondary EBS volume. This requires root privileges on the instance. I have successfully booted an instance of the MongoDB AMI in the AWS Marketplace from a secondary EBS volume created from the Amazon Linux AMI. Anyone exploiting this hole can circumvent the product code lockdown.

Plugging the Hole

Sellers want these licensing controls and lockdowns. Here’s how:

  • Disable the root account.
  • Disable sudo.
  • Prevent user-data from being executed. On the Amazon Linux AMI and Ubuntu AMIs, user-data beginning with a hashbang is executed as root during the startup sequence.

Unfortunately these mitigations result in a crippled instance. Users won’t be able to mount EBS volumes – which requires root access – so data can’t be stored on EBS volumes for better recoverability.

Alternatively, you could develop your AWS Marketplace solutions as SaaS applications. For many potential Sellers this would be a long-term effort.

I’m still looking for good ways to enable scalability and HA of Marketplace AMIs. I welcome your suggestions.

Update 27 April 2012: Amazon Web Services PR has contacted me to say they are actively working on a fix for the ELB limitations, and are also working on removing the limitation related to mounting Marketplace-derived EBS volumes on secondary devices. I’ll update this article when this happens. In the meantime, AWS said that users who want to recover data from Marketplace-derived EBS volumes should reach out to AWS Support for help.

Update 17 May 2012: The Product Manager for AWS Marketplace informed me that AWS Marketplace instances are now capable of being used with ELB.

Cloud Developer Tips

Recapture Unused EC2 Minutes

How much time is “wasted” in the paid-for but unused portion of the hour when you terminate an instance? How can you recapture this time – which represents compute power – and put it to good use? After all, you’ve paid for it already. This article presents a technique for repurposing an instance after you’re “done” with it, until the current billing hour is up. It’s inspired by a tweet from DEVOPS_BORAT:

We have new startup CloudJanitor. We recycle old or unuse cloud instance. Need only your cloud account login!

To clarify, we’re talking about per-hour pricing in public cloud IaaS services, where partial hours consumed are billed as whole hours. AWS EC2 is the most prominent example of a cloud sporting this pricing policy (search for “partial”). In this pricing policy, terminating (or stopping) an instance after it’s been running for 121 minutes results in a usage charge for three hours, “wasting” an extra 59 minutes that you have paid for but not used.

What’s Involved

You might think it’s easy to repurpose an instance: just Stop it (if it’s EBS-backed), change its root volume to a new one, and Start the instance again. Not so fast: Stopping an EC2 instance immediately ends the current billing hour before you can use it all, and when you Start the instance again a new billing hour begins – so we can’t Stop the instance. We also can’t Terminate the instance – that would also immediately curtail the billing hour and prevent us from utilizing it. Instead, we’re going to reboot the instance, which does not affect the billing.

We’ll need an EBS volume that has a bootable distro on it – let’s call this the “beneficiary” volume, because it’s going to benefit from the extra time on the clock. The beneficiary volume should have the same distro as the “normal” root volume has. [Actually, to be more precise, it need only have a distro that works with the same kernel that the instance is currently running.] I’ve tested this technique with Ubuntu 10.04 Lucid and 10.10 Maverick.

One of the great things about the Ubuntu images is how easy it is to play this root volume switcheroo: these distros boot from any volume that has the label uec-rootfs. To change the root volume we’ll change the volume labels, so a different volume is used as the root filesystem upon reboot.

It’s very important to disassociate the instance from all external hooks, such as Auto-Scaling Triggers and Elastic Load Balancers before you repurpose it. Otherwise the beneficiary workload will influence those no-longer-relevant systems. However, this may not be possible if you use hooks that cannot be de-coupled from the instance, such as a CloudWatch Dimension of ImageIdInstanceId, or InstanceType.

The network I/O incurred during the recaptured time may be subject to additional charges. In EC2, only communications between instances in the same availability zone, or between EC2 and S3 in the same region, are free of charge.

You’ll need to make sure the beneficiary workload only accepts communications on ports that are open in the normal instance’s security groups. It’s not possible to add or remove security groups while an instance is running. You also wouldn’t want to be modifying the security groups dynamically because that will influence all instances in those security groups – and you may have other instances that are still performing their normal workload.

The really cool thing about this technique is that it can be used on both EBS-backed and instance-store instances. However, you’ll need to prepare separate beneficiary volumes (or snapshots) for 32-bit and 64-bit instances.

How to Do it

There are three stages in repurposing an instance:

  1. Preparing the beneficiary volume (or snapshot).
  2. Preparing the normal workload image.
  3. Actually repurposing the instance.

Stages 1 and 2 are only performed once. Stage 3 is performed for every instance you want to repurpose.

Preparing the beneficiary snapshot

First we’re going to prepare the beneficiary snapshot. Beginning with a pristine Ubuntu 10.10 Maverick EBS-based instance (at the time of publishing this article that’s ami-ccf405a5 for 32-bit instances), let’s create a clone of the root filesystem:

ec2-run-instances ami-ccf405a5 -k my-keypair -t m1.small -g default

ec2-describe-instances $instanceId #use the instanceId outputted from the previous command

Wait for the instance to be “running”. Once it is, identify the volumeId of the root volume – it will be indicated in the ec2-describe-instances output, the one attached to device /dev/sda1.

At this point you have a running Ubuntu 10.10 instance. For real-world usage you’ll want to customize this instance by installing the beneficiary workload and arranging for it to automatically start up on boot. (I recommend Folding@home as a worthy beneficiary project.)

Now we create the beneficiary snapshot:

ec2-create-snapshot $volumeId #use the volumeId from the previous command

And now we have the beneficiary snapshot.

Preparing the normal workload image

Begin with the same base AMI that you used for the beneficiary snapshot. Launch it and customize it to contain your normal workload stuff. You’ll also need to put in a custom script that will perform the repurposing. Here’s what that script will do:

  1. Determine how much time is left on the clock in the current billing hour. If it’s not enough time to prepare and to reboot into the beneficiary volume, just force ourselves to shut down.
  2. Disassociate any external hooks the instance might participate in: remove it from ELBs, force it to fail any Auto-Scaling health checks, and make sure it’s not handling “normal” workloads anymore.
  3. Attach the beneficiary volume to the instance.
  4. Change the volume labels so the beneficiary volume will become the root filesystem at the next reboot.
  5. Edit the startup scripts on the beneficiary volume to start a self-destruct timer.
  6. Reboot.

The following script performs steps 1, 4, 5, and 6, and clearly indicates where you should perform steps 2 and 3.

#! /bin/bash
# reboot into the attached EBS volume on the specified device, but terminate
# before this billing hour is complete.
# requires the first argument to be the device on which the EBS volume is attached

safetyMarginMinutes=1 # set this to how long it takes to attach and reboot

# make sure we have at least "safetyMargin" minutes left this hour
if wget -q -O $t ; then
	# add 60 seconds artificially as a safety margin
	let runningSecs=$(( `date +%s` - `date -r $t +%s` ))+60
	rm -f $t
	let runningSecsThisHour=$runningSecs%3600
	let runningMinsThisHour=$runningSecsThisHour/60
	let leftMins=60-$runningMinsThisHour-$safetyMarginMinutes
	# start shutdown one minute earlier than actually required
	let shutdownDelayMins=$leftMins-1
	if [[ $shutdownDelayMins < 2 || $shutdownDelayMins > 59 ]]; then
		echo "Shutting down now."
		shutdown -h now
		exit 0

## here is where you would disassociate this instance from ELBs,
# force it to fail AutoScaling health checks, and otherwise make sure
# it does not participate in "normal" activities.

## here is where you would attach the beneficiary volume to $device
# ec2-create-volume --snapshot snap-00000000 -z this_availability_zone
# dont forget to wait until the volume is "available"

# ec2-attach-volume . . . and don't forget to wait until the volume is "attached"

## (optionally) force the beneficiary volume to be deleted when this instance terminates:
# ec2-modify-instance-attribute --block-device-mapping '$device=::true' this_instance_id

## get the beneficiary volume ready to be rebooted into
# change the filesystem labels
e2label /dev/sda1 old-uec-rootfs
e2label $device uec-rootfs
# mount the beneficiary volume
mkdir -m 000 $mountPoint
mount $device $mountPoint
# install the self-destruct timer
sed -i -e "s/^exit 0$/shutdown -h +$shutdownDelayMins\nexit 0/" \
# neutralize the self-destruct for subsequent boots
sed -i -e "s#^exit 0#chmod -x /etc/rc.local\nexit 0#" $mountPoint/etc/rc.local
# get out
umount $mountPoint
rmdir $mountPoint

# do the deed
shutdown -r now
exit 0

Save this script into the instance you’re preparing for the normal workload (perhaps, as the root user, into /root/ and chmod it to 744.

Now, make your application detect when its normal workload is completed – this exact method will be very specific to your application. Add in a hook there to invoke this script as the root user, passing it the device on which the beneficiary volume will be attached. For example, the following command will cause the instance to repurpose itself to a volume attached on /dev/sdp:

sudo /root/ /dev/sdp

Once all this is set up, use the usual EC2 AMI creation methods to create your normal workload image (either as an instance-store AMI or as an EBS-backed AMI).

Actually repurposing the instance

Now that everything is prepared, this is the easy part. Your normal workload image can be launched. When it is finished, the repurposing script will be invoked and the instance will be rebooted into the beneficiary volume. The repurposed instance will self-destruct before the billing hour is complete.

You can force this repurposing to happen by explicitly invoking the command at an SSH prompt on the instance:

sudo /root/ /dev/sdp

Notice that you will be immediately kicked out of your SSH session – either the instance will reboot or the instance will terminate itself because there isn’t enough time left in the current billable hour. If it’s just a reboot (which happens when there is significant time left in the current billing hour) then be aware: the SSH host key will most likely be different on the repurposed instance than it was originally, and you may need to clean up your local ~/.ssh/known_hosts file, removing the entry for the instance, before you can SSH in again.

Cloud Developer Tips

Using AWS Route 53 to Keep Track of EC2 Instances

This article is a guest post by Guy Rosen, CEO of Onavo and author of the Jack of All Clouds blog. Guy was one of the first people to produce hard numbers on cloud adoption for site hosting, and he continues to publish regular updates to this research in his State of the Cloud series. These days he runs his startup Onavo which uses the cloud to offer smartphone users a way to slash overpriced data roaming costs.

In this article, Guy provides another technique to track changes to your dynamic cloud services automatically, possible now that AWS has released Route 53, DNS services. Take it away, Guy.

While one of the greatest things about EC2 is the way you can spin up, stop and start instances to your heart’s desire, things get sticky when it comes to actually connecting to an instance. When an instance boots (or comes up after being in the Stopped state), Amazon assigns a pair of unique IPs (and DNS names) that you can use to connect: a private IP used when connecting from another machine in EC2, and a public IP is used to connect from the outside. The thing is, when you start and stop dozens of machines daily you lose track of these constantly changing IPs. How many of you have found, like me, that each time you want to connect to a machine (or hook up a pair of machines that need to communicate with each other, such as a web and database server) you find yourself going back to your EC2 console to copy and paste the IP?

This morning I got fed up with this, and since Amazon launched their new Route 53 service I figured the time was ripe to make things right. Here’s what I came up with: a (really) small script that takes your EC2 instance list and plugs it into DNS. You can then refer to your machines not by their IP but by their instance ID (which is preserved across stops and starts of EBS-backed instances) or by a user-readable tag you assign to a machine (such as “webserver”).

Here’s what you do:

  1. Sign up to Amazon Route 53.
  2. Download and install cli53 from (follow the instructions to download the latest Boto and dnspython)
  3. Set up a domain/subdomain you want to use for the mapping (e.g.,
    1. Set it up on Route53 using cli53:
      ./ create
    2. Use your domain provider’s interface to set Amazon’s DNS servers (reported in the response to the create command)
    3. Run the following script (replace any details and paths, emphasized in bold, with your own):

      #!/bin/tcsh -f
      set root=`dirname $0`
      setenv EC2_HOME /usr/local/ec2-api-tools
      setenv EC2_CERT $root/ec2_x509_cert.pem
      setenv EC2_PRIVATE_KEY $root/ec2_x509_private.pem
      setenv AWS_ACCESS_KEY_ID myawsaccesskeyid
      setenv AWS_SECRET_ACCESS_KEY mysecretaccesskey

      $EC2_HOME/bin/ec2-describe-instances | \
      perl -ne '/^INSTANCE\s+(i-\S+).*?(\S+\.amazonaws\.com)/ \
      and do { $dns = $2; print "$1 $dns\n" }; /^TAG.+\sShortName\s+(\S+)/ \
      and print "$1 $dns\n"' | \
      perl -ane 'print "$F[0] CNAME $F[1] --replace\n"' | \
      xargs -n 4 $root/cli53/ \
      rrcreate -x 60

Voila! You now have DNS names such as that point to your instances. To make things more helpful, if you add a tag called ShortName to your instances it will be picked up, letting you create names such as The script creates CNAME records, which means that you will automatically get internal EC2 IPs when querying inside EC2 and public IPs from the outside.

Put this script somewhere, run it in a cron – and you’ll have an auto-updating DNS zone for your EC2 servers.

Short disclaimer: the script above is a horrendous one-liner that roughly works and uses many assumptions, it works for me but no guarantees.

Cloud Developer Tips

S3 Reduced Redundancy Storage with Simple Notification Service: What, Why, and When

AWS recently added support for receiving Simple Notification Service notifications when S3 loses a Reduced Redundancy Storage S3 object. This raises a number of questions:

  • What the heck does that even mean?
  • Why would I want to do that?
  • Under what conditions does it make financial sense to do that?

Let’s take a look at these questions, and we’ll also do a bit of brainstorming (please participate!) to design a service that puts it all together.

What is S3 Reduced Redundancy Storage?

Standard objects stored in S3 have “eleven nines” of durability annually. This means 99.999999999% of your objects stored in S3 will still be there after one year. On average, you will need to store 100,000,000,000 – that’s one hundred billion – objects in standard S3 storage before you will, on average, have one of them disappear over a year’s time. Pretty great.

Reduced Redundancy Storage (RRS) is a different class of S3 storage that, in effect, has a lower durability: 99.99% annually. On average, you will need to store only 10,000 objects in RRS S3 before you should expect one of them to disappear over a year’s time. Not quite as great, but still more than 400 times better than a traditional hard drive.

When an RRS object is lost S3 will return an HTTP 405 response code, and your application is supposed to be built to understand that and take the appropriate action: most likely regenerate the object from its source objects, which have been stored elsewhere more reliably – probably in standard eleven-nines S3. It’s less expensive for AWS to provide a lower durability class of service, and therefore RRS storage is priced accordingly: it’s about 2/3 the cost of standard S3 storage.

RRS is great for derived objects – for example, image thumbnails. The source object – the full-quality image or video – can be used to recreate the derived object – the thumbnail – without losing any information. All it costs to create the derived object is time and CPU power. And that’s most likely why you’re creating the derived objects and storing them in S3: to act as a cache so the app server does not need to spend time and CPU power recreating them for every request. Using S3 RRS as a cache will save you 1/3 of your storage costs for the derived objects, but  you’ll need to occasionally recreate a derived object in your application.

How Do You Handle Objects Stored in RRS?

If you serve the derived objects to clients directly from S3 – as many web apps do with their images – your clients will occasionally get a HTTP 405 response code (about once a year for every 10,000 RRS objects stored). The more objects you store the higher the likelihood of a client’s browser encountering a HTTP 405 error – and most browsers show ugly messages when they get a 405 error. So your application should do some checking.

To get your application to check for a lost object you can do the following: Send S3 an HTTP HEAD request for the object before giving the client its URL. If the object exists then the HEAD request will succeed. If the object is lost the HEAD request will return a 405 error. Once you’re sure the object is in S3 (either the HEAD request succeeded, or you recreated the derived object and stored it again in S3), give the object’s URL to the client.

All that HEAD checking is a lot of overhead: each S3 RRS URL needs to be checked every time it’s served. You can add a cache of the URL of objects you’ve checked recently and skip those. This will cut down on the overhead and reduce your S3 bill – remember that each HEAD request costs 1/10,000 of a cent – but it’s still a bunch of unnecessary work because most of the time you check its HEAD the object will still be there.

Using Simple Notification Service with RRS

Wouldn’t it be great if you could be notified when S3 RRS loses an object?

You can. AWS’s announcement introduces a way to receive notification – via Simple Notification Service, SNS – when S3 RRS detects that an object has been lost. This means you no longer need your application to check for 405s before serving objects. Instead you can have your application listen for SNS notifications (either via HTTP or via email or via SQS) and proactively process them to restore lost objects.

Okay, it’s not really true that your application no longer needs to check for lost objects. The latency between the actual loss of an object and the time you recreate and replace it is still nonzero, and during that time you probably want your application to behave nicely.

[An aside: I do wonder what the expected latency is between the object’s loss and the SNS notification. I’ve asked on the Forums and in a comment to Jeff Barr’s blog post – I’ll update this article when I have an answer.]

When Does it Make Financial Sense to Use S3 RRS?

While you save on storage costs for using S3 RRS you still need to devote resources to recreating any lost objects. How can you decide when it makes sense to go with RRS despite the need to recreate lost objects?

There are a number of factors that influence the cost of recreating lost derived objects:

  • Bandwidth to get the source object from S3 and return the derived object to S3. If you perform the processing inside the same EC2 region as the S3 region you’re using then this cost is zero.
  • CPU to perform the transformation of the source object into the derived object.
  • S3 requests for GETting the source object and PUTting the derived object.

I’ve prepared a spreadsheet analyzing these costs for various different numbers of objects, sizes of objects, and CPU-hours required for each derived object.

For 100,000 source objects of average 5MB size stored in Standard S3, each of which creates 5 derived objects of average 500KB size stored in RRS and requiring 1 second of CPU time to recreate, the savings in choosing RRS is $12.50 per month. Accounting for the cost of recreating lost derived objects reduces that savings to $12.37.

For the same types of objects but requiring 15 minutes of CPU time to recreate each derived object the net savings overall is $12.28. Still very close to the entire savings generated by using RRS.

For up to about 500,000 source objects it doesn’t pay to launch a dedicated m1.small instance just for the sake of recreating lost RRS objects. An m1.small costs $61.20 per month, which is approximately the same as the net savings from 500,000 source objects of average 5MB size with 5 derived objects each of average size 500KB. At this level of usage, if you have spare capacity on an existing instance then it would make financial sense to run the recreating process there.

For larger objects the savings is also almost the entire amount saved by using RRS, and the amounts saved are larger than the cost of a single m1.small so it already pays to launch your own instance for the processing.

For larger numbers of objects the savings is also almost the entire amount saved by using RRS.

As far down as you go in the spreadsheet, and as much as you may play with the numbers, it makes financial sense to use RRS and have a mechanism to recreate derived objects.

Which leads us to the the brainstorming.

Why Should I Worry About Lost Objects?

Let’s face it, nobody wants to operate a service that is not core to their business. Most likely, creating the derived objects from the source object is not your business core competency. Creating thumbnails and still frame video captures is commodity stuff.

So let’s imagine a service that does the transformation, storage in S3, and maintenance of RRS derived objects for you so you don’t have to.

You’d drop off your source object in your bucket in S3. Then you’d send an SQS message to the service containing the new source object’s key and a list of the transformations you want applied. As Jeff Bar suggests in his blog, the service would process the message and create derived objects (stored in RRS) whose keys (the name) would be composed of the source object’s name and the name of the transformation applied. You’d know how to construct the name of every derived object, so you would know how to access them. The service would subscribe to the RRS SNS notifications and recreate the derived objects when they are lost.

This service would need a way for clients to discover the supported file types and the supported transformations for each file type.

As we pointed out above, there is a lot of potential financial savings in using RRS, so such a service has plenty of margin to price itself profitably, below the cost of standard S3 storage.

What else would such a service need? Please comment.

If you build such a service, please cut me in for 30% for giving you the idea. Or, at least acknowledge me in your blog.

Cloud Developer Tips

Storing AWS Credentials on an EBS Snapshot Securely

Thanks to reader Ewout and his comment on my article How to Keep Your AWS Credentials on an EC2 Instance Securely for suggesting an additional method of transferring credentials: via a snapshot. It’s similar to burning credentials into an AMI, but easier to do and less prone to accidental inclusion in the application’s AMI.

Read on for a discussion of how to implement this technique.

How to Store AWS Credentials on an EBS Snapshot

This is how to store a secret on an EBS snapshot. You do this only once, or whenever you need to change the secret.

We’re going to automate as much as possible to make it easy to do. Here’s the command that launches an instance with a newly created 1GB EBS volume, formats it, mounts it, and sets up the root user to be accessible via ssh and scp. The new EBS volume created will not be deleted when the instance is terminated.

$ ec2-run-instances -b /dev/sdf=:1:false -t m1.small -k \
my-keypair -g default ami-6743ae0e -d '#! /bin/bash
yes | mkfs.ext3 /dev/sdf
mkdir -m 000 /secretVol
mount -t ext3 -o noatime /dev/sdf /secretVol
cp /home/ubuntu/.ssh/authorized_keys /root/.ssh/'

We have set up the root user to be accessible via ssh and scp so we can store the secrets on the EBS volume as the root user by directly copying them to the volume as root. Here’s how we do that:

$ ls -l
total 24
-r--r--r-- 1 shlomo  shlomo  916 Jun 20  2010 cert-NT63JNE4VSDEMH6VHLHBGHWV3DRFDECP.pem
-r--------  1 shlomo  shlomo   90 Jun  1  2010 creds
-r-------- 1 shlomo  shlomo  926 Jun 20  2010 pk-NT63JNE4VSDEMH6VHLHBGHWV3DRFDECP.pem
$ scp -i /path/to/id_rsa-my-keypair * root@

Our secret is now on the EBS volume, visible only to the root user.

We’re almost done. Of course you want to test that your application can access the secret appropriately. Once you’ve done that you can terminate the instance – don’t worry, the volume will not be deleted due to the “:false” specification in our launch command.

$ ec2-terminate-instance $instance
$ ec2-describe-volumes
VOLUME	vol-7ce48a15	1		us-east-1b	available	2010-07-18T17:34:01+0000
VOLUME	vol-7ee48a17	15	snap-5e4bec36	us-east-1b	deleting	2010-07-18T17:34:02+0000

Note that the root EBS volume is being deleted but the new 1GB volume we created and stored the secret on is intact.

Now we’re ready for the final two steps:
Snapshot the volume with the secret:

$ ec2-create-snapshot $secretVolume
SNAPSHOT	snap-2ec73045	vol-7ce48a15	pending	2010-07-18T18:05:39+0000		540528830757	1

And, once the snapshot completes, delete the volume:

$ ec2-describe-snapshots -o self
SNAPSHOT	snap-2ec73045	vol-7ce48a15	completed	2010-07-18T18:05:40+0000	100%	540528830757	1
$ ec2-delete-volume $secretVolume
VOLUME	vol-7ce48a15
# save the snapshot ID
$ secretSnapshot=snap-2ec73045

Now you have a snapshot $secretSnapshot with your credentials stored on it.

How to Use Credentials Stored on an EBS Snapshot

Of course you can create a new volume from the snapshot, attach the volume to your instance, mount the volume to the filesystem, and access the secrets via the root user. But here’s a way to do all that at instance launch time:

$ ec2-run-instances un-instances -b /dev/sdf=$secretSnapshot -t m1.small -k \
my-keypair -g default ami-6743ae0e -d '#! /bin/bash
mkdir -m 000 /secretVol
mount -t ext3 -o noatime /dev/sdf /secretVol
# make sure it gets remounted if we reboot
echo "/dev/sdf /secretVol ext3 noatime 0 0" > /etc/fstab'

This one-liner uses the -b option of ec2-run-instances to specify a new volume be created from $secretSnapshot, attached to /dev/sdf, and this volume will be automatically deleted when the instance terminates. The user-data script sets up the filesystem mount point and mounts the volume there, also ensuring that the volume will be remounted if the instance reboots.
Check it out, a new volume was created for /dev/sdf:

$ ec2-describe-instances
RESERVATION	r-e4f2608f	540528830757	default
INSTANCE	i-155b857f	ami-6743ae0e			pending	my-keypair	0		m1.small	2010-07-19T15:51:13+0000	us-east-1b	aki-5f15f636	ari-d5709dbc	monitoring-disabled					ebs
BLOCKDEVICE	/dev/sda1	vol-8a721be3	2010-07-19T15:51:22.000Z
BLOCKDEVICE	/dev/sdf	vol-88721be1	2010-07-19T15:51:22.000Z

Let’s make sure the files are there. SSHing into the instance (as the ubuntu user) we then see:

$ ls -la /secretVol
ls: cannot open directory /secretVol: Permission denied
$ sudo ls -l /secretVol
total 28
-r--------  1 root root   916 2010-07-18 17:52 cert-NT63JNE4VSDEMH6VHLHBGHWV3DRFDECP.pem
-r--------  1 root root    90 2010-07-18 17:52 creds
dr--------  2 root   root   16384 2010-07-18 17:42 lost+found
-r--------  1 root root   926 2010-07-18 17:52 pk-NT63JNE4VSDEMH6VHLHBGHWV3DRFDECP.pem

Your application running the instance (you’ll install it by adding to the user-data script, right?) will need root privileges to access those secrets.

Cloud Developer Tips

Track Changes to your Dynamic Cloud Services Automatically

Dynamic infrastructure can be a pain to accommodate in applications. How do you keep track of the set of web servers in your dynamically scaling web farm? How do your apps keep up with which server is currently running what service? How can applications be written so they don’t need to care if a service gets moved to a different machine? There are a number of techniques available, and I’m happy to share implementation code for one that I’ve found useful.

One thing common to all these techniques: they all allow the application code to refer to services by name instead of IP address. This makes sense because the whole point is not to care about the IP address running the service. Every one of these techniques offers a way to translate the name of the service into an IP address behind the scenes, without your application knowing about it. Where the techniques differ is in how they provide this indirection.

Note that there are four usage scenarios that we might want to support:

  1. Service inside the cloud, client inside the cloud
  2. Service inside the cloud, client outside the cloud
  3. Service outside the cloud, client inside the cloud
  4. Service outside the cloud, client outside the cloud

Let’s take a look at a few techniques to provide loose coupling between dynamically movable services and their IP addresses, and see how they can support these usage scenarios.

Dynamic DNS

Dynamic DNS is the classic way of handling dynamically assigned roles: DNS entries on a DNS server are updated via an API (usually HTTP/S) when a server claims a given role. The DNS entry is updated to point to the IP address of the server claiming that role. For example, your DNS may have a record. When the production deployment’s master database starts up it can register itself with the DNS provider to claim the dns record, pointing that DNS entry to its own IP address. Any client of the database can use the host name to refer to the master database, and as long as the machine that last claimed that DNS entry is still alive, it will work.

When running your service within EC2, Dynamic DNS servers running outside EC2 will see the source IP address for the Dynamic DNS registration request as the public IP address of the instance. So if your Dynamic DNS is hosted outside EC2 you can’t easily register the internal IP addresses. Often you want to register the internal IP address because from within the same EC2 region it costs less to use the private IP address than the public IP addresses. One way to use Dynamic DNS with private IPs is to build your own Dynamic DNS service within EC2 and set up all your application instances to use that DNS server for your domain’s DNS lookups. When instances register with that EC2-based DNS server, the Dynamic DNS service will detect the source of the registration request as being the internal IP address for the instance, and it will assign that internal IP address to the DNS record.

Another way to use Dynamic DNS with internal IP addresses is to use DNS services such as DNSMadeEasy whose API allows you to specify the IP address of the server in the registration request. You can use the EC2 instance metadata to discover your instance’s internal IP address via the URL .

Here’s how Dynamic DNS fares in each of the above usage scenarios:

Scenario 1: Service in the cloud, client inside the cloud: Only if you run your own DNS inside EC2 or use a special DNS service that supports specifying the internal IP address.
Scenario 2: Service in the cloud, client outside the cloud: Can use public Dynamic DNS providers.
Scenario 3: Service outside the cloud, client inside the cloud: Can use public Dynamic DNS providers.
Scenario 4: Service outside the cloud, client outside the cloud: Can use public Dynamic DNS providers.

Update window: Changes are available immediately to all DNS servers that respect the zero TTL on the Dynamic DNS server (guaranteed only for Scenario 1). DNS propagation delay penalty may still apply because not all DNS servers between the client and your Dynamic DNS service necessarily respect TTLs properly.

Pros: For public IP addresses only, easy to integrate into existing scripts.

Cons: Running your own DNS (to support private IP addresses) is not trivial, and introduces a single point of failure.

Bottom line: Dynamic DNS is useful when both the service and the clients are in the cloud; and for other usage scenarios if a DNS propagation delay is acceptable.

Elastic IP Addresses

In AWS you can have an Elastic IP address: an IP address that can be associated with any instance within a given region. It’s very useful when you want to move your service to a different instance (perhaps because the old one died?) without changing DNS and waiting for those changes to propagate across the internet to your clients. You can put code into the startup sequence of your instances that associates the desired Elastic IP address, making this approach very scriptable. For added flexibility you can write those scripts to accept configurable input (via settings in the user-data or some data stored in S3 or SimpleDB) that specifies which Elastic IP address to associate with the instance.

A cool feature of Elastic IP addresses: if clients use the DNS name of the IP address (“”) instead of the numeric IP address you can have extra flexibility: clients within EC2 will get routed via the internal IP address to the service while clients outside EC2 will get routed via the public IP address. This seamlessly minimizes your bandwidth cost. To take advantage of this you can put a CNAME entry in your domain’s DNS records.

Summary of Elastic IP addresses:

Scenario 1: Service in the cloud, client inside the cloud: Trivial, client should use Elastic IP’s DNS name (or set up a CNAME).
Scenario 2: Service in the cloud, client outside the cloud: Trivial, client should use Elastic IP’s DNS name (or set up a CNAME).
Scenario 3: Service outside the cloud, client inside the cloud: Elastic IPs do not help here.
Scenario 4: Service outside the cloud, client outside the cloud: Elastic IPs do not help here.

Update window: Changes are available in under a minute.

Pros: Requires minimal setup, easy to script.

Cons: No support for running the service outside the cloud.

Bottom line: Elastic IPs are useful when the service is inside the cloud and an approximately one minute update window is acceptable.

Generating Hosts Files

Before the OS queries DNS for the IP address of a hostname it checks in the hosts file. If you control the OS of the client you can generate the hosts file with the entries you need. If you don’t control the OS of the client then this technique won’t help.

There are three important ingredients to get this to work:

  1. A central repository that stores the current name-to-IP address mappings.
  2. A method to update the repository when mappings are updated.
  3. A method to regenerate the hosts file on each client, running on a regular schedule.

The central repository can be S3 or SimpleDB, or a database, or security group tags . If you’re concerned about storing your AWS access credentials on each client (and if these clients are web servers then they may not need your AWS credentials at all) then the database is a natural fit (and web servers probably already talk to the database anyway).

If your service is inside the cloud and you want to support clients both inside and outside the cloud you’ll need to maintain two separate repository tables – one containing the internal IP addresses of the services (for use generating the hosts file of clients inside the cloud) and the other containing the public IP addresses of the services (for use generating the hosts file of clients outside the cloud).

Summary of Generating Hosts Files:

Scenario 1: Service in the cloud, client inside the cloud: Only if you control the client’s OS, and register the service’s internal IP address.
Scenario 2: Service in the cloud, client outside the cloud: Only if you control the client’s OS, and register the service’s public IP address.
Scenario 3: Service outside the cloud, client inside the cloud: Only if you control the client’s OS.
Scenario 4: Service outside the cloud, client outside the cloud: Only if you control the client’s OS.

Update Window: Controllable via the frequency with which you regenerate the hosts file. Can be as short as a few seconds.

Pros: Works on any client whose OS you control, whether inside or outside the cloud, and with services either inside or outside the cloud. And, assuming your application already uses a database, this technique adds no additional single points of failure.

Cons: Requires you to control the client’s OS.

Bottom line: Good for all scenarios where the client’s OS is under your control and you need refresh times of a few seconds.

A Closer Look at Generating Hosts Files

Here is an implementation of this technique using a database as the repository, using Java wrapped in a shell script to regenerate the hosts file, and using Java code to perform the updates. This implementation was inspired by the work of Edward M. Goldberg of myCloudWatcher.

Creating the Repository

Here is the command to create the necessary database (“Hosts”) and table (“hosts”):

mysql -h dbHostname -u dbUsername -pDBPassword -e \
USE Hosts; \
CREATE TABLE \`hosts\` ( \
\`record\` TEXT \
INSERT INTO \`hosts\` VALUES ("   localhost   localhost.localdomain");'

Notice that we pre-populate the repository with an entry for “localhost”. This is necessary because the process that updates the hosts file will completely overwrite the old one, and that’s where the localhost entry is supposed to live. Removing the localhost entry could wreak havoc on networking services – so we preserve it by ensuring a localhost entry is in the repository.

Updating the Repository

To claim a certain role (identified by a hostname – in this example “webserver1” – with an IP address it is registered in the repository. Here’s the one-liner:

mysql -h dbHostname -u dbUsername -pDBPassword -e \
'DELETE FROM Hosts.\`hosts\` WHERE record LIKE "% webserver1"; \
INSERT INTO Hosts.\`hosts\` (\`record\`) VALUES ("   webserver1");'

The registration process can be performed on the client itself or by an outside agent. Make sure you substitute the real host name and the correct IP address.

On an EC2 instance you can get the private and public IP addresses of the instance via the instance metadata URLs. For example:

$ privateIp=$(curl --silent
$ echo $privateIp
$ publicIp=$(curl --silent
$ echo $publicIp

Regenerating the Hosts File

The final piece is recreating the hosts file based on the contents of the database table. Notice how the table records are already in the correct format for a hosts file. It would be simple to dump the output of the entire table to the hosts file:

mysql -h dbHostname -u dbUsername -pDBPassword --silent --column-names=0 -e \
'SELECT \`record\` FROM Hosts.\`hosts\`' | uniq > /etc/hosts  # This is simple and wrong

But it would also be wrong to do that! Every so often the database connection might fail and you’d be left with a hosts file that was completely borked – and that would prevent the client from properly resolving the hostnames of your services. It’s safer to only overwrite the hosts file if the SQL query actually returns results. Here’s some Java code that does that:

Connection conn = DriverManager.getConnection("jdbc:mysql://" + dbHostname + "/?user=" +
	dbUsername + "&password=" + dbPassword);
String outputFileName = "/etc/hosts";
Statement stmt = conn.createStatement();
ResultSet res = stmt.executeQuery("SELECT record FROM");
HashSet<String> uniqueMe = new HashSet<String>();
PrintStream out = System.out;
if (res.isBeforeFirst()) {
	out = new PrintStream(outputFileName);
while ( {
	String record = res.getString(1);
	if (uniqueMe.add(record)) {

This code uses the MySQL Connector/J JDBC driver. It makes sure only to overwrite the hosts file if there were actual records returned from the database query.

Scheduling the Regeneration

Now that you have a script that regenerates that hosts file (you did wrap that Java program into a script, right?) you need to place that script on each client and schedule a cron job to run it regularly. Via cron you can run it as often as every minute if you want – it adds a negligible amount of load to the database server so feel free – but if you need more frequent updates you’ll need to write your own driver to call the regeneration script more frequently.

If you find this technique helpful – or have any questions about it – I’d be happy to hear from you in the comments.

Update December 2010: Guy Rosen guest-authored this article on using AWS’s DNS service Route 53 to track instances.

Cloud Developer Tips

Elastic Load Balancing with Sticky Sessions

At long last, the most oft-requested feature for EC2’s Elastic Load Balancer is here: session affinity, also known as “sticky sessions”. What is session affinity? Why is this feature in such high demand? How can it be used with existing applications? Let’s take a look at these questions. But first, let’s explore what a session is – then we’ll cover why we want it to be sticky, and what ELB’s sticky session limitations are. [To skip directly to an explanation of how to use ELB sticky sessions, go toward the bottom of the article.]

What is a Session?

A session is a way to get your application involved in a long-lasting conversation with a particular client. Without a session, a conversation between your application and a client would  look like something straight out of the movie Memento. It would look like this:

Life Without Sessions

Client: Hi, I’d like to see /products/awesomeDoohickey.html

Application: I don’t know who you are. Please go here to login first: /login

Client: OK, I’d like to see /login

Application: Here it is: “…”

Client: Thanks. Here’s the filled in login form.

Application: Thanks for logging in. Where do you want to go?

Client: I’d like to see /products/awesomeDoohickey.html

Application: I don’t know who you are. Please go here to login first: /login

Client: >Sigh< OK, I’d like to see /login

Application: Happily! Here it is: “…”

Client: Here’s the filled in login form.

Application: Thanks for logging in. Where do you want to go?

Client: Show me /products/awesomeDoohickey.html already!

Application: I don’t know who you are. Please go here to login first: /login

Client: *$#%& this!

The application can’t remember who the client is – it has no context to process each request as part of a conversation. The client gets so frustrated he starts thinking he’s living in an Adam Sandler movie.

On a technical level: Each HTTP request-response pair between the client and application happens (most often) on a different TCP connection. This is especially true when a load balancer sits between the client and the application. So the application can’t use the TCP connection as a way to remember the conversational context. And, HTTP itself is stateless: any request can be sent at any time, in any sequence, regardless of the preceding requests. Sure, the application may demand a particular pattern of interaction – like logging in before accessing certain resources – but that application-level state is enforced by the application, not by HTTP. So HTTP cannot be relied on to maintain conversational context between the client and the application.

There are two ways to solve this problem of forgetting the context. The first is for the client to remind the application of the context every time he requests something: “My name is Henry Whatsisface, I have these items in my shopping cart (…), I got here via this affiliate (…), yada yada yada… and I’d like to see /products/awesomeDoohickey.html”. No sane client would ever agree to interact with an application that needed to be sent the entire context at every stage of the conversation. Its burdensome for the client, it’s difficult to maintain for the application, and it’s expensive (in bandwidth) for both of them. Besides, the application usually maintains the conversational state, not the client. So it’s wrong to require the client to send the entire conversation context along with each request.

The accepted solution is to have the application remember the context by creating an associated memento. This memento is given to the client and returned to the application on subsequent requests. Upon receiving the memento the application looks for the associated context, and – voila – discovers it. Thus, the conversation is preserved.

One way of providing a memento is by putting it into the URL. It looks really ugly when you do this:

More commonly, mementos are provided via cookies, which all browsers these days support. Cookies are placed within the HTTP request so they can be discovered by the application even if a load balancer intervenes.

Here’s what that conversation looks like with cookies:

Life With Sessions, Take 1

Client: Hi, I’d like to see /products/awesomeDoohickey.html

Application: I don’t know who you are. Please go here to login first: /login

Client: OK, I’d like to see /login

Application: Here it is: “…”

Client: Thanks. Here’s the filled in login form.

Application: Thanks for logging in. Here’s a cookie. Where do you want to go?

Client: I’d like to see /products/awesomeDoohickey.html and here’s my cookie.

Application: I know you – I’d recognize that cookie anywhere! Great, here’s that page: “…”

Client: I’d like to buy 5000 units. Here’s my cookie.

Much improved, yes?

A side point: most modern applications will provide a cookie earlier in the conversation. This allows the following more optimal conversation:

Life With Sessions, Take 2

Client: Hi, I’d like to see /products/awesomeDoohickey.html

Application: I don’t know who you are. Here’s a cookie. Take this login page and fill it out: “…”

Client: OK. Here’s the filled in login form. And here’s my cookie.

Application: I know you – I’d recognize that cookie anywhere! Thanks for logging in. I recall you wanted to see /products/awesomeDoohickey.html. Here it is: “…”

Client: I’d like to buy 5000 units. Here’s my cookie.

That’s about as optimized a conversation as you can have. Cookies make it possible.

What is Session Affinity (Sticky Sessions)? Why is it in High Demand?

When you only have one application server talking to your clients life is easy: all the session contexts can be stored in that application server’s memory for fast retrieval. But in the world of highly available and scalable applications there’s likely to be more than one application server fulfilling requests, behind a load balancer. The load balancer routes the first request to an application server, who stores the session context in its own memory and gives the client back a cookie. The next request from the same client will contain the cookie – and, if the same application server gets the request again, the application will rediscover the session context. But what happens if that client’s next request instead gets routed to a different application server? That application server will not have the session context in its memory – even though the request contains the cookie, the application can’t discover the context.

If you’re willing to modify your application you can overcome this problem. You can store the session context in a shared location, visible to all application servers: the database or memcached, for example. All application servers will then be able to lookup the cookie in the central, shared location and discover the context. Until now, this was the approach you needed to take in order to retain the session context behind an Elastic Load Balancer.

But not all applications can be modified in this way. And not all developers want to modify existing applications. Instead of modifying the application, you need the load balancer to route the same client to the same application server. Once the client’s request has been routed to the correct application server, that application server can lookup the session cookie in its own memory and recover the conversational context.

That’s what sticky sessions are: the load balancer routing the same client to the same application server. And that’s why they’re so important: If the load balancer supports sticky sessions then you don’t need to modify your application to remember client session context.

How to Use ELB with Sticky Sessions with Existing Applications

The key to managing ELB sticky sessions is the duration of the stickiness: how long the client should consistently be routed to the same back-end instance. Too short, and the session context will be lost, forcing the client to login again. Too long, and the load balancer will not be able to distribute requests equally across the application servers.

Controlling the ELB Stickiness Duration

ELB supports two ways of managing the stickiness’ duration: either by specifying the duration explicitly, or by indicating that the stickiness expiration should follow the expiration of the application server’s own session cookie.

If your application server has an existing session cookie already, the simplest way to get stickiness is to configure your ELB to use the existing application cookie for determining the stickiness duration. PHP applications usually have a session cookie called PHPSESSID. Java applications usually have a session cookie called JSESSIONID. The expiration of these cookies is controlled by your application, and the stickiness expiration can be set to match as follows. Assuming your load balancer is called myLoadBalancer and it has an HTTP listener on port 80:

elb-create-app-cookie-stickiness-policy myLoadBalancer --cookie-name PHPSESSID --policy-name followPHPPolicy
elb-set-lb-policies-of-listener myLoadBalancer --lb-port 80 --policy-names followPHPPolicy

The above commands create a stickiness policy that says “make the session stickiness last as long as the cookie PHPSESSID does” and sets the load balancer to use that stickiness policy. Behind the scenes, the ELB’s session cookie will have the same lifetime as the PHPSESSID cookie provided by your application.

If your application does not have its own session cookie already, set your own stickiness duration for the load balancer, as follows:

elb-create-lb-cookie-stickiness-policy myLoadBalancer --policy-name fifteenMinutesPolicy --expiration-period 900
elb-set-lb-policies-of-listener myLoadBalancer --lb-port 80 --policy-names fifteenMinutesPolicy

These commands create a stickiness policy that says “make the session stickiness last for fifteen minutes” and sets the load balancer to use that stickiness policy. Behind the scenes, the ELB’s session cookie will have a lifetime of fifteen minutes.

What Can’t ELB Sticky Session Do?

Life is not all roses with ELB’s sticky session support. Here are some things it can’t do.

Update October 2010: ELB now supports SSL termination, and it can provide sticky sessions over HTTPS as well.


Remember how sticky sessions are typically provided via cookies? The cookie is inserted into the HTTP request by the client’s browser, and any server or load balancer that can read the request can recover the cookie. This works great for plain old HTTP-based communications.

With HTTPS connections the entire communications stream is encrypted. Only servers that have the proper decryption credentials can decipher the stream and discover the cookies. If the load balancer has the server’s SSL certificate then it can decrypt the stream. Because it does not have your application’s SSL certificate (and there’s no way to give it your certificate) ELB does not support HTTPS communications. If you need to support sticky sessions and HTTPS in EC2 then you can’t use ELB today. You need to use HAProxy or aiCache or another product that provide load balancing with session affinity and SSL termination.

Scaling-down-proof stickiness

What happens when you add or remove an application server to/from the load balancer? Depending on the stickiness implementation the load balancer may or may not be able to route requests to the same application servers as it did before the scaling event (caused, for example, by an AutoScaling trigger).

When scaling up (adding more application servers) ELB maintains stickiness of existing sessions. Only new connections will be forwarded to the newly-added application servers.

When scaling down (removing application servers), you should expect some of your clients to lose their sessions and require logins again. This is because some of the stored sessions were on the application server that is no longer servicing requests.

If you really want your sessions to persist even through scaling-down events, you need to go back to basics: your application will need to store the sessions independently, as it did before sticky sessions were supported. In this case, sticky session support can provide an added optimization, allowing you to cache the session locally on each application server and only retrieve it from the central session store (the DB?) if it’s not in the local cache. Such cache misses would happen when application servers are removed from the load balancing pool, but otherwise would not impact performance. Note that this technique can be used equally well with ELB and with other load balancers.

With the introduction of sticky sessions for ELB, you – the application developer – can avoid modifying your application in order to retain session context behind a load balancer. The technical term for this is “a good thing”. Sticky sessions are, despite their limitations, a very welcome addition to ELB’s features.

Thanks to Jeff Barr of Amazon Web Services for providing feedback on drafts of this article.

Cloud Developer Tips

EC2 Reserved Instance Availability Zone Problem? No Problem.

You may know that Amazon Web Services Reserved Instances have some gotchas. One of these gotchas is that the availability zone in which the reservation is purchased cannot be changed. So if you need to use an instance in a different availability zone (e.g. if you hit InsufficientCapacity errors*) than your reservation, you’re out of luck – and you end up paying the on-demand price.

Today at CloudConnect I chatted with Jeremy Edberg of Reddit and Joe Arnold of Cloudscaling, and we had an insight: there’s a workaround. Read on for more details.

* AWS says this shouldn’t happen, but I’ve seen it happen.

The Availability Zone Cha-Cha

EC2 availability zones are different for each customer: my us-east-1a may be your us-east-1c. This makes it confusing when you discuss specific availability zones with other users. Eric Hammond published a technique to discover the correspondence between availability zones across accounts, but there has been limited practical use for this technique.

What if you could share reserved instances across accounts? What if you could do so without regard to the availability zone? Wouldn’t this help circumvent insufficient capacity errors?

Guess what: you can. EC2 Consolidated Billing allows you to set up a master account that consolidates the billing of many sub-accounts. If you use consolidated billing then your reservations from one sub-account’s us-east-1a are usable for another “sister” account’s us-east-1a. Here’s the quote:

Bob receives the cost benefit from Susan’s Reserved Instances only if he launches his instances in the Availability Zone where Susan purchased her Reserved Instances. For example, if Susan specified us-east-1a when she purchased her Reserved Instances, Bob must specify us-east-1a when he launches his instances in order to get the cost benefit on his Consolidated Bill. However, the actual locations of Availability Zones are independent from one account to another. For example, the us-east-1a Availability Zone for Bob’s account might be in a different location than for Susan’s account.

So, here’s the idea:

  1. Set up two accounts in AWS. You might want to use the same credit card for both of them to make your life easier.
  2. Use Eric Hammond’s technique to determine whether or not us-east-1a on the two accounts match up. If they do: make a new account. You’re looking for two accounts whose us-east-1a do not match.
  3. Repeat step 2, creating new accounts and making sure that us-east-1a does not match any of the other accounts. Do this until you have four accounts, all with different physical availability zones behind us-east-1a.
  4. Set up Consolidated Billing for those accounts.
  5. Launch an on-demand instance in one account’s us-east-1a. It doesn’t matter which, because the reserved instance pricing will apply.
  6. If that original instance has a problem and you need to launch another one in a different availability zone, just choose one of the other accounts. Launch the new instance in that account’s us-east-1a availability zone. Reserved instance pricing will apply to the new instance as soon as you terminate the original.

One caveat: I haven’t actually tried this. Please let me know if this helps you.

Cloud Developer Tips

Use ELB to Serve Multiple SSL Domains on One EC2 Instance

This is one of the coolest uses of Amazon’s ELB I’ve seen yet. Check out James Elwood’s article.

You may know that you can’t serve more than one SSL-enabled domain on a single EC2 instance. Okay, you can but only via a wildcard certificate (limited) or a multi-domain certificate (hard to maintain). So you really can’t do it properly. Serving multiple SSL domains is one of the main use cases behind the popular request to support multiple IP addresses per instance.

Why can’t you do it “normally”?

The reason why it doesn’t work is this: The HTTPS protocol encrypts the HTTP request, including the Host: header within. This header identifies what actual domain is being requested – and therefore what SSL certificate to use to authenticate the request. But without knowing what domain is being requested, there’s no way to choose the correct SSL certificate! So web servers can only use one SSL certificate.

If you have multiple IP addresses then you can serve different SSL domains from different IP addresses. The VirtualHost directive in Apache (or similar mechanisms in other web servers) can look at the target IP address in the TCP packets – not in the HTTP Host: header – to figure out which IP address is being requested, and therefore which domain’s SSL certificate to use.

But without multiple IP addresses on an EC2 instance, you’re stuck serving only a single SSL-enabled domain from each EC2 instance.

How can you?

Really, read James’ article. He explains it very nicely.

How much does it cost?

Much less than two EC2 instances, that’s for sure. According to the EC2 pricing charts, ELB costs:

  • $0.025 per Elastic Load Balancer-hour (or partial hour) ($0.028 in us-west-1 and eu-west-1)
  • $0.008 per GB of data processed by an Elastic Load Balancer

The smallest per-hour cost you can get in EC2 is for the m1.small instance, at $0.085 ($0.095 in us-west-1 and eu-west-1).

Using the ELB-for-multiple-SSL-sites trick saves you 75% of the cost of using separate instances.

Thanks, James!

Cloud Developer Tips The Business of IT

How to Work with Contractors on AWS EC2 Projects

Recently I answered a question on the EC2 forums about how to give third parties access to EC2 instances. I noticed there’s not a lot of info out there about how to work with contractors, consultants, or even internal groups to whom you want to grant access to your AWS account. Here’s how.

First, a Caveat

Please be very selective when you choose a contractor. You want to make sure you choose a candidate who can actually do the work you need – and unfortunately, not everyone who advertises as such can really deliver the goods. Reuven Cohen’s post about choosing a contractor/consultant for cloud projects examines six key factors to consider:

  1. Experience: experience solving real world problems is probably more important than anything else.
  2. Code: someone who can produce running code is often more useful than someone who just makes recommendations for others to follow.
  3. Community Engagement: discussion boards are a great way to gauge experience, and provide insight into the capabilities of the candidate.
  4. Blogs & Whitepaper: another good way to determine a candidate’s insight and capabilities.
  5. Interview: ask the candidate questions to gauge their qualifications.
  6. References: do your homework and make sure the candidate really did what s/he claims to have done.

Reuven’s post goes into more detail. It’s highly recommended for anyone considering using a third-party for cloud projects.

What’s Your Skill Level?

The best way to allow a contractor access to your resources depends on your level of familiarity with the EC2 environment and with systems administration in general.

If you know your way around the EC2 toolset and you’re comfortable managing SSH keypairs, then you probably already know how to allow third-party access safely. This article is not meant for you. (Sorry!)

If you don’t know your way around the EC2 toolset, specifically the command-line API tools, and the AWS Management Console or the ElasticFox Firefox Extension, then you will be better off allowing the contractor to launch and configure the EC2 resources for you. The next section is for you.

Giving EC2 Access to a Third Party

[An aside: It sounds strange, doesn’t it? “Third party”. Did I miss two parties already? Was there beer? Really, though, it makes sense. A third party is someone who is not you (you’re the first party) and not Amazon (they’re the counterparty, or the second party). An outside contractor is a third party.]

Let’s say you want a contractor to launch some EC2 instances for you and to set them up with specific software running on them. You also want them to set up automated EBS snapshots and other processes that will use the EC2 API.

What you should give the contractor

Give the contractor your Access Key ID and your Secret Access Key, which you should get from the Security Credentials page:

The Access Key ID is not a secret – but the Secret Access Key is, so make sure you transfer it securely. Don’t send it over email! Use a private DropBox or other secure method.

Don’t give out the email address and password that allows you to log into the AWS Management Console. You don’t want anyone but you to be able to change the billing information or to sign you up for new services. Or to order merchandise from using your account (!).

What the contractor will do

Using ElasticFox and your Access Key ID and Secret Access Key the contractor will be able to launch EC2 instances and make all the necessary configuration changes on your account. Plus they’ll be able to put these credentials in place for automated scripts to make EC2 API calls on your behalf – like to take an EBS snapshot. [There are some rare exceptions which will require your X.509 Certificates and the use of the command-line API tools.]

For example, here’s what the contractor will do to set up a Linux instance:

  1. Install ElasticFox and put in your access credentials, allowing him access to your account.
  2. Set up a security group allowing him to access the instance.
  3. Create a keypair, saving the private key to his machine (and to give to you later).
  4. Choose an appropriate AMI from among the many available. (I recommend the Alestic Ubuntu AMIs).
  5. Launch an instance of the chosen AMI, in the security group, using the keypair.
  6. Once the instance is launched he’ll SSH into the instance and set it up. He’ll use the instance’s public IP address and the private key half of the keypair (from step 3), and the user name (most likely “root”) to do this.

The contractor can also set up some code to take EBS snapshots – and the code will require your credentials.

What deliverables to expect from the contractor

When he’s done, the contractor will give you a few things. These should include:

  • the instance ids of the instances, their IP addresses, and a description of their roles.
  • the names of any load balancers, auto scaling groups, etc. created.
  • the private key he created in step 3 and the login name (usually “root”). Make sure you get this via a secure communications method – it allows privileged access to the instances.

Make sure you also get a thorough explanation of how to change the credentials used by any code requiring them. In fact, you should insist that this must be easy for you to do.

Plus, ask your contractor to set up the Security Groups so you will have the authorization you need to access your EC2 deployment from your location.

And, of course, before you release the contractor you should verify that everything works as expected.

What to do when the contractor’s engagement is over

When your contractor no longer needs access to your EC2 account you should create new access key credentials (see the “Create a new Access Key” link on the Security Credentials page mentioned above).

But don’t disable the old credentials just yet. First, update any code the contractor installed to use the new credentials and test it.

Once you’re sure the new credentials are working, disable the credentials given to the contractor (the “Make Inactive” link).

The above guidelines also apply to working with internal groups within your organization. You might not need to revoke their credentials, depending on their role – but you should follow the suggestions above so you can if you need to.