How much time is “wasted” in the paid-for but unused portion of the hour when you terminate an instance? How can you recapture this time – which represents compute power – and put it to good use? After all, you’ve paid for it already. This article presents a technique for repurposing an instance after you’re “done” with it, until the current billing hour is up. It’s inspired by a tweet from DEVOPS_BORAT:
To clarify, we’re talking about per-hour pricing in public cloud IaaS services, where partial hours consumed are billed as whole hours. AWS EC2 is the most prominent example of a cloud sporting this pricing policy (search for “partial”). In this pricing policy, terminating (or stopping) an instance after it’s been running for 121 minutes results in a usage charge for three hours, “wasting” an extra 59 minutes that you have paid for but not used.
What’s Involved
You might think it’s easy to repurpose an instance: just Stop it (if it’s EBS-backed), change its root volume to a new one, and Start the instance again. Not so fast: Stopping an EC2 instance immediately ends the current billing hour before you can use it all, and when you Start the instance again a new billing hour begins – so we can’t Stop the instance. We also can’t Terminate the instance – that would also immediately curtail the billing hour and prevent us from utilizing it. Instead, we’re going to reboot the instance, which does not affect the billing.
We’ll need an EBS volume that has a bootable distro on it – let’s call this the “beneficiary” volume, because it’s going to benefit from the extra time on the clock. The beneficiary volume should have the same distro as the “normal” root volume has. [Actually, to be more precise, it need only have a distro that works with the same kernel that the instance is currently running.] I’ve tested this technique with Ubuntu 10.04 Lucid and 10.10 Maverick.
One of the great things about the Ubuntu images is how easy it is to play this root volume switcheroo: these distros boot from any volume that has the label uec-rootfs
. To change the root volume we’ll change the volume labels, so a different volume is used as the root filesystem upon reboot.
It’s very important to disassociate the instance from all external hooks, such as Auto-Scaling Triggers and Elastic Load Balancers before you repurpose it. Otherwise the beneficiary workload will influence those no-longer-relevant systems. However, this may not be possible if you use hooks that cannot be de-coupled from the instance, such as a CloudWatch Dimension of ImageId
, InstanceId
, or InstanceType
.
The network I/O incurred during the recaptured time may be subject to additional charges. In EC2, only communications between instances in the same availability zone, or between EC2 and S3 in the same region, are free of charge.
You’ll need to make sure the beneficiary workload only accepts communications on ports that are open in the normal instance’s security groups. It’s not possible to add or remove security groups while an instance is running. You also wouldn’t want to be modifying the security groups dynamically because that will influence all instances in those security groups – and you may have other instances that are still performing their normal workload.
The really cool thing about this technique is that it can be used on both EBS-backed and instance-store instances. However, you’ll need to prepare separate beneficiary volumes (or snapshots) for 32-bit and 64-bit instances.
How to Do it
There are three stages in repurposing an instance:
- Preparing the beneficiary volume (or snapshot).
- Preparing the normal workload image.
- Actually repurposing the instance.
Stages 1 and 2 are only performed once. Stage 3 is performed for every instance you want to repurpose.
Preparing the beneficiary snapshot
First we’re going to prepare the beneficiary snapshot. Beginning with a pristine Ubuntu 10.10 Maverick EBS-based instance (at the time of publishing this article that’s ami-ccf405a5
for 32-bit instances), let’s create a clone of the root filesystem:
ec2-run-instances ami-ccf405a5 -k my-keypair -t m1.small -g default
ec2-describe-instances $instanceId #use the instanceId outputted from the previous command
Wait for the instance to be “running”. Once it is, identify the volumeId of the root volume – it will be indicated in the ec2-describe-instances output, the one attached to device /dev/sda1
.
At this point you have a running Ubuntu 10.10 instance. For real-world usage you’ll want to customize this instance by installing the beneficiary workload and arranging for it to automatically start up on boot. (I recommend Folding@home as a worthy beneficiary project.)
Now we create the beneficiary snapshot:
ec2-create-snapshot $volumeId #use the volumeId from the previous command
And now we have the beneficiary snapshot.
Preparing the normal workload image
Begin with the same base AMI that you used for the beneficiary snapshot. Launch it and customize it to contain your normal workload stuff. You’ll also need to put in a custom script that will perform the repurposing. Here’s what that script will do:
- Determine how much time is left on the clock in the current billing hour. If it’s not enough time to prepare and to reboot into the beneficiary volume, just force ourselves to shut down.
- Disassociate any external hooks the instance might participate in: remove it from ELBs, force it to fail any Auto-Scaling health checks, and make sure it’s not handling “normal” workloads anymore.
- Attach the beneficiary volume to the instance.
- Change the volume labels so the beneficiary volume will become the root filesystem at the next reboot.
- Edit the startup scripts on the beneficiary volume to start a self-destruct timer.
- Reboot.
The following script performs steps 1, 4, 5, and 6, and clearly indicates where you should perform steps 2 and 3.
#! /bin/bash
# reboot into the attached EBS volume on the specified device, but terminate
# before this billing hour is complete.
# requires the first argument to be the device on which the EBS volume is attached
device=$1
safetyMarginMinutes=1 # set this to how long it takes to attach and reboot
# make sure we have at least "safetyMargin" minutes left this hour
t=/tmp/ec2.running.seconds.$$
if wget -q -O $t http://169.254.169.254/latest/meta-data/local-ipv4 ; then
# add 60 seconds artificially as a safety margin
let runningSecs=$(( `date +%s` - `date -r $t +%s` ))+60
rm -f $t
let runningSecsThisHour=$runningSecs%3600
let runningMinsThisHour=$runningSecsThisHour/60
let leftMins=60-$runningMinsThisHour-$safetyMarginMinutes
# start shutdown one minute earlier than actually required
let shutdownDelayMins=$leftMins-1
if [[ $shutdownDelayMins < 2 || $shutdownDelayMins > 59 ]]; then
echo "Shutting down now."
shutdown -h now
exit 0
fi
fi
## here is where you would disassociate this instance from ELBs,
# force it to fail AutoScaling health checks, and otherwise make sure
# it does not participate in "normal" activities.
## here is where you would attach the beneficiary volume to $device
# ec2-create-volume --snapshot snap-00000000 -z this_availability_zone
# dont forget to wait until the volume is "available"
# ec2-attach-volume . . . and don't forget to wait until the volume is "attached"
## (optionally) force the beneficiary volume to be deleted when this instance terminates:
# ec2-modify-instance-attribute --block-device-mapping '$device=::true' this_instance_id
## get the beneficiary volume ready to be rebooted into
# change the filesystem labels
e2label /dev/sda1 old-uec-rootfs
e2label $device uec-rootfs
# mount the beneficiary volume
mountPoint=/tmp/mountPoint.$$
mkdir -m 000 $mountPoint
mount $device $mountPoint
# install the self-destruct timer
sed -i -e "s/^exit 0$/shutdown -h +$shutdownDelayMins\nexit 0/" \
$mountPoint/etc/rc.local
# neutralize the self-destruct for subsequent boots
sed -i -e "s#^exit 0#chmod -x /etc/rc.local\nexit 0#" $mountPoint/etc/rc.local
# get out
umount $mountPoint
rmdir $mountPoint
# do the deed
shutdown -r now
exit 0
Save this script into the instance you’re preparing for the normal workload (perhaps, as the root user, into /root/repurpose-instance.sh
) and chmod
it to 744.
Now, make your application detect when its normal workload is completed – this exact method will be very specific to your application. Add in a hook there to invoke this script as the root user, passing it the device on which the beneficiary volume will be attached. For example, the following command will cause the instance to repurpose itself to a volume attached on /dev/sdp:
sudo /root/repurpose-instance.sh /dev/sdp
Once all this is set up, use the usual EC2 AMI creation methods to create your normal workload image (either as an instance-store AMI or as an EBS-backed AMI).
Actually repurposing the instance
Now that everything is prepared, this is the easy part. Your normal workload image can be launched. When it is finished, the repurposing script will be invoked and the instance will be rebooted into the beneficiary volume. The repurposed instance will self-destruct before the billing hour is complete.
You can force this repurposing to happen by explicitly invoking the command at an SSH prompt on the instance:
sudo /root/repurpose-instance.sh /dev/sdp
Notice that you will be immediately kicked out of your SSH session – either the instance will reboot or the instance will terminate itself because there isn’t enough time left in the current billable hour. If it’s just a reboot (which happens when there is significant time left in the current billing hour) then be aware: the SSH host key will most likely be different on the repurposed instance than it was originally, and you may need to clean up your local ~/.ssh/known_hosts
file, removing the entry for the instance, before you can SSH in again.