Cloud Developer Tips

AWS Auto-Scaling and ELB with Reliable Root Domain Handling

Update May 2011: Now that AWS Route 53 can be used to allow an ELB to host a domain zone apex, the technique described here is no longer necessary. Cool, but not necessary.

Someone really has to implement this. I’ve had this draft sitting around ever since AWS announced support for improved CloudWatch alerts and AutoScaling policies (August 2010), but I haven’t yet turned it into a clear set of commands to follow. If you do, please comment.


You want an auto-scaled, load-balanced pool of web servers to host your site at Unfortunately it’s not so simple, because AWS Elastic Load Balancer can’t be used to host a domain apex (AKA a root domain). One of the longest threads on the AWS Developer Forum discusses this limitation: because ELB utilizes DNS CNAMEs, which are not legal for root domain entries, ELB does not support root domains.

An often-suggested workaround is to use an instance with an Elastic IP address to host the root domain, via standard static DNS, with the web server redirecting all root domain requests to the subdomain (www) served by the ELB. There are four drawbacks to this approach:

  1. The instance with the Elastic IP address is liable to be terminated by auto-scaling, leaving requests to the root domain unanswered.
  2. The instance with the Elastic IP address might fail unnaturally, again leaving requests to the root domain unanswered.
  3. Even when traffic is very low, we need at least two instances running: the one handling the root domain outside the auto-scaled ELB group (due to issue #1) and the one inside the auto-scaled ELB group (to handle the actual traffic hitting the ELB-managed subdomain).
  4. The redirect adds additional latency to requests hitting the root domain.

While we can’t do anything about the fourth issue, what follows is a technique to handle the first three issues.

The Idea

The idea is built on these principles:

  • The instance with the Elastic IP is outside the auto-scaled group so it will not be terminated by auto-scaling.
  • The instance with the Elastic IP is managed using AWS tools to ensure the root domain service is automatically recovered if the instance dies unexpectedly.
  • The auto-scaling group can scale back to zero size, so only a single instance is required to serve low traffic volumes.

How do we put these together?

Here’s how:

  1. Create an AMI for your web server. The AMI will need some special boot-time hooks, which are described below in italics. The web server should be set up to redirect root domain traffic to the subdomain that you’ll want to associate with the ELB, and to serve the subdomain normally.
  2. Create an ELB for the site’s subdomain with a meaningful Health Check (e.g. a URL that exercises representative areas of the application).
  3. Create an AutoScaling group with min=1 and max=1 instances of that AMI. This AutoScaling group will benefit from the default health checks that such groups have, and if EC2 reports the instance is degraded it will be replaced. The LaunchConfiguration for this AutoScaling group should specify user-data that indicates this instance is the “root domain” instance. Upon booting, the instance will notice this flag in the user data, associate the Elastic IP address with itself, an add itself to the ELB.
    Note: At this point, we have a reliably-hosted single instance hosting the root domain and the subdomain.
  4. Create a second AutoScaling group (the “ELB AutoScaling group”) that uses the same AMI, with min=0 instances – the max can be anything you want it to – and set it up to use the ELB’s Health Check. The LaunchConfiguration for this group should not contain the abovementioned special flag – these are not root domain instances.
  5. Create an Alarm that looks at the CPUUtilization across all instances of the AMI, and connect it to the “scale up” and “scale down” Policies for the ELB AutoScaling group.

That is the basic idea. The result will be:

  • The root domain is hosted on an instance that redirects to the ELB subdomain. This instance is managed by a standalone Auto Scaling group that will replace the instance if it becomes degraded. This instance is also a member of the ELB, so it serves the subdomain traffic as well.
  • A second AutoScaling group manages the “overflow” traffic, measured by the CPUUtilization of all the running instances of the AMI.


Here are the missing pieces:

  1. A script that can be run as a boot-time hook that checks the user-data for a special flag. When this flag is detected, the script associates the root domain’s Elastic IP address (which should be specified in the user-data) and adds the instance to the ELB (whose name is also specified in the user-data). This will likely require AWS Credentials to be placed on the instance – perhaps in the user-data itself (be sure you understand the security implications of this) as well as a library such as boto or the AWS SDK to perform the AWS API calls.
  2. The explicit step-by-step instructions for carrying out steps 1 through 5 above using the relevant AWS command-line tools.

Do you have these missing pieces? If so, please comment.

Cloud Developer Tips

Use ELB to Serve Multiple SSL Domains on One EC2 Instance

This is one of the coolest uses of Amazon’s ELB I’ve seen yet. Check out James Elwood’s article.

You may know that you can’t serve more than one SSL-enabled domain on a single EC2 instance. Okay, you can but only via a wildcard certificate (limited) or a multi-domain certificate (hard to maintain). So you really can’t do it properly. Serving multiple SSL domains is one of the main use cases behind the popular request to support multiple IP addresses per instance.

Why can’t you do it “normally”?

The reason why it doesn’t work is this: The HTTPS protocol encrypts the HTTP request, including the Host: header within. This header identifies what actual domain is being requested – and therefore what SSL certificate to use to authenticate the request. But without knowing what domain is being requested, there’s no way to choose the correct SSL certificate! So web servers can only use one SSL certificate.

If you have multiple IP addresses then you can serve different SSL domains from different IP addresses. The VirtualHost directive in Apache (or similar mechanisms in other web servers) can look at the target IP address in the TCP packets – not in the HTTP Host: header – to figure out which IP address is being requested, and therefore which domain’s SSL certificate to use.

But without multiple IP addresses on an EC2 instance, you’re stuck serving only a single SSL-enabled domain from each EC2 instance.

How can you?

Really, read James’ article. He explains it very nicely.

How much does it cost?

Much less than two EC2 instances, that’s for sure. According to the EC2 pricing charts, ELB costs:

  • $0.025 per Elastic Load Balancer-hour (or partial hour) ($0.028 in us-west-1 and eu-west-1)
  • $0.008 per GB of data processed by an Elastic Load Balancer

The smallest per-hour cost you can get in EC2 is for the m1.small instance, at $0.085 ($0.095 in us-west-1 and eu-west-1).

Using the ELB-for-multiple-SSL-sites trick saves you 75% of the cost of using separate instances.

Thanks, James!

Cloud Developer Tips

Solving Common ELB Problems with a Sanity Test

Help! My ELB isn’t serving files!
Whoa! My back-end instances work but not the ELB!
Hey! I can’t get the ELB to work!

These are among the most common Elastic Load Balancer problems raised on the Amazon EC2 Discussion Forums. Inspired by Eric Hammond’s indispensible article Solving “I can’t connect to my server on Amazon EC2”, here is a helpful guide to debugging these common ELB issues, as well as a utility to perform sanity tests on your own ELBs.

Questions to Answer

You’re trying to figure out what’s wrong and you need to know where to start looking. Or, you’re posting your problem on the AWS forums and you want help as quickly as possible. The best way to help yourself or to get help quickly is to examine the basic facts of your situation. Here are some questions to answer for yourself and in your forum post:

  1. What is the output of elb-describe-lbs elbName --show-xml ? This gives the basic details of the ELB, which are critical to diagnosing any problem. If you are posting to the forums and want to keep the DNS name of the ELB private then obscure it in the output. One reason to obscure the DNS name is to prevent readers from accessing your ELB-based service. However, this precaution does not add any security because the DNS information is public, and – presumably – you are using a DNS CNAME entry to integrate the ELB into your domain’s DNS.
  2. What is the output of elb-describe-instance-health elbName ? This provides crucial information about the health of the instances.
  3. What resource are you trying to access via the ELB and what tool are you using to access it and from what location? The resource will likely be a URL of the form http://ELB-DNS-Name/index.html or maybe https://ELB-DNS-Name/index.html, or it might be “I’m running a POP server on port 1234”. The tool you’re using to access it is most likely a browser or HTTP client (Firefox, or wget), or possibly “Microsoft Outlook version 5.4”. The location is either “my local machine” or “an EC2 instance”. Also, can you access the same resource when you connect directly to a back-end instance via its public IP address or host name from a client outside EC2? A public-facing URL pointing directly to a back-end instance looks like this: . And, can you access the same resource when you connect directly to a back-end instance via its private IP address or host name from another instance within EC2? Such a URL looks like this: http://domU-12-31-34-00-69-B9.compute-1.internal/index.html .
  4. Can you access the health check resource directly via the ELB DNS name, and via the back-end instance’s public IP address, and via the back-end instance’s private IP address? If your health check is configured with target=HTTP:8080/check.html then try to access http://ELB-DNS-Name:8080/check.html (which is via the ELB) and (which is via the instance’s public IP address) and http://domU-12-31-34-00-69-B9.compute-1.internal:8080/check.html (which is via the instance’s private IP address, and only accessible from within EC2).
  5. What are the security groups and availability zones for each instance in the ELB? This is visible in the output of ec2-describe-instances i-11111111 i-22222222 ... As above, you might want to obscure the public and private DNS names of these instances in the output.
  6. Can all the back-end instances receive traffic on the instance ports of the ELB listeners and the health check? This can be checked from the output of ec2-describe-group groupName1 groupName2 ... for all the groups shown in question 5’s ec2-describe-instances command.
  7. Do logs on your back-end instances show any connections from ELB?
Common ELB Problems

Okay, now that you know what information is important to diagnosing the problem, here is a look at some of the common gotchas, how to detect them, and how to fix them. These descriptions refer to the above questions by number.Common problems and solutions include:

  • Security groups on back-end instances don’t allow access to the instance ports and health check port. Back-end instances must have all ports on which they receive traffic from the ELB (#1) open to CIDR in one of their associated security groups (#6). Fix this by changing the permissions on the security groups associated with the instances. Note: this fix takes effect within a few seconds and does not require launching new instance or rebooting existing instances.
  • Back-end instances are not healthy (InService). When an instance fails the health check (#1) it is marked as OutOfService (#2) and the ELB does not route traffic to it anymore. To fix this you need to determine why the ELB cannot access the health check resource. Note: there is currently a bug in ELB where instances initially are marked as InService when added to the ELB, until they fail the health check. So you’ll want to make sure you’ve given ELB enough time to detect a failed health check. Update August 2010: AWS folks say that bug has been fixed.
  • An availability zone is enabled on the ELB but has no healthy back-end instances. If you have an availability zone enabled for your ELB (#1) but no healthy instances in that availability zone (#5 and #2), you’ll get 503 Gateway Timeout or other errors. Fix this by adding an instance in that availability zone to the ELB or disabling that availability zone for the ELB.
  • You cannot see a requested resource (#3) or the health check URL (#4) using the ELB DNS name. In this case, check that the URL exists on the back-end instances and look at the back-end instance’s logs (#7) to see if the ELB forwarded your connection or not. If you can see the requested resource using the public address of a back-end instance then check the instance’s security groups (#6) to see that they grant access to the instance’s port.
  • The health check port is not the same as listener target port (#1). While this does not necessarily indicate a problem, for most ELBs the health check should use the same port as one of the listeners. Setting up your ELB to have a health check performed on a different port than the load-balanced traffic is perfectly valid, but you likely want the health check to use the same path that the load-balanced traffic takes to reach your app (and also to exercise a representative set of features used by your app).

I will update this article with new common issues as they appear.

An ELB Sanity Test Utility

If you have your thinking cap on you’ll notice that detecting the first three of the common ELB problems can be automated. Here is an ELB sanity test utility for linux which automates these tests. Save it or download it as follows:

curl -o elb-sanity-test.tar.gz -L

Next, unpack it:

tar xzf elb-sanity-test.tar.gz
cd elb-sanity-test

Next, set up the utility with your credentials. Edit the elb-sanity-test script file, setting AWS_CREDENTIAL_FILE to point to a file containing your AWS credentials in the following format:


The above is the same format that can be used to specify your AWS credentials for the ELB API Tools (see the README.TXT and credential-file-path.template file in the ELB API Tools bundle).

To run the ELB sanity test:

cd elb-sanity-test

Here is sample output showing an ELB that passes the sanity test:

$ ./elb-sanity-test
JUnit version 4.5
Test: all instances have their Security Groups defined to allow access to the ELB listener port
Load Balancer: someLB
ELB someLB has a listener that uses instance-port 8080 and instance i-360ef05e has that TCP port open to the world.
ELB someLB has a listener that uses instance-port 8081 and instance i-360ef05e has that TCP port open to the world.
Test: all ELBs have a HealthCheck on a port that the listener directs traffic to
Load Balancer: someLB
ELB someLB has a configured HealthCheck on listener port 8080
Test: all ELBs have InService instances in each configured availability zone
Load Balancer: somLB
ELB someLB has InService instances in each configured availability zone
Time: 5.22
Tests run: 3,  Failures: 0

The elb-sanity-test utility performs the following sanity tests on every ELB defined in your account:

  • All instances have their security groups defined to allow access to the ELB listener port.
  • All ELBs have a health check on a port that the listener directs traffic to.
  • All ELBs have healthy instances in each configured availability zone.

If a sanity test fails the utility shows a very verbose error message explaining what is wrong.

Some notes about the elb-sanity-test bundle:

  • The utility is written in Java, which is also required for the ELB tools. If you can run the ELB API Tools, you already have all the prerequisites to run this sanity test.
  • The bundle includes source code and is licensed under the Apache License, Version 2.0.
  • The bundle includes all dependency jars necessary to run the script. It uses the JUnit framework and the Typica library.

I would be happy to re-bundle the utility to include a .bat or .cmd file to make it easy to run the script on Windows. If you write one, please add it it in the comments and I’ll include it.

Getting Further Help

If you still have an ELB issue after trying the above advice and the elb-sanity-test utility, please post in the AWS EC2 forum. Questions about the elb-sanity-test utility specifically or about this article are welcome in the comments below.

Update 15 September 2009: Ylastic integrated my elb-sanity-test script into their EC2 management dashboard.

Update 11 October 2009: elb-sanity-test has been released as part of the open-source ec2-elb-tests project hosted on Google Code. And, if you use this utility, please subscribe to the ec2-elb-tests Google Group.

Update 21 July 2014: The project has moved to be hosted on GitHub.


Cloud Developer Tips

Elastic Load Balancer: An Elasticity Gotcha

If you use AWS’s Elastic Load Balancer to allow your EC2 application to scale, like I do, then you’ll want to know about this gotcha recently reported in the AWS forums. By all appearances, it looks like something that should be fixed by Amazon. Until it is, you can reduce (but not eliminate) your exposure to this problem by keeping a small TTL for your ELB’s DNS CNAME entry. Read on for details.

The Gotcha

As your ELB-balanced application experiences an increasing load, some of the traffic received by your back-end instances may be traffic that does not belong to your application. And, after your application experiences a sustained heavy load and then traffic subsides, some of your application’s traffic may be lost or misdirected to other EC2 instances that are not yours.

Update March 2010: It appears AWS has changed the behavior of ELB so this is no longer a likely issue. See below for more details.

Why it Happens

In my article about how ELB works, I describe how ELB resolves its DNS name to a pool of IP addresses, and that this pool increases and decreases in size according to the load placed on the service. Each of the IP addresses in the pool is a “virtual appliance”, acting as a load balancer to distribute the connections among your back-end instances. This gives ELB two levels of elasticity: the pool of virtual appliance IP addresses, and your pool of back-end instances.

Before they are assigned to a specific ELB, the virtual appliance IP addresses are available for use by any ELB, waiting in a global pool. When an ELB needs to increase its pool of virtual appliances due to load, it gets a new IP address from the global pool and begins resolving the ELB DNS name to that IP address in addition to the ones it already uses. And when an ELB notices decreasing load, it releases one of its virtual appliance IP addresses back to the global pool, and no longer returns that IP address when resolving the ELB DNS name. According to testing performed by AWS forum user wizardofcrowds, ELB scales up under sustained load by increasing its pool of IP addresses at the rate of one additional address every 5 minutes. And, ELB scales down by relinquishing an IP address at the rate of one every 2 hours. Thus it is possible that a single ELB virtual appliance IP address can be in service to a number of different ELBs over the course of a few hours.

The problem is that DNS resolution is cached at many layers across the internet. When the ELB scales up and gets a new virtual appliance IP address from the global pool, some client somewhere might still be using that IP address as the resolution of a different ELB’s DNS name. This other ELB might not even belong to you. A few hours ago, another ELB with a different DNS name returned that IP address from a DNS lookup. Now, that IP address is serving your ELB. But some client somewhere may still be using that IP address to attempt to reach an application that is not yours.

The flip side occurs when the ELB scales down and releases a virtual appliance IP address back to the global pool. Some client somewhere might continue resolving your ELB’s DNS name to the now-relinquished IP address. When the address is returned to the pool, that client’s attempts to connect to your service will fail. If that same virtual appliance IP is then put into service for another ELB, then the client working with the cached but no-longer-current DNS resolution for your ELB DNS name will be directed to the other ELB virtual appliance, and then onward to back-end instances that are not yours.

So your application served by ELB may receive traffic destined for other ELBs during increasing load, and may experience lost traffic during decreasing load.

What is the Solution?

Fundamentally, this issue is caused by badly-configured DNS implementations. Some DNS servers (including those of some major ISPs) ignore the TTL (“time to live”) setting of the original DNS record, and thus end up resolving DNS names to an expired IP address. Some DNS clients (browsers such as IE7, and Java programs by default) also ignore DNS TTLs, causing the same problem. Short of fixing all the misconfigured DNS servers and patching all the IE and Java VMs, however, the issue cannot be solved. So a workaround is the best we can hope for.

You, the EC2 user, can minimize the risk that a well-behaved client will experience this issue. Set up your DNS CNAME entries for the ELB to have a small TTL – 120 seconds is good. This will help for clients whose DNS honors the TTL, but not for clients that ignore TTLs or for clients using DNS servers that ignore TTLs.

Amazon can work around the problem on their end. When an ELB needs to scale up and use a new virtual appliance IP address, that address could remain “reserved” for its use for a longer time. When the ELB scales down and releases the virtual appliance IP address, that address would not be reused by another ELB until the reservation period has expired. This would prevent “recent” ELB virtual appliance IP address from being reused by other ELBs, and reduce the risk of misdirecting traffic.

Update March 2010: SanD@AWS has shared that ELB IP addresses will continue to direct traffic to the ELB for one hour after being withdrawn from that ELB’s DNS pool. Hooray!

It should be noted that DNS caching and TTLs influence all load balancing solutions that rely on DNS (such as round-robin DNS), so this issue is not unique to ELB. Caching DNS entries is a good thing for the internet in general, but not all implementations honor the TTL of the cached DNS records. Services relying on DNS for scalability need to be designed with this in mind.

Cloud Developer Tips

EC2 Instance Belonging to Multiple ELBs

I discovered an interesting feature of Amazon EC2 Elastic Load Balancing today: you can add an EC2 instance to more than one ELB virtual appliance. Below I demonstrate the steps I took to set up two ELBs each containing the same instance, and afterward I explain how this technique can be used to deliver different classes of service.

How to Set Up One Instance Belonging to Multiple ELBs

I set this up using the standard EC2 API and EC2 ELB command-line tools. First I launched an instance. I used my favorite Alestic image, the Ubuntu 8.04 Hardy Server AMI:

ec2-run-instances ami-0772946e -k my-keypair -t m1.small -z us-east-1a -n 1

My default security group allows access to port 80, but if yours doesn’t you will need to either change it (for this experiment) or use a security group that does allow port 80. Once my instance was running I sshed in (you should use your own instance’s public DNS name here):

ssh -i /Users/shlomo/.ssh/id_rsa-my-keypair

In the ssh session, I installed the Apache 2 web server:

apt-get update && apt-get install -y apache2

Then I tested it out from my local machine:

wget -q -O -

The result, showing the HTML It works! means Apache is accessible on the instance.

Next, to set up two load balancers. On my local machine I did this:

elb-create-lb lbOne --availability-zones us-east-1a --listener "protocol=http,lb-port=80,instance-port=80"



elb-create-lb lbTwo --availability-zones us-east-1a --listener "protocol=http,lb-port=80,instance-port=80"



Once the load balancers were created, I added the instance to both as follows:

elb-register-instances-with-lb lbOne --instances i-0fdcda66


INSTANCE-ID i-0fdcda66

elb-register-instances-with-lb lbTwo --instances i-0fdcda66


INSTANCE-ID i-0fdcda66

Okay, EC2 accepted that API call. Next, I tested that it actually works, directing HTTP traffic from both ELBs to that one instance:

wget -q -O -
wget -q -O -

Both commands produce It works!, the same as when we accessed the instance directly. This shows that multiple ELBs can direct traffic to a single instance.

What Can This Be Used For?

Here are some of the things you might consider doing with this technique.

Creating different tiers of service

Using two different ELBs that both contain (some of) the the same instance(s) can be useful to create different classes of service without using dedicated instances for the lower service class. For example, you might offer a “best-effort” service with no SLA to non-paying customers, and offer an SLA with performance guarantees to paying customers. Here are example requirements:

  • Paid requests must be serviced within a certain minimum time.
  • Non-paid requests must not require dedicated instances.
  • Non-paid requests must not utilize more than a certain number (n) of available instances.

To build a system that fulfills these requirements you can set up two ELBs, one for each class of service (and direct traffic appropriately). The paying-tier ELB contains n instances at first, and the free-tier ELB contains those very same n instances. You can then set up Auto Scaling for the paying-tier ELB, allowing it to scale based on average CPU or average latency across the paying-tier ELB. This would allow the paying-tier service to scale up and down with demand, but always leaving the original n instances in the paying-tier ELB pool, and always limiting the free-tier to using those n instances. And, all this is done without requiring separate instances for the free-tier ELB.

(“Huh?” you say. “Won’t Auto Scaling terminate the original n instances that were in the paying-tier ELB pool?” No. Auto Scaling does not consider instances that were already in an ELB when the Auto Scaling group was created. As soon as I find this documented I’ll add the source here. In the meantime, trust me, it’s true.)

Splitting traffic across multiple destination ports

The ELB forwarding specification is called a listener. A listener is the combination of source-port, protocol type (HTTP or TCP), and the destination port: it describes what is forwarded and to where. In the above demonstration I created two ELBs that both have the same listener, forwarding HTTP/80 to port 80 on the instance. But the ELBs can also be configured to forward HTTP/80 traffic to different destination ports. Or, the the ELBs can be configured to forward traffic from different source ports to the same destination port. Or from different source ports to different destination ports. Here’s a diagram depicting the various possible combinations of listeners:
Different combinations of ELB listeners that might make sense together

As the table above shows, there are only a few combinations of listeners. Two listeners that are the same (1 -> 1, 1 -> 1 and 1-> 2, 1 -> 2) are redundant (more on this below), possibly only making sense as explained above to provide different classes of service using the same instances. Two listeners forwarding the same source port to different destination ports (1 -> 1, 1 -> 2) can only be achieved with two ELBs. Two listeners that “merge” different source ports into the same destination port (1 -> 1, 2 -> 1 and 1 -> 2, 2 -> 2) can be done with a single ELB. So can two listeners that “swap” source ports and destination ports between them (1 -> 2, 2 -> 1). So can two listeners that map
different source ports to different destination ports (1 -> 1, 2 -> 2).

Did you notice it? I said “can be done with a single ELB”. Did you know you can have multiple listeners on a single ELB? It’s true. If we wanted to forward a number of different source ports all to the same destination port we could accomplish that without creating multiple ELBs. Instead we could create an ELB with multiple listeners. Likewise, if we wanted to forward different source ports to different destination ports we could do that with multiple listeners on a single ELB – for example, using the same ELB with two listeners to handle HTTP/80 and TCP/443 (HTTPS). Likewise, swapping source and target ports with two listeners can be done with a single ELB. What you can’t do with a single ELB is split traffic from a single source port to multiple destination ports. Try it – the EC2 API will reject an attempt to build a load balancer with multiple listeners sharing the same source port:

elb-create-lb lbOne --availability-zones us-east-1a --listener "protocol=http,lb-port=80,instance-port=80" --listener "protocol=http,lb-port=80,instance-port=8080"


elb-create-lb: Malformed input-Malformed service URL: Reason: -
LoadBalancerName --availability-zones value[,value...] --listener
"protocol=value, lb-port=value, instance-port=value" [ --listener
"protocol=value, lb-port=value, instance-port=value" ...]
[General Options]
For more information and a full list of options, run "elb-create-lb --help"

Granted, the error message misidentifies the problem, but it will not work.

Where does that leave us? The only potentially useful case for using two separate ELBs for multiple ports, with the same instance(s) behind the scenes, is where you want to split traffic from a single source port across different destination ports on the instance(s). Practically speaking, this might make sense as part of implementing different classes of service using the same instances. In any case, it would require differentiating the traffic by address (to direct it to the appropriate ELB), so the fact that the actual destination is on a different port is only mildly interesting.

Redundancy – Not.

Putting the same instance behind multiple load balancers is a technique used with hardware load balancers to provide redundancy: in case one load balancer fails, the other is already in place and able to take over. However, EC2 Elastic Load Balancers are not dedicated hardware, and they may not fail independently of each other. I say “may not” because there is not much public information about how ELB is implemented, so nobody who is willing to talk knows how independent ELBs are of each other. Nevertheless, there seems to be no clear redundancy benefits from placing one EC2 instance into multiple ELBs.

With Auto Scaling – Maybe, but why?

This technique might be able to circumvent a limitation inherent in Auto Scaling groups: an instance may only belong to a single Auto Scaling group at a time. Because Auto Scaling groups can manage ELBs, you could theoretically add the same instance to both ELBs, and then create two Auto Scaling groups, one managing each ELB. Your instance will be in two Auto Scaling groups.

Unfortunately you gain nothing from this: as mentioned above, Auto Scaling groups ignore instances that already existed in an ELB when the Auto Scaling group was created. So, it appears pointless to use this technique to circumvent the Auto Scaling limitation.

[The Auto Scaling limitation makes sense: if an instance is in more than one Auto Scaling group, a scaling activity in one group could decide to terminate that shared instance. This would cause the other Auto Scaling group to launch a new instance immediately to replace the shared instance that was terminated. This “churn” is prevented by forbidding an instance from being in more than one Auto Scaling group.]

In short, the interesting feature of ELB allowing an instance to belong to more than one ELB at once has limited practical applicability, useful to implement different service classes using the same instances. If you encounter any other useful scenarios for this technique, please share them.

Update 24 December 2009: You can use multiple ELBs with the same instances to provide multiple HTTPS sites on the same instance.

Cloud Developer Tips

The “Elastic” in “Elastic Load Balancing”: ELB Elasticity and How to Test it

Update March 2012: Amazon published their own official guide to ELB’s architecture and building and testing applications that use it. The official material is very consistent with the presentation offered here, more than two and a half years prior.

Elastic Load Balancing is a long-anticipated AWS feature that aims to ease the deployment of highly-scalable web applications. Let’s take a look at how it achieves elasticity, based on experience and based on the information available in the AWS forums (mainly this thread). The goal is to understand how to design and test ELB deployments properly.

ELB: Two Levels of Elasticity

ELB is a distributed system. An ELB virtual appliance does not have a single public IP address. Instead, when you create an ELB appliance, you are given a DNS name such as Amazon encourages you to set up a DNS CNAME entry pointing (say) to the ELB-supplied DNS name.

Why does Amazon use a DNS name? Why not provide an IP address? Why encourage using a CNAME? Why can’t we use an Elastic IP for an ELB? In order to understand these issues, let’s take a look at what happens when clients interact with the ELB.

Here is the step-by-step flow of what happens when a client requests a URL served by your application:

  1. The client looks in DNS for the resolution of your web server’s name, Because you have set up your DNS to have a CNAME alias pointing to the ELB name, DNS responds with the name
  2. The client looks in DNS for the resolution of the name This DNS entry is controlled by Amazon since it is under the domain. Amazon’s DNS server returns an IP address, say
  3. The client opens a connection with the machine at the provided IP address The machine at this address is really an ELB virtual appliance.
  4. The ELB virtual appliance at address passes through the communications from the client to one of the EC2 instances in the load balancing pool. At this point the client is communicating with one of your EC2 application instances.

As you can see, there are two levels of elasticity in the above protocol. The first scalable point is in Step 2, when Amazon’s DNS resolves the ELB name to an actual IP address. In this step, Amazon can vary the actual IP addresses served to clients in order to distribute traffic among multiple ELB machines. The second point of scalability is in Step 4, where the ELB machine actually passes the client communications through to one of the EC2 instances in the ELB pool. By varying the size of this pool you can control the scalability of the application.

Both levels of scalability, Step 2 and Step 4, are necessary in order to load-balance very high traffic loads. The Step 4 scalability allows your application to exceed the maximum connections per second capacity of a single EC2 instance: connections are distributed among a pool of application instances, each instance handling only some of the connections. Step 2 scalability allows the application to exceed the maximum inbound network traffic capacity of a single network connection: an ELB machine’s network connection can only handle a certain rate of inbound network traffic. Step 2 allows the network traffic from all clients to be distributed among a pool of ELB machines, each appliance handling only some fraction of the network traffic.

If you only had Step 4 scalability (which is what you have if you run your own software load balancer on an EC2 instance) then the maximum capacity of your application is still limited by the inbound network traffic capacity of the front-end load balancer: no matter how many back-end application serving instances you add to the pool, the front-end load balancer will still present a network bottleneck. This bottleneck is eliminated by the addition of Step 2: the ability to use more than one load balancer’s inbound network connection.

[By the way, Step 2 can be replicated to a limited degree by using Round-Robin DNS to serve a pool of IP addresses, each of which is a load balancer. With such a setup you could have multiple load-balanced clusters of EC2 instances, each cluster sitting behind its own software load balancer on an EC2 instance. But Round Robin DNS has its own limitations (such as the inability to take into account the load on each load-balanced unit, and the difficulty of dynamically adjusting the pool of IP addresses to distribute), from which ELB does not suffer.]

Behind the scenes of Step 2, Amazon maps an ELB DNS name to a pool of IP addresses. Initially, this pool is small (see below for more details on the size of the ELB IP address pool). As the traffic to the application and to the ELB IP addresses in the pool increases, Amazon adds more IP addresses to the pool. Remember, these are IP addresses of ELB machines, not your application instances. This is why Amazon wants you to use a CNAME alias for the front-end of the ELB appliance: Amazon can vary the ELB IP address served in response to the DNS lookup of the ELB DNS name.

It is technically possible to implement an equivalent Step 2 scalabililty feature without relying on DNS CNAMEs to provide “delayed binding” to an ELB IP address. However, doing so requires implementing many features that DNS already provides, such as cached lookups and backup lookup servers. I expect that Amazon will implement something along these lines when it removes the limitation that ELBs must use CNAMEs – to allow, for example, an Elastic IP to be associated with an ELB. Now that would be cool.

How ELB Distributes Traffic

As explained above, ELB uses two pools: the pool of IP addresses of ELB virtual appliances (to which the ELB DNS name resolves), and the pool of your application instances (to which the ELBs pass through the client connection). How is traffic distributed among these pools?

ELB IP Address Distribution

The pool of ELB IP addresses initially contains one IP address. More precisely, this pool initially consists of one ELB machine per availability zone that your ELB is configured to serve. This can be inferred from this page in the ELB documentation, which states that “Incoming traffic is load balanced equally across all Availability Zones enabled for your load balancer”. I posit that this behavior is implemented by having each ELB machine serve the EC2 instances within a single availability zone. Then, multiple availability zones are supported by the ELB having in its pool at least one ELB machine per enabled availability zone. Update April 2010: An availability zone that has no healthy instances will not receive any traffic. Previously, the traffic would be evenly divided across all enables availability zones regardless of instance health, possibly resulting in 404 errors. Recent upgrades to ELB no longer exhibit this weakness.

Back to the scalability of the ELB machine pool. According to AWS folks in the forums, this pool is grown in response to increased traffic reaching the ELB IP addresses already in the pool. No precise numbers are provided, but a stream of gradually-increasing traffic over the course of a few hours should cause ELB to grow the pool of IP addresses behind the ELB DNS name. ELB grows the pool proactively in order to stay ahead of increasing traffic loads.

How does ELB decide which IP address to serve to a given client? ELB varies the chosen address “from time to time”. No more specifics are given. However, see below for more information on making sure you are actually using the full pool of available ELB IP addresses when you test ELB.

Back-End Instance Connection Distribution

Each ELB machine can pass through client connections to any of the EC2 instances in the ELB pool within a single availability zone. According to user reports in other forum posts, clients from a single IP address will tend to be connected to the same back-end instance.  According to AWS ELB does round-robin among the least-busy back-end instances, keeping track of approximately how many connections (or requests) are active at each instance but without monitoring CPU or anything else on the instances. It is likely that earlier versions of ELB exhibited some stickiness, but today the only way to ensure stickiness is to use the Sticky Sessions feature.

How much variety is necessary in order to cause the connections to be fairly distributed among your back-end instances? AWS says that “a dozen clients per configured availability zone should more than do the trick”. Additionally, in order for the full range of ELB machine IP addresses to be utilized, “make sure each [client] refreshes their DNS resolution results every few minutes.”

How to Test ELB

Let’s synthesize all the above into guidelines for testing ELB:

  1. Test clients should use the ELB DNS name, ideally via a CNAME alias in your domain. Make sure to perform a DNS lookup for the ELB DNS name every few minutes. If you are using Sun’s Java VM you will need to change the system property to be different than the default value of -1, which causes DNS lookups to be cached until the JVM exits. 120 (seconds) is a good value for this property for ELB test clients. If you’re using IE 7 clients, which cache DNS lookups for 30 minutes by default, see the Microsoft-provided workaround to set that cache timeout to a much lower value.
    Update 14 August 2008: If your test clients cache DNS lookups (or use a DNS provider that does this) beyond the defined TTL, traffic may be misdirected during ramp-up or ramp-down. See my article for a detailed explanation.
  2. One test client equals one public IP address. ELB machines seem to route all traffic from a single IP address to the same back-end instance, so if you run more than one test client process behind a single public IP address, ELB regards these as a single client.
  3. Use 12 test clients for every availability zone you have enabled on the ELB. These test clients do not need to be in different availability zones – they do not even need to be in EC2 (although it is quite attractive to use EC2 instances for test clients). If you have configured your ELB to balance among two availability zones then you should use 24 test clients.
  4. Each test client should gradually ramp up its load over the course of a few hours. Each client can begin at (for example) one connection per second, and increase its rate of connections-per-second every X minutes until the load reaches your desired target after a few hours.
  5. The mix of requests and connections that each test client performs should represent a real-world traffic profile for your application. Paul@AWS recommends the following:

    In general, I’d suggest that you collect historical data or estimate a traffic profile and how it progresses over a typical (or perhaps extreme, but still real-world) day for your scenario. Then, add enough headroom to the numbers to make you feel comfortable and confident that, if ELB can handle the test gracefully, it can handle your actual workload.

    The elements of your traffic profile that you may want to consider in your test strategy may include, for example:

    • connections / second
    • concurrent connections
    • requests / second
    • concurrent requests
    • mix of request sizes
    • mix of response sizes

Use the above guidelines as a starting point to setting up a testing environment that exercises you application behind an ELB. This testing environment should validate that the ELB, and your application instances, can handle the desired load.

A thorough understanding of how ELB works can go a long way to helping you make the best use of ELB in your architecture and toward properly testing your ELB deployment. I hope this article helps you design and test your ELB deployments.