Cloud Developer Tips

The “Elastic” in “Elastic Load Balancing”: ELB Elasticity and How to Test it

Update March 2012: Amazon published their own official guide to ELB’s architecture and building and testing applications that use it. The official material is very consistent with the presentation offered here, more than two and a half years prior.

Elastic Load Balancing is a long-anticipated AWS feature that aims to ease the deployment of highly-scalable web applications. Let’s take a look at how it achieves elasticity, based on experience and based on the information available in the AWS forums (mainly this thread). The goal is to understand how to design and test ELB deployments properly.

ELB: Two Levels of Elasticity

ELB is a distributed system. An ELB virtual appliance does not have a single public IP address. Instead, when you create an ELB appliance, you are given a DNS name such as Amazon encourages you to set up a DNS CNAME entry pointing (say) to the ELB-supplied DNS name.

Why does Amazon use a DNS name? Why not provide an IP address? Why encourage using a CNAME? Why can’t we use an Elastic IP for an ELB? In order to understand these issues, let’s take a look at what happens when clients interact with the ELB.

Here is the step-by-step flow of what happens when a client requests a URL served by your application:

  1. The client looks in DNS for the resolution of your web server’s name, Because you have set up your DNS to have a CNAME alias pointing to the ELB name, DNS responds with the name
  2. The client looks in DNS for the resolution of the name This DNS entry is controlled by Amazon since it is under the domain. Amazon’s DNS server returns an IP address, say
  3. The client opens a connection with the machine at the provided IP address The machine at this address is really an ELB virtual appliance.
  4. The ELB virtual appliance at address passes through the communications from the client to one of the EC2 instances in the load balancing pool. At this point the client is communicating with one of your EC2 application instances.

As you can see, there are two levels of elasticity in the above protocol. The first scalable point is in Step 2, when Amazon’s DNS resolves the ELB name to an actual IP address. In this step, Amazon can vary the actual IP addresses served to clients in order to distribute traffic among multiple ELB machines. The second point of scalability is in Step 4, where the ELB machine actually passes the client communications through to one of the EC2 instances in the ELB pool. By varying the size of this pool you can control the scalability of the application.

Both levels of scalability, Step 2 and Step 4, are necessary in order to load-balance very high traffic loads. The Step 4 scalability allows your application to exceed the maximum connections per second capacity of a single EC2 instance: connections are distributed among a pool of application instances, each instance handling only some of the connections. Step 2 scalability allows the application to exceed the maximum inbound network traffic capacity of a single network connection: an ELB machine’s network connection can only handle a certain rate of inbound network traffic. Step 2 allows the network traffic from all clients to be distributed among a pool of ELB machines, each appliance handling only some fraction of the network traffic.

If you only had Step 4 scalability (which is what you have if you run your own software load balancer on an EC2 instance) then the maximum capacity of your application is still limited by the inbound network traffic capacity of the front-end load balancer: no matter how many back-end application serving instances you add to the pool, the front-end load balancer will still present a network bottleneck. This bottleneck is eliminated by the addition of Step 2: the ability to use more than one load balancer’s inbound network connection.

[By the way, Step 2 can be replicated to a limited degree by using Round-Robin DNS to serve a pool of IP addresses, each of which is a load balancer. With such a setup you could have multiple load-balanced clusters of EC2 instances, each cluster sitting behind its own software load balancer on an EC2 instance. But Round Robin DNS has its own limitations (such as the inability to take into account the load on each load-balanced unit, and the difficulty of dynamically adjusting the pool of IP addresses to distribute), from which ELB does not suffer.]

Behind the scenes of Step 2, Amazon maps an ELB DNS name to a pool of IP addresses. Initially, this pool is small (see below for more details on the size of the ELB IP address pool). As the traffic to the application and to the ELB IP addresses in the pool increases, Amazon adds more IP addresses to the pool. Remember, these are IP addresses of ELB machines, not your application instances. This is why Amazon wants you to use a CNAME alias for the front-end of the ELB appliance: Amazon can vary the ELB IP address served in response to the DNS lookup of the ELB DNS name.

It is technically possible to implement an equivalent Step 2 scalabililty feature without relying on DNS CNAMEs to provide “delayed binding” to an ELB IP address. However, doing so requires implementing many features that DNS already provides, such as cached lookups and backup lookup servers. I expect that Amazon will implement something along these lines when it removes the limitation that ELBs must use CNAMEs – to allow, for example, an Elastic IP to be associated with an ELB. Now that would be cool.

How ELB Distributes Traffic

As explained above, ELB uses two pools: the pool of IP addresses of ELB virtual appliances (to which the ELB DNS name resolves), and the pool of your application instances (to which the ELBs pass through the client connection). How is traffic distributed among these pools?

ELB IP Address Distribution

The pool of ELB IP addresses initially contains one IP address. More precisely, this pool initially consists of one ELB machine per availability zone that your ELB is configured to serve. This can be inferred from this page in the ELB documentation, which states that “Incoming traffic is load balanced equally across all Availability Zones enabled for your load balancer”. I posit that this behavior is implemented by having each ELB machine serve the EC2 instances within a single availability zone. Then, multiple availability zones are supported by the ELB having in its pool at least one ELB machine per enabled availability zone. Update April 2010: An availability zone that has no healthy instances will not receive any traffic. Previously, the traffic would be evenly divided across all enables availability zones regardless of instance health, possibly resulting in 404 errors. Recent upgrades to ELB no longer exhibit this weakness.

Back to the scalability of the ELB machine pool. According to AWS folks in the forums, this pool is grown in response to increased traffic reaching the ELB IP addresses already in the pool. No precise numbers are provided, but a stream of gradually-increasing traffic over the course of a few hours should cause ELB to grow the pool of IP addresses behind the ELB DNS name. ELB grows the pool proactively in order to stay ahead of increasing traffic loads.

How does ELB decide which IP address to serve to a given client? ELB varies the chosen address “from time to time”. No more specifics are given. However, see below for more information on making sure you are actually using the full pool of available ELB IP addresses when you test ELB.

Back-End Instance Connection Distribution

Each ELB machine can pass through client connections to any of the EC2 instances in the ELB pool within a single availability zone. According to user reports in other forum posts, clients from a single IP address will tend to be connected to the same back-end instance.  According to AWS ELB does round-robin among the least-busy back-end instances, keeping track of approximately how many connections (or requests) are active at each instance but without monitoring CPU or anything else on the instances. It is likely that earlier versions of ELB exhibited some stickiness, but today the only way to ensure stickiness is to use the Sticky Sessions feature.

How much variety is necessary in order to cause the connections to be fairly distributed among your back-end instances? AWS says that “a dozen clients per configured availability zone should more than do the trick”. Additionally, in order for the full range of ELB machine IP addresses to be utilized, “make sure each [client] refreshes their DNS resolution results every few minutes.”

How to Test ELB

Let’s synthesize all the above into guidelines for testing ELB:

  1. Test clients should use the ELB DNS name, ideally via a CNAME alias in your domain. Make sure to perform a DNS lookup for the ELB DNS name every few minutes. If you are using Sun’s Java VM you will need to change the system property to be different than the default value of -1, which causes DNS lookups to be cached until the JVM exits. 120 (seconds) is a good value for this property for ELB test clients. If you’re using IE 7 clients, which cache DNS lookups for 30 minutes by default, see the Microsoft-provided workaround to set that cache timeout to a much lower value.
    Update 14 August 2008: If your test clients cache DNS lookups (or use a DNS provider that does this) beyond the defined TTL, traffic may be misdirected during ramp-up or ramp-down. See my article for a detailed explanation.
  2. One test client equals one public IP address. ELB machines seem to route all traffic from a single IP address to the same back-end instance, so if you run more than one test client process behind a single public IP address, ELB regards these as a single client.
  3. Use 12 test clients for every availability zone you have enabled on the ELB. These test clients do not need to be in different availability zones – they do not even need to be in EC2 (although it is quite attractive to use EC2 instances for test clients). If you have configured your ELB to balance among two availability zones then you should use 24 test clients.
  4. Each test client should gradually ramp up its load over the course of a few hours. Each client can begin at (for example) one connection per second, and increase its rate of connections-per-second every X minutes until the load reaches your desired target after a few hours.
  5. The mix of requests and connections that each test client performs should represent a real-world traffic profile for your application. Paul@AWS recommends the following:

    In general, I’d suggest that you collect historical data or estimate a traffic profile and how it progresses over a typical (or perhaps extreme, but still real-world) day for your scenario. Then, add enough headroom to the numbers to make you feel comfortable and confident that, if ELB can handle the test gracefully, it can handle your actual workload.

    The elements of your traffic profile that you may want to consider in your test strategy may include, for example:

    • connections / second
    • concurrent connections
    • requests / second
    • concurrent requests
    • mix of request sizes
    • mix of response sizes

Use the above guidelines as a starting point to setting up a testing environment that exercises you application behind an ELB. This testing environment should validate that the ELB, and your application instances, can handle the desired load.

A thorough understanding of how ELB works can go a long way to helping you make the best use of ELB in your architecture and toward properly testing your ELB deployment. I hope this article helps you design and test your ELB deployments.

103 replies on “The “Elastic” in “Elastic Load Balancing”: ELB Elasticity and How to Test it”

Very informational post about Elastic Load Balancing.

Specific Setup Question:
When I'm trying to use this load balancer with a medium-sized Sharepoint farm (2 web servers, 1 sql server), the end-user is not able to be authenticated by the domain controller. If only one of the web servers is connected to the load balancer, then the user is authenticated and it doesn't matter which server is the one connected. But as soon as the second server is added, authentication fails.

Does this have anything to do with using a CNAME vs a regular A entry?


I don't believe your issue has anything to do with DNS CNAME or A records. If it did, the problem would happen even when only one back-end server is in service.

I'm not too familiar with Sharepoint, so I'm sorry I can't help on that.

A lookup table for client IP addresses in the ELB might explain why a client connection could appear to have an affinity/stickiness to a backend server. The ELB might only incur the expense of determining to which server to route a request, if it had not recently seen the connecting IP address. This would allow higher throughput for the ELB.

It's a reasonable shortcut if all backend servers process requests equally quickly and connections/session, when averaged, last roughly the same amount of time on each server. You might get interesting results by mixing small and extra large instances behind the same ELB.

Hi Shlomo,

Thank you for this wonderful article!!! I am a newbie to ELB. I would like to clarify certain things with you.

1. According to amazon each ELB will have a DNS name instead of an elastic ip which we should configure by adding a CNAME entry. could you please explain how to make it technically possible of attaching an elastic ip to ELB?

2. Regarding the step 2 scalability, once the inbound network network traffic capacity has reached, will a new load balancer be automatically created? if so will those instances that were registered with the first load balancer be automatically registerd with the new one?

3. could you let me know any kind of testing tool that will check HOW MANY REQUESTS can an instance handle at a time? ie at what point will the load balancer share the load from one instance to the next one?

Thanks in advance!!!!!




There is no way to associate an Elastic IP with an ELB today.

Regarding question #2 about inbound network capacity and the back-end instances: An ELB has a pool of back-end instances. The ELB automatically scales itself to handle the incoming traffic and distribute it to the back-end instances. Once you have created an ELB and assigned it a pool of back-end instances, you do not have to create any more ELBs for those back-end instances – the ELB will automatically scale by itself to handle the incoming traffic. This does not involve a new ELB, just some new behind-the-scenes stuff inside Amazon. The behind-the-scenes stuff is what I described in this article.

Regarding question #3: There is no general answer to "how many requests can an instance handle at a time". This depends on the type of instance (because each instance type has different CPU and network characteristics) and on the applications that you are running on the instance. You need to test this yourself to find the answer for your application and instance type. There are many many tools available to help you measure this. Google is your friend.

Regarding "share the load from one instance to the next": ELB spreads out the incoming requests approximately evenly among the back-end instances, not in a "cascading" style that you describe. The only thing you need to consider is how many back-end instances to give to the load balancer in order to handle the incoming load. One approach is to simply do the math (max desired traffic / max traffic per instance = # back-end instances to put into the ELB). The more cost-effective approach is to use Auto Scaling to automatically launch (and terminate) back-end instances for the ELB according to demand. For more details about Auto Scaling please consult the docs here:

Does anyone have the problem that a domain name root cannot be a CNAME (e.g., to the load balancer)?

so, how do you get to proxy the AWS loadbalancer which requires a CNAME? Point to an IP address and redirect that traffic to WWW ?

I'm no network engineer, but 1 static IP representing a single instance taking millions of hits in a few seconds, even if only redirects, sounds like a bottleneck!

Is there something I'm missing?


The only solutions available today for using ELB with the root domain are workarounds.

You can spread the traffic hitting the root domain (for redirection to by using round-robin DNS pointing to multiple machines (each of which performs the redirect).

There is another intriguing workaround suggested by M. David Peterson here:
I haven't tried it, though.

hello Shlomo,
A very informative and well written post, I am very thankful for that.

I just need to know you views on the following

We are trying to achieve something of this order… 1 Million Requests per sec (1 KB size per request) == 1 gByte/sec.
* Can Amazon allow such a limit or in specific can it possible to achieve that with step 2 (ofcourse with proper backend on step4.)
* Can this be achieved with a single availability zone?
* Also we want to check on the concurrent connection limits. What is the limit of concurrency that Amazon supports?
* Can step 2 help us scale that as well.

Hoping to hear your comments on this.. will surely be of much help to me.

with regards



1 GB/sec is not a lot and it is definitely possible to do with ELB. I've seen ELB handle more. I don't know what the limit is – if there is one, because there might not be a limit on the ELB side. The limit might be the number of back-end instances you can launch.

There's no problem with putting all the back-end instances in the same availability zone.

Concurrent connections are also not really limited by the ELB, but by the back-end instances.

If your traffic patterns match those that ELB was designed for (i.e. gradually ramping-up) then it should be able to scale to handle that traffic.

Excellent blog, very informative.

I however, being an EC2 newbie, would like to ask a quick question regarding bandwidth.

Your explanation on how requests are handed off to instances (ie the client is eventually connected directly with an instance behind the ELB) suggests that max bandwidth should be increased when increasing the number of instances behing the ELB. Does this sound correct?

What concerns me is that I've read several forum posts etc that seem to suggest this is not the case. And that bandwidth is limited to the bandwidth of the ELB itself and the number of instances behind is irrelevant. This doesn't sound right to me, hopefully they're wrong.

Bandwidth is going to be very important for me, as my app will have a very heavy streaming element. So I am trying to determine whether or not I can increase bandwidth by simply increasing the number of instances I have behind my ELB.


I did not mean to imply that connections are "handed off" to back-end instances – they're not, the traffic is "passed through" to the back-end instances via the ELB. Please let me know where I might clarify the article.

The theoretical bandwidth of ELB is unlimited – it will keep on scaling as long as the traffic keeps ramping up. So the overall bandwidth that your system will be able to handle is a direct result of the number of instances you put behind the ELB. As I described above, you'd need to test ELB carefully to make sure you're actually reproducing the conditions under which ELB was designed to scale. It's possible (and I've suggested as much in the forum) that the forum posts describing bandwidth limitations did not test the ELB properly and therefore hit a "faux" limit.

Unlike ELB, software load balancers running on an EC2 instance (and hardware load balancers in a data center) cannot scale beyond the bandwidth of the network connection feeding into them.

Very informative article . Thank you so much. I am confused with terms “handed off ‘“ and “passed through “. Can you please shed some light on it. What I understood is request will passed through Elb and but connections will establish between client and back end ec2 instances only, not with elb. Am I right? Please reply . I have Lack of knowledge.

"4.The ELB virtual appliance at address passes through the communications from the client to one of the EC2 instances in the load balancing pool. At this point the client is connected with one of your EC2 application instances"

This is the part which I originally took to mean connections are handed off. But on reading it again after your reply, it makes a little more sense.

The main thing is that it sounds like as long as I configure things correctly, the ELB will suit my requirements. Thanks again for the article!


Thanks for pointing that out – I've edited it to say "at this point the client is communicating with one of your EC2 application instances". I hope that makes it clearer.

Hello Shlomo,
This might be a unrelated question to this post, but since you are a EC2 expert, hope you can shed light on this.

I dont know is it only my observation or in general people have seen this.

I was configuring some clusters on the EC2 and most importantly was putting up a typical mysql master-slave replication based cluster. As I was doing it, I saw that the every instance created was given a private IP from different subnets. When I did some basic network tests including ping round-trip test, I saw the network was fluctuating a lot and at time was miserable. This also reflected on the performance of the cluster, because they were made to communicate over private IPs.

So wanted to know:
1. I guess there is no way you can ask for private IPs to be from the same subnet (except on VPC).
2. What I am seeing here, is it a normal phenomena on the Amazon?
3. If it is normal, how does one mitigate these issues. Cause with these basic network issues, performance at the backend is not at all good.

Hoping that you will shed some light on this.


You ask interesting questions.

1. There is no way to request a specific subnet for your instance's private IP.

2. It's normal for traffic between instances to experience fluctuation, especially when measuring ping time: The EC2 network de-prioritizes ICMP communications (as per the comment from Cindy@AWS in this thread: ). Note that the SLA does not guarantee a minimum bandwidth or network latency.

3. First we need to determine if network latency is improved or not between instances on the same subnet. I did a little experiment and launched 20 instances in a single request. The private IPs I was assigned were scattered across three different subnets.

We can run tests to determine the average network latency & speed across instances on the same and on different subnets. If these tests indicate that network is better within the same subnet then you can launch four or five times as many instances as you need, and terminate those that are not in the same subnet.

Please share the results of any tests you do to determine the common subnet's effect on network latency.

Hi shlomo,
i know this is not the right place to ask this question but nevertheless may be you can help me. i am unable to load balance i mean i need to implement this way. i need to have one load balancer than on 1instance i will have apache web server which will forward the client request to the jboss application server on another instance and then it will perform any DB operations on another mysql instance. so the request would be like client -> ELB -> apache -> jboss -> mysqlDB
no the problem is that i am not able to communicate from apache instance to jboss instance or from jboss instance to mysql instance how do i resolve this? please help!!! thanks.

first of all this is a great post.
I am trying to achieve the following Scnerio:
a) Create a ELB
b) add two instances to it.
c) Map CNAME alias pointing for

Now I also have another subdomain called
How will I map this subdomain to use my ELB cluster ?


You can map more than one subdomain to use the ELB’s DNS name. Just decide what subdomains you want the ELB to answer, and make a CNAME mapping for each one.

The CNAME mapping is done with your domain’s DNS provider. There should be some sort of DNS management screen that allows you to create a CNAME alias.

Hi Shlomo,

Great post, and lots of useful info.

What seems problematic to me in step 2 is that it requires another DNS lookup (this might be very fast, but it’s another overhead nonetheless). Your thoughts ?

Another thing I think is missing from the ELB is the option to load balance based on location (as far as I can tell). What’s the point in having servers in different locations if you can’t tell you LB to divert first to the closest location. (you might have a user coming from a room inside the AWS data center being routed across the world…)
i.e. Have the ELB choose the closest location, and if it’s not available, go on as usual.
As far as I can tell you need a GSLB enabled DNS service which then will require you to add a separate ELB for each zone you wish to be in….
Can you think of any better way to achieve this ?



In practice clients cache DNS lookups (sometimes for too long, as mentioned in the article above) so DNS doesn’t have a significant effect on the traffic.

I hope that AWS introduces a managed DNS solution. This would allow them to offer GSLB (or better yet, a BGP-based “global IP”) as well as a number of other convenient services.

[…] The "Elastic" in "Elastic Load Balancing": ELB Elasticity and How to Test it EC2 Instance Belonging to Multiple ELBs, (Creating different tiers of service, Splitting traffic across multiple destination ports) var a2a_config = a2a_config || {}; a2a_config.linkname="High Availability Across Multiple Data Centers, Multihoming and EC2"; a2a_config.linkurl=""; You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site. […]

Hello, Shlomo. Given your expertise I wanted to ask your opinion on something.

We’re looking at running a two-tier app within EC2. The first tier will be using Varnish to cache and serve pages generated by the second tier. It turns out that m1.small is half the cost of c1.medium, but c1.medium has five times the computing power (1 compute unit vs. 2 x 2.5 compute units). My question is: in the Varnish tier do you think it makes sense to use c1.medium instances–or, for the same cost, to run twice as many m1.small instances? (The same question asked a different way: do you know if network bandwidth correlates to compute units?)

Both m1.small and c1.medium are described by Amazon as having “medium” I/O. The blog entry at says:

“We could easily deliver 500 to 600 requests per second with a small instance and have the box idle around 95% (uncompressed content).

It turns out we are limited by bandwidth and not by CPU.

The limit was always ~35 MB/s.”

If he was able to hit a bandwidth limit before hitting a CPU limit on an m1.small I wonder if that means the extra compute power of a c1.medium would be wasted in a Varnish tier.

@Michael Lenaghan,

For workloads that are bandwidth-limited you’re better off with more network capacity than with more CPUs. When comparing m1.smalls vs. c1.mediums, which have the same network capacity, an m1.small is recommended here.

Hope this helps.

Shlomo, I think you’ll find this interesting:

They tested the impact of virtualization on m1.small and c1.medium instance types. The impact of processor scheduling is visible in the m1.small instances–in more ways than one. (That doesn’t imply that m1.small is the wrong choice, just that the correct answer to the question is a little more complicated.)

@Michael Lenaghan,

Thanks for the link to that paper – a very interesting read!

Two m1.small instances will have better network performance than a single c1.medium instance – Figures 3 and 4 in the paper show this. For network-intensive applications m1.smalls are more economical.

Hi Shlomo,

As of now is there any work around for capturing client ip address through HTTPS traffic, other than the work around you had mentioned in the AWS Forum

I have a server that is configured to handle both HTTP annd HTTPS traffic. I have set the X-Forwarded-For in apache.conf and is capturing client ip address for http traffic without any issues?

So is it like as of now we don’t have any solution to capture client ip through HTTPS traffic when using ELB?


I don’t know of any other way to get the client’s IP address via HTTPS when using ELB other than the workaround I propose in that thread. Perhaps other readers know of something?

Hi Shlomo,

great post!

I’m trying to load test a website using jmeter and I’ve configured elb as member of eu-west-1a only. I’ve also configure an autoscaling trigger, in order to increase the number of instance if the load is above a threshold. I’m running jmeter from a large instance in the same availability zone. What I notice is that, once the additional instance is up and running, it doesn’t receive any traffic, while the old one stays overloaded. Am I doing anything wrong in the config? Both AppCookieStickinessPolicies LBCookieStickinessPolicies are both unset. Any thoughts?
Thanks for the great post



You should double-check that the AutoScaling group is connected to the ELB. As per ChrisK@AWS in this thread:

AutoScaling will not add instances to the ELB if you do not specify the ELB name in the AutoScalingGroup. Without the ELB, AutoScaling will just launch the instance and you can add the instance to your ELB whenever the instance is ready. Note however that, when AutoScaling terminates an instance, without the ELB configured, AutoScaling will not be able to remove the instance from ELB.


the trigger adds the instance to the elb automatically. I’ve tried to disable the keepalive flag in the Jmeter script, but still I don’t see the traffic equally distributed among the instances.


Did you start JMeter’s JVM with the property set appropriately?
Are you hitting the ELB’s DNS name in JMeter?
What happens if you hit the ELB’s DNS name via a browser and constantly refresh the page – does the traffic get distributed?


first thank for the help. I’m running jmeter with the following param, and the script is configured in order to use an HTTP Request Defaults pointing to the lb dns name.
How do I check from the browser which webserver responds to the request? Shall I just check IIS logs and see if they all contain requests from my browser?
Thanks again

one thing that is not clear to me is if there is a case where an ELB can become unavailable. What happens when whatever piece of equipment within amazon that is forwarding the traffic dies? Do they create a new ELB on another device or does it just go away like a regular instance? I’m sort of inferring that it reappears, but I’d like to know for sure…


Failure modes for ELB are not documented anywhere, and there’s no SLA for ELB. Sometimes there are service interruptions (rarely, actually) but the ELB does “come back to life” with the same endpoint name.

Thanks for an excellent article.

For us, the major limitation of this service is the requirement of a cname DNS entry to map to the load balancer. Since Amazon has not commented on when they ever plan to allow an Elastic IP to be mapped, this is really holding us back from moving to EC2.

The problem is our web sites all operate without the “www” prefix, and this would not be a popular change to make. Redirects would put our search rankings etc at risk, and it’s just an ugly solution.

I was toying with the idea of writing a service that periodically polls the DNS CNAME address of the load balancer, and dynamically updates a proper A record with the returned IP address. The public could then access the site via this A address as normal.

This would partially solve the problem for us, giving us failover, and balancing processing load over multiple servers, but of course would retain a single load balancer as a potential throughput bottleneck. However this may well be acceptable for us, as the failure tolerance is our main requirement.

Can you think of any other potential issues with this approach? Obviously the service would have to run on multiple servers to ensure the A record stays up to date, but how often does it actually change?

Is there any way to retrieve a list of all the IP’s currently in service for a given ELB? (either via DNS or the API).


There’s no concern about a single ELB being a bandwidth bottleneck. Because multiple load balancing machines are used (represented by the variety of IP addresses returned by the DNS lookup) ELB spreads the traffic out according to the load at the time.

As described in the article, ELB’s pool of addresses varies with load. If your load changes significantly, odds are you’ll see a change to the pool of addresses.

The approach you describe has been suggested before, but I’ve yet to see anyone try it. Theoretically it should work. Please share your experiences if you do try it!

Ah yes what I meant is that by polling the cname and updating an A record periodically, I’ll really only be sending people to a single IP address from the pool that *may* have been allocated to the ELB.

From the sounds of it, my updater service is likely to get the same IP from the cname each request. And even if it does get a new IP, I can’t really guarantee that previous ones would still be available.

That is, unless there’s a way I can retrieve all of the IP’s currently allocated to the ELB, but it doesn’t seem like there is.

Will let you know how I go though!


Ah, I see. You are correct that only using one of the returned IP addresses will interfere with scalability.

If you use 12 clients per enabled ELB availability zone to poll the ELB’s DNS entry then you should be able to get the complete current list of assigned IP addresses.

I ended up doing as I suggested … a script that runs every minute, looks up the DNS for the load balancer cname, and then updates the A records using Dynamic DNS.

The A records have a short 60 second TTL, just like the original CNAME. I use DNSMadeEasy for the Dynamic DNS.

It’s working very well! I have to create a fixed number of A records to be used in the round robin, so I just created 20 and then spread the currently allocated IP’s across them.

eg. if there was just and,
IP1 –
IP2 –
IP3 –
… etc

If anyone wants the code (c# .net) I’d be happy to share.

I’m still considering HAProxy, but mainy because our web site is tied in with a TV show, and the ELB’s just don’t scale up quickly enough for that instant demand peak.

Excellent post! I am new to ELB and have a question about session stickiness. I have a 3-tier app that’s configured like this: client -> ELB1 -> web server -> ELB2 -> tomcat. I have the same stack setup in 2 availability zones. Implementing session stickiness means that a client request should (A)be sent to the correct avability zone and (B) be sent to the correct tomcat instance. (B) can be addressed by enabling application based session stickiness on ELB2. How can I address (A) – in other words, how do I make sure that a given client request always goes to one of the web servers in the same availability zone?

Thanks in advance


You can use an lb session cookie, as I describe in my article , on the front-end ELB. This cookie will be added by ELB and will cause the same client to alway be sent to the same web server:

elb-create-lb-cookie-stickiness-policy myFrontEndLoadBalancer --policy-name fifteenMinutesPolicy --expiration-period 900
elb-set-lb-policies-of-listener myFrontEndLoadBalancer --lb-port 80 --policy-names fifteenMinutesPolicy

Hi Shlomo,

1. What if we need to handle traffic spikes. How fast does ELB adapt to sudden traffic spikes.
2. If we have an ELB per REGION, how can we make users from Europe get to the ELB at Europe and those from US to the US.
3. What are the downsize of intead of using ELB only setting up a very strong Instance running good Software Load Balancer with another instance as fail safe HA. If the Instance will not hold the Network we can run several and use DNS Round-Robin between them.



1. What if we need to handle traffic spikes. How fast does ELB adapt to sudden traffic spikes.

ELB makes no guarantee about the speed of its reaction to a spike. In my experience it’s good for most “non-steep” traffic events. I would not recommend ELB for, say, balancing in front of a web farm that opens the floodgates to traffic all at once – for example to conduct a contest, promotion, or time-triggered marketing event.

2. If we have an ELB per REGION, how can we make users from Europe get to the ELB at Europe and those from US to the US.

This is called Global Server Load Balancing (GSLB) and AWS doesn’t offer this service yet, but other DNS providers do. AWS has many ingredients necessary to offer this service in the future – Route 53 and CloudFront are, together, probably 90% of the puzzle.

3. What are the downsize of intead of using ELB only setting up a very strong Instance running good Software Load Balancer with another instance as fail safe HA. If the Instance will not hold the Network we can run several and use DNS Round-Robin between them.

Some advantages of do-it-yourself the way you describe over ELB are as follows:
– It’s easier to diagnose problems.
– It’s easier to fix problems.
– It can load balance in front of non-cloud nodes as well.
– The scaling strategy can be integrated with the application.
Some disadvantages of do-it-yourself are:
– It increases the risk of lost traffic, as DNS lookups are cached and fall out of date with the roster of currently-active load balancing instances.
– It requires monitoring network traffic to identify when scaling out is necessary.
– It’s complex.

If you’re thinking about implementing something like this yourself, check out HA DNS services (such as ) that will ping your server every so often and change the DNS entry if there’s a problem.

Thanks Shlomo, I am currently evaluating EC2 Enterprise product from How would you rate that?

@Pratyush Pushkar

It’s not really a “solution” but I see several people in that thread misclassifying the problem. It’s not a hard 60-second timeout on all connections. It’s just a timeout on idle connections. So file transfers, streaming and other cases where you actually need long connections work fine.

The real problem cases, as far as I can tell, are:
1. A poorly-written app that does so much processing inside the request/response cycle that it takes >60 seconds to calculate a response. I’d imagine we’ve all been guilty of this for “good-enough” reporting from time to time, but it’s something to be avoided in production anyway and is solved pretty easily with job queues and AJAX polling. I <3 Celery for this in Python:
2. Long-polling comet solutions. With the 60-second idle timeout, you have to change your application to poll more often than every 60 seconds. For some apps, polling every 5 minutes would be better, but with ELB you have to bring that down to something less than 60 seconds. Definitely not optimal.

I'd love to hear about a use case other than comet where you need HTTP connections open and idle for more than 60 though. I'm sure they exist somewhere and if it's an absolute requirement that you can't tweak, then it seems shlomo is correct. Software load balancers are the solution in ec2. Why do you need more than 60 seconds, out of curiosity?

@Wes Winham

The use case that we are encountering is that – we have an application which gets the data from a third-party service and retrieving the data (for a select few users with large amount of data) from the third-party service itself takes anything between 45-75 seconds and therefore more often that not the HTTP connection made to the third-party service remains idle till we get the data back.

Can you tell me if there is an alternative solution for such a use case? Also, we were in the middle of the release when we encountered this issue and therefore revamping the code completely may not be an effective solution. Please let me know your thoughts or help me understanding if I am still misreading the problem.

@Pratyush Pushkar,

You’ll need to use your own software load balancing solution, such as HAProxy or aicache. They’re not elastic, but you can carefully choose an instance size for the load balancer that supports the expected traffic.


Thanks for your feedback. The problem there would be managing the failover for the load-balancer instance per se. Any suggestions regarding how to handle that?

@Pratyush Pushkar,

There are DNS-based solutions that do a health check on the primary IP address and failover to the secondary IP address. Each service differs in the frequency of its health checks.

Alternatively, you can keep a spare instance running the software load balancer. Use rsync on a cron job to periodically sync the load balancer config file from the live instance to the hot spare – this will allow you to update only the live load balancer when instances join and leave the load balanced cluster. And, create a test script on the spare, run by a cron job every minute or so, to attempt to resolve a URL from the live instance. This script, when that URL fails, should use the EC2 API to remap the Elastic IP from the live instance to itself. Be sure to stop the periodic test script after failover happens. For added robustness the hot spare should be in a different availability zone than the live load balancer instance.

The “right” solution depends on your desired availability profile and your budget.

@Pratyush Pushkar

Let’s see if I understand your use case correctly. A user makes a request for a page from your web application. Upon receiving the request, your application queries a 3rd-party service for data, waits 45-75 seconds for that data to be retrieved, and then returns the results to the user.

If I’m understanding what you’re saying, your request to the 3rd-party service isn’t going through ELB, so it’s not that request that’s being timed out. The issue is that it’s taking you more than 60 seconds to generate a response to the user, and that request is what is timing out.

If that’s the case, the best way to address this would be with a job queue. There are lots of options there, depending on your stack (Celery, DelayedJob, Resque). When your application receives the request, it should immediately return a page to the user and fire off a job to go get the 3rd-party data. The user’s page then polls via AJAX to get the status of that job (possibly displaying a progress bar), and to retrieve the results upon completion. This has the benefit of A) Giving instant feedback to your user B) Not tying up two HTTP requests for the duration C) Not holding one of your expensive web workers (Apache, Nginx, Tomcat, whatever) for the entire request

This of course requires changing your application, but it’s a good change to make for reasons beyond the ELB constraint. There are also really good libraries to make this fairly pain-free.

@Wes Winham
The issue is that the client application in case is a mobile application and implementing a job queue for a mobile application may be very tricky. Any ideas?

@Pratyush Pushkar

The intended client for your application shouldn’t matter. The job queue lives on the server. In fact, for a mobile application, quick user feedback is even *more* important because network connections are so much less reliable.

In the case of a mobile application, you might want to avoid AJAX and go for a simple page that reloads every few seconds until the job is complete. When the job is finished the page just reloads another that displays the results, instead of reloading itself.

This kind of thing can be achieved completely server-side with no javascript.

Thanks, it is really very helpful.
Right now we can trying to figure out the scheduling policy of ELB. We donnot have our own domain name, so can we point an instance domain name to ELB?
Or set a alias name of the instance, then point the alias name to ELB ? We tried, but failed.

What i want to say is: we often point a 3rd level domain name(must be a CNAME, like to ELB.

Right now because we do not have our own domain name, what we have is only the domain name by running an instance. It is a 4th level domain name.

I have tried point the4th level domain name(the domain name of the instance) to ELB, but failed.
I also tried set a 5th level domain name(CNAME ), pointing to ELB. We failed again.

So could you give me some advices ? Thanks a lot!


Setting up a CNAME alias requires a DNS domain that you control.

Even though most use cases call for it, you don’t need to use a DNS CNAME alias in order to use ELB. You can send your traffic directly to the ELB’s DNS name, e.g. So if that is your ELB’s DNS name then you can send web requests to your ELB at . Most people set up a CNAME alias in order to provide a nicer-looking URL and an additional point of control.


is there a way to monitor the number of connections and all other Load Balancer statistics like you get from HAProxy? Any UI frontend?



ELBs publish metrics to CloudWatch. The useful ones are Latency and RequestCount.

In the AWS Management Console you can see graphs of these metrics.


A very informative post about the ELB.

I am new to the cloud and trying to do the Load Testing for a high traffic application using Jmeter.

I have one ELB routing the traffic to 4 Jboss instances (m1-XLarge) and 3 MySQL shards.

I ran a 7500 Vusers test using 1 Jmeter master and 8 Jmeter slave instances sitting in the cloud and I could reach up to X req/sec with Y ms as a response time. There was no contention at JBOSS and DB layer.

Just to find the break point I tried running the same test with 10K Vusers and increased the Jmeter slaves to 12; however I could see the test is still able to achieve same X req/sec with Y ms as response time. and there was no contention at the JBOSS and DB layer this time as well.

I am not sure whether I am hitting the ceiling of ELB in bound traffic or this could be due to some other reasons.

I would highly appreciate your suggestions on this.

Thanks in advance.


Let’s look at those numbers first.
Y is the response time. This number should stay constant across your two test cases, which are identical in all ways except for number of requests generated.
X is the req/sec served through the LB. When this number peaks it can mean one of two things:
1. You’ve hit the maximum throughput of the ELB.
2. You’ve hit the maximum throughput of the back-end instances.
The way to tell the difference is by looking at the back-end instances. If they are maxing out on CPU or network input/output (or some other observable constraint) then the bottleneck is in the back-end instances. If the back-end instances are not stressed on any axis then the bottleneck is in the LB.

In your case it sounds like you hit the maximum throughput of the LB. Your ELB would need to scale itself in order to handle more throughput.

However, as mentioned in the article, your test case does not really stress the scalability of ELB properly. The test case should gradually ramp up the req/sec over time in order for the LB to be able to scale itself to meet the load. So the X req/sec you observed is a false reading.

If you want to check the maximum performance of a single back-end instance, run your JMeter test cases against a single back-end instance.

thanks shlomo for your quick response.

It looks to me that we are hitting the max throughput at ELB as we do not see any contention from the back end servers…

We are gardually increasing the request/seconds in 10 minutes duration.
Any Idea how to check what is the max throughput we can achieve from ELB?

Thanks a ton.


It’s likely that the tests are not causing the ELB to scale. This might be due to the DNS lookups on the JMeter hosts being improperly cached. Try launching the JMeter processes with the -D option to adjust the property as discussed in the article above.

I have not yet seen a ceiling for LB throughput in all the tests I’ve run and reviewed.

ec2 ip address from the change in the Will for the consol?
given ip once a day, for instance want to change.


If I understand correctly, you’re asking: “Is there a way to change the IP address of an instance from the AWS Management Console? And can you change the IP address of an instance whenever you want, for example once a day?”

Please correct me if I misunderstood your question.

Using the AWS Management Console (or the API, or the command-line tools) you can change the public IP address of an instance by associating and removing an Elastic IP. When you associate an Elastic IP, the original public IP address is released. When you remove an Elastic IP, a new public IP address is provided. Usually this new address is different from the original address (the address the instance had before the Elastic IP was associated), but sometimes it is the same.

To do this on a regular schedule you could write a script to call the appropriate command-line tools or API methods, and use standard scheduling tools (e.g. cron).


Awesome post!!

i’m using ec2 only for few month now and it keep on amaze me on how simple things are when using it correctly, i just implemented the ELB in our ec2 servers and it works perfectly, thank you for your great post!!!

one question though, from what i understand until now you can approach the load balancer only from the “outside” and i also would like to use the ELB to distribute internal requests between my instances within the cloud.. is it possible to create an EC2 load balancer which will work without going first outside the cloud?
i know it will work this way also but it will cost more and will take more time than if i could do it all internally.

once again, thank you for a great post!



I’m glad you find this post helpful.

It sure would be great to use ELB as an internal component without incurring the extra bandwidth charge. Unfortunately that is not an option today. ELB is designed to be accessed via public IP addresses (which incur the internet bandwidth charge) because (presumably) the ELB system needs the ability to remap those public IP addresses to different virtual LB machines when one of them fails – exactly the same way you would remap an Elastic IP to recover from instance failure.

I guess it might be technically possible for AWS to deliver an ELB that is accessed via private IP addresses, for “internal-to-EC2” use, but it would not be as resilient in the face of failure as “regular” ELB.

“2. The client looks in DNS for the resolution of the name This DNS entry is controlled by Amazon since it is under the domain. Amazon’s DNS server returns an IP address, say”

Would this reply be cached by the client’s ISP? So, next time would be returned for from DNS cache without asking amazon?

Awesome article! However, I have one correction to a point which I think is critical for understanding how the ELB appliance scales.

You said:

By the way, Step 2 can be replicated to a limited degree by using Round-Robin DNS to serve a pool of IP addresses, each of which is a load balancer. With such a setup you could have multiple load-balanced clusters of EC2 instances, each cluster sitting behind its own software load balancer on an EC2 instance. But Round Robin DNS has its own limitations (such as the inability to take into account the load on each load-balanced unit, and the difficulty of dynamically adjusting the pool of IP addresses to distribute), from which ELB does not suffer.

Unfortunately, Round-Robin DNS is exactly what AWS uses to balance load across the ELB instances. They say so in passing on this page: and I’ve confirmed in my own use.

My understanding from one of their consultants is that this is just using the standard R53 Round-Robin capabilities, so it does not take the load on a particular ELB node into account at all. Even if it did, it would still be subject to the same problems RR DNS faces with respect to DNS caching and large numbers of clients sitting behind a single cache all hitting the same IP.

This means that in the pathological case of a single client generating all of your traffic, the ELB is effectively restricted to a single node at a time, and offers no scale for incoming connections.


Thanks for posting this. ELB used to work as described in this article. Since then, AWS has released more documentation on ELB internals, and they keep it up to date with changes, which is a welcome development.

Excellent information on ELB.

i am having issue with internet facing ELB url( regarding with DNS resolution.

Resolving the “publicnlb” is taking time for first time and after TTL value expire in ISP level, so our clients are saying application is responding is very slow.

since it is not having static IP’s, i can’t add to in Route 53. and now i have added alias for publicnlb with

even though it is not resolving and not getting timely response.

please give me any suggestions for resolving this issue.

I am facing a issue in AWS, where my web layer ec2 instances trying to connect to app elbs, that is when we instances look up on the cname, it at times trying to connect to old ip addresses mapped to APP ELBs previously and it getting time out. I am not sure if any caching is happening on the dns look up.

I used dig app elb name to get the ip addresses that elb a-record is poiniting to confirm this


I’m very new to EC2/ELB and have a question related to the load balance changing IP addresses. My current domain has a CNAME record pointed to my elb name, which seems to be working okay in my development environment. My question is this: if the IP address to the ELB changes, will my clients have connection issues if they haven’t received this new IP address due to DNS TTL? Or will the old IP address still work? If this question doesn’t make sense I can try to reword it, again I’m new to this stuff. I have 1 ELB and 1 EC2 Instance attached to the ELB.

Leave a Reply

Your email address will not be published.