Month: January 2011

Lessons Learned from Using Multiple Cloud APIs

Post author By Shlomo Swidler
Post date January 27, 2011
1 Comment on Lessons Learned from Using Multiple Cloud APIs

Adrian Cole, author of the jClouds library, has an excellent writeup of the trajectory the library’s development followed as it added support for more cloud providers’ APIs.

[Update August 2011: Blogger.com has removed Adrian’s blog so that link no longer works.]

Some important takeaways for application developers:

Your unit tests are as valuable as your code. The tests ensure the code works to spec, and they should be used as frequently as possible during development.
Make your code easy to test, with sensible defaults that require no external dependencies: e.g. don’t require internet connectivity.
Cloud limitations, both general (such as eventual consistency) and cloud-specific (such as a limit on the number of buckets per S3 account), will require careful consideration in your code (and tests).
Some things can only really be tested against a live cloud service. As Adrian points out, the only way to test that an instance launched with the desired customizations is to ssh in to that instance and explore it from the inside. This is not testable using an offline stub emulator.

But the key lesson developers can learn is: Whenever possible, use an existing library to interface with your clouds. As Adrian’s post makes patently clear, a lot of effort goes into ensuring the library works properly with the various supported APIs, and you can only benefit by leveraging those accomplishments.

On the other side of the fence, API developers can also learn from Adrian’s article. As William Vambenepe recently commented:

Rather than spending hours obsessing about the finer points of your API, spend the time writing love letters to [boto author] Mitch and Adrian so they support you in their libraries.

In fact, Adrian’s blog can be viewed as a TODO list for API creators who want to encourage adoption.

API authors should also refer to Steve Loughran’s Cloud Tools Manifesto for more great ideas on how to make life easy for developers.

Tags apis, devops, jclouds

Cloud Developer Tips

AWS Auto-Scaling and ELB with Reliable Root Domain Handling

Post author By Shlomo Swidler
Post date January 24, 2011
9 Comments on AWS Auto-Scaling and ELB with Reliable Root Domain Handling

Update May 2011: Now that AWS Route 53 can be used to allow an ELB to host a domain zone apex, the technique described here is no longer necessary. Cool, but not necessary.

Someone really has to implement this. I’ve had this draft sitting around ever since AWS announced support for improved CloudWatch alerts and AutoScaling policies (August 2010), but I haven’t yet turned it into a clear set of commands to follow. If you do, please comment.

Background

You want an auto-scaled, load-balanced pool of web servers to host your site at example.com. Unfortunately it’s not so simple, because AWS Elastic Load Balancer can’t be used to host a domain apex (AKA a root domain). One of the longest threads on the AWS Developer Forum discusses this limitation: because ELB utilizes DNS CNAMEs, which are not legal for root domain entries, ELB does not support root domains.

An often-suggested workaround is to use an instance with an Elastic IP address to host the root domain, via standard static DNS, with the web server redirecting all root domain requests to the subdomain (www) served by the ELB. There are four drawbacks to this approach:

The instance with the Elastic IP address is liable to be terminated by auto-scaling, leaving requests to the root domain unanswered.
The instance with the Elastic IP address might fail unnaturally, again leaving requests to the root domain unanswered.
Even when traffic is very low, we need at least two instances running: the one handling the root domain outside the auto-scaled ELB group (due to issue #1) and the one inside the auto-scaled ELB group (to handle the actual traffic hitting the ELB-managed subdomain).
The redirect adds additional latency to requests hitting the root domain.

While we can’t do anything about the fourth issue, what follows is a technique to handle the first three issues.

The Idea

The idea is built on these principles:

The instance with the Elastic IP is outside the auto-scaled group so it will not be terminated by auto-scaling.
The instance with the Elastic IP is managed using AWS tools to ensure the root domain service is automatically recovered if the instance dies unexpectedly.
The auto-scaling group can scale back to zero size, so only a single instance is required to serve low traffic volumes.

How do we put these together?

Here’s how:

Create an AMI for your web server. The AMI will need some special boot-time hooks, which are described below in italics. The web server should be set up to redirect root domain traffic to the subdomain that you’ll want to associate with the ELB, and to serve the subdomain normally.
Create an ELB for the site’s subdomain with a meaningful Health Check (e.g. a URL that exercises representative areas of the application).
Create an AutoScaling group with min=1 and max=1 instances of that AMI. This AutoScaling group will benefit from the default health checks that such groups have, and if EC2 reports the instance is degraded it will be replaced. The LaunchConfiguration for this AutoScaling group should specify user-data that indicates this instance is the “root domain” instance. Upon booting, the instance will notice this flag in the user data, associate the Elastic IP address with itself, an add itself to the ELB.
Note: At this point, we have a reliably-hosted single instance hosting the root domain and the subdomain.
Create a second AutoScaling group (the “ELB AutoScaling group”) that uses the same AMI, with min=0 instances – the max can be anything you want it to – and set it up to use the ELB’s Health Check. The LaunchConfiguration for this group should not contain the abovementioned special flag – these are not root domain instances.
Create an Alarm that looks at the CPUUtilization across all instances of the AMI, and connect it to the “scale up” and “scale down” Policies for the ELB AutoScaling group.

That is the basic idea. The result will be:

The root domain is hosted on an instance that redirects to the ELB subdomain. This instance is managed by a standalone Auto Scaling group that will replace the instance if it becomes degraded. This instance is also a member of the ELB, so it serves the subdomain traffic as well.
A second AutoScaling group manages the “overflow” traffic, measured by the CPUUtilization of all the running instances of the AMI.

TODO

Here are the missing pieces:

A script that can be run as a boot-time hook that checks the user-data for a special flag. When this flag is detected, the script associates the root domain’s Elastic IP address (which should be specified in the user-data) and adds the instance to the ELB (whose name is also specified in the user-data). This will likely require AWS Credentials to be placed on the instance – perhaps in the user-data itself (be sure you understand the security implications of this) as well as a library such as boto or the AWS SDK to perform the AWS API calls.
The explicit step-by-step instructions for carrying out steps 1 through 5 above using the relevant AWS command-line tools.

Do you have these missing pieces? If so, please comment.

Tags auto scaling, aws, dns, elastic load balancing, elb

Cloud Developer Tips

Using Elastic Beanstalk via command-line on a Mac? Keep that OS X Install DVD handy

Post author By Shlomo Swidler
Post date January 19, 2011
4 Comments on Using Elastic Beanstalk via command-line on a Mac? Keep that OS X Install DVD handy

The title pretty much says it all.

Elastic Beanstalk is the new service from Amazon Web Services offering you easier deployment of Java WAR files. More languages and platforms are expected to be supported in the future.

Most people will use the service via the convenient web console, but if you want to automate things you’ll either end up using the command-line tools (CLI tools) or the API in the Java SDK (until their other SDKs add Beanstalk support).

But, if you’re running on a Mac, you’ll have a problem running the command-line tools:

Shlomos-MacBook-Pro:ec2 shlomo$ elastic-beanstalk-describe-applications/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require': no such file to load -- json (LoadError) from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/rubygems/custom_require.rb:31:in `require' from /Users/shlomo/ec2/elasticbeanstalk/bin/../lib/aws/client/awsqueryhandler.rb:2 from /Users/shlomo/ec2/elasticbeanstalk/bin/../lib/aws/client/awsquery.rb:4:in `require' from /Users/shlomo/ec2/elasticbeanstalk/bin/../lib/aws/client/awsquery.rb:4 from /Users/shlomo/ec2/elasticbeanstalk/bin/../lib/aws/elasticbeanstalk.rb:19:in `require' from /Users/shlomo/ec2/elasticbeanstalk/bin/../lib/aws/elasticbeanstalk.rb:19 from /Users/shlomo/ec2/elasticbeanstalk/bin/setup.rb:18:in `require' from /Users/shlomo/ec2/elasticbeanstalk/bin/setup.rb:18 from /Users/shlomo/ec2/elasticbeanstalk/bin/elastic-beanstalk-describe-applications:18:in `require' from /Users/shlomo/ec2/elasticbeanstalk/bin/elastic-beanstalk-describe-applications:18

If you remember from the README (which you read, of course 😉 there was some vague mention of this:

If you're using Ruby 1.8, you will have to install the JSON gem: gem install json

OK, let’s try that:

Shlomos-MacBook-Pro:ec2 shlomo$ sudo gem install json Building native extensions. This could take a while... ERROR: Error installing json: ERROR: Failed to build gem native extension.

/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby extconf.rb mkmf.rb can't find header files for ruby at /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/ruby.h

Gem files will remain installed in /Library/Ruby/Gems/1.8/gems/json-1.4.6 for inspection. Results logged to /Library/Ruby/Gems/1.8/gems/json-1.4.6/ext/json/ext/generator/gem_make.out

The first complaint is about the ruby tools not being in the path. Let’s fix that and try again:

Shlomos-MacBook-Pro:ec2 shlomo$ export PATH=$PATH:/Users/shlomo/.gem/ruby/1.8/bin Shlomos-MacBook-Pro:ec2 shlomo$ sudo gem install json Building native extensions. This could take a while... ERROR: Error installing json: ERROR: Failed to build gem native extension.

/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby extconf.rb mkmf.rb can't find header files for ruby at /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/ruby.h

Gem files will remain installed in /Library/Ruby/Gems/1.8/gems/json-1.4.6 for inspection. Results logged to /Library/Ruby/Gems/1.8/gems/json-1.4.6/ext/json/ext/generator/gem_make.out

Uh oh, no dice. What now? StackOverflow to the rescue:

The ruby headers don’t come installed with the base ruby install with Mac OS X. These can been found on Mac OS X Install Disc 2 by installing the XCode Tools.

If you’re like me and you don’t carry around the OS X Install DVD wherever you go, you’re stuck.
Any readers in Seoul with an OS X 10.6 Install DVD?

Update 20 Jan 2011: I’ve gotten some comments that made me realize the title really didn’t say it all. Some clarifications are in order.

Some people pointed out that I should just download the XCode .dmg DVD image – all 3.4 GB of it. Unfortunately that wasn’t applicable for me at the time: I was connected via 3G, tethered to my Android phone. I’ve never tried to download 3.4 GB on that connection, and I don’t plan to try today: it would be expensive.

See the great comment below by Beltran (who works for Bitnami) about a great solution.

Tags aws, beanstalk, cli, osx