Categories
Cloud Developer Tips

Track Changes to your Dynamic Cloud Services Automatically

Dynamic infrastructure can be a pain to accommodate in applications. How do you keep track of the set of web servers in your dynamically scaling web farm? How do your apps keep up with which server is currently running what service? How can applications be written so they don’t need to care if a service gets moved to a different machine? There are a number of techniques available, and I’m happy to share implementation code for one that I’ve found useful.

One thing common to all these techniques: they all allow the application code to refer to services by name instead of IP address. This makes sense because the whole point is not to care about the IP address running the service. Every one of these techniques offers a way to translate the name of the service into an IP address behind the scenes, without your application knowing about it. Where the techniques differ is in how they provide this indirection.

Note that there are four usage scenarios that we might want to support:

  1. Service inside the cloud, client inside the cloud
  2. Service inside the cloud, client outside the cloud
  3. Service outside the cloud, client inside the cloud
  4. Service outside the cloud, client outside the cloud

Let’s take a look at a few techniques to provide loose coupling between dynamically movable services and their IP addresses, and see how they can support these usage scenarios.

Dynamic DNS

Dynamic DNS is the classic way of handling dynamically assigned roles: DNS entries on a DNS server are updated via an API (usually HTTP/S) when a server claims a given role. The DNS entry is updated to point to the IP address of the server claiming that role. For example, your DNS may have a production-master-db.example.com record. When the production deployment’s master database starts up it can register itself with the DNS provider to claim the production-master-db.example.com dns record, pointing that DNS entry to its own IP address. Any client of the database can use the host name production-db-master.example.com to refer to the master database, and as long as the machine that last claimed that DNS entry is still alive, it will work.

When running your service within EC2, Dynamic DNS servers running outside EC2 will see the source IP address for the Dynamic DNS registration request as the public IP address of the instance. So if your Dynamic DNS is hosted outside EC2 you can’t easily register the internal IP addresses. Often you want to register the internal IP address because from within the same EC2 region it costs less to use the private IP address than the public IP addresses. One way to use Dynamic DNS with private IPs is to build your own Dynamic DNS service within EC2 and set up all your application instances to use that DNS server for your domain’s DNS lookups. When instances register with that EC2-based DNS server, the Dynamic DNS service will detect the source of the registration request as being the internal IP address for the instance, and it will assign that internal IP address to the DNS record.

Another way to use Dynamic DNS with internal IP addresses is to use DNS services such as DNSMadeEasy whose API allows you to specify the IP address of the server in the registration request. You can use the EC2 instance metadata to discover your instance’s internal IP address via the URL http://169.254.169.254/latest/meta-data/local-ipv4 .

Here’s how Dynamic DNS fares in each of the above usage scenarios:

Scenario 1: Service in the cloud, client inside the cloud: Only if you run your own DNS inside EC2 or use a special DNS service that supports specifying the internal IP address.
Scenario 2: Service in the cloud, client outside the cloud: Can use public Dynamic DNS providers.
Scenario 3: Service outside the cloud, client inside the cloud: Can use public Dynamic DNS providers.
Scenario 4: Service outside the cloud, client outside the cloud: Can use public Dynamic DNS providers.

Update window: Changes are available immediately to all DNS servers that respect the zero TTL on the Dynamic DNS server (guaranteed only for Scenario 1). DNS propagation delay penalty may still apply because not all DNS servers between the client and your Dynamic DNS service necessarily respect TTLs properly.

Pros: For public IP addresses only, easy to integrate into existing scripts.

Cons: Running your own DNS (to support private IP addresses) is not trivial, and introduces a single point of failure.

Bottom line: Dynamic DNS is useful when both the service and the clients are in the cloud; and for other usage scenarios if a DNS propagation delay is acceptable.

Elastic IP Addresses

In AWS you can have an Elastic IP address: an IP address that can be associated with any instance within a given region. It’s very useful when you want to move your service to a different instance (perhaps because the old one died?) without changing DNS and waiting for those changes to propagate across the internet to your clients. You can put code into the startup sequence of your instances that associates the desired Elastic IP address, making this approach very scriptable. For added flexibility you can write those scripts to accept configurable input (via settings in the user-data or some data stored in S3 or SimpleDB) that specifies which Elastic IP address to associate with the instance.

A cool feature of Elastic IP addresses: if clients use the DNS name of the IP address (“ec2-1-2-3-4.compute-1.amazonaws.com”) instead of the numeric IP address you can have extra flexibility: clients within EC2 will get routed via the internal IP address to the service while clients outside EC2 will get routed via the public IP address. This seamlessly minimizes your bandwidth cost. To take advantage of this you can put a CNAME entry in your domain’s DNS records.

Summary of Elastic IP addresses:

Scenario 1: Service in the cloud, client inside the cloud: Trivial, client should use Elastic IP’s DNS name (or set up a CNAME).
Scenario 2: Service in the cloud, client outside the cloud: Trivial, client should use Elastic IP’s DNS name (or set up a CNAME).
Scenario 3: Service outside the cloud, client inside the cloud: Elastic IPs do not help here.
Scenario 4: Service outside the cloud, client outside the cloud: Elastic IPs do not help here.

Update window: Changes are available in under a minute.

Pros: Requires minimal setup, easy to script.

Cons: No support for running the service outside the cloud.

Bottom line: Elastic IPs are useful when the service is inside the cloud and an approximately one minute update window is acceptable.

Generating Hosts Files

Before the OS queries DNS for the IP address of a hostname it checks in the hosts file. If you control the OS of the client you can generate the hosts file with the entries you need. If you don’t control the OS of the client then this technique won’t help.

There are three important ingredients to get this to work:

  1. A central repository that stores the current name-to-IP address mappings.
  2. A method to update the repository when mappings are updated.
  3. A method to regenerate the hosts file on each client, running on a regular schedule.

The central repository can be S3 or SimpleDB, or a database, or security group tags . If you’re concerned about storing your AWS access credentials on each client (and if these clients are web servers then they may not need your AWS credentials at all) then the database is a natural fit (and web servers probably already talk to the database anyway).

If your service is inside the cloud and you want to support clients both inside and outside the cloud you’ll need to maintain two separate repository tables – one containing the internal IP addresses of the services (for use generating the hosts file of clients inside the cloud) and the other containing the public IP addresses of the services (for use generating the hosts file of clients outside the cloud).

Summary of Generating Hosts Files:

Scenario 1: Service in the cloud, client inside the cloud: Only if you control the client’s OS, and register the service’s internal IP address.
Scenario 2: Service in the cloud, client outside the cloud: Only if you control the client’s OS, and register the service’s public IP address.
Scenario 3: Service outside the cloud, client inside the cloud: Only if you control the client’s OS.
Scenario 4: Service outside the cloud, client outside the cloud: Only if you control the client’s OS.

Update Window: Controllable via the frequency with which you regenerate the hosts file. Can be as short as a few seconds.

Pros: Works on any client whose OS you control, whether inside or outside the cloud, and with services either inside or outside the cloud. And, assuming your application already uses a database, this technique adds no additional single points of failure.

Cons: Requires you to control the client’s OS.

Bottom line: Good for all scenarios where the client’s OS is under your control and you need refresh times of a few seconds.

A Closer Look at Generating Hosts Files

Here is an implementation of this technique using a database as the repository, using Java wrapped in a shell script to regenerate the hosts file, and using Java code to perform the updates. This implementation was inspired by the work of Edward M. Goldberg of myCloudWatcher.

Creating the Repository

Here is the command to create the necessary database (“Hosts”) and table (“hosts”):

mysql -h dbHostname -u dbUsername -pDBPassword -e \
'CREATE DATABASE IF NOT EXISTS Hosts; \
USE Hosts; \
DROP TABLE IF EXISTS \`hosts\`; \
CREATE TABLE \`hosts\` ( \
\`record\` TEXT \
) DEFAULT CHARSET=latin1; \
INSERT INTO \`hosts\` VALUES ("127.0.0.1   localhost   localhost.localdomain");'

Notice that we pre-populate the repository with an entry for “localhost”. This is necessary because the process that updates the hosts file will completely overwrite the old one, and that’s where the localhost entry is supposed to live. Removing the localhost entry could wreak havoc on networking services – so we preserve it by ensuring a localhost entry is in the repository.

Updating the Repository

To claim a certain role (identified by a hostname – in this example “webserver1” – with an IP address 1.2.3.4) it is registered in the repository. Here’s the one-liner:

mysql -h dbHostname -u dbUsername -pDBPassword -e \
'DELETE FROM Hosts.\`hosts\` WHERE record LIKE "% webserver1"; \
INSERT INTO Hosts.\`hosts\` (\`record\`) VALUES ("1.2.3.4   webserver1");'

The registration process can be performed on the client itself or by an outside agent. Make sure you substitute the real host name and the correct IP address.

On an EC2 instance you can get the private and public IP addresses of the instance via the instance metadata URLs. For example:

$ privateIp=$(curl --silent http://169.254.169.254/latest/meta-data/local-ipv4)
$ echo $privateIp
10.209.206.223
$ publicIp=$(curl --silent http://169.254.169.254/latest/meta-data/public-ipv4)
$ echo $publicIp
75.101.198.120

Regenerating the Hosts File

The final piece is recreating the hosts file based on the contents of the database table. Notice how the table records are already in the correct format for a hosts file. It would be simple to dump the output of the entire table to the hosts file:

mysql -h dbHostname -u dbUsername -pDBPassword --silent --column-names=0 -e \
'SELECT \`record\` FROM Hosts.\`hosts\`' | uniq > /etc/hosts  # This is simple and wrong

But it would also be wrong to do that! Every so often the database connection might fail and you’d be left with a hosts file that was completely borked – and that would prevent the client from properly resolving the hostnames of your services. It’s safer to only overwrite the hosts file if the SQL query actually returns results. Here’s some Java code that does that:

Class.forName("com.mysql.jdbc.Driver").newInstance();
Connection conn = DriverManager.getConnection("jdbc:mysql://" + dbHostname + "/?user=" +
	dbUsername + "&password=" + dbPassword);
String outputFileName = "/etc/hosts";
Statement stmt = conn.createStatement();
ResultSet res = stmt.executeQuery("SELECT record FROM Hosts.host");
HashSet<String> uniqueMe = new HashSet<String>();
PrintStream out = System.out;
if (res.isBeforeFirst()) {
	out = new PrintStream(outputFileName);
}
while (res.next()) {
	String record = res.getString(1);
	if (uniqueMe.add(record)) {
		out.println(record);
	}
}
out.println();
out.close();
res.close();
stmt.close();

This code uses the MySQL Connector/J JDBC driver. It makes sure only to overwrite the hosts file if there were actual records returned from the database query.

Scheduling the Regeneration

Now that you have a script that regenerates that hosts file (you did wrap that Java program into a script, right?) you need to place that script on each client and schedule a cron job to run it regularly. Via cron you can run it as often as every minute if you want – it adds a negligible amount of load to the database server so feel free – but if you need more frequent updates you’ll need to write your own driver to call the regeneration script more frequently.

If you find this technique helpful – or have any questions about it – I’d be happy to hear from you in the comments.

Update December 2010: Guy Rosen guest-authored this article on using AWS’s DNS service Route 53 to track instances.

Categories
Cloud Developer Tips

Alternatives to Elastic IPs for EC2 Name Resolution

How can you handle DNS lookups in EC2 without going crazy each time a resource’s IP address changes? One solution is to use an Elastic IP, a stable IP address that can be remapped to different instances, but Elastic IPs are not appropriate for all situations. This article explores the various methods of managing name resolution with EC2 instances.

Features of Different Name Resolution Methods

Before diving into the methods themselves let’s take a look at the factors to consider when evaluating methods of managing name resolution. Here are the factors:

  • Updatable in code. You will want to write code to make changes to the name resolution settings automatically, in response to infrastructure events (e.g. launching a new server).
  • Propagation delay. It can take some time for changes to name resolution settings to propagate (especially with DNS). A solution should offer some degree of assurance that changes will propagate within a known and reasonable period of time. [Note that some clients (e.g. the IE browser or the Java rutime) by default ignore the DNS TTL, artificially increasing the propagation delay for DNS-based methods.]
  • Compatible with DNS. If your service will be accessed by a web browser or other client that you do not control, your name resolution method will need to be compatible with DNS. Otherwise clients will not be able to resolve your hostnames properly.
  • Ease of implementation. Some solutions, while technically sufficient, are difficult to implement.
  • Public / Private IP addresses. Whether the solution can serve public and/or private IP addresses. If your clients are inside the same EC2 region then you want their lookups to resolve to the private IP address. Clients outside the same EC2 region should be served the public IP address.
  • Supply. Is there any practical limitation on the number of name resolution entries?
  • Cost. How much it costs to implement, including costs for idle resources and updating settings.

Methods of Name Resolution

As mentioned above, there are a number of different methods to manage name resolution. These are:

  • Traditional DNS.
  • Dynamically update the /etc/hosts file on the various application hosts. The /etc/hosts file on linux (like the C:\Windows\System32\Drivers\etc\hosts file on Windows) contains host-name-to-IP-address mappings that are checked before DNS is consulted, allowing it to override DNS. The file can be updated via pull (initiated by the host) or push (initiated by an external agent).
  • Store the mappings in S3 or SimpleDB. Clients must use the S3 or SimpleDB APIs for name resolution.
  • Use a dynamic DNS provider.
  • Run your own traditional DNS servers for your domain. Clients must be able to see these DNS servers.
  • Run your own dynamic DNS servers for your domain. Clients must be able to see these DNS servers.
  • Elastic IPs. The AWS pricing model discourages (though not strongly enough, I believe) Elastic IPs from being left unused, so you should use them for instances hosting services that are always on, such as your web server or your Facebook application. You should set up a DNS entry pointing the host names to the Elastic IPs, and then any remapping of the Elastic IP to a different instance happens via the EC2 API without requiring any change to DNS.

Here is a table (click on it to see it full size) showing how each of these name resolution methods stack up against each other:

Notes:

  • Dynamically updating /etc/hosts can be used to store either the public IP or the private IP but not both for the same client. You can use one /etc/hosts file for your clients inside the same EC2 region which contains the private IPs, and a different but corresponding /etc/hosts file for your clients outside the EC2 region (or outside EC2 completely) which contains the public IPs. The propagation delay is governed by the frequency with which you update the /etc/hosts file on each client. You can minimize this delay by increasing the frequency of updates. This technique is described in detail in an article by Tim Dysinger.
  • Similarly, the two “run your own DNS” methods (Your Own DNS for your Domain, Your Own Dynamic DNS for your Domain) can be used to resolve to either the public IP address or the private IP address, but not both for the same client. You should set up your clients inside EC2 to utilize the DNS service inside EC2, and the domain should be configured to point to the DNS service running outside EC2 so that clients outside EC2 will see the public IPs. Note that clients running inside EC2 whose DNS resolution you do not control (for example, another EC2 user’s client) will be referred to the public IPs. Jeff Roberts offers some great practical suggestions for running your own DNS inside EC2.

This table demonstrates the following:

  • Elastic IPs are the best choice when you need only a limited number of resolvable names and you will use them constantly. If you use their corresponding DNS name then they intelligently resolve to the public IP when looked up from the internet and to the private IP when looked up from within EC2.
  • If you need an unlimited number of resolvable names within EC2 then you should run your own dynamic DNS within EC2.
  • Methods that are incompatible with DNS should only be used with clients you control.

As we can see, Dynamic DNS (especially running your own) has one distinct advantage over using Elastic IPs: unlimited supply at no cost when unused.

When Running Your Own Dynamic DNS is Better than Elastic IPs

One application for running your own Dynamic DNS is a testing environment that includes large clusters of EC2 instances, for example database cluster or application nodes, connected to web layer instance(s). These cluster instances will only be visible to the front-end web tier, so they do not need a publicly resolvable IP address. And your testing environment is not likely to be running all the time. Elastic IPs would work here (presuming you needed only 5 or you could convince AWS to increase your Elastic IP limit to meet your needs), but would cost money when unused. A more economical solution might be to use your own Dynamic DNS within EC2 for these instances. If you have spare capacity on an existing instance then you can put the Dynamic DNS service there – otherwise you will need another instance, making the cost less attractive. In any case you’ll need the instance hosting the Dynamic DNS to have an Elastic IP to allow failover without affecting the clients. And you’ll need a script to dynamically configure the /etc/resolv.conf on your EC2 clients to point to the private IP address of the Dynamic DNS instance by looking up its Elastic IP’s DNS name.

Let’s compare the monthly costs of using Elastic IPs with the costs of running your own dynamic DNS for a testing environment such as the above. The cost reflects the following ingredients and assumptions:

  • The number of hours over the month that allocated addresses (DNS entries) are not associated with a live instance, in total for all allocated addresses. If you have DNS entries / addresses and leave them unmapped for 10 hours each then you have 100 unmapped hours.
  • The number of changes to the DNS mappings made that month.
  • The fractional cost of running an instance just to serve the dynamic DNS. If you have spare capacity on an existing instance then this is the instance cost multiplied by the fraction of the capacity that the dynamic DNS service uses. If you need to spin up a dedicated instance for the dynamic DNS service then this is the entire cost of that instance.
  • Pricing for Elastic IPs: free when in use. 1 cent per hour unused. First 100 remaps per month free, 10 cents per remap afterward.

It should be obvious that using dynamic DNS for this testing environment will be economical when

FractionalDNSInstanceCost < NumUnmappedHours * 0.01 + MAX(NumMappingChanges – 100, 0) * 0.1

For simplicity’s sake this can be rewritten in clearer terms:

FractionalDNSInstanceCost < NumInstances * ( NumHoursClusterUnused * 0.01 + MAX(NumTimesClusterIsLaunched – 100, 0) * 0.1)

Right about now I’m wishing Excel had better 3-D graphing capabilities. Here’s something helpful to visualize this:

The chart shows the monthly cost of running clusters of different sizes according to how many times the cluster is launched. The color “bands” show the areas in which the monthly cost lies, depending on how many hours the cluster remains unused. For a given number of times launched (i.e. for a given vertical line), the “bottom” point of each band is the cost when the cluster is unused zero hours (i.e. always on), and the “top” point is the cost when the cluster is unused for 500 hours (about 20 days).

The dominant factors are, first, the number of instances in the cluster and, second, the number of times the cluster will be launched. A cluster of 100 instances costs $10 each time it is launched beyond the first 100, (plus $1 for each hour unused). For large cluster sizes, the more times you launch, the higher the cost of using Elastic IPs will be and the more attractive the run-your-own dynamic DNS option becomes.