S3 has an “eventual consistency” model, which presents certain limitations on how S3 can be used. Today, Amazon released an improvement called “read-after-write-consistency” in the EU and US-west regions (it’s there, hidden at the bottom of the blog post). Here’s an explanation of what this is, and why it’s cool.
What is Eventual Consistency?
Consistency is a key concept in data storage: it describes when changes committed to a system are visible to all participants. Classic transactional databases employ various levels of consistency, but the golden standard is that after a transaction commits the changes are guaranteed to be visible to all participants. A change committed at millisecond 1 is guaranteed to be available to all views of the system – all queries – immediately thereafter.
Eventual consistency relaxes the rules a bit, allowing a time lag between the point the data is committed to storage and the point where it is visible to all others. A change committed at millisecond 1 might be visible to all immediately. It might not be visible to all until millisecond 500. It might not even be visible to all until millisecond 1000. But, eventually it will be visible to all clients. Eventual consistency is a key engineering tradeoff employed in building distributed systems.
One issue with eventual consistency is that there’s no theoretical limit to how long you need to wait until all clients see the committed data. A delay must be employed (either explicitly or implicitly) to ensure the changes will be visible to all clients.
Practically speaking, I’ve observed that changes committed to S3 become visible to all within less than 2 seconds. If your distributed system reads data shortly after it was written to eventually consistent storage (such as S3) you’ll experience higher latency as a result of the compensating delays.
What is Read-After-Write Consistency?
Read-after-write consistency tightens things up a bit, guaranteeing immediate visibility of new data to all clients. With read-after-write consistency, a newly created object or file or table row will immediately be visible, without any delays.
Note that read-after-write is not complete consistency: there’s also read-after-update and read-after-delete. Read-after-update consistency would allow edits to an existing file or changes to an already-existing object or updates of an existing table row to be immediately visible to all clients. That’s not the same thing as read-after-write, which is only for new data. Read-after-delete would guarantee that reading a deleted object or file or table row will fail for all clients, immediately. That, too, is different from read-after-write, which only relates to the creation of data.
Why is Read-After-Write Consistency Useful?
Read-after-write consistency allows you to build distributed systems with less latency. As touched on above, without read-after-write consistency you’ll need to incorporate some kind of delay to ensure that the data you just wrote will be visible to the other parts of your system.
But no longer. If you use S3 in the US-west or EU regions (or other regions supporting read-after-write consistency), your systems need not wait for the data to become available.
Update March 2011: As more S3 regions come online they seem to be getting the same features as US-West. So far the AP-Singapore and AP-Tokyo regions also support Read-After-Write consistency. US Standard does not.
Update June 2012: As pointed out in the comments below, more S3 regions now support read-after-write consistency: US-West Oregon, SA-Sao Paolo, and AP-Tokyo. It’s not easy keeping up with the pace of AWS’s updates!
Why Only in the AWS US-west and EU Regions not in the US Standard region?
Read-after-write consistency for AWS S3 is was only available in the US-west and EU regions, not the US-Standard region. I asked Jeff Barr of AWS blogging fame why, and his answer makes a lot of sense:
This is a feature for EU and US-West. US Standard is bi-coastal and doesn’t have read-after-write consistency.
Aha! I had forgotten about the way Amazon defines its S3 regions. US-Standard has servers on both the east and west coasts (remember, this is S3 not EC2) in the same logical “region”. The engineering challenges in providing read-after-write consistency in a smaller geographical area are greatly magnified when that area is expanded. The fundamental physical limitation is the speed of light, which takes at least 16 milliseconds to cross the US coast-to-coast (that’s in a vacuum – it takes at least four times as long over the internet due to the latency introduced by routers and switches along the way).
If you use S3 and want to take advantage of the read-after-write consistency, make sure you understand the cost implications: some other regions have higher storage and bandwidth costs than the US-Standard region.
Next Up: SQS Improvements?
Some vague theorizing:
It’s been suggested that AWS Simple Queue Service leverages S3 under the hood. The improved S3 consistency model can be used to provide better consistency for SQS as well. Is this in the works? Jeff Barr, any comment? 🙂
22 replies on “Read-After-Write Consistency in Amazon S3”
Ah ha, thank you for solving the mystery. (And for introducing me to the word bi-coastal!)
It looks like Read-After-Write Consistency has been added to a few more regions (http://aws.amazon.com/s3/faqs/):
Q: What data consistency model does Amazon S3 employ?
Amazon S3 buckets in the US West (Northern California), EU (Ireland), Asia Pacific (Singapore), and Asia Pacific (Tokyo) Regions provide read-after-write consistency for PUTS of new objects and eventual consistency for overwrite PUTS and DELETES. Amazon S3 buckets in the US Standard Region provide eventual consistency.
@Oliver Coleman,
Thanks for pointing that out; I have updated the article.
I guess Amazon makes no guarantees about bucket lists? I.e., how long before we should expect a new or updated object to appear in a bucket list, and how long before a deleted object is no longer returned (including an in-progress paged bucket list)?
@Jamshid,
Correct. No guarantees are given about list consistency. However, “paged” bucket lists are not really “in-progress” in S3 – they are composed of separate, independent requests. Each request returns a list of up to 1000 objects and an indication whether the data has been truncated (“IsTruncated”). Subsequent requests can use the data in the last response (the last key returned) as the “Marker” from which to begin listing results. See the docs for more details. But there is no inherently consistent view maintained by the service.
Good article. But requires some updates.
Amazon S3 now has a new region in Oregon – US West (Oregon), which provides read-after-write consistency. Importantly, it is also priced the same as US Standard region.
So, if you need strong read-after-write consistency, and if you prefer to have your data stored in the west coast, you should go ahead and use the US West (Oregon) region.
@StoragePro,
Thanks – I’ve updated the article to reflect the new regions.
[…] If you are making many writes to S3, you’ll want to become familiar with the concept of eventual consistency in the Amazon S3 system and how ‘write’ transactions are logged. For the case of simple […]
Seems like the term “read-after-create” would have eliminated a lot of confusion.
[…] larger than 8.6. What we do know that at least in the US Standard region, AWS stores the object bi-coastally, so indeed the odds of a natural disaster simultaneously wiping out everything must indeed be quite […]
[…] For more information on eventual consistency, Shmolo Swidler has a great post on it . […]
Great explanation ! Thank you !
[…] If you are making many writes to S3, you’ll want to become familiar with the concept of eventual consistency in the Amazon S3 system and how ‘write’ transactions are logged. For the case of simple […]
Good Explanation on read-after-write and eventual consistency. Thanks!
Nice article!!! After reading this i can easily understood the eventual consistency and read after write consistency.
[…] http://shlomoswidler.com/2009/12/read-after-write-consistency-in-amazon.html […]
Apparently – all the regions are now read-after-write consistency enabled for new objects.
Read-after-write Consistency: Amazon S3 now supports read-after-write consistency for new objects added to Amazon S3 in US Standard region. Prior to this announcement, all regions except US Standard supported read-after-write consistency for new objects uploaded to Amazon S3. With this enhancement, Amazon S3 now supports read-after-write consistency in all regions for new objects added to Amazon S3. Read-after-write consistency allows you to retrieve objects immediately after creation in Amazon S3.
https://aws.amazon.com/about-aws/whats-new/2015/08/amazon-s3-introduces-new-usability-enhancements/
@Naveen Vijay,
Thanks, that’s important to know.
[…] post http://shlomoswidler.com/2009/12/read-after-write-consistenc… has a quote from Jeff Barr at AWS indicating that us-east-1 is bicoastal, which is also why its […]
[…] 02:16:07 AM Xueman Li: 不是s3独有的 02:16:34 AM chenjian_test: http://shlomoswidler.com/2009/12/read-after-write-consistency-in-amazon.html 02:16:37 AM chenjian_test: 这个链接 02:17:51 AM Xueman Li: […]
According to this article “read-after-write consistency guarantees immediate visibility of new data to all clients. With read-after-write consistency, a newly created object or file or table row will immediately be visible, without any delays.”
So can I say it means creating new object is consistent?
According to AWS documentation: ”
Amazon S3 Data Consistency Model:
Amazon S3 provides read-after-write consistency for PUTS of new objects in your S3 bucket in all regions with one caveat. The caveat is that if you make a HEAD or GET request to the key name (to find if the object exists) before creating the object, Amazon S3 provides eventual consistency for read-after-write.
”
https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#ConsistencyModel
Which doesn’t fit for your definition, unfortunately I cannot find AWS definition of read-after-write consistency.
Could you please help me clarify?
@Peter Nagy,
The definitions agree. The docs point out a corner case that would violate read-after-write consistency: If you request a non-existent object before creating it, it is still possible for the request to succeed. The time window between the request and the creation would have to be very short.