Search code examples
amazon-web-servicesamazon-s3cloud-storage

How many 9s is durable an S3 object replicated in 'n' Regions/Buckets


S3 documentation states that an S3 object durability is 99.999999999 (11 nines) for a year. How many 9s an object is durable if it is replicated/copied over 'n' regions/buckets.


Solution

  • This question started me wondering... how do you put a number like this on durability? How does S3 come up with 11 9's of durability and why is the durability of the old Reduced Redundancy Storage (RRS) class apparently so much lower, at only 99.99% (4 9's), even though it's still stored in 2 AZs, not 3.

    The answer appears to lie in the statistical odds of the annual failure rate (AFR) of each individual storage entity (which might be a hard drive, but given the fact that commodity hard drives have a statistically higher failure rate -- perhaps as high as 4% AFR -- a "storage device" might be a RAID array, or other cluster technology where each independent storage entity has a 1% AFR. I'll refer to this entity as a "storage device" for simplicity. My intention is not to claim that S3 uses n hard drives to store objects; that is almost certainly an oversimplification, and I have no insight into the inner workings of S3).

    Let's briefly assume, for illustration purposes, that the AFR of a storage device in a well-maintained fleet is 1%. Obviously, this assumes the physical drives are removed from service before they reach an excessive age, otherwise they would of course all fail, eventually.

    Running with the assertion that the likelihood of losing a storage device is 1/100, the odds against it failing in a given year are 99%. We can then call the device's contents 99% durable, annually.

    If we have the same data stored on two such devices, and the system is designed such that the failure of both devices is unlikely to have any correlatable cause (e.g., not only are they not in the same cabinet or on the same power supply, they're not even in the same building), we can say concurrent failures are statistically independent, and we can determine the likelihood of losing both devices concurrently (resulting in the loss of the contents) by multiplying the probabilities together: 0.01 × 0.01 = 0.0001 or 0.01%. Thus with the same content on both drives, the odds against losing both of them improves to 99.99%.

    We can extrapolate this out to a number of storage devices:

    1 0.010000000000 99%
    2 0.000100000000 99.99%
    3 0.000001000000 99.9999%
    4 0.000000010000 99.999999%
    5 0.000000000100 99.99999999%
    6 0.000000000001 99.9999999999%
    

    Curiously, we arrive at numbers very similar to the published specs of S3, which we know stores objects redundantly across 3 availability zones. If we assume "redundantly" means two storage devices in each of these zones, then we arrive very close to 11 9's of durability (it's actually slightly higher).

    Reduced Redundancy Storage stores objects replicated fewer times ans and in only 2 availability zones, and we find the statistical failure rate of 2 devices does predict a durability of 99.99%.

    All of this is is to try to establish what "durability" really means with regard to stored objects, and it certainly seems to refer to the odds against every copy of the object being lost.

    By extension, replicating an object to a second AWS region means we need to multiply the infinitesimally small odds together, which increases the statistical durability by an additional ~11 9's (22 9's), because the failure of 12 independent storage devices in 6 availability zones across 2 different regions should be absolutely uncorrelatable, and so unlikely as to never be a possibility.

    The problem, of course, is that at these small numbers, the odds of something else going wrong, unrelated to pure durability -- like an administrative error, a malicious event, or even a defect in S3 -- would seem to become more likely by comparison... but replication across regions may help guard against these things as well. Object versioning is also an excellent feature for helping prevent data loss, since certain kinds of inadvertent errors become less likely to occur.