amazon-web-services encryption amazon-kms

How does envelope encryption work in aws kms?

I have recently been reading about Amazon KMS, including envelope encryption. The source I used for envelope encryption (Free Code Camp) seems to consider 4 levels of envelope encryption (data encryption key, key encryption key, KMS master key and root KMS master key), as shown in the figure below:

On the other hand, what I have read in aws only seems to consider two levels (KMS key or key encryption key and data encryption key), as shown in the figure below:

Am I missing something here? Is this disparity only apparent? Thanks in advance for your answers!

Solution

What you see here are just different levels of abstraction.

The upper diagram shows the whole chain of keys involved in creating data keys for encryption/decryption. It peeks under the hood of the KeyId abstraction.

The docs are quite extensive as cryptography is famously littered with pitfalls and edge cases, but here's my not entirely accurate but useful enough mental model of the process:

Through heavily-audited and secure random magic hardware appliances (HSMs) can come up with keys that (almost, except for replication) never leave the appliance. Besides creating these keys, HSMs can also encrypt and decrypt data using these keys as well as create more keys.

KMS exposes these HSMs in some ways to you. It allows you to manage these keys through the KMS API and create an abstraction called a master Key (there are different versions...) identified by a KeyId. This master key is not necessarily a single key (key material is what AWS uses in the docs).

Rather there are multiple keys under the hood, but they are not directly exposed to the user. The underlying keys can be rotated according to a schedule. Effectively this key is versioned, but only the most recent version is used to encrypt new data; older versions are only used to decrypt data.

The other keys in the chain in the diagram are pretty much just implementation details that you don't need to know unless you plan to audit KMS in which case my mental model won't be enough. In practice, the lower diagram is much more important.

The lower diagram shows what KMS looks like from a client that uses KMS to manage encryption keys to encrypt data in another system. To use KMS in a system, you only need to know the identifier of a KMS Key (KeyId) and have the appropriate permissions.

The flow looks like this for encryption:

You make a GenerateDataKey API Call to KMS and pass in the KeyId of the key you wish to use. KMS will then return (among other things) a plain text data key and an encrypted data key.
You use the plain text data key to encrypt your data locally and store the encrypted data alongside the encrypted data key. Next, you delete the plain text data key from memory.

For decryption the flow is as follows:

You read the encrypted data key from the encrypted object and make a call to the Decrypt API passing in the encrypted data key. Assuming you have permission, KMS will return the plain text data key ^[1][2].
You use the plain text data key to decrypt your encrypted data.

From the client's perspective, all you need to know about the keys that KMS uses is the KeyId (by default, there are more advanced features like encryption contexts).

^{[1] for symmetric encryption, it can infer the correct KeyId from the metadata, but as a best practice, you should send the KeyId as well.}

^{[2] AWS infers the correct key-material for decryption from metadata in the encrypted blob.}