Search code examples
encodingjwtbase64urlencode

Why does JWT use Base64url encoding instead of Percent-encoding (also known as URL encoding)


I recently read the JWT specification and saw that it advocates the use of Base64url for encoding JWT attributes. However, I am unsure as to why Base64url is chosen over Percent-Encoding, especially when the latter has native support in numerous programming languages like PHP, JavaScript, and .NET.

Could it be a case of misunderstanding the specific uses of these two encoding methods on my part?

Edit: To my understanding, the JWT is sometimes transferred via URL's and thus makes me question the need for Base64url.


Solution

  • I believe i figured it out myself.


    Percent-encoding should not be confused with Base64url. While you could also use Base64url to encode form inputs or other textual data in a URL safe format it serves a different purpose. Base64url is used for encoding binary data that can be any data type, not just strings.

    Considering JSON Web Tokens (JWTs) as an illustrative example. These tokens, which may be transferred via URL’s, consist of three parts. The first two are JSON objects (aka strings) - the header and the payload. These two objects are then Base64url encoded (or Percent-Encoded, the choice doesn't affect this step), and then hashed, forming the final part: the signature.

    Why is this relevant to our current discussion?

    Because the output of a hash operation is raw binary data (0's and 1's), thus needing Base64url to safely encode the data, and even stated explicitly in the JWT specification. Percent-Encoding couldn't perform this task unless we first translate the hash output into Base64 or a hexadecimal representation.

    This likely explains the use of Base64url for the header and payload as well, to maintain uniformity.

    You can read more about the hash step in this OKTA blog post.