Search code examples
python-3.xcryptographysha256hmachashlib

Python hashlib SHA256 with starting value


I want to process a HMAC* length extension attack for a university task. Therefore I have both, a HMAC* and the corresponding message, provided and want to attach another arbitrary message and recalculate the HMAC without having the key. Regarding our lecture, this is possible and a very common attack scenario.

My problem is rather implementation based: To drive this attack, I need to replace the default SHA256 starting values (h0 to h7) with the existing HMAC* I already have. As I do not have the key, just pushing in the orginal data will not be possible.

Is there any way except reimplementing SHA256 that would allow me to replace these starting values in python3?

Clarification

I have a valid HMAC* h given. Furthermore, there is the a message m that has been used (together with a secret key k) to generate h. (h = SHA256(k || m)).

My task: I need to find a way to derivate another HMAC* h' without knowing k on the basis of m. It turned out, that the new message is m' = m + pad(k||m) + a with a randomly chosen a.

Further clarification

*: With "HMAC" I do not refer to the standard of RFC 2014. HMAC in general "is a specific type of message authentication code (MAC) involving a cryptographic hash function and a secret cryptographic key." (Wikipedia.org/HMAC). In this case, the HMAC is calculated as h = SHA256(k || m) where k is the secret key, || is the concatenation and m is the message.


Solution

  • First of all, the SHA256 context includes multiple parameters. In this solution, two of these are relevant: The state which somehow represents the "progress" of the SHA256 algorithm. The final state is actually the SHA256 hash sum. And the seconds parameter is the overall message length in bits, that will be set at the end of the padding.

    Furthermore, SHA256 always uses padding what means, another sequence of bytes p is implicitly added to the input data before calculating the final hash and depends on the actual input value. So lets say SHA256(x) = mySHA256(x || p(x)) assuming that mySHA256 is not using padding.

    When the given HMAC h has been generated using h = SHA256(k || m) = mySHA256(k || m || p) where k was the secret key and m was the message, h represented the final state of the SHA256 context. Additionally, we have an implicit padding p that depends on k || m. Hereby, p is not rather dependend on len(k) and not k itself, what means that we can calculate p without knowing the key but it's length.

    As my target will only accept my modified message m' = m + a when I deliver a correct HMAC h' = SHA256(k || m'), I need to focus on that point now. By knowing the original HMAC h, I can set the state of the SHA256 context corresponding to h. As I know the message m as well, and I know that the overall message length in bits is (len(k) + len(m) + len(p)) * 8, my overall message length is just depending on len(k) (not k!) because len(p) only depends on len(k) and len(m). I will iterate through a range of len(k), like 1 - 64. In each iteration step, I can just insert my value for len(k). So it is possible to set the overall message length (the second parameter of my SHA256 context), too.

    When iterating through all key lengths, there will be one value that represents the length of the key that has actually been used. In that case, I have a SHA256 context that exactly equals the context of the original calculation. We can now add our arbitrary data a to the hash calculation and create another HMAC h' that does depend on the key k without knowing it. h' = SHA256(k || m || p || a)

    But now, we have to ensure that this HMAC h' equal to that one, the target calculates using our message m'. Therefore, we add our padding p to the end of original message m followed by our arbitrary message a. Finally we have m' = m || p || a.

    As the target knows the secret key in order to validate the input data, it can easily calculate SHA256(k || m') = SHA256(k || m || p || a)* and oooops! Indeed that is the same hash sum as our HMAC h' that we calculated without knowing the secret key k

    Result: We can not add a fully arbitrary message, but a message that is fully arbitrary after the padding. As the padding is mostly filled with Null-Bytes, that can disturb our attack, but that depends on each case. In my case, the Null-Bytes were ignored and I just had one artifact from the overall message length displayed before my inserted message.