Search code examples
phpcsecurityphp-internalstiming-attack

Why is order of arguments in PHP's hash_equals() function important?


PHP 5.6 introduced hash_equals() function for safe comparison of password hashes and prevention of timing attacks. Its signature is:

bool hash_equals(string $known_string, string $user_string)

As described in the documentation, $known_string and $user_string must be of the same length for the function to effectively prevent timing attacks (otherwise, false is returned immediately, leaking length of known string).

Further, the docs say:

It is important to provide the user-supplied string as the second parameter, rather than the first.

It seems unintuitive to me that the function is not symmetric in its arguments.

The question is:

  • Why is it important that the user string is supplied last?

Here's an excerpt from the function's source code:

PHP_FUNCTION(hash_equals)
{
    /* ... */

    if (Z_STRLEN_P(known_zval) != Z_STRLEN_P(user_zval)) {
        RETURN_FALSE;
    }

    /* ... */

    /* This is security sensitive code. Do not optimize this for speed. */
    for (j = 0; j < Z_STRLEN_P(known_zval); j++) {
        result |= known_str[j] ^ user_str[j];
    }

    RETURN_BOOL(0 == result);
}

As for me, the implementation is completely symmetrical regarding the two arguments. The only operation which could make any difference is the XOR operator.

  • Is it possible that the XOR operator executes in non-constant time, depending on argument values? May its execution time depend on order of arguments (e.g. if the 1st argument is zero)?

  • Or is this note from PHP's docs a "reservation" for changes of implementation in future versions?


Edit

As Morpfh stated, the initial proposal implementation was different:

PHP_FUNCTION(hash_compare)
{
    /* ... */

    /**
     * If known_string has a length of 0 we set the length to 1,
     * this will cause us to compare all bytes of userString with the null byte which fails
     */
    mod_len = MAX(known_len, 1);

    /* This is security sensitive code. Do not optimize this for speed. */
    result = known_len - user_len;
    for (j = 0; j < user_len; j++) {
        result |= known_str[j % mod_len] ^ user_str[j];
    }

    RETURN_BOOL(0 == result);
}

As you see, the draft implementation attempted to handle hashes of different lengths, and it processed the arguments asymmetrically. Maybe this draft implementation is not the first one.

Summarizing: the note in the docs about arguments' order seems to be a leftover from draft implementation.


Solution

  • Update:

    See comment from Rouven Weßling, (below this answer).


    This is more speculation then an answer, but perhaps you get something out of it.


    One guess, as you mention, is that it is for probable backward compatibility if the function undergo a future change, for what ever reason, to (1) not return false on equal length; thus being vulnerable to leak length information – or (2) other algorithms/checks where one need to know which is which - or (n) ...


    A likely candidate is that it is a leftover from the proposal to implementation:

    As such, from Proposal one have:

    Users have to be mindful, as it is important that the user supplied string (or a hash of that string) is used as the the second parameter not the first.

    This has been present since proposal creation:

    Which can be linked to References, for example:

    where one do not return on equal length, but loop useLen.