Search code examples
c++performanceoptimizationcomparisonbranchless

compare chars for equality without branching


On a previous question where I wanted to optimize this function:

static
lvh_distance levenshtein_distance( const std::string & s1, const std::string & s2 )
{
    const size_t len1 = s1.size(), len2 = s2.size();
    std::vector<unsigned int> col( len2+1 ), prevCol( len2+1 );

    const size_t prevColSize = prevCol.size();
    for( unsigned int i = 0; i < prevColSize; i++ )
        prevCol[i] = i;

    for( unsigned int i = 0, j; i < len1; ++i )
    {
        col[0] = i+1;
        const char s1i = s1[i];
        for( j = 0; j < len2; ++j )
        {
            const auto minPrev = 1 + std::min( col[j], prevCol[1 + j] );
            col[j+1] = std::min( minPrev, prevCol[j] + (  s1i == s2[j] ? 0 : 1   ) );

        }
        col.swap( prevCol );
    }
    return prevCol[len2];
}

An user commented that I could replace s1i == s2[j] ? 0 : 1 with ((s1i - s2[j]) & 0x80) >> 7 to prevent a conditional jump. The trick was wrong and the user deleted his comment, but I am wondering whether there could actually be a way to do that.


Solution

  • Assuming that the code

    s1i == s2[j] ? 0 : 1
    

    does give you a branching operation, which you really want to avoid, you could simply attempt the following:

    !(s1i == s2[j])
    

    This should give the same effect, and may help the compiler remove the branching. Or, you could reverse the logic and write

    s1i != s2[j]
    

    As always with this type of optimizations, there is never a guarantee that this will actually achieve the results you hope for. Optimizers do very many clever things and trying to predict how they will react to your tricks is very often difficult. So, even in the best case, all you can do is attempt different solutions and compare the resulting binary code.