Search code examples
c++boosthashtype-conversion

Checking the size of hashes in C++


As one would do with a blockchain, I want to check if a hash satisfies a size requirement. This is fairly easy in Python, but I am having some difficulty implementing the same system in C++. To be clear about what I am after, this first example is the python implementation:

difficulty = 25
hash = "0000004fbbc4261dc666d31d4718566b7e11770c2414e1b48c9e37e380e8e0f0"
print(int(hash, 16) < 2 ** (256 - difficulty))

The main problem I'm having is with these numbers - it is difficult to deal with such large numbers in C++ (2 ** 256, for example). This is solved with the boost/multiprecision library:

boost::multiprecision::cpp_int x = boost::multiprecision::pow(2, 256)

However, I cannot seem to find a way to convert my hash into a numeric value for comparison. Here is a generic example of what I am trying to do:

int main() {
      string hash = "0000004fbbc4261dc666d31d4718566b7e11770c2414e1b48c9e37e380e8e0f0";
      double difficulty = 256 - 25;
      cpp_int requirement = boost::multiprecision::pow(2, difficulty);

      // Something to convert hash into a number for comparison (converted_hash)

      if (converted_hash < requirement) {
           cout << "True" << endl;
      }
      return 1;
}

The hash is either being received from my web server or from a local python script, in which case the hash is read into the C++ program via fstream. Either way, it will be a string upon arrival.

Since I am already integrating python into this project, I am not entirely opposed to simply using the Python version of this algorithm; however, sometimes taking the easier path prevents you from learning, so unless this is a really cumbersome task, I would like to try to accomplish it in C++.


Solution

  • Your basic need is to compute how many zero bits exist before the first non-zero bit. This has nothing to do with multi-precision really, it can be reformulated into a simple counting problem:

    // takes hexadecimal ASCII [0-9a-fA-F]
    inline int count_zeros(char ch) {
        if (ch < '1') return 4;
        if (ch < '2') return 3;
        if (ch < '4') return 2;
        if (ch < '8') return 1;
        return 0; // see ASCII table, [a-zA-Z] are all greater than '8'
    }
    
    int count_zeros(const std::string& hash) {
        int sum = 0;
        for (char ch : hash) {
            int zeros = count_zeros(ch);
            sum += zeros;
            if (zeros < 4)
                break;
        }
        return sum;
    }
    

    A fun optimization is to realize there are two termination conditions for the loop, and we can fold them together if we check for characters less than '0' which includes the null terminator and also will stop on any invalid input:

    // takes hexadecimal [0-9a-fA-F]
    inline int count_zeros(char ch) {
        if (ch < '0') return 0; // change 1
        if (ch < '1') return 4;
        if (ch < '2') return 3;
        if (ch < '4') return 2;
        if (ch < '8') return 1;
        return 0; // see ASCII table, [a-zA-Z] are all greater than '8'
    }
    
    int count_zeros(const std::string& hash) {
        int sum = 0;
        for (const char* it = hash.c_str(); ; ++it) { // change 2
            int zeros = count_zeros(*it);
            sum += zeros;
            if (zeros < 4)
                break;
        }
        return sum;
    }
    

    This produces smaller code when compiled with g++ -Os.