Search code examples
regexbitcoin

Bitcoin block and transaction regex


I want to be able to determine if a given hash represents a block, block height or transaction.

I know block are like ^[0-9]+$ and that ^0$ is valid since it's the genesis block. I know a block is in base58 with a length of 64 and usually starts with a 0. I know a transaction is in base58 with a length of 64.

Now I ended up with regex:

  • block height: ^(0|[1-9][0-9]*)$
  • block hash: ^0+[BASE58]{63}$
  • transaction hash: ^[BASE58]{64}$

Yet I spotted some transactions with leading 0 so I guess it's not part of the Bitcoin protocol to have only blocks to start with 0s. I also run a regtest local (fake) network with difficulty=1 and all blocks do not start with 0s.

Is there any reliable way, most probably using regex, to differentiate a block hash from a transaction hash?


Solution

  • There are two main problems you will need to tackle: the first, and easier one, is the format of a transaction or block hash. These are SHA256 hashes of the corresponding serialized representations of the transaction or a block, meaning that these are just 32 byte arrays, commonly represented in hexadecimal to users. The regular expression to check validity for both therefore is simply

    ^[a-fA-F0-9]{64}$
    

    as you noted blocks in Bitcoin (though not other cryptocurrencies such as litecoin) will have leading zeroes, with at least 8 zeroes due to the minimum difficulty, you could therefore also use the following:

    ^[0]{8}[a-fA-F0-9]{56}$
    

    though keep in mind that transactions may also fall in that category, since they will on occasion produce a hash with leading zeroes (in expectation every 1/4294967296th transaction will have such a hash).

    The second, much harder, problem is to check whether the hash actually corresponds to a transaction or block. Think of it like this, while there are many valid e-mail addresses, only very few of them will actually correspond actual users. To perform this check you'd have to either have a complete copy of the blockchain and look for the matching items, or you can have another datastructure, e.g., index or bloomfilter, to check that such an item actually exists.