Search code examples
searchdata-structureshashtableethereumpatricia-trie

Merkle Patricia Tree (Ethereum) and Hashtable, which one has faster search speed?


Goal:

I would like to implement a function which has a sequence of inputs "X1,...,Xn" and outputs an ordered list "Xp,..,Xq" where all the elements are distinct but ordered.

Requirement:

  • For every Xi in the sequence "X1,...,Xn", it is a 256 bits long string.
  • The sequence of inputs "X1,...,Xn" may have the same elements, which means that there may exist two elements Xi and Xj to satisfy Xi=Xj.
  • For the same elements in sequence of "X1,...,Xn", we only output one element in the ordered list.
  • The speed of function should be as fast as possible. And it does not matter that how much storage volume is used in function.
  • The size of sequence "X1,...,Xn" is n, and n is a number that no more than 10,000.

My idea:

  • I use an Array to store the sequence which is initially empty.

  • When inputting Xi, first I search the Hashtable to judge if Xi is already in the above array. If yes, just dropping it. And if not, add Xi to Hashtable and Array.

  • If having inputting all the element of the sequence "X1,...,Xn", I sort the array and output it.

Question:

  • And with Merkle Patricia Tree (Ethereum) and Hashtable, which one should I choose?
  • For Merkle Patricia Tree (Ethereum) and Hashtable, which one has faster search speed?
  • Or is there a better data structure to satisfy this function?

Solution

  • If you want the fastest look up nothing can beat the hashtable but hashtable is not good for ordering. merkle patricia trie allows us to verify data integrity in large datasets. "Patricia" stands for "Practical Algorithm To Retrieve Information Coded In Alphanumeric". Since blockchains include sensitive financial transactions, "merkle trees" are heavily used in blockchains. In your question you are worried about data integrity because inputs are in order and each input in sequence might be same or might contain similar elements. That sounds like perfect use case for merkle-patricia tree