I'm writing a recursive encoder in crystal-lang for the Ethereum RLP standard.
What I have to do is taking any incoming data blob that is to be encoded and determine it's type. For now, I can ignore invalid types for the sake of simplicity.
Valid types for RLP encoding would be binary Bytes
, strings String
, or lists of strings Array(String)
. So far, so good, I wrote three methods that allow for the three data types:
module Rlp
# rlp-encodes binary data
def self.encode(b : Bytes)
return "binary #{typeof(b)} #{b}"
end
# rlp-encodes lists data
def self.encode(l : Array(String))
return "listsy #{typeof(l)} #{l}"
end
# rlp-encodes string data
def self.encode(s : String)
return "strngy #{typeof(s)} #{s}"
end
end
Now, however, there is some depth here because arrays can be nested. Therefore, this encoder is recursive. Given an Array(String)
it gives it some prefix and encodes the String
with Rlp.encode(s : String)
. Now, the logic for having a nested array would be to call .encode
as often as required to encode everything. However, I fail to wrap my head around how to determine the type at compile-time.
For example:
79 | Rlp.encode([["a", "b", "c"], ["cow", "dog", "cat"]])
^-----
Error: no overload matches 'Rlp.encode' with type Array(Array(String))
Right, because I don't have any self.encode(l : Array(Array(String)))
implemented and I cannot know the nesting depth here to potentially implement all possible cases.
I tried to implement less strict wrapper method self.encode(data)
that does not specify a data type, however, I can basically do nothing with data
because the compiler implies data types based on my usage, i.e.:
# rlp-encodes _any_ data
def self.encode(data)
if typeof(data) == Int32
return Bytes.new data
elsif typeof(data) == Char
return String.new data
end
end
Passing 32
of type Int32
would cause:
45 | return String.new data
Error: no overload matches 'String.new' with type Int32
Even though this code wouldn't be called with data of type Int32
. I'm not sure how to proceed here. Is there anyway to be smarter about the types used while being agnostic at compile-time?
Ideally, I would just accept any data as input and handle the different cases myself.
If you declare the Rlp.encode
for Array as follow, the compiler will take care of instantiating that method for the different types of arrays.
module Rlp
# rlp-encodes lists data
def self.encode(l : Array)
return "listsy #{typeof(l)} #{l}"
end
end