What I'm doing: I'm writing a small interpreter system that can parse a file, turn it into a sequence of operations, and then feed thousands of data sets into that sequence to extract some final value from each. A compiled interpreter consists of a list of pure functions that take two arguments: a data set, and an execution context. Each function returns the modified execution context:
type ('data, 'context) interpreter = ('data -> 'context -> 'context) list
The compiler is essentially a tokenizer with a final token-to-instruction mapping step that uses a map description defined as follows:
type ('data, 'context) map = (string * ('data -> 'context -> 'context)) list
Typical interpreter usage looks like this:
let pocket_calc =
let map = [ "add", (fun d c -> c # add d) ;
"sub", (fun d c -> c # sub d) ;
"mul", (fun d c -> c # mul d) ]
in
Interpreter.parse map "path/to/file.txt"
let new_context = Interpreter.run pocket_calc data old_context
The problem: I'd like my pocket_calc
interpreter to work with any class that supports add
, sub
and mul
methods, and the corresponding data
type (could be integers for one context class and floating-point numbers for another).
However, pocket_calc
is defined as a value and not a function, so the type system does not make its type generic: the first time it's used, the 'data
and 'context
types are bound to the types of whatever data and context I first provide, and the interpreter becomes forever incompatible with any other data and context types.
A viable solution is to eta-expand the definition of the interpreter to allow its type parameters to be generic:
let pocket_calc data context =
let map = [ "add", (fun d c -> c # add d) ;
"sub", (fun d c -> c # sub d) ;
"mul", (fun d c -> c # mul d) ]
in
let interpreter = Interpreter.parse map "path/to/file.txt" in
Interpreter.run interpreter data context
However, this solution is unacceptable for several reasons:
It re-compiles the interpreter every time it's called, which significantly degrades performance. Even the mapping step (turning a token list into a interpreter using the map list) causes a noticeable slowdown.
My design relies on all interpreters being loaded at initialization time, because the compiler issues warnings whenever a token in the loaded file does not match a line in the map list, and I want to see all those warnings when the software launches (not when individual interpreters are eventually run).
I sometimes want to reuse a given map list in several interpreters, whether on its own or by prepending additional instructions (for instance, "div"
).
The questions: is there any way to make the type parametric other than eta-expansion? Maybe some clever trick involving module signatures or inheritance? If that's impossible, is there any way to alleviate the three issues I have mentioned above in order to make eta-expansion an acceptable solution? Thank you!
A viable solution is to eta-expand the definition of the interpreter to allow its type parameters to be generic:
let pocket_calc data context =
let map = [ "add", (fun d c -> c # add d) ;
"sub", (fun d c -> c # sub d) ;
"mul", (fun d c -> c # mul d) ]
in
let interpreter = Interpreter.parse map "path/to/file.txt" in
Interpreter.run interpreter data context
However, this solution is unacceptable for several reasons:
- It re-compiles the interpreter every time it's called, which significantly degrades performance. Even the mapping step (turning a token list into a interpreter using the map list) causes a noticeable slowdown.
It recompiles the interpreter every time because you are doing it wrong. The proper form is more something like this (and technically, if the partial interpretation of Interpreter.run
to interpreter
can do some computations, you should move it out of the fun
too).
let pocket_calc =
let map = [ "add", (fun d c -> c # add d) ;
"sub", (fun d c -> c # sub d) ;
"mul", (fun d c -> c # mul d) ]
in
let interpreter = Interpreter.parse map "path/to/file.txt" in
fun data context -> Interpreter.run interpreter data context