I am still trying to understand the value restriction in OCaml and I was reading through Wright's paper. And in it states (fun x -> x) (fun y -> y)
is not a syntactic value while it is also stating lambda expression should be a value. I am a bit confused here, isn't id id
in its essence also a lambda expression? What really counts as a syntactic value in OCaml?
I also tried it in utop
and found these:
utop # let x = let x = (fun y -> y) (fun z -> z) in x ;;
val x : '_a -> '_a = <fun>
Here id id
is not a value and it cannot escape the value restriction but
utop # let x a = let x = (fun y -> y) a in x ;;
val x : 'a -> 'a = <fun>
Here id a
seems to be treated as a value.
They are all function application, what's the difference?
So, there are two concepts involved here: let-polymoprhism and value restriction. Let-polymorphism disallows type generalization for all values that are not let-bound. Or, without using a double negation, it allows a value to be polymorphic only if it is introduced with the let-binding. This is an over-approximation, i.e., it may disallow valid programs (there are false positive), but it will never allow an invalid program (it will preserve the soundness).
The value restriction is another over-approximation, that is needed to preserve the soundness of imperative programs. It disallows polymorphism for non-syntactic values. OCaml uses a more precise version of this over-approximation that is called a relaxed value restriction (that actually allows certain non-syntactic values to be polymorphic).
But let me first explain what is a syntactic value:
Informally, a syntactic value is an expression that can be evaluated without doing any computation, e.g., consider the following let binding:
let f = g x
Here f
is not a syntactic value because in order to get the value you need to compute expression g x
. But, in the following,
let f x = g x
the value f
is syntactic, it would be more obvious, if we will remove the sugar:
let f = fun x -> g x
Now it is obvious, that f
is syntactic as it is bound to a lambda-expression.
The value is called syntactic because it is defined directly in the program (in the syntax). Basically, it is a constant value that can be computed at static time. Slightly more formally, the following values are considered syntactic:
let f x = ...
Now, when we're pretty sure what is syntactic what is not, let's look at your examples more closely. Let's start with the Wright's example, actually:
let f = (fun x => x) (fun y => y)
or, by introducing let id = fun x -> x
let f = id id
You may see, that f
here is not a syntactic value, although id
is syntactic. But in order to get the value of f
you need to compute - so the value is defined at runtime, not in the compile time.
Now, let's desugar your example:
let x a = let x = (fun y -> y) a in x
==>
let x = fun a -> let x = (fun y -> y) a in x
We can see, that x
is a syntactic value, because to the left is a lambda expression. The type of lambda expression is 'a -> 'a
. You may ask, why the type of the expression is not '_a -> '_a
. This is because the value restriction is introduced only on the top-level, and lambda expression is not a value yet, it is an expression. In layman terms, first, the most general Hindley-Milner type is inferred under an assumption, that there are no side effects, and then the inferred type is weakened by the (relaxed) value restriction. The scope of type inference is a let
binding.
This is all theory, and sometimes it is not really obvious why some expressions are well-type, while expressions with the same semantics, but written slightly different, are not well typed. The intuition might say, that there is something wrong here. And yes it is, in fact, let f = id id
is a well-formed program that is declined by a typechecker, and this is an example of the over-approximation. And if we will transform this program to let f x = id id x
it suddenly becomes a well typed program with a general type, although the transformation doesn't change the semantics (and both programs are actually compiled to the same machine code). This is a limitation of a type system, that came as a compromise between simplicity and precision (the soundness can't be a part of compromise - typechecker must be sound). So, it is totally not obvious from the theory why the latter example, is always safe. Just for the sake of experiment let's try to play with your example, and try to break the program:
# let x = fun a -> let z = ref None in let x = (fun y -> z := Some y; y) a in x ;;
val x : 'a -> 'a = <fun>
So, we added a reference z
here, and we're trying to store the value, so that under different applications to different types, we should be able to store to the same reference values of different types. However, it is totally non-possible, since because x
is a syntactic value, it is guaranteed, that every type x k
is called an new reference is created, and this reference will never leak the scope of let-definition. Hope, that this helps :)