Search code examples
type-safetylanguage-theoryturing-complete

What does it mean for a language to be statically typed?


My understanding is that it means that one can potentially write a program to formally prove that a program written in a statically typed language will be free of a certain (small) subset of defects.

My problem with this is as follows:

Assume that we have two turing complete languages, A and B. A is presumed to be 'type safe' and 'B' is presumed not to be. Suppose I am given a program L to check the correctness of any program written in A. What is to stop me from translating any program written in B to A, applying L. If P translates from A to B then why isn't PL a valid type checker for any program written in B?

I'm trained in Algebra and am only just starting to study CS so there might be some obvious reason that this doesn't work but I would very much like to know. This whole 'type safety' thing has smelt fishy to me for a while.


Solution

  • Let A be your Turing-complete language which is supposed to be statically typed and let A' be the language you get from A when you remove the type checking (but not the type annotations because they also serve other purposes). The accepted programs of A will be a subset of the accepted programs of A'. So in particular, A' will also be Turing-complete.

    Given your translator P from B to A (and vice versa). What is it supposed to do? It could do one of two things:

    1. Firstly, it could translate every program y of B to a program of A. In this case, LPy would always return True as programs of A are by definition correctly typed.

    2. Secondly, P could translate every program y of B to a program of A'. In this case, LPy would return True if Py happens to be a program of A and False if not.

    As the first case doesn't yield anything interesting, let us stick to the second case, which is probably what you mean. Does the function LP defined on programs of B tell us anything interesting about programs of B? I say no, because it is not invariant under a change of P. As A is Turing-complete, even in the second case P could be chosen so that its image happens to lie in A. Then LP would be constantly True. On the other hand, P could be chosen so that some programs are mapped to the complement of A in A'. In this case LP would spit out False for some (possibly all) programs of B. As you can see, you don't get anything which only depends on y.

    I can also put it more mathematically in the following way: There is a category C of programming languages whose objects are the programming languages and whose morphisms are translators from one programming language to another one. In particular if there is a morphism P: X -> Y, Y is at least as expressive as X. Between each pair of Turing-complete languages there are morphisms in both directions. For each object X of C (i.e. for each programming language) we have an associated set, say {X} (bad notation, I know) of those partially defined functions that can be computed by programs of X. Each morphism P: X -> Y then induces an inclusion {X} -> {Y} of sets. Let us formally invert all those morphisms P: X -> Y that induce the identity {X} -> {Y}. I will call the resulting category (which is, in mathematical terms, a localization of C) by C'. Now the inclusion A -> A' is a morphism in C'. However, it is not preserved under automorphisms of A', that is the morphism A -> A' is not an invariant of A' in C'. In other words: from this abstract point of view the attribute "statically typed" is not definable and can be arbitrarily attached to a language.

    To make my point clearer you can also think of C' as the category of, say, geometrical shapes in three-dimensional space together with the Euclidean motions as morphisms. A' and B are then two geometrical shapes and P and Q are Euclidean motions bringing B to A' and vice versa. For example, A' and B could be two spheres. Now let us fix a point on A', which shall stand for the subset A of A'. Let us call this point "statically typed". We want to know whether a point of B is statically typed. So we take such a point y, map it via P to A' and test, whether it is our marked point on A'. As one can easily see, this depends on the chosen map P or, to put in other words: A marked point on a sphere is not preserved by automorphisms (that are Euclidean motions that map the sphere onto itself) of that sphere.