Search code examples
ocamlundefinedlazy-evaluationlazy-initialization

(Lazy) Haskell undefined/bottom in OCaml


Haskell has a really swell undefined value, which lazily raises an exception (upon evaluation). Ocaml is of course strict, so as far as I can tell there is no equivalent of Haskell's undefined. This is unfortunate though, because it means there is no bottom type for values. Let's say I wanted an

val a : int

I could of course do

let a = failwith "undefined"
let () =
  print_string "something unrelated\n"

And this happily compiles. Unfortunately, upon running it we get the undefined exception (which is expected).

What I want is to let a be a bottom/undefined value without changing its type (so things like Lazy won't work). Is this possible?

extra details: So what I'm asking for probably sounds fairly silly. To curtail any commentary on why I shouldn't be doing this, allow me to briefly describe my use case. I'm writing a script that modifies the AST of an mli file to produce a "empty" ml file which matches its signatures. For the general case, it's possible to have val a : int in your mli so I need a way to generically synthesize a bottom type. failwith "undefined" works if all I needed was for compilation to succeed. But unfortunately, I also need to link this ml file against an OUnit test suite and run it (obviously the suite will fail, but the purpose is to be able to run it with -list-test so I can programmatically get a list of all tests).

more details: I recognize that the proper way to solve this (probably) is to write a function which can generate the bottom type for any generic type. For the builtin primitives (and list, option, etc.) this is simple enough (just verbose). This becomes more complicated with records (perhaps defined in the stdlib, but also may be defined in the same file or a different package). To handle this my AST transformer then needs to have a complete understanding of the OCaml type system and file importing strategies for packages, which is a lot more logic than I want to/should include in it.


Solution

  • Important note: As noted by @PatJ and @camlspotter, Obj.magic can lead to some horrendous behavior. In fact, I'd venture using something that it returns without some sort of proof (such as with Coq) that its use was warranted is akin to undefined behavior in C. You could get nothing, you could get a segfault, or you could set your house on fire. All are possible. The solution below should only be used if you never use or call any undefined values. So, it is fine for my case, because I need to only link an OUnit test suite against this package with undefined values (but OUint never calls anything in the package). This allows you to run the OUint test suite with -list-test (with failwith running the test binary fails immeidately in the case of let a = failwith "undefined"). But since -list-test never runs any of the code its linked against, it never tries to use the undefined values, so this particular use case is safe. Proceed with caution.

    I managed to find a rather hacky solution. The Obj module enables some fancy runtime things that requires some loose types (and doesn't appear to do runtime checking). It offers some functions with some promising signatures:

    type t 
    val repr : 'a -> t
    val obj : t -> 'a
    val magic : 'a -> 'b
    

    After some experimentation, it's clear these are unsafe operations (they don't do any runtime checks), which is perfect for our use case (since we don't really care too much what happens if you try to use these stubbed out values). Either of these appear to be sufficient equivalents to Haskell's undefined:

    Obj.obj (Obj.repr 0) (* 0 could be unit, another primitive, etc. *)
    Obj.magic 0
    

    Now of course the equivalency ends when you try to use these values anywhere. You may hit some segfaults. For my aforementioned use case this is fine because OUnit -list-test never actually runs the tests (and if it did they are run in separate processes, so the segfaults don't affect the test harness).

    I'd be curious to hear if anyone with more experience with OCaml internals can comment on this approach. It would be really cool (although not really required in this context) if using the values resulted in an exception instead of a potential segfault (or some other undefined behavior--which for 0 sometimes appears to be nothing happening).