Saturday, May 22, 2010

Staying grounded in Haskell

I want to look a bit closer at the question: are λxx and Ω observationally equivalent? In a CBV language, they are clearly distinguishable by a context of type unit:

f.()) []

Plugging in the first term gives a terminating program, while the second program leads to divergence. This behavior follows from two properties of CBV evaluation: arguments are evaluated before substitution, and λ-abstractions count as values (i.e., we do not evaluate under a λ). It follows that, for CBV languages, observational equivalence is robust under different choices of observable types (base types, function types, etc.).

In a CBN language, the above context does not serve to distinguish the two terms, as f is not forced by applying it to some argument. But of course, the terms λxx and Ω themselves have different behavior under evaluation, because just as in CBV, λ-abstractions are considered values. So for CBN, observational equivalence is not robust: the types you choose to observe matter.

In the last post on this topic, I raised the natural question for a lazy language: how should you choose the observable types?

Haskell, it turns out, has a rather satisfying answer to this question. Haskell programs are allowed to have essentially one of two types: t or IO t, for any type t implementing Show. Both are allowed at the REPL in ghci, but a compiled program must have a main function of, roughly, type IO ().

The Show typeclass requires, in particular, a show function of type t -> String. The REPL essentially wraps a call to show around any expression the user enters. As a result, the REPL only allows observation at one ground type, String. Compiled programs are even simpler: they must produce a value of unit type, and are evaluated only for effect. So again we are observing at a ground type, unit, modulo side-effects.

As a result, I believe that λxx and Ω are observationally equivalent in Haskell.

What about untyped, lazy languages? There, I think, you have no choice but to allow observation at "any type", and hence to distinguish the two terms. In operational terms, an expression at the REPL would have to be evaluated far enough to figure out whether it was a string (or other printable type) or something else -- which, alas, is too much to ask for Ω.

No comments:

Post a Comment