Modeling union types using only functions

The lambda calculus is the basis for functional programming. Church's encodings in the untyped lambda calculus show an interesting pattern that parallels union types in languages like Haskell. Here I try to show how these discriminated union types can be implemented using only JavaScript functions.

As an example of a union type, I am going to show two definitions of a singly ended, singly linked list implementation that use the union concept. This simple union has two cases: A node or an empty list. The first, in Haskell, defines a union explicitly:

data List a = EmptyList
            | ListNode a (List a)

This JavaScript implements the same idea with objects, and the union is implicit:


function List() { }; 

function EmptyList () { List.call(this); };
EmptyList.prototype = new List();
EmptyList.prototype.isEmpty = true;
EmptyList.prototype.getHead = function () { throw "Empty list has no head"; };
EmptyList.prototype.getTail = function () { throw "Empty list has no tail"; };

function ListNode (head, tail) {
	List.call(this);
	this.head = head; // any value
	this.tail = tail; // a List instance
}
ListNode.prototype = new List();
ListNode.prototype.isEmpty = false;
ListNode.prototype.getHead = function () { return this.head; };
ListNode.prototype.getTail = function () { return this.tail; };

This JavaScript is not especially idiomatic, but it does represent a somewhat common approach in OO to representing a similar idea to the union type. Here I use polymorphism and subtyping (and a trivial use of the composite pattern). Hopefully the Haskell definition is reasonably clear, as I will refer back to that notation as we continue.

The point of this is to show that a union is a type that has a variety of potential forms it can take, but any instance is only one form at a time.

A simple case

To look at how these are modelled using only functions, as in the lambda calculus, lets first look at the one of the simplest union types: booleans. Booleans are a set of two values: True and False. An important detail here is that they are values that do not have an fields of their own, unlike the list example above. In Haskell notation this might look like:

data Boolean = True 
             | False

An encoding of this type using only functions might look like this:

function True(a, _) { return a; }
function False(_, b) { return b; }

Here are three functions that operate on booleans:

function not(a) { return a(False, True); }
function and(a, b) { return a(b, a); }
function or(a, b) { return b(b, a); }

Of particular interest is that both cases of the union (True, and False) take two arguments but return only one. Because we are representing everything with functions here, we can safely assume that both arguments are functions. The effect here is that the booleans choose which function should be evaluated. This is key: polymorphism is achieved by selecting which arguments is appropriate.

We can leverage this to implement simple pattern matching. The following example steps back to using some JavaScript values for clarity, but the logic we care about is all done using the encoding above:


function logBool (bool) {
    return bool(
	    "True",
	    "False"
    );
};

logBool(True)     //=> "True"
logBool(False)    //=> "False"

Option types

Another common type seen in functional programming is the Option (or Maybe) type. This is a way of implementing null as a wrapper around a non-nullable type. First, here is the Haskell definition:

data Option a = Some a
              | None

This is almost identical to booleans, except that an Option might contain a value. Here is a simple function encoding:

function Some(a) { 
    return function (destructureSome, destructureNone) { 
        return destructureSome(a);
    };
}
function None(destructureSome, destructureNone) {
    return destructureNone();
}

Here things start to get more complex. None looks similar to False; infact, only the names have changed. Some is more interesting. Here we see that because Some is parameterised, there is more than one value for Some, thus we have more than one function. A closure is used to capture the parameterized value.

We also see the first example of how lambda calculus uses lexical closures to carry data in a function, and treat that function as a structure. Because there are multiple cases in our type, we need multiple destructuring functions to access that captured data. Again, the parametric polymorphism we saw with booleans is in play to select the appropriate destructuring behaviour.

Here is an example:

function div(n, d) {
	if (d === 0) return None;
	return Some(n / d)
}

function logOption(opt) {
	opt(
	    function (v) { console.log("Some " + v);},
	    function () { console.log("None"); }
	);
}	

logOption(div(10, 2));      // logs "Some 5"
logOption(div(10, 0));      // logs "None"
logOption(div(10, -1));     // logs "Some -10"

This can obviously be extended to an Either type. We simply make the second case also capture a value, and pass it to its appropriate destructuring function, eg:

data AnEither a b = Left a 
                  | Right b

function Left(v) { 
    return function (destructureLeft, _) {
	     return destructureLeft(v);
	};
}

function Right(v) { 
    return function (_, destructureRight) {
	     return destructureRight(v);
	};
}

This is such a trivial extension of Option that I will leave experimenting with this to the reader.

Lists

Next, I would like to return to the List implementation at the top of this article, and define it using this technique.

function ListNode(head, tail) { 
    return function (destructureListNode, _) {
	     return destructureListNode(head, tail);
	};
}

function Empty(_, destructureEmpty) {
	return destructureEmpty();
}

Again, this is a simple extension of our Option type to carry multiple values in the ListNode case. Lists allow us to create some interesting functions as examples:

function head(list) { 
    return list(function(h, _) { return h; }, 
	            function() { console.log("Empty list has no head"); });
}

function tail(list) { 
    return list(function(_, t) { return t; }, 
	            function() { console.log("Empty list has no tail"); });
}

Note that this implementation throws exceptions if head or tail is called on an Empty. I will return to this shortly. First however, a more interesting function:

var l3 = ListNode(1, ListNode(2, ListNode(3, Empty)));

function reduce(f, acc, list) { 
    return list(function(h, t) {  // ListNode case
                    return reduce(f, f(h, acc), t);
                }, 
                function() {      // Empty case
                    return acc; 
                }
           );
}


function sum(list) {
    return reduce(function (v, acc) { return acc + v; }, 0, list);
}

sum(Empty); //=> 0
sum(l3);    //=> 6


function map(f, list) {
	return reduce(function (v, acc) {
                      return ListNode(f(v), acc); 
	              }, Empty, list);
}

head(map(function (a) { return a + 1; }, l3)); //=> 4


function reverse(list) { return reduce(ListNode, Empty, list); }
head(reverse(l3));   //=> 3

// etc…

Combining types

Returning to head and tail above. Instead of having the error be thrown as an exceptionⁱ, these versions return an Option (this could easily be an Either if we wished to carry data about the error).

function maybeHead(list) {
    return list(function (h, t) { 
                    return Some(h);
                },
                function () { return None; }
           );
}

function maybeTail(list) {
    return list(function (h, t) { 
                    return Some(t); 
                },
                function () { return None; }
           );
}

logOption(maybeHead(l3)); // logs "Some 1"
logOption(maybeHead(Empty)); // logs "None"

Wrapping up

Some final points. Obviously for each case in the union, the functions need to take that many arguments. This gets tedious fast. Likewise, if we wanted to process a List of Options, this would bog down in a pile of confusing functions.

Nevertheless, this technique is fascinating to consider. Interested readers may wish to look at Brian McKenna's Fantasy-Promises, to see practical application of function driven polymorphism with the fork function provided to the Promise (this code uses his Fantasy Land spec for category theory types in JavaScript).

Postscript

This post was inspired in part by John Bender's great article on System F in Coffeescript. Even if you didn't enjoy my article, I would recommend reading his.

Updated

13 June: After feedback from Alan Malloy, I have stripped out the redundant thunk code and related text, as well as corrected some errors with sum and head.

Unsurprisingly, exceptions form an implicit union type with the return type of the function, so it is trivial to adapt a function that throws an exception to return a union type.