Computability Part 8: Lambda Calculus
In the previous posts, we dug deeper into one particular model of computation, starting with Turing Machines in part 2, to the von Neumann computer architecture in part 6, to some of the implementation practicalities of machines - physical or virtual - in part 7.
We'll switch gears and cover another computational model this time around: lambda calculus. Lambda calculus was developed by Alonzo Church around the same time Alan Turing was proposing the Turing machine as a universal model for computation. The Church-Turing thesis1 proves the equivalence between the two models - anything a Turing machine can compute can also be computed by lambda calculus.
Lambda calculus consists of lambda terms and reductions applied to lambda terms.
The lambda terms are built with the following rules, where \(\Lambda\) is the set of all possible lambda terms:
- Variables, like \(x\), are lambda terms. \(x \in \Lambda\).
- Abstractions, \((\lambda x.M)\). This is a function definition where \(M\) is a lambda term and \(x\) becomes bound in the expression. For \(x \in \Lambda\) and \(M \in \Lambda\), \((\lambda x.M) \in \Lambda\).
- Applications, \((M \space N)\). This applies the function \(M\) to the argument \(N\), where \(M\) and \(N\) are lambda terms. For \(M \in \Lambda\) and \(N \in \Lambda\), \((M \space N) \in \Lambda\).
If a term \(y\) appears in \(M\) but is not bound, then \(y\) is free in \(M\), e.g. for \(\lambda x.y \space x\), \(x\) is bound and \(y\) is free. The reductions are:
- \(\alpha\)-equivalence: bound variables in an expression can be renamed to avoid collisions: \((\lambda x.M[x]) \rightarrow (\lambda y.M[y])\).
- \(\beta\)-reduction: bound variables in the body of an abstraction are replaced with the argument expression: \((\lambda x.t)s \rightarrow t[x := s]\).
- \(\eta\)-reduction: if \(x\) is a variable that does not appear free in the lambda term M, then \(\lambda x.(M x) \rightarrow M\). This can also be understood in terms of function equivalence: if two functions give the same result for all arguments, then the functions are equivalent.
Let's look at a few simple examples in Python:
lambda x: x
This is the identity function expressed as a lambda abstraction. In this case,
x (the lambda parameter), becomes bound in the body of the lambda.
lambda y: y
This is the same identity function, we're just using
y instead of
x to name
For function application, we can apply the identity function to any other lambda term and get back that lambda term:
(lambda x: x)(lambda y: y)
This applied the identify function
lambda x: x to the argument
lambda y: y,
which will give us back
lambda y: y.
Based on the above definition, lambda calculus consists exclusively of lambda
terms - while
(lambda x: x)(10) is valid Python code, applying an identity
lambda to the number
10, lambda calculus does not have a number
Church encoding: Alonzo Church came up with a way to encode logic values and
numbers as lambda terms.
Let's start with Boolean logic:
TRUE is defined as \(T := (\lambda x.\lambda
FALSE is defined as \(F := (\lambda x.\lambda y.y)\).
TRUE = lambda x: lambda y: x FALSE = lambda x: lambda y: y
Note with this definition, if we apply a first argument to
TRUE, and a second
argument to the returned lambda, we always get back the first argument. For
FALSE, we always get back the second argument.
We can defined
IF as \(IF := (\lambda x.x)\). This is the same as the identity
IF = lambda x: x
This works since we defined
TRUE to always return the first argument and
FALSE to always return the second argument. So when we call
TRUE, we get back
x (the if-branch), otherwise we get back
We can try this out (though again this is outside of lambda calculus, we are introducing numbers for clarity):
IF(TRUE)(1)(2) # This returns 1 IF(FALSE)(1)(2) # This returns 2
Now that we can express if-then-else, we can easily express other logic operators. Negation is \(\lambda x.(x \space F \space T)\).
NOT = lambda x: x(FALSE)(TRUE)
TRUE, we get back the first argument,
we get back the second argument,
x AND y can be expressed as if x then y else FALSE, or: \(\lambda x.\lambda
y.(x \space y \space F)\).
x OR y can be expressed as if x then TRUE else y,
or \(\lambda x.\lambda y.(x \space T \space y)\).
AND = lambda x: lambda y: x(y)(FALSE) OR = lambda x: lambda y: x(TRUE)(y)
Here are a few examples:
print(AND(TRUE)(TRUE) == TRUE) # prints True print(AND(TRUE)(FALSE) == TRUE) # prints False print(OR(TRUE)(FALSE) == TRUE) # prints True print(NOT(FALSE) == TRUE) # prints True
Using only lambda terms, we were able to implement Boolean logic! But Church encoding goes further - we can also represent natural numbers and arithmetic as lambda terms.
Alonzo Church encoded numbers as applications of a function \(f\) to a term \(x\).
0means applying \(f\) 0 times to the term: \(0 := \lambda f.\lambda x.x\).
1means applying \(f\) once to the term: \(1 := \lambda f.\lambda x.f x\).
2means applying \(f\) twice: \(2 := \lambda f.\lambda x.f (f x)\).
In general, the number
n is represented by
n applications of
f: \(n :=
\lambda f.\lambda x.f (f (... (f x)) ... ))\) or \(n := \lambda f.\lambda x.
ZERO = lambda f: lambda x: x ONE = lambda f: lambda x: f(x) TWO = lambda f: lambda x: f(f(x)) ...
ZERO is the same as
FALSE. With this definition of numbers, we can
define the successor function
SUCC as a function that takes a number
(represented with our Church encoding), the function
f, the term
f one more time. \(SUCC := \lambda n.\lambda f.\lambda x.f (n f x)\).
SUCC = lambda n: lambda f: lambda x: f(n(f)(x))
We can define addition as \(PLUS := \lambda m.\lambda n.m \space SUCC \space n\).
Since we define a number as repeatedly applying a function, we express
m + n
m times the successor function
PLUS = lambda m: lambda n: m(SUCC)(n)
We can similarly define multiplication as applications of the
MUL = lambda m: lambda n: m(PLUS)(n)
We'll stop here with arithmetic, but this should hopefully give you a sense of the expressive power of lambda calculus.
Some well-known lambda terms are called combinators:
- \(I\) is the identity combinator \(I := \lambda x.x\).
- \(K\) is the constant combinator \(K := \lambda x.\lambda y.x\). When applied to an argument \(x\), it returns a constant function \(K_x\) which returns \(x\) when applied to any argument.
- \(S\) is the substitution combinator \(S := \lambda x.\lambda y.\lambda z.x z (y z)\). \(S\) takes 3 arguments, \(x\), \(y\), and \(z\), applies \(x\) to \(z\), then applies the result of applying \(y\) to \(z\) to it.
I = lambda x: x K = lambda x: lambda y: x S = lambda x: lambda y: lambda z: x(z)(y(z))
Turns out these 3 combinators can together express any lambda term. The SKI
combinators are the simplest
programming language since they can express
anything expressable in lambda calculus, which we know is Turing-complete.
The Y combinator
Another interesting combinator is the \(Y\) combinator. In lambda calculus, there
is no way for a function to reference itself: within the body of a lambda like
lambda x: ... we can refer to the bound term
x, but we can reference the
lambda itself. The implication is that we can't define, using this syntax,
self-referential functions. We can only pass functions as arguments. How can we
then implement recursion? With the \(Y\) combinator, of course.
Let's take an example: we can recursively define factorial as:
def fact(n): return 1 if n == 0 else n * fact(n - 1)
This works, but note we reference
fact() within its body. In lambda calculus
we can't do that.
The \(Y\) combinator is defined as \(Y := \lambda f.(\lambda x.f (x x))(\lambda x.f (x x))\).
Y = lambda f: (lambda x: f(x(x)))(lambda x: f(lambda z: x(x)(z)))
Note the Python implementation is slightly different than the mathematical definition. This has to do with the way in which Python evaluates arguments. We won't go into the details here, but consider this a Python implementation detail irrelevant to the lambda calculus discussion2.
Here is a lambda version of factorial:
FACT = lambda f: lambda n: 1 if n == 0 else n * f(n - 1)
With this definition, we pass the function to call as an argument (
f). We can
fully express this in lambda calculus (using Church numerals, arithmetic and
logic), but we'll keep the example simple. We can then use the \(Y\) combinator
print(Y(FACT)(5)) # prints 120
This should give you an intuitive understanding of how the \(Y\) combinator works: we pass it our function and argument, and it enables the recursion mechanism.
We can similarly implement Fibonacci as:
FIB = lambda f: lambda n: 1 if n <= 2 else f(n - 1) + f(n - 2) print(Y(FIB)(10)) # prints 55
The powerful \(Y\) combinator can be used to define recursive functions in programming languages that don't natively support recursion.
Let's also look at how we can express lists in lambda calculus. Let's start with pairs. We can define a pair as \(PAIR := \lambda x.\lambda y.\lambda f. f x y\). We can extract the first element of a pair with \(FIRST := \lambda p. p \space T\) and the second one with \(SECOND := \lambda p.p \space F\).
PAIR = lambda x: lambda y: lambda f: f(x)(y) FIRST = lambda p: p(TRUE) SECOND = lambda p: p(FALSE) print(FIRST(PAIR(10)(20))) # prints 10 print(SECOND(PAIR(10)(20))) # prints 20
We can define a
NULL value as \(NULL := \lambda x.T\) and a test for
\(ISNULL := \lambda p.p (\lambda x.\lambda y.FALSE)\).
NULL = lambda x: TRUE ISNULL = lambda p: p(lambda x: lambda y: FALSE)
We can now define a linked list as either
NULL (an empty list) or as a pair
consisting of a pair of elements - a head element and a tail list.
We can get the head of the list using
FIRST and the tail using
list \(L\), we can prepend an element \(x\) by forming the pair \((x, L)\).
HEAD = FIRST TAIL = SECOND PREPEND = lambda x: lambda xs: PAIR(x)(xs)
We can build a list by prepending elements to
NULL, and traverse it using
# Build the list [10, 20, 30] L = PREPEND(10)(PREPEND(20)(PREPEND(30)(NULL))) print(HEAD(TAIL(L))) # prints 20
Appending is more interesting: if our list is represented as a pair of head and
tail, we need to
traverse the list until we reach the end. This sounds a lot
like a recursive function: appending
xs entails returning the pair
PAIR(x, NULL) if
NULL, else the pair
x))). Fortunately, we just looked at the \(Y\) combinator which allows us
to express this.
Here is a simplified, readable implementation, using Python tuples:
_append = lambda f: lambda xs: lambda x: \ (x, None) if not xs else (xs, f(xs)(x)) append = Y(_append) print(append(append(append(None)(10))(20))(30)) # This will print (10, (20, (30, None)))
We can express the same using the lambdas we defined above (
_APPEND = lambda f: lambda xs: lambda x: \ ISNULL(xs) (lambda _: PAIR(x)(NULL)) (lambda _: PAIR(HEAD(xs))(f(TAIL(xs))(x))) (TRUE) APPEND = Y(_APPEND) L = APPEND(APPEND(APPEND(NULL)(10))(20))(30) print(HEAD(L)) # prints 10 print(HEAD(TAIL(L))) # prints 20
We covered logic, arithmetic, combinators, pairs, and lists, all expressed as lambda terms. Let's also sketch a proof of Turing completeness, like we did in previous posts.
A sketch of Turing completeness
We're calling this a
sketch, as lambda notation is not easy to read. We will
instead look at an implementation using more Python syntax than just lambdas,
but we will only use constructs which we know can be expressed in lambda
As usual, we will emulate another system which we know to be Turing-complete.
In part 3
we looked at tag systems. We talked about cyclic tag systems, which can emulate
m-tag systems, which are Turing-complete. As a reminder, a cyclic tag system is
implemented as a set of binary strings (strings containing only
which are production rules, and we process a binary input string by popping the
head of the string and, if it is equal to
1, appending the current production
rule to the string. We cycle through the production rules at each step. This is
the code we used in the previous post:
def cyclic_tag_system(productions, string): # Keeps track of current production i = 0 # Repeat until the string is empty while string: string = string[1:] + (productions[i] if string == '1' else '') # Update current production i = i + 1 if i == len(productions): i = 0 yield string
We used the productions
00 and the input
productions = ['11', '01', '00'] string = '1' print(string) for string in cyclic_tag_system(productions, string): print(string)
Let's sketch an alternative implementation using the constructs we covered in this post.
First, we can describe our production rules as lists of Boolean values. We
know how to represent Boolean values (
FALSE), and how to build
a list using
PAIR. Our productions can be represented as:
p1 = (True, (True, None)) # PAIR(TRUE)(PAIR(TRUE)(NULL)) p2 = (False, (True, None)) # PAIR(FALSE)(PAIR(TRUE)(NULL)) p3 = (False, (False, None)) # PAIR(FALSE)(PAIR(FALSE)(NULL)) productions = (p1, (p2, (p3, None)))
We can cycle through the list by processing the head, then appending it to the tail of the list. Here are simpler implementations of our list processing functions over Python tuples (though we know how to do these using only lambda terms):
def head(p): return p def tail(p): return p def append(xs, x): return (x, None) if not xs else (head(xs), append(tail(xs), x)) # If we want to cycle through our productions, we can do: # productions = append(tail(productions), head(productions))
We'll also need a function to concatenate two lists. We can easily build this
on top of
def concat(xs, ys): return xs if not ys else concat(append(xs, head(ys)), tail(ys))
While we still have
ys, we append the head of
xs, then recurse
with the tail of
We process our input string as follows: if it is empty, we are done. If not,
if the head is
1, we concatenate our current production to the end of the
string, and recurse, cycling productions:
def cyclic_tag_system(productions, input): return None if not input else \ cyclic_tag_system( # Cycle productions append(tail(productions), head(productions)), # If head is True, concatenate head production. Pop head input either way. concat(tail(input), head(productions)) if head(input) else tail(input))
Let's throw in a
print() and run this on the same input as our original
def cyclic_tag_system(productions, input): print(input) return None if not input else \ cyclic_tag_system( # Cycle productions append(tail(productions), head(productions)), # If head is True, concatenate head production. Pop head input either way. concat(tail(input), head(productions)) if head(input) else tail(input)) # The input is equivalent to the string '1' cyclic_tag_system(productions, (True, None))
This should produce output very similar to our original
but using lists of Booleans instead of strings of
We emulated a cyclic tag system in lambda calculus - well, we didn't write all the code as lambda terms, but everything is expressed as one-liner functions that use only if-then-else expressions, lists (pair, head, tail), and recursion (for which we have the \(Y\) combinator).
Lambda calculus has been extremely influential in computer science - it is the
root of functional programming. LISP, one of the earliest programming
languages, is heavily influenced by lambda calculus. Many ideas, like anonymous
functions, also known as lambdas, are now broadly available in most modern
programming languages (Python even uses the keyword
lambda for these, as we
saw in this post).
In this post we covered lambda calculus:
- Lambda terms, including variables, abstractions, and applications.
- Reductions: \(\alpha\)-equivalence, \(\beta\)-reduction, and \(\eta\)-reduction.
- Church encoding for Boolean logic and arithmetic using lambda terms.
- Combinators: the \(S\), \(K\), and \(I\) combinators which are sufficient to encode all lambda terms, and the \(Y\) combinator which enables recursion.
- Pairs and lists (defined using pairs), including an
- Emulating a cyclic tag systems in lambda calculus.