Type Checking
Earlier, we skipped over the type checking phase. Let's come back to that now. After lexing and parsing, the next phase of compilation is semantic analysis, and the primary task of semantic analysis is type checking.
A type system is a mathematical description of how to determine whether an expression is ill typed or well typed, and in the latter case, what the type of the expression is. A type checker is a program that implements a type system, i.e., that implements the static semantics of the language.
Commonly, a type system is formulated as a ternary relation , which means that expression has type in typing context . A typing context, aka typing environment, is a map from identifiers to types. The context is used to record what variables are in scope, and what their types are. The use of the Greek letter for contexts is traditional.
That ternary relation is typically written with infix notation, though, as . You can read the turnstile symbol as "proves" or "shows", i.e., the context shows that has type .
Let's make that notation a little friendlier by eliminating the Greek
and the math typesetting. We'll just write ctx |- e : t
to mean
that typing context ctx
shows that e
has type t
. Let's write
{}
for the empty context, and x:t
to mean that x
is bound to t
.
So, {foo:int, bar:bool}
would be the context is which foo
has type int
and bar
has type bool
. A context may bind an
identifier at most once. We'll write ctx[x -> t]
to mean
a context that contains all the bindings of ctx
, and also binds
x
to t
. If x
was already bound in ctx
, then that old binding
is replaced by the new binding to t
in ctx[x -> t]
.
With all that machinery, we can at last define what it means to be well
typed: An expression e
is well typed in context ctx
if there
exists a type t
for which ctx |- e : t
. The goal of a type checker
is thus to find such a type t
, starting from some initial context.
It's convenient to pretend that the initial context is empty. But in
practice, it's rare that a language truly uses the empty context to
determine whether a program is well typed. In OCaml, for example, there
are many built-in identifiers that are always in scope, such as
everything in the Stdlib
module.