Exceptions using Control Stacks

We will now give a direct semantics for exceptions, without
translation to another language. This model closely corresponds to a
second way of implementing exceptions, which uses an exception stack
and a run-time mechanism to walk up the stack when an exception
occurs.

To express the semantics of exceptions directly, we need a model for
control flow. None of the constructs that we've studied so far
required an explicit representation of control-flow, because they all
had local control-flow.

We will model control using a stack of control frames:

(control stack)  C ::= Ų | C :: F
(control frame)  F ::= [] e | v [] | [] + e | v + [] | []

The empty frame is a special marker that will be used to indicate that
an exception handler has been installed at that point.

Intuitively, a control stack describes the control point that we've
currently reached. For instance, consider expression:

  (\x.x) (1 + (2+3)) + 4

and assume the program reaches the control point where it is about to
evaluate subexpression 2+3. We can represent this control point as
follows:

 [] + 4 :: (\x.x) [] :: 1 + []

This says that when we're done evaluating 2+3, we substitute the result
for [] in 1 + []. When we get a result for this expression, we plug it
in (\x.x) [], and so on.

To describe the execution of the program, each configuration will be a
triple, consisting of an exception handler stack E (not defined yet),
a control stack C, and a command currently being evaluated.  The
evaluation is described by the following small-step operational rules:

(E, C, e1 + e2) -> (E, C::[] + e2, e1)  if e1 not a value
(E, C, v + e2) -> (E, C::v + [], e2)  if e2 not a value
(E, C, n1 + n2) -> (E, C, n) where n = n1 + n2

(E, C, e1 e2) -> (E, C::[] e2, e1)  if e1 not a value
(E, C, v e2) -> (E, C::v [], e2)  if e2 not a value
(E, C, (\x.e) v)) -> (E, C, e[v/x])

(E, C::F, v) -> (E, C, F[v]) if F is not []

For instance, you have the following evaluation for the
above-mentioned expression:

   (E, Ų, (\x.x) (1 + (2 + 3)) 
-> (E, (\x.x) [], 1 + (2+3)) 
-> (E, (\x.x) [] :: 1 + [], 2+3) 
-> (E, (\x.x) [] :: 1 + [], 5) 
-> (E, (\x.x) [], 1 + 5) 
-> (E, (\x.x) [], 6) 
-> (E, , (\x.x) 6) 
-> (E, , 6) 

Now that we have modeled control explicitly in the configurations, we
can use it to describe how exceptions work. Each entry in the
exception handler stack will be a triple (X, e, C), where X is an
exception, e is the handler for X, and C is the control point where
the handler has been installed. The rules for raise and handle are as
follows:

(E, C, try e1 with X -> e2) -> (E:: (X, e2, C), C::[], e1) if e1 not a value
(E::F, C::[], v) -> (E, C, v)

These show that an exception handler needs to be installed in the
exception stack for each "try e1 with X -> e2" construct; this handler
is popped from the exception stack after the evaluation of e1.

(E::(Y,e,C), C', throw X) -> (E, C', throw X)
(E::(X,e,C), C', throw X) -> (E, C, e)

These rules show that the execution searches the exception stack until
it finds a matching handler. At that point, it executes the handler
and proceeds with the control associated with the handler.

This description closely corresponds to an implementation that
maintains a stack of exception handlers. Each time a handle (or
try-catch block in Java) is entered, a new entry is pushed on the
exception stack. This entry indicates the exception that it is able to
handle; it also contains a pointer to the handler code, and a pointer
to the control point. When the handle expression (or try-catch block)
terminates, the program pops the exception stack.

When an exception is raised (or thrown in Java/C++), the program uses
a run-time mechanism that searches the exception stack to find the
appropriate handler. It then jumps to the control point indicated in
the exception stack entry. This point may be in a different function
than the one where the raise/throw has occurred (but it is guaranteed
that it belongs to one of the currently active functions).

Regarding efficiency, note that there are no checks on function calls
or other constructs. But there is run-time overhead when entering and
exiting handle constructs or try-catch blocks. However, this has a
smaller impact on run-time than the previous solution, because such
constructs are less frequent than function calls.  The approach using
exception stacks can be implemented in a portable manner using the C
routines setjmp (to save control points) and longjmp (to transfer
control to the saved points). This is popular way of implementing
portable, and fairly efficient exception mechanisms.

There is a third way of dealing with exceptions, which is very
efficient and has almost no run-time overhead in the absence of
exceptions. This relies on a run-time function to walk up the program
stack (i.e., the stack of activation records) each time an exception
is being issued. At each frame, the run-time system determines the
address of the call instruction and checks if there exists a handler
for that address in the current function. Unfortunately, this approach
is not portable: the run-time system must know the structure of each
frame to be able to walk up the stack.

Note that raised exceptions cause the premature termination of
variable lifetimes. In general, when variables go out of scope,
certain cleanup tasks must be performed, e.g., reclaiming their
memory, calling destructors (for objects), or restoring saved
registers (for function scopes). The exception run-time system must
perform all of these tasks as it walks up the stack: it must
deallocate activation records, it must call destructors on all objects
that go out of scope, it must restore registers, etc.


Continuations and control-flow

The control stack C in our semantic model describes the rest of the
computation after we're done with the evaluation of the current
expression. A function that represents "the rest of the computation"
is called a _continuation_. Such a function takes the current result
as parameter and yields the final value of the computation. Hence,
control points can be described as continuations.

Consider our model where C is a stack of frames. Each frame with a
hole [] can be viewed as a function whose argument is []. For
instance, regard [] + e as the function \y. y + e. Then a control
stack C is a function representing the composition of the functions
that its frames represent.

For instance, take expression (\x.x) (1 + (2+3)). After a few steps,
the evaluation yields a control stack (\x.x)[]::1+[] and an expression
2+3. We can write this control stack as a stack of functions k1::k2,
where:

  k1 = \y. (\x.x) y
  k2 = \y. 1 + y

Then, the current continuation is k1 o k2, and describes the rest of
the computation after we're done evaluating 2 + 3. The final result of
the program is k1 (k2 (2 + 3)).

The nice thing about continuations is that it makes the control
explicit, and this is especially useful in the case of functional
programs, where control is not explicit otherwise. In fact, we can
rewrite a program to make continuations more explicit. For instance,
we can write (\x.x) (1 + (2+3)) as:

  (\y. (\x.x) y) ((\z. 1 + z) (2 + 3))

and since "let x = e1 in e2" is syntactic sugar for "(\x.e2) e1", we
can re-write the above as:

  let z = 2 + 3 in
  let y = 1 + z in
      (\x.x) y

This is fairly close to some machine instructions of the form:

  add z, 2, 3
  add y, 1, z
  call id y


Using continuations, functions can be transformed into "functions that
don't return" -- functions that take, besides the usual arguments, an
additional argument representing a continuation. When the function
finishes, it invokes the continuation on its result, instead of
returning the result to its caller. Writing functions in this way is
usually referred to as Continuation-Passing Style, or CPS for
short. For instance, the CPS version of factorial (written in ML)
looks as follows:

  let rec cps_fact = fun n -> fun k -> 
                     if n = 1 then k(1)
                     else cps_fact (n-1) (fn m => k(n * m))
                               
  let n = cps_fact 3 (fn n => n)

Continuation-Passing Style is an important concept in the compilation
of functional languages and is used as an intermediate compiler
representation (it has been used in compilers for Scheme, ML,
etc). The main advantage is that CPS makes the control flow explicit
and makes it easier to translate functional code to machine code where
control is explicit (in the form of sequences of machine instructions
and jumps). For instance, a CPS call can be easily translated into a
jump to the invoked method, since the invoked function does not return
the control.


Continuations as first-class values

We will now study control and continuations as first-class value in a
language. By first-class values we mean values that the program can
manipulate: store continuations in variables, take them as arguments,
pass them as return values, etc. Note that, although exceptions allow
for non-local control-flow, they do not make control a first-class
values.

We will study a concept called call-with-current-continuation
(available in Lisp, Scheme, ML). The idea is to capture the current
control in a program variable (or heap ref, or parameter) that you can
manipulate in the program. Then, the program will be able to jump to
the control that a program expression represents.

We will use two constructs in our language to model continuations as
first-class values:

e ::= ... | letcc k in e1 | throw e1 e2

These constructs work as follows. The expression "callcc e" evaluates
e to a function whose argument is the current control point. In the
body e of this function, this control point is a variable. In the
second construct, e1 evaluates to a continuation k, e2 to a value
v. Then, the control is being transferred to point k, and value v is
passed to that point. The evaluation is as follows:


(E, C, letcc k in e1) -> (E, C, e1[C/k])    
(E, C, throw e1 e2) -> (E, C:throw [] e2, e1) if e1 not a value
(E, C, throw C' e2) -> (E, C:throw C' [], e2) if e2 not a value
(E, C, throw C' v) -> (E, C', v) 

Such constructs can be used to implement algorithms efficiently, in a
way similar to the way break (or even goto's) are used in C/Java
programs to jump out of structured control constructs. For instance, a
program that multiplies all elements of a list be written as follows:

   open SMLofNJ.Cont;

   fun mult l = 
       callcc (fn k =>
           let fun mult' [] = 1
                 | mult' (0::t) = throw k 0
                 | mult' (n::t) = n * mult'(t)
           in 
               mult' l e
           end
       ) 
                                  

(callcc is an ML construct such that "callcc (fn k => e)" is equivalent
to "letcc k in e").