Exceptions using Control Stacks We will now give a direct semantics for exceptions, without translation to another language. This model closely corresponds to a second way of implementing exceptions, which uses an exception stack and a run-time mechanism to walk up the stack when an exception occurs. To express the semantics of exceptions directly, we need a model for control flow. None of the constructs that we've studied so far required an explicit representation of control-flow, because they all had local control-flow. We will model control using a stack of control frames: (control stack) C ::= Ø | C :: F (control frame) F ::= [] e | v [] | [] + e | v + [] | [] The empty frame is a special marker that will be used to indicate that an exception handler has been installed at that point. Intuitively, a control stack describes the control point that we've currently reached. For instance, consider expression: (\x.x) (1 + (2+3)) + 4 and assume the program reaches the control point where it is about to evaluate subexpression 2+3. We can represent this control point as follows: [] + 4 :: (\x.x) [] :: 1 + [] This says that when we're done evaluating 2+3, we substitute the result for [] in 1 + []. When we get a result for this expression, we plug it in (\x.x) [], and so on. To describe the execution of the program, each configuration will be a triple, consisting of an exception handler stack E (not defined yet), a control stack C, and a command currently being evaluated. The evaluation is described by the following small-step operational rules: (E, C, e1 + e2) -> (E, C::[] + e2, e1) if e1 not a value (E, C, v + e2) -> (E, C::v + [], e2) if e2 not a value (E, C, n1 + n2) -> (E, C, n) where n = n1 + n2 (E, C, e1 e2) -> (E, C::[] e2, e1) if e1 not a value (E, C, v e2) -> (E, C::v [], e2) if e2 not a value (E, C, (\x.e) v)) -> (E, C, e[v/x]) (E, C::F, v) -> (E, C, F[v]) if F is not [] For instance, you have the following evaluation for the above-mentioned expression: (E, Ø, (\x.x) (1 + (2 + 3)) -> (E, (\x.x) [], 1 + (2+3)) -> (E, (\x.x) [] :: 1 + [], 2+3) -> (E, (\x.x) [] :: 1 + [], 5) -> (E, (\x.x) [], 1 + 5) -> (E, (\x.x) [], 6) -> (E, , (\x.x) 6) -> (E, , 6) Now that we have modeled control explicitly in the configurations, we can use it to describe how exceptions work. Each entry in the exception handler stack will be a triple (X, e, C), where X is an exception, e is the handler for X, and C is the control point where the handler has been installed. The rules for raise and handle are as follows: (E, C, try e1 with X -> e2) -> (E:: (X, e2, C), C::[], e1) if e1 not a value (E::F, C::[], v) -> (E, C, v) These show that an exception handler needs to be installed in the exception stack for each "try e1 with X -> e2" construct; this handler is popped from the exception stack after the evaluation of e1. (E::(Y,e,C), C', throw X) -> (E, C', throw X) (E::(X,e,C), C', throw X) -> (E, C, e) These rules show that the execution searches the exception stack until it finds a matching handler. At that point, it executes the handler and proceeds with the control associated with the handler. This description closely corresponds to an implementation that maintains a stack of exception handlers. Each time a handle (or try-catch block in Java) is entered, a new entry is pushed on the exception stack. This entry indicates the exception that it is able to handle; it also contains a pointer to the handler code, and a pointer to the control point. When the handle expression (or try-catch block) terminates, the program pops the exception stack. When an exception is raised (or thrown in Java/C++), the program uses a run-time mechanism that searches the exception stack to find the appropriate handler. It then jumps to the control point indicated in the exception stack entry. This point may be in a different function than the one where the raise/throw has occurred (but it is guaranteed that it belongs to one of the currently active functions). Regarding efficiency, note that there are no checks on function calls or other constructs. But there is run-time overhead when entering and exiting handle constructs or try-catch blocks. However, this has a smaller impact on run-time than the previous solution, because such constructs are less frequent than function calls. The approach using exception stacks can be implemented in a portable manner using the C routines setjmp (to save control points) and longjmp (to transfer control to the saved points). This is popular way of implementing portable, and fairly efficient exception mechanisms. There is a third way of dealing with exceptions, which is very efficient and has almost no run-time overhead in the absence of exceptions. This relies on a run-time function to walk up the program stack (i.e., the stack of activation records) each time an exception is being issued. At each frame, the run-time system determines the address of the call instruction and checks if there exists a handler for that address in the current function. Unfortunately, this approach is not portable: the run-time system must know the structure of each frame to be able to walk up the stack. Note that raised exceptions cause the premature termination of variable lifetimes. In general, when variables go out of scope, certain cleanup tasks must be performed, e.g., reclaiming their memory, calling destructors (for objects), or restoring saved registers (for function scopes). The exception run-time system must perform all of these tasks as it walks up the stack: it must deallocate activation records, it must call destructors on all objects that go out of scope, it must restore registers, etc. Continuations and control-flow The control stack C in our semantic model describes the rest of the computation after we're done with the evaluation of the current expression. A function that represents "the rest of the computation" is called a _continuation_. Such a function takes the current result as parameter and yields the final value of the computation. Hence, control points can be described as continuations. Consider our model where C is a stack of frames. Each frame with a hole [] can be viewed as a function whose argument is []. For instance, regard [] + e as the function \y. y + e. Then a control stack C is a function representing the composition of the functions that its frames represent. For instance, take expression (\x.x) (1 + (2+3)). After a few steps, the evaluation yields a control stack (\x.x)[]::1+[] and an expression 2+3. We can write this control stack as a stack of functions k1::k2, where: k1 = \y. (\x.x) y k2 = \y. 1 + y Then, the current continuation is k1 o k2, and describes the rest of the computation after we're done evaluating 2 + 3. The final result of the program is k1 (k2 (2 + 3)). The nice thing about continuations is that it makes the control explicit, and this is especially useful in the case of functional programs, where control is not explicit otherwise. In fact, we can rewrite a program to make continuations more explicit. For instance, we can write (\x.x) (1 + (2+3)) as: (\y. (\x.x) y) ((\z. 1 + z) (2 + 3)) and since "let x = e1 in e2" is syntactic sugar for "(\x.e2) e1", we can re-write the above as: let z = 2 + 3 in let y = 1 + z in (\x.x) y This is fairly close to some machine instructions of the form: add z, 2, 3 add y, 1, z call id y Using continuations, functions can be transformed into "functions that don't return" -- functions that take, besides the usual arguments, an additional argument representing a continuation. When the function finishes, it invokes the continuation on its result, instead of returning the result to its caller. Writing functions in this way is usually referred to as Continuation-Passing Style, or CPS for short. For instance, the CPS version of factorial (written in ML) looks as follows: let rec cps_fact = fun n -> fun k -> if n = 1 then k(1) else cps_fact (n-1) (fn m => k(n * m)) let n = cps_fact 3 (fn n => n) Continuation-Passing Style is an important concept in the compilation of functional languages and is used as an intermediate compiler representation (it has been used in compilers for Scheme, ML, etc). The main advantage is that CPS makes the control flow explicit and makes it easier to translate functional code to machine code where control is explicit (in the form of sequences of machine instructions and jumps). For instance, a CPS call can be easily translated into a jump to the invoked method, since the invoked function does not return the control. Continuations as first-class values We will now study control and continuations as first-class value in a language. By first-class values we mean values that the program can manipulate: store continuations in variables, take them as arguments, pass them as return values, etc. Note that, although exceptions allow for non-local control-flow, they do not make control a first-class values. We will study a concept called call-with-current-continuation (available in Lisp, Scheme, ML). The idea is to capture the current control in a program variable (or heap ref, or parameter) that you can manipulate in the program. Then, the program will be able to jump to the control that a program expression represents. We will use two constructs in our language to model continuations as first-class values: e ::= ... | letcc k in e1 | throw e1 e2 These constructs work as follows. The expression "callcc e" evaluates e to a function whose argument is the current control point. In the body e of this function, this control point is a variable. In the second construct, e1 evaluates to a continuation k, e2 to a value v. Then, the control is being transferred to point k, and value v is passed to that point. The evaluation is as follows: (E, C, letcc k in e1) -> (E, C, e1[C/k]) (E, C, throw e1 e2) -> (E, C:throw [] e2, e1) if e1 not a value (E, C, throw C' e2) -> (E, C:throw C' [], e2) if e2 not a value (E, C, throw C' v) -> (E, C', v) Such constructs can be used to implement algorithms efficiently, in a way similar to the way break (or even goto's) are used in C/Java programs to jump out of structured control constructs. For instance, a program that multiplies all elements of a list be written as follows: open SMLofNJ.Cont; fun mult l = callcc (fn k => let fun mult' [] = 1 | mult' (0::t) = throw k 0 | mult' (n::t) = n * mult'(t) in mult' l e end ) (callcc is an ML construct such that "callcc (fn k => e)" is equivalent to "letcc k in e").