Polymorphism 

Polymorphism refers to the ability of a piece code to operate on
values of different types. Polymorphism applies to various language
constructs, including functions, datatypes, objects, or modules.  For
instance, a polymorphic function is one that can be invoked with
different kinds of arguments. And a polymorphic datatype is one that
contains elements of unspecified types.

There are kinds of polymorphism:

- parametric polymorphism: the code is written without knowledge of
  the actual type of the arguments and operates on any kind of
  arguments. Examples include polymorphic functions in ML, or generics
  in Java 1.5. 

- subtyping polymorphism: the code works on values whose type may be
  any subtype of a known type

- ad-hoc polymorphism. This usually refers to code that appears to be
  polymorphic to the programmer, but the actual implementation is
  not. A typical example is overloading -- using the same function
  name for functions with different kinds of parameters. Although the
  same looks like a polymorphic function to the code that uses it,
  there are actually multiple function implementations (none being
  polymorphic) and the compiler invokes the appropriate one. Templates
  in C++ can be considered in this category, as the compiler
  instantiates them, rather that using one single polymorphic version.


Parametric polymorphism

With parametric polymorphism, types "parameterized" on unknown types,
i.e. defined in terms of unknown types. Consider the identity
function: \x . x. The type inference in ML (or from last lecture)
yields a type of the form 'a -> 'a, where 'a is an unknown type. Here,
the type of id is parameterized on the unknown type 'a. To make
explicit the fact that 'a can be any type, we can write the type of id as:

   \forall 'a . 'a -> 'a.

To apply id to an integer, we have to instantiate 'a to int. To apply
id to a boolean, we must instantiate 'a to bool. 

To study parametric polymorphism, we will the typed lambda calculus
with constructs that describe polymorphic functions and types. This
augmented calculus is know as the polymorphic lambda calculus, or
"System F":

(expr)   e ::= n | x | \x : t . e | e1 e2 | /\ 'a . e | e [t]
(types)  t ::= int t1 -> t2 | 'a | \forall 'a . t
(values) v ::= n | \x : t . e | /\ 'a . e

The expression /\ 'a . e is a type abstraction: it takes type 'a as a
parameter, and yields the value of e given that type. The expression e
[t] is a type instantiation: it instantiates the polymorphic type of e
with type t. Note that instantiation does not require the program to
keep run-time type information, or to perform type checks at run-time;
it is rather a way to statically check type safety in the presence of
polymorphism. 

Finally, types include polymorphic types \forall 'a . t, which
universally quantify occurrences of 'a in t over all possible types.

In this language, the polymorphic identity function is written as:

  poly_id =  /\ 'a . \x : 'a . x

And we can apply this function to integers via type instantiation:

 poly_id[int]  is  \x : int . x, with type int->int


The evaluation rules for the polymorphic system as the same as in the
typed lambda calculus, augmented with new rules for evaluating the new
constructs:

    e -> e'
 -------------
 e[t] -> e'[t]

 
 (/\ 'a . e) [t] -> e[t/'a]

 
Type checking expressions is slightly different than before. Besides
the type environment E, we also need to keep track of a set D of type
variables 'a. The reason is to ensure each type variable 'a is bound
to an enclosing /\ 'a abstraction. The typing judgments are of the
form D, E |- e : t. The rules are as follows:

  D, E |- n : int
  D, E |- n : int

     E(x) = t
  --------------
  D, E |- x : t


  D, E |- e1 : t->t'  D, E |- e2 : t
  ------------------------------------
          D, E |- e1 e2 : t'


     D, E[x -> t] |- e : t'
  ----------------------------  FTV(t) subset D
  D, E |- \x : t . e : t -> t'


In the last rule, FTV(t) represents the set of free type variables of
t. The rule ensures that all free type variables of t are bound to
enclosing /\'s.

The remaining rules are:

        D U {'a}, E |- e : t 
  ---------------------------------- 'a not in D
   D, E |- /\ 'a . e : forall 'a .t  


  D, E[x -> t] |- e : forall 'a . t'
  ----------------------------------- FTV(t) subset D
   D, E |- \x : e[t] : t'[t/'a]

The side condition in the rule for type abstractions forbids shadowing
type parameters (what can go wrong if we omit this condition?).  The
rule for type instantiations requires that the free type variables of
t are all bound.

With these rules, we can check that (/\ 'a . x : 'a . x) [int] 2 is
well-typed. We can also apply poly_id to itself, after the appropriate
type instantiation:

  (poly_id[\forall 'a . 'a ->'a]) poly_id -> 
  (\x : \forall 'a . 'a ->'a . x) poly_id -> 
  poly_id

In real languages such as ML, programmers don't have to annotate their
programs with things like /\ 'a . e or e[t]. Both are automatically
inferred by the compiler (although the user can specify the former if
he wishes). For instance, We can write "fun f x = (x,x)", and have ML
figure out that we mean "fun f (x : 'a) : 'a * 'a = (x, x)". Or we can
write the latter directly and ML will type-check it. Conceptually, the
system inserts an universal quantifiers /\ 'a for all type variables
'a around the outermost enclosing expression. Then, if we want to apply
f to an integer, we don't have to make the instantiation explicit and
write "f[int] 2", but we can directly write "f 2" and have the system
infer the "[int]".

In Java 1.5, generics provide support for parametric polymorphism. For
instance, we can write a class that is parameterized on an unknown
reference type T:

  class Pair<T> {
     T x, y;

     Pair(T x, T y) {
        this.x = x;
        this.y = y;
     }

     T fst(Pair<T> p) {
        this.x = p.x;
        return p.x;
     }
  }
 
This is a class that contains a pair of two elements of unknown, but
same type T. The parameterization /\ T is implicit around the class
declaration. Since Java does not support type inference, type
instantiations are required. Type instantiations are done by writing
the actual type in angle brackets:

  Pair<Boolean> p;
  p = new Pair<Boolean>(Boolean.TRUE, Boolean.FALSE);
  Boolean x = p.fst(p);