In the fifth programming assignment, we will be extending our Iota compilers to support a more sophisticated language called Iota+. This language extends Iota by adding support for object-oriented programming. It is backwards compatible with Iota. As with the Iota language specification given earlier, if you see ambiguity in this specification, you have the flexibility to resolve it in the manner you wish as long as you can justify the reasonableness of your decision. However, you should contact the course staff to resolve ambiguities.
At the high level, Iota and Iota+ are similar: a program is composed of some number of modules that are described by interfaces. However, both modules and interfaces support additional kinds of declarations. New expression and statement forms also are supported in Iota+. Rather than describe the entire Iota+ language from scratch, this document describes only the additions and changes to the Iota language.
PrettyPrinter (180 lines)
New io
and file
interfaces
We will take the same approach as in the earlier language definition: grammatical productions are intermixed with descriptions of the new language features they correspond to.
As in Iota, a module implementation file starts with a declaration of what external components it accesses (uses), then continues with a series of definitions. uses declarations are needed for all components referenced from external interface files, including interface types and constants. The syntax for a uses declaration is the same as in Iota.
A module contains one or more definitions of variables or functions. The definitions may not be separated by semicolons. All top-level definitions are visible throughout the module; there is no need for forward declarations. Two definitions may not use the same name; nor may they conflict with any identifier defined by a uses declaration.
In addition to definitions of functions and variables, as in Iota, new kinds of definitions are supported: interface type definitions, class definitions, and constant definitions. In addition, variable definitions are extended to support initialization expressions that are evaluated when the program starts. The initialization expressions within a module are evaluated in the order in which they occur within the module. When a program contains more than one module, the initialization expressions of the modules are evaluated in some order that is specified at link time. However, all of the initialization expressions of one module in the order are evaluated before the initialization expressions of the next module in the order are begun.
defn ::= formal [ = expr ]
| fn_header = expr
| id = constant
| interface-type
| class
The syntax of an interface also is extended to support new kinds of declarations. In addition to variable and function declaration, an interface supports interface type declarations and constant definitions.
decl ::= formal
| fn_header
| interface-type
| id = constantfn_header ::= id ( [ formals ] ) [ : type ]
An interface also may begin with a series of uses
declarations, which were
not permitted in Iota.
interface ::= [ uses ] decl*
These declarations are useful because they allow an interface to talk about interface
types and symbolic constants defined in the interfaces of other modules. Each compilation
unit in the language -- module or interface -- may have a set of uses declarations. The uses
declarations of other compilation units are not in force in a compilation unit unless it
explicitly contains these declarations. The only exception to this rule is that the
identifiers introduced by uses
declarations of the interface to a module are
in scope in the body of the module as well, just as the interface types and constants
defined by the module's interface are in scope. This rule ensures that a module is
consistent with its own interface.
The uses
declarations in all the interfaces in a program define a
dependency graph among these interfaces, in the sense that one interface depends on
another if it uses an identifier from the second. It is a static error for the graph
of interface dependencies in the program to contain a cycle. For example, two interfaces
may not each contain a uses
declaration that mentions an identifier in the
other interface. (Implementation of this rule is optional.)
The identifiers introduced by a uses
declaration in one
interface may also be referenced by a uses
declaration in another
interface. (This is optional for Spring 2000.) For example, if interface foo
contains a declaration uses bar.x, bz = bar.z
, then another
interface baz
can import the identifiers foo.x
and foo.bz
by including the declaration uses foo.x
, foo.bz
.
Both interfaces and modules may contain definitions of constants. When an identifier
is declared to have a constant value, the compiler treats that identifier everywhere as
representing that value, exactly as if the constant expression were used in place of the
identifier. Constants only may be integers, booleans, strings, or the new constant null
.
The type of the identifier is not declared explicitly; it is determined from the constant.
Note that constant identifiers may not be assigned to!
In addition to the built-in types supported by Iota, Iota+ supports two new
kinds of types: class types and interface types. Both class types and interface types are
denoted by using their declared names, or in the case of interface types referenced
through a uses
declaration, the appropriate name introduced by the
declaration. An interface type defined in an interface file may be referenced by both
interface files and module files; however, a uses
declaration is necessary if
the referring module is different from the defining module. A class or interface type may
be used anywhere that a built-in type may be used; in particular, it may be used as a
variable type, a function or method argument type, a return type, and as a parameter to an
array
type.
type ::= id
|object
|int
|string
|bool
|array
[ type ]
A class or interface type supports various operations that are described in the declaration of the type. The compiler permits exactly the operations described in the type declaration and the declarations of any supertypes of the type. An interface type declaration permits only the declaration of object methods:
interface-type ::=
interface
id [extends
id ] { fn_header* }
The method declarations look exactly like function declarations at the top level of an interface; however, they are methods rather than functions. The object of the interface type is implicitly an argument to the method.
An interface type I optionally may declare that it extends
another
interface type I'. In this case, the signature of each method of I must conform to the
signature of the method of the same name in I' (if any), and similarly in every other
interface that it transitively extends. Unlike in Java, different methods of interface and
class types never have the same name; method overloading is not permitted.
If a subtype I
of the interface type I' has a method m with the same name as some method m' declared in I',
then the signature of m must conform to the signature of m'. Conformance requires that the
number of arguments in m and m' must be the same. In addition, the type of each argument
in m must be the same as, or a supertype of, the argument in the corresponding position in
m'. The names of arguments need not be identical in the two interface types. The return
type of m must the same as or a subtype of the return type of m'. If m' has no return
type, then m may declare any return type or simply not declare a return type.
Just as an interface type declaration is similar to an entire Iota interface, a class
definition is similar to an entire Iota module. A class definition may contain definitions
of fields (instance variables) and methods. Unlike in Java, fields and methods do not
have visibility modifiers such as public
and private
; such
modifiers are unnecessary because interfaces determine visibility. Components of a module
that are declared in the module's interface are public, and other components are
automatically private. Note that unlike in Java, classes do not have final fields or
methods; nor are there static fields or methods. In Iota+, the role of static
fields and methods is fulfilled by module variables and functions, which are not attached
to a specific class.
class ::=
class
id [implements
id ] { classdefn* }
classdefn ::= formal
| fn_header = expr
When class C declares that it implements
an
interface type I, it must implement every
method that I declares. The signatures of class methods must conform to the
signatures of the interface that it extends, transitively. This checking is
performed in exactly the manner described above for interface subtype declarations.
Neither a class nor an interface may introduce a new member (field or method) with the
same name as a member of a supertype. However, the signature of a supertype method may be
refined as described above.
The type system of Iota+ is more complex than that of Iota. In addition to
the new class types and interface types that may be declared by programmers and the
pseudo-types unit and none, we introduce the type Null to represent the type of the
expression null
, representing a null object reference. The new type object
is the supertype of all class and interface types, and of string and array types.
As in Iota, the pseudo-type none is used to represent the type of statements or
expressions that cannot terminate normally.
All types in the Iota+ type system are considered formally to be subtypes of unit. This subtype relation makes sense because a containing context expecting a unit will discard whatever value it receives; therefore, every value type may be used as a unit.
As in Java, null references are considered to be members of every object type. Therefore, the type Null is considered to be a subtype of every class or interface type, and also a subtype of the built-in types string and array[T]. However, it is not a subtype of the built-in types bool and int.
The pseudo-type none represents non-termination, and is considered to be the
subtype of every other type in the type system, including Null. It is the bottom of the pre-order (actually, lattice) on types defined by
the subtype relation. The top of the pre-order is the type unit. The type none is the universal subtype because an
expression that does not terminate normally will never produce a value that its containing
context does not expect. For example, the return
statement is given the
type none, since it does not terminate with an ordinary value. Consider the
following statement:
x: int = ( return "foo" )
This statement cannot result in a run-time type error, because no value ever will be assigned to x.
Iota+ allows expressions of the object type to be used in conditional expressions. We thus need an additional pseudo-type for expressions that can be either bool or object. This pseudo-type will be denoted bool _ object.
Note that Iota+ gives no way for the programmer to talk about the types unit,
bool _ object, Null, and none. These types are part of the type system but may not be
used in declarations. Their names are not keywords or reserved words and can thus potentially be used as legal identifiers
in an Iota+ program. However, the type object may be denoted in a program using the
keyword object
, just as with the types int and bool.
The subtype hierarchy of Iota+ just described is diagrammed as follows:
A method is defined as an expression that is evaluated at the time that the function is
called, initializing the variables of the formal arguments with the actual values passed.
As in Java, when a value of class type or interface type is passed, the called method
receives references to the same objects that the calling code passed. A method body
differs from a function body in that the additional implicit argument this
,
representing the receiver object, is in scope within the method body. Unlike the other
arguments, this
is not an assignable variable.
The syntax for formal arguments of both functions and methods is extended to allow a
more compact denotation of multiple formals arguments in a row sharing the same type. For
example, the argument list x,y: int, z: bool
is equivalent to the argument
list x:int, y:int, z:bool
.
formals ::= formals , formal
| formal
formal ::= ids : typeids ::= id | ids , id
Iota+ adds several new kinds of expressions to Iota.
Expressions may take one of several new forms.
expr ::= ...
|new
typeprimary ::= ...
|this
| primary . function_call
|cast
( expr , type )variable ::= ...
| primary . idconstructor ::=
new
type [ expr ] ( expr )
The expression this
may be used within method bodies, as described
earlier, to denote the receiver object. In an ordinary function body, it is a static error
to use this expression. Note that the identifier this
is a keyword and cannot
be used as the name of an ordinary variable.
The expression new
C creates a new object of the class C,
where C is the name of the class. When a new object of a class is created, all of
its fields are initialized to their default values. For all types that are supertypes of Null,
the default value of the type is null
. For integers, the default value is 0,
and for booleans, it is false
. Unlike in Java, classes have no constructors.
By convention, classes declare a method init
that performs proper
initialization of an object, establishing necessary representation invariants. The
pretty-printer sample code above contains some examples of this convention.
The dot operator (.) is used to indicate an access to one of the members of an object, either a field or a method. These accesses operate as in Java. The static type of the left-hand-side of the dot operator (the receiver) is used to determine whether the field access or method invocation is legal. A field access can be legal only if the type of the receiver is a class type, because interface types have no fields. In the case of a method invocation, the receiver type may be either a class type or an interface type. The arguments passed to the method must agree with the types of formal arguments of the method, as declared in the static type of the receiver, or, if the static type of the method does not declare the method, in its closest (lowest) supertype that declares the method.
Within a method, the fields and methods of the receiver object are automatically placed
in scope and can be accessed as if they were module variables or functions. For example,
if the class of the receiver object contains a field named a
, then the
expressions this.a
and a
are equivalent within that method body.
No scoping conflict between fields and methods of the object and other identifiers is
permitted. The names of fields of a class must differ from the names of all visible
variables, functions, classes and interface types.
The binary operators are extended to complete the usual set of six relational
operators, with the usual semantics. Note that booleans can be compared using the
relational operators, and that false
is consider to be less than true
.
Strings are compared lexicographically, unlike in Java (This is true in Iota as well).
Objects of array, class, and interface types only support the ==
and !=
relational operators, which
implement pointer equality: two objects are considered equal only if they are exactly the
same object. Two values may be tested for equality or inequality even if they do not have
exactly the same static type. Two values may be compared as long as they have a common supertype that is neither unit or none or
bool _ object.
This rule allows a value of static type object
to be compared with a value of static type string
; in this case, the two values are equal only if they are exactly the same object.
The pseudo-function cast
provides a way to change the type of an object
reference. It takes two arguments: an expression and a new type. The first argument is the
expression whose value is to be cast; the second argument is the desired type of the cast
expression. This expression requires that both the static type of the expression being
cast and the desired target type are subtypes of object. The validity of the cast
expression is checked at runtime: invalid casts evaluate to the null object.
binary_op ::= ... |
<=
|>=
|!=
The new constant expression null
returns the null object, whose type is Null,
as described earlier. Since Null is a subtype of every object type, the null object
is a member of every object type. The effect of this membership is that every access to an
object member (field or method) must check first whether the object is the null object. If
so, the program halts immediately with an appropriate error message. This check also is
required for all string and array operations, such as array and string index expressions
and string concatenation.
constant ::= ...
|null
Array and string index expressions are required in Iota+ to be checked at run time to ensure that the index is not out of bounds.
In certain contexts where a value of type bool is expected in Iota, a value of
any subtype of object may be used instead. The value is automatically converted
to a boolean in the following way: null
is converted to false
,
and all other values are converted to true
. The contexts where an automatic
conversion is performed are the following: the test expressions of if
and while
statements, and the arguments to the logical operators !
, &
and |
. For example, the following two if
statements are
equivalent:
if (o != null) ... if (o) ...
Note that there is no automatic conversion from integers to booleans, as in the C language.
Iota+ adds a few more statement forms to Iota. The new increment and
decrement statements increase or decrease the value of an integer variable by
one. Unlike in C and Java, the value of the statement is the result of the
increment. The statement v++
is not precisely equivalent to the
statement v=
v+1
, however, because the
location of the variable v is computed only once. For example, the
statement a[(i++)]++
is equivalent to (i++; a[i]++)
because the index expression (i++)
is evaluated just once.
stmt ::= ...
| variable++
| variable--
|  break
The new statement break
terminates the closest enclosing while
statement, just as in Java. A break
statement may occur only within a while
statement.
The syntax of assignment is extended to permit multiple assignment within a single
statement. An assignment may take the form a = b = c =
E, where a-c
are variables, and E is the expression to be assigned to each of the variables. The
production stmt ::= variable = expr is replaced by a new
production stmt ::= assignment. The non-terminal assignment is
defined as follows:
assignment ::= variable = expr
| formal = expr
| variable = assignment
| formal = assignment
Note that assignment is a right-associative operator: first, the expression is evaluated, then each of the expressions denoting variables are evaluated in turn from right to left, with each assignment completed before the evaluation of the previous variable.
In Iota, a local variable declaration may include an initializing expression. In
Iota+, multiple local variables may be declared in the same statement, because
of the redefinition of the formal non-terminal. For example, the statement i,j:
int = 1
introduces two variables i
and j
, and initializes
them both to 1. (Note that this differs from C and Java.) The right-hand side expression is evaluated only once, not once for each
variable being initialized. The right-hand side expression may be an assignment itself,
and the declaration and initialization of new variables may serve as assignments as well.
Note that a statement may still consist of a variable declaration without
initialization, as in Iota: the production stmt ::= id :
type [ = expr ] is replaced by the production stmt ::= formal
Certain statements in Iota+ have the type none: break
statements and return
statements. In addition, an if
statement
produces the type none if both of its arms have that type, as in the following
example:
if (b) return 0 else return 1
However, the type of this if
statement results simply from the application
of the usual rule for checking if
statements statically.
A statement whose type is none may not be followed by any other statement, since that following statement could not be executed. In other words, if such a statement occurs within a block of statements, it must occur at the end of the block.
This typing has the advantage that it permits functions to end with return
statements, which avoids confusing programmers used to C or Java syntax. For example, the
function
inc(x: int): int = ( return x+1 )
is legal because the type of the function body is none, which is a subtype of int, and the return statement returns the proper type.
Iota+ introduces a number of new keywords that have been mentioned in this
document. The keywords of the language include all the keywords of Iota ( array
,
bool
, false
, if
, int
, length
,
return
, string
, true
, uses
, while
),
and the new keywords mentioned here ( break
, cast
, new
,
null
, object
, this
).
Note also that Iota+ adds new multi-character non-alphabetic tokens, such as
++
, --
, <=
, >=
, !=
.
The following list summarizes all the new features of Iota+, for convenience in checking whether the compiler properly implements the language.
uses
in interface with acyclic constraint extends
clause for interfaces typesimplements
clause for classesnull
constant/null checking new C
expression init()
method this
expression cast
expression a = b = c
) return
and break
as none Iota+ provides the same two standard modules for programming: io
and conv
. However,
the io
module is extended with some additional functionality. It also adds
new interfaces for input and output stream objects, and provides objects stdin
and stdout
that may be used to read and write data from and to the standard
input and output device. The new declarations in the interface are as follows:
interface OutputStream { print(s: string) // output the string s putc(c: int) // output the character whose ASCII value is c flush() // make sure any pending output is flushed } interface InputStream { readln(): string // read a line from the input getc(): int // read a single character from the input eof(): bool // return whether the end of file has been reached on the input } stdout: OutputStream stdin: InputStream
In addition, a new file
module is provided for file
I/O:
uses io.InputStream, io.OutputStream interface FileOutput extends OutputStream { close() } interface FileInput extends InputStream { close() } read(filename: string): FileInput // Open the named file for reading. Return null on failure. create(filename: string): FileOutput // If the file does not exist, create it and return a FileOutput that // will write to the file. If the file cannot be created, return null. append(filename: string): FileOutput // If the file can be opened, return a FileOutput that appends to it. // Otherwise, return null. remove(filename: string): bool // Remove the file and return true if successful