due Wednesday, March 8, 10AM
In this programming assignment, you will build an intermediate code
generator for the
language Iota. The latest version of the Iota language definition is at
http://courses.cs.cornell.edu/cs412/2000sp/iota/iota.html.
We expect you to build upon the code that you wrote for Programming
Assignments 1 and 2, and to fix any problems with your lexical analysis,
parsing, or semantic analysis that interfere with IR generation for correct
Iota programs.
Your translator must support a number of dump options for debugging purposes:
You may choose to add additional options not mentioned here as long as they do not interfere with the required functionality. For example, your IR generator might accept more than one source file on the command line or support more debugging options than those listed. You may allow multiple dump options to be specified; we require support for only a single dump option on a particular command invocation.
Certain language features need not be implemented in this assignment. However, you will be required to implement them for a later programming assignment, and you may do them now for extra credit. If these language features are encountered in the program, the compiler should print a message: "Language feature not implemented: <description>" and then exit.
Logic Operators. Iota has the short-circuiting logical operators "&" and "|". You are not required to implement these operations.
Bounds checks. The Iota language specification requires that arrays and strings be bounds-checked. For this programming assignment, we do not require implementation of bounds checking.
String comparisons. The operators ==, <, and
> all may be used to compare strings to determine their
lexicographic equality or their lexicographic ordering. You need not implement
these operators for this assignment.
For IR translation, define a method on AST nodes that creates a corresponding IR node, as discussed in lecture. Write a placeholder method that generates an empty IR node or, even better, a special IR node that can contain the entire AST node that was to be transformed. When dumped, the special IR node uses your existing AST dumping facility. This placeholder method can then be implemented for each of the Iota language constructs in turn, so that you can test IR translation on simple programs and gradually add the functionality for translating all of the AST nodes to IR.
For canonicalization, you can take the same approach. First, attack the problem of bringing side-effecting nodes up to the top level. Elimination of two-way jumps can be worked on as a parallel problem. For the hoisting of side-effecting nodes, define a transformation method for each IR node that reduces it to a sequence of statement nodes plus an expression node in the case of canonicalizing expressions, as described in class. Again, you can use a placeholder method to allow incremental development and testing. The placeholder transformation method simply returns the statement node or expression node it is invoked on. This approach will allow you to test out your canonicalization functionality without having implemented all of the relevant code.
f(x: int) = x + g(x)
might be compiled to IR and printed out as follows:
MOVE(RV, MEM(FP + 8) + CALL(NAME(g), MEM(FP + 8)))
For compactness, + nodes in the trees
are printed out in infix notation here; CONST
nodes are printed simply as the corresponding constant, and FP
and RV denote the frame pointer and return
value registers, respectively. During canonicalization, the CALL
node must
be hoisted to the top of the tree, resulting in the following output:
MOVE(TEMP(t1), CALL(NAME(g), MEM(FP + 8)))
MOVE(RV, MEM(FP + 8) + TEMP(t1))
interface Unparse {
public void unparse(Iota.util.CodeWriter cw);
// Write a human-readable representation of the node to the given
// CodeWriter cw.
}
CodeWriter automatically introduces newlines to avoid excessively long lines, while also maintaining indentation levels. In expected use, you will call begin(0) whenever you write a parenthesis and end() whenever you write its closing parenthesis. allowBreak(0) is called at every comma and after binary operators. With this usage, the IR expression above will be formatted as follows with a narrow formatting width, resulting in a more readable output than simply wrapping lines:
MOVE(TEMP(t1),
CALL(NAME(g),
MEM(FP + 8)))
MOVE(RV, MEM(FP + 8) +
TEMP(t1))
Do not use the newline() method to insert newlines except at the
end of each statement in canonicalized IR.
As an example, the ESEQ node described in Appel might be dumped with the following code:
class ExpSeqNode extends IRNode implements Unparse {
StmNode stm;
ExpNode exp;
public void unparse(CodeWriter cw) {
cw.write("ESEQ(");
cw.begin(0);
stm.unparse();
cw.write(",");
cw.allowBreak(0);
exp.unparse();
cw.write(")");
cw.end();
}
}
Global variables and functions names in Iota are translated to assembler names as follows. A global identifier I in module M is translated as an assembler symbol _M__I; that is, two underscore characters are placed after the name of the module, and one before. For example, if a call is made to a function F in another module M, the following IR might be generated for the call, even though the Iota program expression will look like F(...) :
CALL(NAME(_M__F), ...)
These fully-expanded names are used even when the global function or variable is found within the same module as the call itself. This name mangling scheme is chosen for simplicity. Obviously you can create confusion by writing Iota programs in which double underscores appear in the identifiers. A better mangling would avoid this problem but be more complex to implement.
The only exception to this name-mangling strategy is the function main,
which is the first function executed in the program. To allow the run-time
system to find this function, it must be mangled as
| a.length | |
| a ® | a[0] |
| a[1] | |
| a[2] | |
| ... | |
| a[a.length-1] |
The Iota language specification requires that arrays and strings be bounds-checked, although it is not required for this assignment. To implement such a feature, it is necessary to have a way to halt the program in mid-execution. The procedure iota__abort() can be used to accomplish this.
Your electronic submission is expected at the same time as your written
submission: at the beginning of class on the due date. Electronic submissions
after 10AM will be considered a day late. To submit Programming Assignment 3, please drop your files in \\goose\courses\cs412-sp00\grpX\pa3,
where grpX is your group identifier. Please organize your
top-level directory structure as follows :
src\ - all of your source code and class files. For example,
we expect to find a class file for your main program in src\Iota\IRGen.class,
and a companion .java file in the same directory.doc\ - documentation, including your write-up and a README.TXT
containing information on how to compile and run your project, a description
of the class hierarchy in your src\ directory, brief
descriptions of the major classes, any known bugs, and any other information
that we might find useful when grading your assignment.test\ - any test cases you used in testing your project.Note: Failure to submit your assignment in the proper format may result in deductions from your grade.