due Monday, March 12, in class
[2/23] Fixed runtime library routine names to include leading underscore.
In this programming assignment, you will build an intermediate code
generator for the
language Iota, defined in http://www.cs.cornell.edu/cs412/2001sp/iota/iota.html.
We expect you to build upon the code that you wrote for Programming
Assignments 1 and 2, and to fix any problems with your lexical analysis,
parsing, or semantic analysis that interfere with IR generation for correct
Iota programs.
Your translator must support a number of dump options for debugging purposes:
You may choose to add additional options not mentioned here as long as they do not interfere with the required functionality. For example, your IR generator might accept more than one source file on the command line or support more debugging options than those listed. You may allow multiple dump options to be specified; we require support for only a single dump option on a particular command invocation.
Certain language features need not be implemented in this assignment. However, you will be required to implement them for a later programming assignment, and you may do them now for extra credit. If these language features are encountered in the program, the compiler should print a message: "<filename>:<lineno>:Language feature not implemented: <description>" and then exit. We strongly recommend that you get the rest of the IR generation working before tackling the extra credit items.
Logic Operators. Iota has the short-circuiting logical operators & and |. You are not required to implement these operations.
Bounds checks. The Iota language specification requires that the indices used to access elements of arrays and strings be bounds-checked. For this programming assignment, we do not require implementation of bounds checking.
Arithmetic checks. The Iota specification says that division by zero is caught and terminates the program. You do not need to implement this yet.
For IR translation, define a method on AST nodes that creates a corresponding IR node, as discussed in lecture. Write a placeholder method inherited by all AST nodes that generates an empty IR node or, even better, a special IR node that can contain the entire AST node that was to be transformed. When dumped, the special IR node uses your existing AST dumping facility. This placeholder method can then be implemented for each of the Iota language constructs in turn, so that you can test IR translation on simple programs and gradually add the functionality for translating all of the AST nodes to IR.
For tree flattening, you can take the same approach. First, attack the problem of bringing side-effecting nodes up to the top level. Elimination of two-way jumps can be worked on as a parallel problem. For the hoisting of side-effecting nodes, define a transformation method for each IR node that reduces it to a sequence of statement nodes plus an expression node (when flattening expressions, as described in class.) Again, you can use a placeholder method to allow incremental development and testing. The placeholder transformation method simply returns the statement node or expression node it is invoked on. This approach will allow you to test out your tree-flattening functionality without having implemented all of the relevant code.
f(x: int) = x + g(x)might be compiled to IR and printed out as follows:
Function f: MOVE(RV, MEM(FP + 8) + CALL(NAME(g), MEM(FP + 8)))For compactness, + nodes in the trees are printed out in infix notation here; CONST nodes are printed simply as the corresponding constant, and FP and RV denote the frame pointer and return value registers, respectively. In canonical form, CALL nodes must be hoisted to the top of the tree, resulting in output like the following:
Function f: MOVE(TEMP(t1), CALL(NAME(g), MEM(FP + 8))) MOVE(RV, MEM(FP + 8) + TEMP(t1))
Unparse
interface from Programming
Assignment 2:
interface Unparse { public void unparse(Iota.util.CodeWriter cw); // Write a human-readable representation of the node to the given // CodeWriter cw. }Do not use the newline() method to insert newlines except perhaps at the end of each statement in the canonical IR. As an example, the ESEQ node described in Appel might be dumped with the following code:
class ExpSeqNode extends IRNode implements Unparse { StmNode stm; ExpNode exp; public void unparse(CodeWriter cw) { cw.write("ESEQ("); cw.begin(0); stm.unparse(); cw.write(","); cw.allowBreak(0); exp.unparse(); cw.write(")"); cw.end(); } }
Global variables and function names in Iota are translated to assembler names as follows. A global identifier I in module M is translated as an assembler symbol _M__I; that is, two underscore characters are placed after the name of the module, and one before. For example, if the Iota program expression contains a call like F(...), an invocation of a function F in another module M, the following IR might be generated for the call:
CALL(NAME(_M__F), ...)
These fully-expanded names are used even when the global function or
variable is found within the same module as the call itself. This name mangling
scheme is chosen for simplicity. Obviously you can create confusion by writing
Iota programs in which double underscores appear in the identifiers. A
better mangling scheme would avoid this problem but be more complex to implement.
Note that the actual name that is used by the referencing module is not used for
mangling; the important name is the name that the defining module uses. This
distinction is important for handling uses
clauses of the form a=b.c
.
Where the identifier a
is used, the IR should contain the mangled
name _b__c
instead.
The only exception to this name-mangling strategy is the function main,
which is the first function executed in the program. To allow the run-time
system to find this function, its name must be mangled as _iota__main
regardless of what module it occurs in.
You will also have the ability to define your own runtime support routines in the next assignment. You may define such routines during this assignment and use them in your IR. If you do so, you must submit the C source code that implements your routines with this assignment. You should be aware that a call to a runtime routine will likely be more expensive than the IR code that it replaces, even if that IR code is fairly complicated. Be sparing with the number of routines you introduce, because frequent calls to runtime routines will likely degrade performance of the code your compiler emits. As another consequence, use of such routines in the implementation of one of the extra credit items above will disqualify you from receiving that particular extra credit.
Strings are stored in a packed representation in which every character is placed in a consecutive byte. The string is always terminated by an extra null character (0), and the string is also padded out with nulls until it fills out a 32-bit word completely. Here are some examples of string layouts in memory.
An array is laid out similarly in memory, although each element of the array takes up a full 32-bit word and there is no terminator:
The Iota language specification requires that accesses to elements of arrays and strings be
bounds-checked, although it is not required for this assignment. To implement
such a feature, it is necessary to have a way to halt the program in mid-execution.
The procedure _iota__abort can be used to accomplish this. Code that
invokes _iota__abort
should print some useful message first.
_epilogue
where F is the name of the function in question. The function prologue
does not need to be represented in
the IR at all; it will be generated in the next compiler phase.
javadoc
output that we can browse on-line. We should be able to browse
through all your classes and interfaces, including all their methods and
fields.Your electronic submission is expected at the same time as your written
submission: at the beginning of class on the due date. Electronic submissions
after 10AM will be considered a day late. To submit Programming Assignment 3, please drop your files in \\goose.csuglab.cornell.edu\courses\cs412-sp01\grpX\pa3
,
where grpX
is your group identifier. Please organize your
top-level directory structure as follows :
src\
- all of your source code and class files. For example,
we expect to find a class file for your main program in src\Iota\IRGen.class,
and a companion .java file in the same directory.doc\
- documentation, including your write-up and a README.TXT
containing information on how to compile and run your project, a description
of the class hierarchy in your src\
directory, brief
descriptions of the major classes, any known bugs, and any other information
that we might find useful when grading your assignment.test\
- any test cases you used in testing your project.Warning: Failure to submit your assignment in the proper format will result in deductions from your grade.