Design report due Friday, May 4
Code and demo due Thursday,
May 10 or Friday, May 11
In this programming assignment, you will complete your Iota+ compilers by implementing graph-coloring register allocation, constant folding and unreachable-code elimination.
The name of the compiler class and its syntax for invocation remain the same as in Programming Assignments 3–5. However, the compiler must support three additional options: -dump_live, -dump_reg_alloc, and -noopt. The -dump_live option causes the abstract assembly code to be dumped by the compiler, but with a list of variables live on input to and output from each assembly code statement printed with each statement. Abstract assembly statements are dumped in exactly the format shown in the following example:
add t1, t2 ; live-in: t1, t2, t3 live-out: t1, t3
The -dump_reg_alloc option causes the compiler to print out for each method and function the register allocations that were performed. For a method f of class C in module M, the output states which temporary variable was allocated to each of the allocable registers in the register set:
M.C.f:
ax: t1, t2, t4
bx: t3, t8
cx: t5,
t6
...
di:
<nothing>
For a global function f in module M, the report of register allocations is introduced by the line M.f: instead.
By default, the compiler will perform these optimizations: graph-coloring register allocation, constant folding and unreachable-code elimination. To turn off these optimizations, the -noopt option must be supported. This option causes code to be generated in the same way as in the earlier programming assignments.
The constant-folding optimization replaces expressions that have constant arguments with either their result or a simplified expression. You are expected to implement at least the following kinds of simplifications:
These optimizations should be performed on IR trees, where it will be easy to express the rewriting necessary. Some optimizations can be performed at the AST level as well; for example, the simplification of if and while statements is more easily recognized in AST form. On the other hand, some opportunities for constant folding will become apparent only when the code is expanded into IR. When the test condition of a conditional jump in the IR code is always true or false, it can be replaced with an unconditional jump. In general, some code may then become unreachable and can be removed from the program. For this assignment it is acceptable to implement constant folding only for IR trees.
We recommend implementing these optimizations using a functional style: rather than modifying the IR (or possibly AST) tree in-place, construct a new tree. A nice trick that improves performance is to share unmodified nodes with the old tree. For example, when implementing constant folding on a binary operation node, recursively fold constants on both operand subtrees. If the recursive calls return exactly the same subtrees, and no constant folding is possible at the current node, then the node itself can be returned as a result. If one or both of the recursive calls do some constant folding, or some constant folding is possible on the subtrees, then a new node must be constructed as the result of constant folding. This approach significantly reduces object allocation and garbage collection overhead.
Unreachable code is code that cannot possibly be executed, regardless of the
input data. Unreachable-code elimination should be performed on the canonical IR
code. A straightforward approach is to generate a control flow graph
from the canonical IR code and then identify and remove those
basic blocks to which there is no path from the entry block. It's possible that
some edges in a control flow graph are not
traversable. For example, there is usually an edge from the instruction CJUMP(e,l1,l2)
to the instruction l1:s
,
but if e
is the constant false
,
then that edge will never be traversed and could be removed from
the control flow graph. Finding and removing non-traversable edges can help to
identify more unreachable code. Unreachable code elimination should be performed
after constant folding conditionals and before ordering the basic blocks with
traces.
Your compiler will implement the graph-coloring register allocation algorithm described in Appel, except that register coalescing and removal of redundant mov instructions are optional and can be done for 5 points of extra credit. Rather than being allocated immediately to stack locations, local variables are allocated to temporaries that are spilled to stack locations only if necessary. It is necessary to deal properly with callee-save and caller-save registers. If any callee-save registers are used during a function, they must be saved on the stack and restored from it before the function returns. Similarly, the content of caller-save registers must be assumed to be destroyed by all function calls. The register allocation algorithm will cause this spilling automatically if the def and use for call instructions are set up properly.
It will be necessary to perform live variable analysis over the control flow graph of each method and function in the program. To aid you in constructing live variable analysis, we are providing a Dataflow package that provides a framework for the work-list iterative dataflow analysis described in class. To obtain a live variable analyzer, you will need to extend dataflow.Analysis and implement dataflow.Value and dataflow.FlowGraph, by defining the appropriate flow functions and meet operations for live variable analysis.
You will be demonstrating your compiler in person on May 11 or 12. Signups for demo time slots will be arranged later. The demo will be a half hour long, consisting of two roughly 15-minute segments. In the first segment, you will give a short presentation about your compiler and impress the course staff with its design and functionality. We will be interested to hear about interesting challenges you addressed and lessons you learned. Prepare your presentation carefully and think about what points you want to make. In the second segment, the course staff will run your program through several benchmarks. We will be looking at the quality of your compiler and the quality of your generated code. For each benchmark, we will measure (at least) the following: the time it takes to run the compiled benchmark, the size of your generated code, and the time it takes your compiler to compile the benchmark. The groups producing the best compiler, based on these results and measures of quality, will receive a small prize and future bragging rights. See the benchmark competition page for more details.
We are requiring less complete documentation of your project for this programming assignment because you will be demoing your system in person. However, we are still interested in receiving a complete design document that shows you have thought through all of the implementation issues. You should turn in the following information in hardcopy in class on May 4 (no extensions!):
java Iota.Compiler foo.imwith the directory
\\goose\courses\cs412-sp01\grpX\pa6\src
in the classpath.
Your electronic submission is expected at the same time as your demo. To
submit Programming Assignment 6, please drop your files in
\\goose\courses\cs412-sp01\grpX\pa6
, where grpX
is
your group identifier. Please organize your top-level directory structure
as follows :
src\
- all of your source code and class files. For
example, we expect to find a class file for your main program in
src\Iota\Compiler.class, and a companion .java file in
the same directory.
doc\
- documentation, including your write-up and a
README.TXT
containing information on how to compile and run your
project, a description of the class hierarchy in your src\
directory, brief descriptions of the major classes, any known bugs, and any
other information that we might find useful when grading your assignment.
test\
- any test cases you used in testing your project.
Note: Failure to submit your assignment in the proper format may result in deductions from your grade.