CS 412/413
Introduction to Compilers
Spring 2002

Assignment 3: IR  and CFG Generation

due: Friday,  March 15, in class


In this programming assignment, you will generate the intermediate code and the control flow graph representation of the program. In this assigment, you will build upon the code that you wrote for Programming Assignments 1 and 2.

Assignment Description

The intermediate code generation will have a main class, IC.IRGen. The IR code generator will parse and type-check the program using the code from Programming Assignments 1 and 2, then it will convert the AST for each function into an high intermediate representation. Further, it will translate this  high-level representation to  low-level  intermediate code. Finally, it will build the control flow graph for each function in the program. When invoked from the command line, the intermediate code generator should implement the following functionality: 

High-level IR

The high-level intermediate code should only consist of  the following statement and expression nodes:

    - assignment, block, if, while, break, continue, return, and call statements
    - binary expression, unary expression, constant expression, array access expression, variable expression, constructor expression, function call expression.

There should be no other nodes in your high IR representation. Block statements should have a list of children statements representing the all the statement in the block. Call statements should have a list of expressions representing the actual arguments at the call site. 

The high-level intermediate representation may be similar to your AST from the previous assignment. You can either translate the AST to high IR, or you can change your AST construction code to directly generate the high-level IR according to the above specification.

Low-level IR

You will next translate the high level intermediate code to sequences of low-level intermediate code instructions, which describe a simple abstract machine. Your low level IR should consist of the following instructions: 

    - binary operation instructions
    - unary operation instructions
    - copy instruction
    - load instruction
    - store instruction 
    - symbolic address  instruction
    - call instruction
    - return instruction
    - conditional jump instructions (branching on either true or false condition)
    - unconditional jump instruction

The operands of these instructions can be either constants, program variables, or temporary variables created during the translation to low-level code.

The IR lowering phase should correctly translate all high-level constructs, including short-circuit conditional expressions, and break and continue statements. Lowering array access expressions need not check for array bounds violations in this assignment. Lowering array constructor expressions (new) should invoke a library function __allocate, whose argument represent the number of bytes for the allocated array; the function returns the address of the allocated memory block. 

For lowering string comparison expression, you will use another library function __stringcompare which takes as arguments the addresses of two string objects s and s' and returns -1 if s< s', 0 if s== s', and 1 if s>s'. To translate the concatenation of two strings, you will invoke a third library function __stringcat, which takes the addresses of two string objects s and s', creates a new string object which is the concatenation of s and s', and returns the address of the created string object.

Control Flow Graph construction

You next have to design the class hierarchy for the control flow graph (CFG) representation. The control flow graph of each procedure is a graph whose nodes represent basic blocks. Each basic block consists of a sequence of low-level IR instructions. There should be no label instructions, except at the beginning of basic blocks. Similarly, there should be no jump instructions except at the end of basic blocks. Basic blocks don't necessarily begin with label instructions or end with jump instructions. Also, each CFG has a special start basic block which represents the starting point of the computation in the graph.

In this assignment, you have to build the CFG of the program starting from the low-level intermediate code. The resulting CFG should be such that:
    - there are no empty basic blocks (i.e. blocks with no instructions other than labels and jumps);
    - there are no blocks B and B' such that: B' is the only successor of B, and B is the only predecessor of  B'.


What to turn in

Turn in on paper:

Turn in electronically:

Electronic submission instructions

Place your files in \\goose\courses\cs412-sp02\grpX\pa3, where grpX is your group identifier.  Use the same directory structure for the electronic submission as in the previous assignment : 
1) src\ - all of your source code and class files;
2) doc\ - documentation, including your write-up and a README.TXT containing information on how to compile and run your project, a description of the class hierarchy of your source code, brief descriptions of the major classes, any known bugs;
3) test\ - test cases you used in testing your project.