CS 412/413
Introduction to Compilers
Spring 2002

Assignment 4: Dataflow Analysis and Optimizations

due: Friday,  April 5, in class

In this programming assignment, you will implement several dataflow analyses and optimizations. These will build upon the code that you wrote for previous assignments.

Assignment Description

You will implement a generic dataflow analysis framework and then instantiate it for several analyses: live variable analysis, available expressions, reaching definitions, and constant folding analysis. You will then use the result of these analyses to perform dead code elimination, common sub-expression elimination, loop-invariant code motion, constant folding, and strength reduction of induction variables.

The compiler will first parse the program, check for semantic error, generate intermediate code, and build the control flow graph.  When invoked from the command line, the compiler must support the following optimizations:

The compiler will perform the optimizations in the order they occur in the command line. The above arguments may be invoked multiple times in the same command line - in that case the compiler will execute them multiple times, in the order specified. It will only perform the analyses necessary for the optimizations requested.

Dataflow Analysis Framework

Your dataflow analysis code will consist of an abstract dataflow analysis class IC.DFA.DFAEngine, and an abstract dataflow information class IC.DFA.DFAInfo

The constructor of your dataflow information object must return the top element in the lattice, and must have a Meet method which takes another dataflow information object as argument and returns the meet operation between this object and the other object. It must also support an Equals method which tests if two dataflow information objects correspond to the same element in the lattice.

The dataflow analysis class DFAEngine has a method Solve which takes as argument a control flow graph which iteratively solves the dataflow equations (using the worklist algorithm) and computes dataflow information at each program point in the flow graph. The dataflow analysis class also contains a TransferFunction method which models the transfer functions in the dataflow analysis framework. It takes a low-level IR instruction and a dataflow information object as parameters and returns the resulting dataflow information. You should allow your dataflow analysis class to be instantiated either a a forward or as a backward analysis. 

Analysis Instances

You will then use your dataflow analysis framework to implement several dataflow analyses and use their results for subsequent optimizations. There are four dataflow analyses that you will implement in this assignment:

  1. Live variables analysis - computes variables which may be live at each program point
  2. Available expressions -  computes expressions which are available in all executions of the program
  3. Reaching definitions - for each program point, it computes the definitions which may reach that point 
  4. Constant folding analysis - determines variables whose values are constant 


Next, you will implement the following optimizations in the compiler:

  1. Dead code elimination - remove code which updates variables whose values are not used in any executions. This optimization will use the results from the live variable analysis. 
  2. Common subexpression elimination - reuse expressions already computed in the program. Here you will use the information computed with the available expressions analysis. If an expression is available before an instruction that re-computes that expression, you have to replace the its computation by the variable which holds the result of that expression in all executions of the program. If there is no such variable, you will create a temporary variable to store the result of the expression.
  3. Loop invariant code motion - for a given loop, detect computation which doesn't change in different iterations of the loop and hoist that computation out of the loop. Use reaching definition information to determine loop invariant code. You also need to identify loops in your flow graph; for this, add loop information to your basic blocks: each block will had a reference to the loop header block of the innermost loop, and each loop header block will keep a list of all blocks in its loop body.
  4. Constant folding - use the results from constant folding analysis and replace each constant variable with the computed constant.
  5. Strength reduction of induction variables - detect induction variables in loops and replace expensive multiplication operators on these operations by semantically equivalent code which updates induction variables using addition instructions.



Your compiler must still support the command line switches -dump_lir and -dump-cfg from last assignment, which produce textual outputs of the low-level code and of the flow graph. In addition, your compiler will detect function calls of the form showDFAinfo() in the program and will produce a textual output of the dataflow information at the program point where the function is invoked. This function must not be moved out of loops, as part of the loop invariant code motion transformation.

What to turn in

Turn in on paper:

Turn in electronically:

Electronic submission instructions

Place your files in \\goose\courses\cs412-sp02\grpX\pa4, where grpX is your group identifier.  Use the same directory structure for the electronic submission as in the previous assignments : 
1) src\ - all of your source code and class files;
2) doc\ - documentation, including your write-up and a README.TXT containing information on how to compile and run your project, a description of the class hierarchy of your source code, brief descriptions of the major classes, any known bugs;
3) test\ - test cases you used in testing your project.