CS 412/413
Introduction to Compilers
Spring 2001

Homework 2: Syntactic Analysis

due: Wednesday, February 14, in class

  1. Appel 3.3(f, g)
     
  2. Consider a simple markup language that uses tags. Its only terminals are {<, >, /, =, word}.  Every tag begins with < and ends with >. The first token in an open tag is a word representing its name, followed by an optional list of attributes which are pairs of words related by =. Every open tag must be paired with a close tag, which is a tag with / before the name and no attributes. Any number of words or tags may appear between and open and close tags.
     
    For example, here is a valid string from this grammar:
     
    <word word=word word=word><word>word word word</word><word></word></word>
     
    1. Write a grammar for this language.
    2. Find the nullable, FIRST, and FOLLOW sets for your grammar.
    3. Construct the LL(1) parse table. Clearly identify any conflicts in the table.
    4. What, fundamentally, makes this language not LL(1)? Is there a k for which it is LL(k)? 
      Hint: what if you could change the terminals?

      

  3. Consider the following grammar:

    E !  E op E | (E) | num
    op !  + | * | ^ 

    This grammar is ambiguous; we would like the grammar to have parse trees in which exponentiation (^) has higher precedence than multiplication (*), and both have higher precedence than addition (+).
     

    1. Write an LL(1) grammar that accepts the same language as this grammar and respects the desired operator precedence.  (You need not ensure that the operators are left-associative, however.)
    2. Show the derivation of the expression 2^3 + 4*5 using the grammar of part (a).
    3. Write an LR(1) grammar that accepts the same language, but enforces both the correct precedence and left-associativity of the operators.
       
  4. Java allows the assignment operator to occur multiple times in a single expression, e.g., a = b = c. The operator is defined as right associative, so the previous example is equivalent to a = (b = c). The following grammar captures the essence of this assignment operator and its interaction with variables and addition.
      
    S !  A
    A
    !  id = R
    R
    !  A | E
    E
    !  E + P
    E
    !  P
    P
    !  id | (A)
      
    1. Build the LR(0) DFA for this grammar.
    2. Is this an LR(0) grammar? Give evidence.
    3. Is this an SLR grammar? Give evidence.
    4. Is this an LR(1) grammar? Give evidence.
       
  5. [Dragon book, 4.40] Show that the following grammar is LR(1) but not LALR(1).
     
    S !  Aa | bAc | Bc | bBa
    A
    !  d
    B !  d
     
  6. The two dominant parsing techniques in real compilers are LL(1) and LALR(1). As might be expected, each has its own particular strengths. Develop a set of criteria for comparing parsing techniques, and compare these two techniques with respect to your criteria. Some possibilities are: simplicity (of drivers and grammars), generality, and table size. Assume that automated parser generators are available for each technique. Your answer should be brief (no longer than half a page).