due Friday, February 4
In this programming assignment, you will build a lexical analysis phase for the language Iota, defined on the web page at http://courses.cs.cornell.edu/cs412/2000sp/iota/iota.html. Your lexer, or tokenizer, should have the following interface (it may have additional methods, of course):
class Lexer {
Lexer(InputStream i);
// Create a lexer that reads characters from
the input stream i
Token getToken( ) throws LexicalError;
// Return the next language token on the input
stream. Returns
// a token representing the end of file as the
last token, assuming
// that no lexical error is encountered first.
}
The class Token
may be defined in whatever way you prefer, but it should
at least implement the following interface:
interface LexerResult {
void unparse(OutputStream o);
// Print a human-readable representation of
this token on the
// output stream o; one that contains
all the relevant information
// associated with the token. The
representation has the form
// <token-type, attribute,
line-number>
int lineNumber();
// Return the number of the line that this
token came from.
}
The class LexicalError
should also implement the LexerResult
interface, though you will want to choose a different output format for the unparse
method.
You must also implement a lexer test-bed program. This program must be a class LexTest
that implements the following behavior. When run from the command line, the LexTest
program takes a single filename as an argument. It reads the file, breaks it into tokens,
and uses the Token.unparse
method to dump a representation of the file as a
series of tokens. If a lexical error is encountered, it prints an error message that
includes the line number on which the error occurred.
All of the classes you write should be in or under the package Iota
, so
the Lexer class will be Iota.Lexer
, the testbed will be Iota.LexTest
,
etc.
You may use a lexer generator such as JLex to do this assignment. However, we do not take responsibility for helping you figure out how to use JLex; if you use it, you are on your own. If you use a lexer generator, you should turn in the lexer generator input rather than the Java source code that it emits!
We will test your lexer rigorously against our own test cases -- including programs that are lexically correct, and also programs that contain lexical errors. The correctness of your lexer will be important, and we will be more rigorous in our expectations for correctness if you use a lexer generator. We expect you to perform your own testing of the lexer. Often student projects do not handle erroneous input properly -- make sure that yours does! Testing your program on corner cases is also a good idea.
Because groups in this class are relatively large, we will be expecting a higher level of quality in your product than in some other courses you have taken. Much of the value in a compiler (or any other large program) is in how easily it can be maintained. A high value is placed here on both clarity and brevity -- both in documentation and code.
To submit your Programming Assignment 1, please drop your files in \\goose\courses\cs412-sp00\grpX\pa1
,
where grpX
is your group identifier. Please organize your
top-level directory structure as follows :
src\
- all of your source code and class files. For example,
we expect to find a class file for your main program in src\Iota\LexTest.classdoc\
- documentation, including your write-up and a README.TXT
containing information on how to compile and run your project, a description
of the class hierarchy in your src\
directory, brief
descriptions of the major classes, any known bugs, and any other information
that we might find useful when grading your assignment.test\
- any test cases you used in testing your project.Note: Failure to submit your assignment in the proper format may result in deductions from your grade.