CS 430
Information Discovery
Fall 2001

Syllabus

This preliminary syllabus can be expected to change as the course progresses.

Class Formats

Classes are divided into two formats. 

Lectures
Each week there will be two conventional lectures in usual format.  PowerPoint slides will be available on the web.
 
Discussions
Class discussions on Wednesday evenings will be around readings from the text book or other papers to be read before the class.  It is essential that that everybody comes to class well prepared.

Week 1

Date Event Topic
Thursday 8/30 Lecture 1 Overview of information discovery

Week 2

Date Event Topic
Tuesday 9/4 Lecture 2 Basic concepts of information retrieval
Wednesday 9/5 Discussion 1 Inverted files [reading] 
Thursday 9/6 Lecture 3 Inverted files

Week 3

Date Event Topic
Tuesday 9/11 Lecture 4 Data structures for information retrieval
Wednesday 9/12 Discussion 2 Lexical analysis and stoplists [reading]
Thursday 9/13 Lecture 5 Library catalogs, MARC cataloguing

Week 4

Date Event Topic
Tuesday 9/18 Lecture 6 Library catalogs, Dublin Core
Wednesday 9/19 Introduction to Perl 1 [No discussion class]
Thursday 9/20 Lecture 7 Automatic extraction of catalog records
Friday 9/21 Assignment 1 due  

Week 5

Date Event Topic
Tuesday 9/25 Lecture 8 Vector methods
Wednesday 9/26 Discussion 3 Stemming [reading]
Thursday 9/27 Lecture 9 Term weighting and ranking

Week 6

Date Event Topic
Tuesday 10/2 Lecture 10 Cranfield and TREC
Wednesday 10/3 Introduction to Perl 2 [No discussion class]
Thursday 10/4 Lecture 11 Evaluation of retrieval effectiveness

Week 7

Date Event Topic
Tuesday 10/9 [fall break]  
Wednesday 10/10 Discussion 4 Dublin Core [reading]
Thursday 10/11 Lecture 12 Extending the Boolean model
Friday 10/12 Assignment 2 due  

Week 8

Date Event Topic
Tuesday 10/16 Lecture 13 NSDL case study
Wednesday 10/17 Discussion 5 User interfaces [reading]
Thursday 10/18 Lecture 14 Usability 1

Week 9

Date Event Topic
Tuesday 10/23 Lecture 15 Usability 2
Wednesday 10/24 Discussion 6 Ranking methods [reading]
Thursday 10/25 Lecture 16 Thesaurus examples

Week 10

Date Event Topic
Tuesday 10/30 Lecture 17 Probabilistic information retrieval
Wednesday 10/31 Midterm examination  
Thursday 11/1 Lecture 18 Guest lecture: Carl Lagoze, Distributed information retrieval

Week 11

Date Event Topic
Monday 11/5 Assignment 3 due  
Tuesday 11/6 Lecture 19 Web crawlers
Wednesday11/7 Discussion 7 Google [reading]
Thursday 11/8 Lecture 20 Web search systems

Week 12

Date Event Topic
Tuesday 11/13 Lecture 21 Non-textual materials 1
Wednesday 11/14 Discussion 8 Informedia [reading]
Thursday 11/15 Lecture 22 Non-textual materials 2

Week 13

Date Event Topic
Tuesday 11/20 Lecture 23 Query refinement / Latent semantic analysis
Wednesday 11/21 [break]  
Thursday 11/22 [break]  

Week 14

Date Event Topic
Tuesday 11/27 Lecture 24 Cluster analysis 1
Wednesday 11/28 Discussion 9 Thesaurus construction [reading]
Thursday 11/29 Lecture 25 Cluster analysis 2 and thesaurus construction

Week 15

Date Event Topic
Tuesday 12/4 Lecture 26 [no lecture]
Wednesday 12/5 Discussion 10 Clustering [reading]
Thursday 12/6 Lecture 27 Automatic classification and review
Friday 12/7 Assignment 4 due  

Examinations

Date Event
Friday 12/14 Final examination, Phillips 101, 9:00 to 10:30.

[CS 430 Home Page]

William Y. Arms
(wya@cs.cornell.edu)
Last changed: December 12, 2001