With the increasing amount of textual information, such as web pages, it is important to develop effective systems to help users manage and exploit the information. Web search engines, such as Google, are good examples of such systems. In this course, you will learn the underlying technologies of information retrieval systems and get hands-on project experience. Topics include information retrieval models, information filtering, text classification, text clustering etc.
Prerequisites
Students should come with good programming skills. If you
are not sure whether you have the right background, please contact the instructor.
Grading
Class discussion (26%): Students are expected to participate all the discussions.
Although students are not required to write paper critiques, they need to read
the paper before the presentation and think about the answers to the following four questions:
(1) What are the strongest points of the paper? (2) What are the weakest points of the paper?
(3) How would you address the weakest points? (4) What can we learn from the paper and how
can it be applied to the project? We will discuss theses questions in class.
In addition, every student needs to bring at least one question about the paper/topic
for discussion.
Paper Presentation (24%): Every student is expected to give two presentations
or give one presentation and write two paper critiques if there are not enough presentation
slots. For every presentation, we will have 25 minutes for presentation and 15 minutes
for discussions. Presenters need to email the slides to the instructor
one day before the presentation.
Course project (50%): Students need to conduct a research-oriented project either by group
or individually. Here is the timeline.
Form a team (deadline: April 1, 2008)
Pick a topic
Read related work
write a proposal and present it (deadline: April 15, 2008)