CSE 888J02: Advanced Seminar in Knowledge Discovery and Data Mining
Fall 2005

Instructor:        Srinivasan Parthasarathy
Office:               DL 693
Phone:               292-2568
Email:                srini@cse.ohio-state.edu
Class Hours:     Friday 1:30-3:00pm
Office Hours:   By appointment
Web Page:         http://www.cse.ohio-state.edu/~srini/888J02
 

Introduction
With the unprecedented rate at which data is being collected today in almost all fields of human endeavor, there is an emerging economic and scientific need to extract useful information from it. Data mining is the process of automatic discovery of patterns, changes, associations, sequences and anomalies in massive databases. This research seminar will survey the main topics in data mining and knowledge discovery. Topics will be from among: mining techniques:-classification, clustering, association rules, sequence similarity etc.; high performance implementation issues:-parallel/distributed data mining, active resource-aware data mining; and application domains such as web mining, scientific simulations, e-commerce and bioinformatics.

Prerequisites
There are no pre-requisites as such. However, it is desired that students will have had experience with at least one of the following courses: database systems (CIS 670), statistics (500/600 level course) parallel computing (CIS 720). Tha ability to program in C/C++, and/or with statistical programming packages, and work on quarter-long (team/individual) projects is expected.

Class Format and Requirements
The class will be a mix of student presentations, paper discussions and a research-oriented project. Generally, one of you will introduce a topic, and then we'll discuss some of the latest work on that topic. You will have to explain and defend what the paper says, as well as present weaknesses and shortcomings as you see fit. The rest of the class will be expected to contribute to the discussion as well, and there will be some points assigned for class participation. Ideally, criticisms should be constructive in nature, including the identification of alleviating solutions . Once a paper has been discussed in class you will be expected to compile an annotated bibliography covering all the papers discussed during the quarter and submit this to me by the end of the quarter. The best time to compile this is to do it as soon as possible after the discussion in class. That is when you will have all the points covered in class. A schedule will be found here . Feedback forms can be downloaded here . Other than the papers discussed in class you will need to read and summarize papers not covered in class (the number and subject area is to be determined on an individual-by-individual basis). Optionally you may choose to work on a research-oriented project. How much you get out of this class is completely up to you. The research component is stressed as is evidenced by the fact that many of the projects started by students taking this course over the year have resulted in publications in prestigious conferences and workshops. The projects you do may be group of individual in nature (if group the tasks will be non-overlapping). A list of project topics will be discussed individually with each of you based on your interests during the first week of class along with relevant references. I will try to meet with each one of you during the first two weeks to help determine projects.

The final grade (S or U) will be primarily determined by:
    80% Class Participation and Presentations
    20% Research Project or Readings