Instructor: Srinivasan Parthasarathy
Office:
DL 693
Phone:
292-2568
Email:
srini@cse.ohio-state.edu
Class Hours: Friday 1:30-3:00pm
Office Hours: By appointment
Web Page: http://www.cse.ohio-state.edu/~srini/888J02
Introduction
With the unprecedented rate at which data is being
collected today in almost all fields of human endeavor,
there is an emerging economic and scientific need to extract
useful information from it. Data mining
is the process of automatic discovery of patterns,
changes, associations,
sequences and anomalies in massive databases.
This research seminar will survey the
main topics in data mining and knowledge discovery.
Topics will be from among: mining techniques:-classification, clustering,
association rules, sequence similarity etc.;
high performance implementation issues:-parallel/distributed data mining,
active resource-aware data mining; and
application domains such as web mining,
scientific simulations, e-commerce and bioinformatics.
Prerequisites
There are no pre-requisites as such. However,
it is desired that students will have had
experience with at least one of the following courses:
database systems (CIS 670),
statistics (500/600 level course)
parallel computing (CIS 720).
Tha ability to program in C/C++, and/or with
statistical programming packages, and work on quarter-long
(team/individual) projects is expected.
Class Format and Requirements
The class will be a mix of student presentations, paper discussions and
a research-oriented project.
Generally, one of you will introduce a topic, and then we'll discuss
some of the latest work on that topic.
You will have to explain and defend what the
paper says, as well as present weaknesses and
shortcomings as you see fit.
The rest of the class will be expected to contribute to the discussion
as well, and there will be some points assigned for class participation.
Ideally, criticisms should be constructive in nature, including
the identification of alleviating solutions .
Once a paper has been discussed in class you will be expected to compile
an annotated bibliography covering all the papers discussed during the quarter
and submit this to me by the end of the quarter.
The best time to compile
this is to do it as soon as possible after the discussion in class. That
is when you will have all the points covered in class.
A schedule will be found here .
Feedback forms can be downloaded here .
Other than the papers discussed in class you will need to read and summarize
papers not covered in class (the number and subject area is to be
determined on an individual-by-individual basis).
Optionally you may choose to work on a research-oriented
project. How much you get out of this class is completely
up to you. The research component is stressed as is evidenced
by the fact that many of the projects started by students taking this
course over the year have resulted in publications in prestigious conferences
and workshops.
The projects you do may be
group of individual in nature (if group the tasks will be non-overlapping).
A list of project topics will be discussed individually with each of you
based on your interests during the first week of class
along with relevant references. I will try to meet with each one of you
during the first two weeks to help determine projects.
The final grade (S or U) will be primarily determined by:
80% Class Participation and Presentations
20% Research Project or Readings