CS5984 Introduction to Data Mining

Instructor: Chang-Tien Lu
Office: NVC Room 310 
Tel

703-538-8373

Email:   ctlu@vt.edu
Office Hour: Tuesday 4-5PM, Friday 11AM-Noon, or by appointment.

Class Time and Location: Thursday 4-6:45PM  NVC 322
Class Website:   http://europa.nvc.cs.vt.edu/~ctlu/Course/2008/CS5984/index.html

Course Description:

This course examines the basic principles of data mining, including data analysis and uncertainty, modeling, data mining algorithms, patterns and rules discovering, spatial-temporal data analysis, data integration and management, and various data mining applications. The key objectives of this course are two-fold: (1) to teach the fundamental concepts of data mining and (2) to provide extensive hands-on experience in applying the concepts to real-world applications.

TextBook

Introduction to Data Mining
Pang-Ning Tan, Michael Steinbach, Vipin Kumar
Addison-Wesley, 2005
ISBN-10: 0321321367
ISBN-13: 978-0321321367


Reference Book

Data Mining: Concepts and Techniques (2nd Edition)
Jiawei Han, Micheline Kamber
Publisher: Morgan Kaufmann, 2005
ISBN-10: 1558609016
ISBN-13: 978-1558609013

Supplementary Material

A collection of papers.

Tentative Schedule:

                The schedule indicates the concepts and material to be covered in each week under the column labeled "Topics".

Week Date Lecture Topics Read Due Feedback
1 8/28 Introduction, Data Ch. 1, 2    
2 9/4 Data, Exploring Data Ch. 2, 3    
3 9/11 Classification Ch. 4 HW0  
4 9/18 Classification Ch. 4, 5 Project Proposal  
5 9/25 Classification: Alternative Techniques Ch. 5 HW1  
6 10/2 Association Analysis Ch. 6 Project Checkpoint Report I HW1
7 10/9 Midterm I      
8 10/16 Association Analysis: Advanced Concepts Ch. 7   Midterm I
9 10/23 Association Analysis: Advanced Concepts, Cluster Analysis Ch. 7, 8 Project Checkpoint Report II  
10 10/30 Cluster Analysis Ch. 8    
11 11/6 Midterm II      
12 11/13 Cluster Analysis: Additional Issues and Algorithms Ch. 9   Midterm II
13 11/20 Anomaly Detection Ch. 10 Project Checkpoint Report III  
14 11/27 (Thanksgiving Holiday)      
15 12/4 Final Project Presentation   Final Report (6PM, 12/7)  


Examinations and Assignments:

There are three homework assignments. Homework assignments are due at the start of class. If you have an excused absence from a class, turn in the homework assignment prior to the class session. All assignments must have your name, student ID and course name/ number. 

The weighting scheme used for grading is: 3 HW Assignments: 20%, including HW0(1%), HW1(9%), and HW2(Research Presentation, 10%), Midterm I: 20%, Midterm II: 25%, Final Project: 35% (Final Presentation: 10%, Final Report: 25%), Class Discussion and Participation: 5%. Students are responsible for all material covered in lectures. Examinations will heavily emphasize conceptual understanding of the material.

Late Submission Policy: 

Assignments must be handed in at the beginning of the class on the specified due date (Thursday of designated week).A penalty of 30% will be deducted from your score for the first 24-hour period your assignment is late. A penalty of 70% will be deducted from your score for >= 24-hour period. Weekend days will be counted. For assignments, you are encouraged to type your answers. 

Honor System: 

All work is to be done under the provisions of the Virginia Tech Honor System. Students can discuss the interpretation of an assignment, however, the actual solution to problems must be one's own. The tenets of the Virginia Tech Graduate Honor Code will be strictly enforced in this course, and all assignments shall be subject to the stipulations of the Graduate Honor Code. Whenever I learn that a student has violated the honor code, I am obligated to report the violation. For more information on the Graduate Honor Code, please refer to the GHS Constitution, located online at http://ghs.grads.vt.edu/.

Disabilities:

Any student that is in need of special accommodations due to a disability, as recognized by the Americans with Disabilities Act, should contact the Services for Students with Disabilities (SSD) in the Dean of Students Office. "Students with disabilities are responsible for self-identification. To be eligible for services, documentation of the disability from a qualified professional must be presented to SSD upon request. Academic adjustments may include, but are not limited to: priority registration, auxiliary aids, program and course adjustment, exam modifications, oral or sign language interpreters, cassette taping of text/materials, notetakers/readers, or assistive technology." (see http://www.hr.vt.edu/supervisorscorner/adainfo/)

If you need adaptation or accommodations because of a disability (learning disability, attention deficit disorder, psychological, physical, etc.), if you have emergency medical information to share with me, or if you need special arrangements in case the building must be evacuated, please make an appointment with me as soon as possible. If you need captioning for videos, please let me know no later than two weeks in advance of date on syllabus for reviewing.

Helpful Comments: 

This class is Very Interesting and Useful for audience interested in database systems research as well as in Master/Doctoral projects. We will explore a number of current research areas which are very important yet fairly open for research. Spatial databases continue to be the heart of information management in areas ranging from business to scientific domains (e.g., earth observation systems, genomics).

To get full benefit out of the class you have to work independently and regularly. Read the textbook and papers before the meeting and bring comments for discussion. Plan to spend at least 12 hrs a week (a little more during first few weeks till you feel comfortable with geographic information and queries) on this class doing projects or reading.

Good Luck, and Welcome to CS 5984!
Chang-Tien Lu