Spring 2006 Data and Information Ph.D.
Philosophy of Examination
- Edward Fox
- Chang-Tien Lu (Primary Contact)
- Naren Ramakrishnan
Process and Format
- Since students vary in their abilities regarding written and oral communication,
and since doctoral students are expected to have some skill with each media
type, students will explain their solutions both in writing and orally. Solutions
will be graded based on their clarity as a result of the union of these modes
- Students are expected to have studied all works in the reading list. Any
pre-requisite or background knowledge required to understand the works in
the reading list are also expected to be acquired by the student.
- Students are expected to understand those works at the level of a doctoral
student who has taken the equivalent of courses such as CS5604 and CS5614.
- Students are expected to be able to understand a real situation/context/problem
in the information/data area, to be able to synthesize/apply the findings
of multiple papers from the reading list to such problems, and to be able
to formulate an answer outlining how they would approach and solve that problem.
- The examination includes a takehome examination that is expected to be administered
in the beginning of 2006.
- At the beginning of the examination period, all students will receive a
document that contains three questions.
- By the end of the examination period, each student must turn in a written
solution to one of those questions, i.e., the student must choose one out
of three. It is expected that the solutions will be no longer than about 15
pages (excluding references) at 11 point or larger. Specific details about
format and length will be provided along with the questions.
- Also at this time, each student must turn in a PowerPoint presentation or
equivalent that will be used for an oral explanation of the written solution.
Oral explanations, lasting no longer than 30 minutes, will be scheduled as
soon after the end of the exam week as feasible, using VTEL or equivalent
as needed to ensure coverage by students and/or faculty in either Blacksburg
or N. Virginia.
- Written solutions might be expected to have the following approximate format
(although detailed guidelines will be provided during the exam):
It is important that any assumptions made be clearly stated in the written
- a motivation section making clear the context of the problem/situation
- a clear statement of the problem in terms of concepts and terminology
in the information/data area, that addresses the situation/context
- a review of related literature, drawn mostly from multiple relevant
works in the reading list
- a statement of how the problem can be approached
- a description of the approach to solve the problem
- Oral presentations must follow what is given in the previously turned-in
PowerPoint file or equivalent. They must be completed within a 30 minute period,
in which roughly 25 minutes are for presentation and 5 minutes for answering
questions posed by faculty examiners.
- Each solution will be graded by at least 2 faculty members. A combined grade
will then be assigned for each student based on all faculty input by the area
committee, on a scale of 0-3, as is called for by GPC policies.
- 11/1, 2005: Reading List Available.
- 1/14 (Saturday), 2006 : Written Examination Available.
- 1/27 (Friday) 5PM, 2006: Written Examination Due.
- 1/30 (Monday) 5PM, 2006: PowerPoint Presentation Slide Due.
- 2/6 (Monday) 2006 : Oral Examination.
- 2/15 : Exam Results due to GPC.
Oral Examination Schedule (Monday, 2/6, 2006, NVC206-VT Whit 281)
- 9:35 - 10:10AM : Ray Dos Santos
- 10:10 - 10:45AM : Ying Jin
- 10:50 - 11:25AM : Jing Dai
- 11:25 - 12:00AM : Yi Ma
- 1:05 - 1:40 AM : Arnold Boedihardjo
- 1:40 - 2:15PM : Pengbo Liu
- 2:20 - 2:55PM : Manu Shukla
(Note: Some of the hyperlinks below lead to web pages maintained by the
respective publishers. You may or may not be able to download the articles directly
from these web pages - this depends on the host computer from which the access
is made. To access the articles, we recommend that you go through the VT-subscribed
ACM digital library or IEEE Explore interface).
- Venkatesh Ganti, Johannes Gehrke, Raghu Ramakrishnan: Mining
Very Large Databases. IEEE Computer 32(8): 38-45 (1999)
- Dimitrios Gunopulos, Roni Khardon, Heikki Mannila, Sanjeev Saluja, Hannu
Toivonen, Ram Sewak Sharma: Discovering
All Most Specific Sentences. ACM Trans. Database Syst. 28(2): 140-174
- Jiawei Han, Jian Pei, Yiwen Yin, Runying Mao: Mining
Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach.
Data Min. Knowl. Discov. 8(1): 53-87 (2004)
- Andreas Paepcke, Chen-Chuan K. Chang, Terry Winograd, Héctor García-Molina,
Interoperability for Digital Llibraries
Worldwide, April 1998 Communications of the ACM, Volume 41 Issue 4.
- Yves Petinot, C. Lee Giles, Vivek Bhatnagar, Pradeep B. Teregowda, Hui Han,
Isaac Councill, Service Applications:
A Service-oriented Architecture for Digital Libraries, November 2004,
Proceedings of the 2nd international conference on service oriented computing.
Publisher: ACM Press.
- Greg Janée, James Frew, Digital
Libraries for Spatial Data: The ADEPT Digital Library Architecture, July
2002 Proceedings of the 2nd ACM/IEEE-CS joint conference on digital libraries
Publisher: ACM Press.
- Michael G. Christel, David B. Winkler, C. Roy Taylor, Multimedia
Abstractions for a Digital Video Library, July 1997 Proceedings of the
second ACM international conference on digital libraries Publisher: ACM Press.
- Petros Maniatis, Mema Roussopoulos, T. J. Giuli, David S. H. Rosenthal,
Mary Baker, The LOCKSS Peer-to-Peer
Digital preservation system, February 2005 ACM Transactions on Computer
Systems (TOCS), Volume 23 Issue 1.
- Volker Gaede, Oliver Gunther, Multidimensional
Access Methods (Summary), ACM
Computing Surveys, Vol. 30, No. 2, June 1998. (Slide)
- Bongki Moon, H.V. Jagadish, Christos Faloutsos, Joel H. Saltz, Analysis
of the Clustering Properties of the Hilbert Space-Filling Curve, IEEE
Transactions on Knowledge and Data Eng., Vol. 13, No. 1, pp. 124-141, January/February,
- Mohamed F. Mokbel, Walid G. Aref, and Ibrahim Kamel "Analysis
of Multi-dimensional Space-Filling Curves," GeoInformatica, 7(3),
pp. 179-209, Sep. 2003. (Short
- D.H. Lee; H.J. Kim, An Efficient
Technique for Nearest-Neighbor Query Processing on the SPY-TEC, IEEE Transactions
on Knowledge and Data Eng., Vol. 15, No. 6, pp. 1472- 1486, Nov./Dec., 2003.
- Yufei Tao, D. Papadias, Range
Aggregate Processing in Spatial Databases, IEEE Transactions on Knowledge
and Data Eng., Vol. 16, No. 12, pp. 1555-1570, Dec., 2004.
- M. Zhu, D. Papadias, J. Zhang, D.L. Lee, Top-k
Spatial Joins, IEEE Transactions on Knowledge and Data Eng., Vol. 17,
No. 4, pp. 567-579, April., 2005.