Computational Data Analysis

Detailed Information

Course Form

This course is given over 13 weeks during the spring semester. It is open both for regular DTU students and for everyone else via Open University. DTU students should sign up using CampusNet. For information on how to apply via Open University, see this link.

A typical course day consists of a one to two hour lecture immediately followed by exercises.

The course is also given as a one-week PhD course in August with course number 02910. The PhD courses is intended for PhD Students throughout DTU (and other institutions) who may or may not have a mathematical/statistical background. Thus 02910 has greater emphasis on the hands on experience than the theoretical parts. Consult this page for more information.

Microarray example (from Wikipedia).

Course Material

The course material consists of chapters from electronic textbooks and electronic papers. Most lectures will refer to the book "Elements of Statistical Learning" (ESL) by Hastie, Tibshirani and Friedman. This book is freely available from this link. References to other material will be given on CampusNet.

Hand-in Exercises

There will be two cases during the course. These will give hands-on experience with real applications to complement the in-depth theoretical ground covered in the lectures. Both cases must be passed in order to go to the exam.

Examination

There will be an oral exam where students will be assessed individually. Grades are based solely on the oral exam though questions may be asked which concern the case work during the course. An exam schedule will be issued later.

Lecturers

LAAR: Lars Arvastson, External Lecturer, DTU Data Analysis, Lundbeck, larv[at]lundbeck.com
LHC: Line H. Clemmensen, Assistant Professor, DTU Data Analysis, lkhc[at]dtu.dk
MM: Morten Mørup, Associate Professor, DTU Informatics, Cognitive Systems, mmor[at]dtu.dk

Course Calendar

Week	Date	Subjects	Lecturer	Litterature
1	1/2	Introduction to Computational Data Analysis. [OLS, Ridge, Fischer LDA, KNN]	LHC, LAAR	ESL Chapters 1, 2, 3.1, 3.2, 3.4.1, 4.1 and 13.3
2	8/2	Model Selection. [CV, Bootstrap, Cp, AIC, BIC, ROC]	LHC	ESL Chapter 7 and 9.2.5. You may safely skip sections 7.8 and 7.9
3	15/2	Sparse Regression. [Lasso, Elastic Net] Case 1 presentation	LHC	ESL Chapter 3.3, 3.4 and 18
4	22/2	Linear Classifiers and Basis Expansion. [LDA, QDA, Logistic regression, Splines]	LAAR	ESL Chapter 4.3, 4.4, 5.1 and 5.2
5	1/3	Support Vector Machine and Convex Optimization	LAAR	ESL Chapter 4.5, 12.1, 12.2 and 12.3.1
6	8/3	Sub-Space Methods [PCA, CCA, PCR, PLS]	LHC	ESL Chapter 14.5.1, 14.5.5 and 3.5
7	15/3	Unsupervised Clustering [Hierarchical clustering, K-means, GMM, Gap-statistics]	LAAR	ESL Chapter 14.3
-	20/3	Deadline for handing in Case 1 - short report	-	-
8	22/3	Classification and Regression Trees Discussion of Case 1 + competition	LAAR	ESL Chapter 9.2
	29/3	Easter Holiday
9	5/4	Ensemble Methods [Bagging, Boosting and Random Forest]	LHC	ESL Chapter 8.7, 10.1 and 15
10	12/4	Unsupervised Decomposition [SC, NMF, AA, ICA]	MM	ESL Chapter 14.6, 14.7., [Sparse Coding, Nature]
11	19/4	Multi-Way Models	MM	WireOverview.pdf available from Campusnet
12	26/4	Neural Networks and Self Organizing Maps	LAAR	ESL Chapter 11.1-11.5 and 14.4
-	1/5	Deadline for handing in Case 2 - poster pdf	-	-
13	3/5	Case 2 poster presentation Industrial examples		-
-	18/5	Oral examination	-	-