Detailed Information

Course Form

This course is given over 13 weeks during the spring semester. It is open both for regular DTU students and for everyone else via Open University. DTU students should sign up using CampusNet. For information on how to apply via Open University, see this link.

A typical course day consists of a one to two hour lecture immediately followed by exercises.

The course is also given as a one-week PhD course in August with course number 02910. The PhD courses is intended for PhD Students throughout DTU (and other institutions) who may or may not have a mathematical/statistical background. Thus 02910 has greater emphasis on the hands on experience than the theoretical parts. Consult this page for more information.

Microarray example (from Wikipedia).

Course Material

The course material consists of chapters from electronic textbooks and electronic papers. Most lectures will refer to the book "Elements of Statistical Learning" (ESL) by Hastie, Tibshirani and Friedman. This book is freely available from this link. References to other material will be given on CampusNet.

Hand-in Exercises

There will be two cases during the course. These will give hands-on experience with real applications to complement the in-depth theoretical ground covered in the lectures. Both cases must be passed in order to go to the exam.

Examination

There will be an oral exam where students will be assessed individually. Grades are based solely on the oral exam though questions may be asked which concern the case work during the course. An exam schedule will be issued later.

Lecturers

LAAR: Lars Arvastson, External Lecturer, DTU Data Analysis, Lundbeck, larv[at]lundbeck.com
LHC: Line H. Clemmensen, Assistant Professor, DTU Data Analysis, lkhc[at]dtu.dk
MM: Morten Mørup, Associate Professor, DTU Informatics, Cognitive Systems, mmor[at]dtu.dk

Course Calendar

Week Date Subjects Lecturer Litterature
1 1/2 Introduction to Computational Data Analysis.
[OLS, Ridge, Fischer LDA, KNN]
LHC, LAAR ESL Chapters 1, 2, 3.1, 3.2, 3.4.1, 4.1 and 13.3
2 8/2 Model Selection.
[CV, Bootstrap, Cp, AIC, BIC, ROC]
LHC ESL Chapter 7 and 9.2.5. You may safely skip sections 7.8 and 7.9
3 15/2 Sparse Regression.
[Lasso, Elastic Net]
Case 1 presentation
LHC ESL Chapter 3.3, 3.4 and 18
4 22/2 Linear Classifiers and Basis Expansion.
[LDA, QDA, Logistic regression, Splines]
LAAR ESL Chapter 4.3, 4.4, 5.1 and 5.2
5 1/3 Support Vector Machine and Convex Optimization LAAR ESL Chapter 4.5, 12.1, 12.2 and 12.3.1
6 8/3 Sub-Space Methods
[PCA, CCA, PCR, PLS]
LHC ESL Chapter 14.5.1, 14.5.5 and 3.5
7 15/3 Unsupervised Clustering
[Hierarchical clustering, K-means, GMM, Gap-statistics]
LAAR ESL Chapter 14.3
- 20/3 Deadline for handing in Case 1 - short report - -
8 22/3 Classification and Regression Trees
Discussion of Case 1 + competition
LAAR ESL Chapter 9.2
29/3 Easter Holiday
9 5/4 Ensemble Methods
[Bagging, Boosting and Random Forest]
LHC ESL Chapter 8.7, 10.1 and 15
10 12/4 Unsupervised Decomposition
[SC, NMF, AA, ICA]
MM ESL Chapter 14.6, 14.7., [Sparse Coding, Nature]
11 19/4 Multi-Way Models MM WireOverview.pdf available from Campusnet
12 26/4 Neural Networks and Self Organizing Maps LAAR ESL Chapter 11.1-11.5 and 14.4
- 1/5 Deadline for handing in Case 2 - poster pdf - -
13 3/5 Case 2 poster presentation
Industrial examples

-
- 18/5 Oral examination - -