This course is given over 13 weeks during the spring semester. It is open both for regular DTU students and for everyone else via Open University. DTU students should sign up using CampusNet. For information on how to apply via Open University, see this link.
A typical course day consists of a one to two hour lecture immediately followed by exercises.
The course is also given as a one-week PhD course in August with course number 02910. The PhD courses is intended for PhD Students throughout DTU (and other institutions) who may or may not have a mathematical/statistical background. Thus 02910 has greater emphasis on the hands on experience than the theoretical parts. Consult this page for more information.
The course material consists of chapters from electronic textbooks and electronic papers. Most lectures will refer to the book "Elements of Statistical Learning" (ESL) by Hastie, Tibshirani and Friedman. This book is freely available from this link. References to other material will be given on CampusNet.
There will be two cases during the course. These will give hands-on experience with real applications to complement the in-depth theoretical ground covered in the lectures. Both cases must be passed in order to go to the exam.
There will be an oral exam where students will be assessed individually. Grades are based solely on the oral exam though questions may be asked which concern the case work during the course. An exam schedule will be issued later.
LAAR: Lars Arvastson, External Lecturer, DTU Data Analysis, Lundbeck, larv[at]lundbeck.com
LHC: Line H. Clemmensen, Assistant Professor, DTU Data Analysis, lkhc[at]dtu.dk
MM: Morten Mørup, Associate Professor, DTU Informatics, Cognitive Systems, mmor[at]dtu.dk
|1||2/2||Introduction to Computational Data Analysis.
[OLS, Ridge, Fischer LDA, KNN]
|LAAR||ESL Chapters 1, 2, 3.1, 3.2, 3.4.1, 4.1 and 13.3|
[CV, Bootstrap, Cp, AIC, BIC, ROC]
|LAAR||ESL Chapter 7 and 9.2.5. You may safely skip sections 7.8 and 7.9|
[Lasso, Elastic Net]
Case 1 presentation
|LAAR||ESL Chapter 3.3, 3.4 and 18|
|4||23/2||Linear Classifiers and Basis Expansion.
[LDA, QDA, Logistic regression, Splines]
|LAAR||ESL Chapter 4.3, 4.4, 5.1 and 5.2|
|5||2/3||Support Vector Machine and Convex Optimization||LAAR||ESL Chapter 4.5, 12.1, 12.2 and 12.3.1|
[PCA, CCA, PCR, PLS]
|LAAR||ESL Chapter 14.5.1, 14.5.5 and 3.5|
[Hierarchical clustering, K-means, GMM, Gap-statistics]
|LAAR||ESL Chapter 14.3|
|-||21/3||Deadline for handing in Case 1 - small report||-||-|
|8||23/3||Classification and Regression Trees
Discussion of Case 1 + competition
|LAAR||ESL Chapter 9.2|
[Bagging, Boosting and Random Forest]
|LAAR||ESL Chapter 8.7, 10.1 and 15|
[SC, NMF, AA, ICA]
|MM||ESL Chapter 14.6, 14.7., [Sparse Coding, Nature]|
|11||20/4||Multi-Way Models||MM||WireOverview.pdf available from Campusnet|
|12||27/4||Neural Networks and Self Organizing Maps||LAAR||ESL Chapter 11.1-11.5 and 14.4|
|-||2/5||Deadline for handing in Case 2 - poster pdf||-||-|
|13||4/5||Case 2 poster presentation
Data analysis at Maersk A/S by Line Clemmensen