Detailed Information

Course Form

This course is given over 13 weeks during the spring semester. It is open both for regular DTU students and for everyone else via Open University. DTU students should sign up using CampusNet. For information on how to apply via Open University, see this link.

A typical course day consists of a one to two hour lecture immediately followed by exercises. This year we will be working on a case from a company throughout the course.

The course is also given as a one-week PhD course in August with course number 02910. The PhD courses is intended for PhD Students throughout DTU (and other institutions) who may or may not have a mathematical/statistical background. Thus 02910 has greater emphasis on the hands on experience than the theoretical parts. Consult this page for more information.

Microarray example (from Wikipedia).

Course Material

The course material consists of chapters from electronic textbooks and electronic papers. Most lectures will refer to the book "Elements of Statistical Learning" (ESL) by Hastie, Tibshirani and Friedman. This book is freely available from this link. References to other material will be given on CampusNet.

Hand-in Exercises

There will be two cases; one covering supervised methods (regression and classification) and one covering unsupervised analyses (principal component analysis and similar). These will give hands-on experience with real applications to complement the in-depth theoretical ground covered in the lectures.

Examination

There will be an oral exam where students will be assessed individually. Grades are based solely on the oral exam though questions may be asked which concern the case work during the course. An exam schedule will be issued later.

Lecturers

LHC: Line H. Clemmensen, Assistant Professor, DTU Data Analysis, lhc[at]imm.dtu.dk
LAAR: Lars Arvastson, External Lecturer, DTU Data Analysis, Lundbeck, larv[at]lundbeck.com
MM: Morten Mørup, Associate Professor, DTU Informatics, Cognitive Systems, mm[at]imm.dtu.dk

Course Calendar

Week Date Subjects Lecturer Litterature
1 3/2 Introduction to computational data analysis. Ordinary least squares regression and linear discriminant analysis for classification LHC ESL Chapters 3.1, 3.2, 3.4.1, 4.1, and 4.3
2 10/2 Model selection. Ridge regression and k-nearest-neighbor classification LHC ESL Chapter 7. You may safely skip sections 7.8 and 7.9
3 17/2 Logistic regression, Optimal separating hyperplanes and convex optimization LAAR ESL Chapters 4.4, 5.1, 5.2
4 24/2 Basis expansions, splines, kernels and Support Vector Machines LAAR ESL Chapters 4.5, 12.1, 12.2, 12.3.1
5 3/3 Sparse regression and classification LHC ESL Chapters 3.3, 3.4, 18
6 10/3 Classification and Regression Trees (CART) LAAR ESL Chapter 9.2
7 17/3 Bagging, Boosting and Random Forests LHC ESL Chapter 15
8 24/3 Bayesian statistics and case competition LAAR -
- 23/3 at 12.00 Deadline for handing in case on supervised modelling - -
9 7/4 Principal Component Analysis, Sparse Principal Component Analysis LHC ESL Chapters 14.5.1, 14.5.5
10 14/4 Sparse coding, NMF, Archetypical Analysis and ICA MM ESL Chapters 14.6 - 14.7, [Sparse Coding, Nature]
11 21/4 Cluster Analysis LAAR ESL Chapter 14.3
12 28/4 Multiway Models Rasmus Bro, KU WireOverview.pdf available from Campusnet
13 5/5 Spectral unmixing and case competition LHC -
- 4/5 Deadline for handing in exercise on unsupervised modelling - -
- 28/5 Oral examination - -