This course is given over 13 weeks during the spring semester. It is open both for regular DTU students and for everyone else via Open University. DTU students should sign up using CampusNet. For information on how to apply via Open University, see this link.
A typical course day consists of a one to two hour lecture immediately followed by exercises. This year we will be working on a case from a company throughout the course.
The course is also given as a one-week PhD course in August with course number 02910. The PhD courses is intended for PhD Students throughout DTU (and other institutions) who may or may not have a mathematical/statistical background. Thus 02910 has greater emphasis on the hands on experience than the theoretical parts. Consult this page for more information.
The course material consists of chapters from electronic textbooks and electronic papers. Most lectures will refer to the book "Elements of Statistical Learning" (ESL) by Hastie, Tibshirani and Friedman. This book is freely available from this link. References to other material will be given on CampusNet.
There will be two cases; one covering supervised methods (regression and classification) and one covering unsupervised analyses (principal component analysis and similar). These will give hands-on experience with real applications to complement the in-depth theoretical ground covered in the lectures.
There will be an oral exam where students will be assessed individually. Grades are based solely on the oral exam though questions may be asked which concern the case work during the course. An exam schedule will be issued later.
LHC: Line H. Clemmensen, Assistant Professor, DTU Data Analysis, lhc[at]imm.dtu.dk
LAAR: Lars Arvastson, External Lecturer, DTU Data Analysis, Lundbeck, larv[at]lundbeck.com
MM: Morten Mørup, Associate Professor, DTU Informatics, Cognitive Systems, mm[at]imm.dtu.dk
|1||3/2||Introduction to computational data analysis. Ordinary least squares regression and linear discriminant analysis for classification||LHC||ESL Chapters 3.1, 3.2, 3.4.1, 4.1, and 4.3|
|2||10/2||Model selection. Ridge regression and k-nearest-neighbor classification||LHC||ESL Chapter 7. You may safely skip sections 7.8 and 7.9|
|3||17/2||Logistic regression, Optimal separating hyperplanes and convex optimization||LAAR||ESL Chapters 4.4, 5.1, 5.2|
|4||24/2||Basis expansions, splines, kernels and Support Vector Machines||LAAR||ESL Chapters 4.5, 12.1, 12.2, 12.3.1|
|5||3/3||Sparse regression and classification||LHC||ESL Chapters 3.3, 3.4, 18|
|6||10/3||Classification and Regression Trees (CART)||LAAR||ESL Chapter 9.2|
|7||17/3||Bagging, Boosting and Random Forests||LHC||ESL Chapter 15|
|8||24/3||Bayesian statistics and case competition||LAAR||-|
|-||23/3 at 12.00||Deadline for handing in case on supervised modelling||-||-|
|9||7/4||Principal Component Analysis, Sparse Principal Component Analysis||LHC||ESL Chapters 14.5.1, 14.5.5|
|10||14/4||Sparse coding, NMF, Archetypical Analysis and ICA||MM||ESL Chapters 14.6 - 14.7, [Sparse Coding, Nature]|
|11||21/4||Cluster Analysis||LAAR||ESL Chapter 14.3|
|12||28/4||Multiway Models||Rasmus Bro, KU||WireOverview.pdf available from Campusnet|
|13||5/5||Spectral unmixing and case competition||LHC||-|
|-||4/5||Deadline for handing in exercise on unsupervised modelling||-||-|