space
Section for Cognitive Systems
DTU Compute

02450 Introduction to Machine Learning and Data Mining

Bjørn Sand Jensen
Bjørn Sand Jensen
 
Jes Frellsen
Jes Frellsen
 
Tue Herlau
Tue Herlau
 
Morten Mørup
Morten Mørup
 
Mikkel N. Schmidt
Mikkel N. Schmidt
 
François R. J. Cornet
François R. J. Cornet
 
Anne Kjerstine Desler (online only in week 9-13)
Anne Kjerstine Desler (online only in week 9-13)
 
Hugo Henri Joseph Sénétaire (not available in week 10)
Hugo Henri Joseph Sénétaire (not available in week 10)
 
Lenka Hýlová
Lenka Hýlová
 
Santiago Maldonado Hernández
Santiago Maldonado Hernández
 
Rune Dodensig Kjærsgaard
Rune Dodensig Kjærsgaard
 
Nikolai Beck Jensen
Nikolai Beck Jensen
 
Bjarke Arnskjær Hastrup
Bjarke Arnskjær Hastrup
 
Gonzalo Eduardo Mazzini
Gonzalo Eduardo Mazzini
 
Changzhi Ai
Changzhi Ai
 
Alvaro Carrera Cardeli
Alvaro Carrera Cardeli
 
Jonas Thusgaard Elsborg
Jonas Thusgaard Elsborg
 
Hatef Abdollahi
Hatef Abdollahi
 
Sai Shaurya Iyer
Sai Shaurya Iyer
 
Jens Perregaard Thorsen
Jens Perregaard Thorsen
 
Yevhenii Osadchuk
Yevhenii Osadchuk
 
Vimal Velusamy Bharathi
Vimal Velusamy Bharathi
 

Machine learning and data mining

The course is designed around a data modeling framework shown in the figure. Each lecture/assignment will focus on an aspect of the data modeling framework.

data modeling framework

We emphasize the holistic view of modeling in order to motivate and stress the relevance of individual components and building blocks, disseminate the obtained competence (see the course learning obejctives), and make them applicable for a broad spectrum of engineering problems in e.g. biomedical engineering, chemistry, electrical engineering, and informatics.

Resources

DTU Learn

If you are enrolled in the course you can access material and participate in the course through the DTU Learn homepage.

Lectures

The lectures will take place on Tuesdays from 13:00-15:00 either fully online or with some physical presense depending on the Covid-19 situation.

In the first couple of weeks lectures will be fully online as a consequence of the current Covid-19 situation and the large class size. We aim to provide an opputunity for in-person lectures later in the semester; however, the allocated auditorium cannot support everyone and you will not be able to attend the lecture in person every week.

It is possible to stream the live lecture when it is not possible to attend in person. Additionally, all lectures will be recorded and made available online.

Exercises

Exercise sessions will take place immediately after the Tuesday lectures from 15:00-17:00 either online via Microsoft Teams or in-person depending on the Covid-19 situation.

In the first couple of weeks, the exercise sesssions will be fully online via Microsoft Teams and Piazza. We aim to provide an opputunity for in-person exercise sessions later in the semester at the locations indicated below.

We expect you will have access to your own laptop/computer during the exercise sessions. Exercises will be available in Matlab, R, and Python and we recommend selecting a language you are familiar with. If you are unfamiliar with any of the languages, we recommend Python.

The exercise teams (including virtual and physical location) are listed below (room/team capacity is limited to 34 student per team). The programming language in parentheses)}. Each exercise team is allocated a physical room and a corresponding channel on Microsoft Teams

Reading material, lecture slides and exercises

The course will use lecture notes and other freely available material. Lecture notes, slides, course assignment instructions etc. is available at the DTU learn course page (requires formal enrolment on the course).

Online demos

We have developed several online demos which illustrates key concepts from the course. The topics discussed currently includes PCA, regression, classification and density estimation.

Course description

A description of the course can be found at the DTU Coursebase

Help and support

Support outwith the scheduled sessions is primarialy available through the Piazza forum.

Teachers

Schedule

No. Date Subject Reading Homework
11 February, 2022 BSJIntroduction C1
Data: Feature extraction, and visualization
28 February, 2022 BSJData, feature extraction and PCA C2, C3 P3.1, P2.1, P3.2
315 February, 2022 BSJMeasures of similarity, summary statistics and probabilities C4, C5 P4.1, P4.2, P4.3
422 February, 2022 BSJProbability densities and data visualization C6, C7 P6.1, P6.2, P7.1
Supervised learning: Classification and regression
51 March, 2022 BSJDecision trees and linear regression C8, C9 P9.1, P8.1, P8.2
68 March, 2022 BSJOverfitting, cross-validation and Nearest Neighbor (Project 1 due before 13:00) C10, C12 P10.1, P10.2, P12.1
715 March, 2022 BSJPerformance evaluation, Bayes, and Naive Bayes C11, C13 P13.1, 13.2, P12.2
822 March, 2022 JFArtificial Neural Networks and Bias/Variance C14, C15 P15.1, P15.2, P15.3
929 March, 2022 JFAUC and ensemble methods C16, C17 P16.1, P16.2, P17.1
Unsupervised learning: Clustering and density estimation
105 April, 2022 JFK-means and hierarchical clustering C18 P18.1, P18.2, P18.3
Holiday
1119 April, 2022 BSJMixture models and density estimation (Project 2 due before 13:00) C19, C20 P20.1, P19.1, P19.2
1226 April, 2022 JFAssociation mining C21 P21.1, P18.2, P18.3
Recap
133 May, 2022 BSJRecap and discussion of the exam C1-C21

(Cx refers to Chapter x of the course notes. Px.y refers to problem number y in chapter x of the course notes.
The first listed problem will be that week discussion question at the exercises.)

FAQ

DTU logo space
space