Kies de Nederlandse taal
Course module: JM0150-M-6
Data Mining
Course info
Course moduleJM0150-M-6
Credits (ECTS)6
CategoryMA (Master)
Course typeCourse
Language of instructionEnglish
Offered byTilburg University; Tilburg School of Economics and Management; TiSEM: Management; TiSEM: Management;
Is part of
M Data Science and Entrepreneurship (joint degree)
Lecturer J. Vanschoren
Other course modules lecturer
Academic year2020
Starting block
SM 1
Course mode
RemarksCaution: this information is subject to change
Registration openfrom 25/08/2020 up to and including 20/08/2021
After completion of the course the students should be able to:
  • understand the mathematical principles of state-of-the-art machine learning methods
  • apply and evaluate the performance of state-of-the-art machine learning techniques on real problems
  • prepare data to optimize the execution of machine learning algorithms
  • write programs that build predictive models from training data
  • evaluate predictive models and compare different predictive models against each other
  • diagnose and address common issues with predictive models (e.g. overfitting, underfitting, bias)
  • optimize models using data-driven techniques
  • understand open problems and current developments in machine learning 
Machine learning is the science of making computers act without being explicitly programmed. Instead, algorithms are used to find patterns in data. It is so pervasive today that you probably use it dozens of times a day without knowing it, for instance in web search, speech recognition, and (soon) self-driving cars. It is also a crucial component of data-driven industry (Big Data), scientific discovery, and modern healthcare.

In this course, you'll learn the key concepts behind state-of-the-art machine learning algorithms as well as their mathematical formulations and solving techniques. Moreover, you will gain hands-on experience in applying them on real-world problems, build predictive models, and learn how to empirically validate and optimize these models. These algorithms include linear models, ensembling techniques (e.g. random forests, gradient boosting, stacking), Bayesian learning, support vector machines, and neural networks.
You will learn how each technique represents models and how they are fitted to the data. We will also discuss the relationships between these techniques and their individual benefits and drawbacks. Particular attention is paid to the proper analysis of model performance (e.g. under- and overfitting, bias-variance analysis, ROC analysis) and to the efficient optimization of predictive models (model selection, optimization, meta-learning).

Recommended background‚Äč
This course requires both a decent mathematical background as well as programming experience. It is highly recommended to have a working knowledge of algebra and statistics. Programming is part of the assignments, hence programming experience is highly recommended as well. The assignments and the programming examples in the course will be based on Python.

Type of instructions
Lectures, Q&A sessions, interactive discussions

Compulsory Reading
  1. Selected book chapters, survey and research articles (available online freely or through the university subscription).
Contact person J. Vanschoren
Timetable information
Data Mining
Written test opportunities
Written test opportunities (HIST)
Data Mining: exam (50%) / Data Mining: exam (50%)EXAM_01SM 1116-12-2020
Data Mining: exam (50%) / Data Mining: exam (50%)EXAM_01SM 1220-01-2021
Required materials
Recommended materials
Data Mining: exam (50%)

Final grade

Data Mining: group project (50%)

Kies de Nederlandse taal