CloseHelpPrint
Kies de Nederlandse taal
Course module: JM0150-M-6
JM0150-M-6
Data Mining
Course info
Course moduleJM0150-M-6
Credits (ECTS)6
CategoryMA (Master)
Course typeCourse
Language of instructionEnglish
Offered byTilburg University; Tilburg School of Economics and Management; TiSEM: Management; TiSEM: Management;
Is part of
M Data Science and Entrepreneurship (joint degree)
Lecturer(s)-
Academic year2019
Starting block
SM 1
Course mode
Full-time
Remarks-
Registration openfrom 19/08/2019 up to and including 24/01/2020
Aims
After completion of the course the students should be able to:
  • understand the mathematical principles of state-of-the-art machine learning methods
  • apply and evaluate the performance of state-of-the-art machine learning techniques on real problems
  • prepare data to optimize the execution of machine learning algorithms
  • write programs that build predictive models from training data
  • evaluate predictive models and compare different predictive models against each other
  • diagnose and address common issues with predictive models (e.g. overfitting, underfitting, bias)
  • optimize models using data-driven techniques
  • understand open problems and current developments in machine learning 
Content
Machine learning is the science of making computers act without being explicitly programmed. Instead, algorithms are used to find patterns in data. It is so pervasive today that you probably use it dozens of times a day without knowing it, for instance in web search, speech recognition, and (soon) self-driving cars. It is also a crucial component of data-driven industry (Big Data), scientific discovery, and modern healthcare.

In this course, you'll learn the key concepts behind state-of-the-art machine learning algorithms as well as their mathematical formulations and solving techniques. Moreover, you will gain hands-on experience in applying them on real-world problems, build predictive models, and learn how to empirically validate and optimize these models. These algorithms include linear models, ensembling techniques (e.g. random forests, gradient boosting, stacking), Bayesian learning, support vector machines, and neural networks.
You will learn how each technique represents models and how they are fitted to the data. We will also discuss the relationships between these techniques and their individual benefits and drawbacks. Particular attention is paid to the proper analysis of model performance (e.g. under- and overfitting, bias-variance analysis, ROC analysis) and to the efficient optimization of predictive models (model selection, optimization, meta-learning).

 
Recommended background‚Äč
This course requires both a decent mathematical background as well as programming experience. It is highly recommended to have a working knowledge of algebra and statistics. Programming is part of the assignments, hence programming experience is highly recommended as well. The assignments and the programming examples in the course will be based on Python.

 
Type of instructions
Lectures, Q&A sessions, interactive discussions


Compulsory Reading
  1. Selected book chapters, survey and research articles (available online freely or through the university subscription).
Contact person
dr.ir. J. Vanschoren
Timetable information
Data Mining
Written test opportunities
DescriptionTestBlockOpportunityDate
Written test opportunities (HIST)
DescriptionTestBlockOpportunityDate
Data Mining: exam (50%) / Data Mining: exam (50%)EXAM_01SM 1109-12-2019
Data Mining: exam (50%) / Data Mining: exam (50%)EXAM_01SM 1206-01-2020
Required materials
-
Recommended materials
-
Tests
Data Mining: exam (50%)

Final grade

Data Mining: group project (50%)

CloseHelpPrint
Kies de Nederlandse taal