The "introduction to Machine Learning" course will cover basic topics in Data Mining and Machine leaarning, leading from the design of a proper data-scientific study campaign which starts from data mining and preparation and proceeds to experimentation with ML algorithms. Known frameworks for Data Mining (i.e., CRISP-DM) will be considered and experimented upon practically. Furthermore, the student will learn the basics of research design and hypothesis formulation/testing. Subsequently, the student will get to grips with most commonly used techniques of machine-learning including decision-trees, instance-based learning, as well as artificial neural networks. Finally, the student will learn the basics of model evaluation, model generalization as well as the bias-variance tradeoff.
This introductory course covers the following topics:
The student will experiment practically with the studied techniques both in a project and in-course practical sections.
- Data mining end-to-end process, starting from translation of the business problem to data mining task(s) and
- Data preparation (e.g., feature subset selection and data transformation) for modeling and ending with evaluation of the data mining outcomes and reporting.
- Machine-Learning techniques for classification (Instance-based methods, decision trees, ANNs and ensembles).
- Evaluation of Machine-Learning output, model performance optimization through boosting as well as avoiding overfitting while trading-off over bias and variance.
- Comparing performance of different techniques.
During this course the students are expected to learn the foundations of Machine-Learning as well as data mining for machine-learning purposes; the student will gain hands-on experience of applying both in practice.|
After taking the course, each student:
- Understands and can explain the basic principles and techniques of machine learning and data mining.
- Is aware of the involved application areas
- Understands and can explain when data mining and machine-learning are useful in a value-generating sense
- Is capable of translating business problems to data mining as well as machine-learning tasks and choosing appropriate data mining techniques.
- Has the skills for designing, developing and evaluating machine-learning solutions using exciting specific software packages
- Transforming raw data like a collection of texts or a database of transactions to a representation that can be understood by the known techniques
- Choosing appropriate techniques for data preprocessing, basic modeling and evaluation, optimization of parameters for defined KPIs, e.g. cost-sensitive classification, for the algorithms available in Weka, R, or other software;
- making valid conclusions about the performance of the models and their utility for addressing the identified business problem.
Type of instructions
Lectures and instructions/labs
- A blend of research articles, class notes, and material from reference books will be used in this course..