The objective of the Data Challenge courses is to teach students how to perform large-scale data-driven analyses themselves, combining the technical skills acquired earlier with insights gained in methodological courses. As an example, consider the classification of images for detecting a disease or any other outcome variable of interest. The challenge in this setting involves training a model on the classification task and reflecting the quality of the trained classifier regarding its practical application. Ethical considerations should also be taken into account.|
After successfully completing the course, students are able to
- work on data-driven analytics, e.g., dividing the required process into tasks/phases and executing relevant techniques on each of these phases.
- recognize, describe, and apply the steps for building a model to analyze image data, that is, prepare an image dataset for the modelling task, build and train a model to classify images, and evaluate the performance of such model.
- consider the advantages and limitations of the used technology with respect to the ethical implications of the developed solution.
- apply best practices that help team members to develop and complete a project, such as planning the required tasks, estimating the effort for these tasks, and monitoring/refining the execution of tasks.
- present the final solution as well as the evaluation results through presentations and reports that are suitable for the given audience.
The course follows the objective of all Data Challenges courses, and thus, teaches students how to set up, execute, and evaluate scientifically sound data analyses that are adequate for particular stakeholders (i.e., users, enterprise, and society) using available data sets. Data Challenge 1 focuses on a data-driven solution that has particular economical and societal impact. The problem is well-defined in terms of the available data, target requirements, and group of stakeholders. Due to the nature of the application, the analysis also involves important ethical considerations, with which the students must also deal.
The students are given a project on which they have to work on in teams. Each team must identify and execute the modelling and a suitable analysis approach, and then reflect on the adequacy of the analysis for the different stakeholders. An example of a project is to create a decision support system for the diagnosis of a disease. The main task is to develop a system that can, even if partially, automate the diagnosis. For instance, the task of analyzing images presents significant challenges since it requires processing high-dimensional image data to discover weak signals that are correlated with the outcome. Due to the dimensionality and the highly non-linear relationship between the features, many machine learning methods struggle to produce models with acceptable performance. Recent developments in Deep Convolutional Neural Networks (DCNN) have been very successful in addressing these challenges. Beyond the technical challenges, the course also requires the students to gain a good understanding of the domain and the application of data science methods in such setting, and to deal with the ethical implications that arise when using machine learning models in applications with multiple stakeholders.
It is essential that students (at least) have basic knowledge of the programming language Python, basic knowledge of machine learning techniques and have acquired all knowledge and skills provided by the course Data Analytics for Engineers. Such knowledge and skills will be assumed for taking Data Challenge 1.