The objective of the course is to provide students with:
(1) theoretical knowledge of the basic image processing methods,
(2) practical skills for sampling, representing and analysing images,
(3) understanding the basics of the field of computer vision and the non- deep learning techniques used in the field, and
(4) understanding the relation between biological and artificial vision.
Computer vision focuses on the task of endowing computers with vision-related capabilities. Although deep learning has become the most successful method for computer vision, knowledge of traditional computer vision methods is required in practice. This course provides students with the understanding and skills to apply computer vision methods to image processing, image manipulation, image representation, and image recognition. To the extent that the computer vision methods generalize to 1D and ND signals, some knowledge of signal processing methods will be acquired as well. Machine learning approaches are briefly presented to compare the effect of different features and image-representations in the context of object recognition. Throughout the course, parallels between biological and artificial vision systems are highlighted.
The lectures of the course start with a brief historical overview of computer vision culminating in the state-of-the-art systems, and with a comparison to the human visual system. An introduction to visual sampling and color representation will be given from cognitive science (sampling by the human eye) and AI (digital sampling) perspectives. Subsequently, edge detection will be presented and will be used as an application example to introduce image filtering and convolutions. Then, the notion of spatial frequencies and the Fourier transform and the convolution operation will be highlighted for the 2D case and generalized to the ND case. The global nature of the Fourier transform will be contrasted with local (windowed) Fourier transforms and wavelet (Gabor) transforms as more effective and cognitively more plausible methods. The scale space and multi-scale pyramid representations will be explained and applied to building effective image (and signal) representations. Computer vision tasks such as segmentation, feature detection and matching, and object detection and recognition will be reviewed.
During skill classes, students are trained on applying the methods to natural images.
The final score for the course is based on two parts: a midterm exam (40%) and a final written exam (60%). Both exams consist of open questions that test your knowledge of foundations of computer vision and your skills (by means of code fragments).
Excerpts from: Szeliski, R. (2010). Computer Vision: Algorithms and Applications. (http://szeliski.org/Book) and selected papers.
“Due to limited capacity, this course is currently not open for external students.”