At the end of the unit, students will be able to:
- describe natural language processing components on the lexical, syntactic, and semantic level, both from a functional and technical point-of-view and use these components;
- analyze and design complex natural language processing end-user systems from the lower level language processing components;
- compare different approaches (such as symbolic or probabilistic approaches) to natural language processing tasks, identifying pros and cons of the different approaches.
Manually analyzing large collections of text, such as newspaper articles, blogs or tweets, quickly becomes infeasible due to the huge amounts of data. Content analysis of, for instance social media data or company texts, often requires linguistic analysis in order to identify and extract useful information. Techniques that automate this process are called natural language processing techniques.
Natural language processing comprises a vast collection of tasks, algorithms, and theoretical frameworks that, at various different levels, aim at making human language understandable to computers. In order to build working computer systems that are able to automatically process natural language, it is essential to have a thorough understanding of how these ingredients work. During the course, students acquire this knowledge through both theoretical study of techniques, and practical experience with basic language processing systems.
Topics that will be covered by the course include (among others):
Syntactic annotation of language,
User-oriented applications, such as machine translation.
The final grade is calculated based on the grade for the written exam (70%) and two individual assignments (10% each) and a paper (10%). Assignments have non-negotiable deadlines. Assignments handed in after the deadline will not be accepted and will lead to a fail for the course.
Students may be expected to present part of the course material during lectures.
Daniel Jurafsky and James H. Martin, Speech and Language Processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd edition, Prentice-Hall: Upper Saddle River, New Jersey, 2009, ISBN 978-0-13-504196-3.
Steven Bird, Ewan Klein, and Edward Loper, Natural Language Processing with Python - Analyzing Text with the Natural Language Toolkit https://www.nltk.org/book/