Modulinformationssystem Informatik


Text Mining URL PDF XML

Modulcode: infTM-01a
Englische Bezeichnung: Text Mining
Modulverantwortliche(r): Prof. Dr. Ralf Krestel
Turnus: unregelmäßig (SS23 WS24/25 SS26)
Präsenzzeiten: 2V 2Ü
Workload: 30 Std. Vorlesung, 30 Std. Präsenzübung, 150 Std. Selbststudium
Dauer: ein Semester
Modulkategorien: BSc-Inf-WP (BSc Inf (21)) BSc-WInf-WP-WInf (BSc WInf (21)) 2F-MEd-Inf-WP (MEd-Hdl Inf (21)) 2F-MA-Inf-WP (2F-MA Inf (21))
Lehrsprache: Englisch
Voraussetzungen: Info infEInf-01a


The digital age and the success of the Internet in particular has led to a huge amount of publicly available documents and textual information. The task of text mining is to process this unstructured information and extract knowledge. To this end, we will present methods, algorithms, and models that support a diverse set of text mining applications, ranging from regular expressions and lexicon-based sentiment analysis, to more complex methods using machine learning, such as probabilistic topic models.


Students are able to...

  • understand basic concepts of text mining and their commercial application
  • explain, apply, and evaluate text mining methods
  • implement text mining applications in Python


Foundation of linguistics Information extraction Named entities Opinion mining & sentiment analysis Preprocessing of textual data Representation of documents Word associations Topic Modeling Foundations of machine learning Document classification Clustering Sequence labeling Visualizing textual data Ethics & bias

Weitere Voraussetzungen:


  • Written exam at the end of the semester
  • Successful participation in the exercises (homework assignment) is a prerequisite to participate in the exam

Lehr- und Lernmethoden:

Concepts are introduced in the lectures with the help of examples and specific application tasks. In the exercise the knowledge is deepened and applied - guided by weekly homework assignments.



  • Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition von Dan Jurafsky undJames H. Martin (Third Edition Draft)

