Modulinformationssystem Informatik

 

Master-Projekt - Knowledge Discovery in sehr großen Daten- und Dokumentenmengen URL PDF XML

Modulcode: WInf-Proj-KD12
Englische Bezeichnung: Master-Project - Knowledge Discovery on very large data or document corpora
Modulverantwortliche(r): Prof. Dr. Ansgar Scherp
Turnus: unregelmäßig (SS15 WS15/16 SS16 SS17 WS17/18 SS18)
Präsenzzeiten: 4PÜ
ECTS: 12
Workload: 60 Std. betreute Übung bzw. Praktikum, 300 Std. eigenständige Projektarbeit
Dauer: ein Semester
Modulkategorien: Prakt (MSc Inf)
Lehrsprache: Englisch
Voraussetzungen: Info

Kurzfassung:

The project group covers different topics in the area of Knowledge Discovery. Example topics are the analysis, interlinkage, and enrichment of unstructured documents or the analysis and use of semi-structured data. The students work in small groups on different innovative and applied problems. Besides a requirements analysis and conceptual specification of the problem, a major task is the implementation and proper evaluation of the proposed solution.

Lernziele:

The students learn how to work independently in small teams. They gain knowledge and practical experience in methods of Knowledge Discovery and their application on, e. g., unstructured documents or semi-structured data.

Lehrinhalte:

Knowledge Discovery deals with the content-driven identification and localization of digital objects such as semi-structured data on the web (i.e., Linked Open Data), Documents, Profiles, or Communities and understanding the relationships among them. The module involves the design of innovative methodologies and algorithms and their application to extensive data and document corpora of different origin and quality. Of particular interest are the analysis, interlinkage, and enrichment of unstructured data like multimedia content and textual content as well as the analysis and use of Linked Open Data (LOD) on the web. LOD is a technological approach to publish and interlink data from different sources, of varying quality, and purpose on the web. Since its advent in 2007, the so-called LOD has tremendously grown in terms of size and variety and has leveraged semantic technologies to widespread adoption and success. Today, LOD is applied not only by universities, research organizations and public bodies such as libraries and federal agencies but has also found its success in commercial sectors including media industries and web search engines.

The students of the practical course are encouraged to independently organize and work on a project for a real or fictive partner in industry or research. Important requirement to the practical course is a proper conceptual design, implementation, and evaluation of the solution. In addition, a sufficient level of innovation for the proposed solution is required as well as an in-depth analysis of the problem and documentation of the results. This includes a continuous evaluation of intermediate results and active participation of students in the design of the solution for the practical course. Thus, students are highly encouraged to propose their own view on the problem and make suggestions for improving the applied methods and results.

Weitere Voraussetzungen:

Experience in object-oriented programming, knowledge in methods of Knowledge Discovery, and technologies of Semantic Web and Linked Open Data are an advantage.

Prüfungsleistung:

Intermediate presentations and final presentation of the project results. Documentation of the project results and where appropriate an oral exam.

Lehr- und Lernmethoden:

Learning material will be provided in form of presentation slides, scientific publications, and other references.

Verwendbarkeit:

The acquired knowledge and skills can be applied for a thesis in the area of Knowledge Discovery.

Literatur:

Tan, Steinbach, Kumar: Introduction to data mining, Addison Wesley, 2006.

Liu: Web Data Mining, Springer, 2007.

Witten, Frank, Hall: Data Mining, Morgan Kaufmann, 2011.

Verweise:

None.

Kommentar:

None.