GESIS Training Courses

Wiss. Koordination

Andreas Niekler
Tel: +49-341-97-32239

Administrative Koordination

Loretta Langendörfer M.A.
Tel: +49-221-47694-143

Practical Introduction to Text Mining

Dr. Andreas Niekler, Dr. Arnim Bleier, Dr. Gregor Wiedemann

Datum: 18.04 - 19.04.2018 ics-Datei

Veranstaltungsort: Cologne


The workshop provides an introduction to Natural Language Processing (NLP) with a special emphasis on the analysis of Job Advertisements. NLP techniques enable researchers to describe contents of a collection of documents or filter a huge collection for specific thematic aspects. This workshop concentrates on the basic concepts needed for quantitative text analysis. It provides an overview starting with issues related to data import, frequency analysis and continues with co-occurrence analysis. Participants take part in short theoretical lectures and will be provided with R scripts to compute own models in exemplary tutorials.


The course is targeted at labour market researchers and researchers form the humanities who are interested in analyzing large textual data sets.


Participants will learn about opportunities and limits of text mining methods to analyze qualitative and quantitative aspects of large text collections. With example scripts provided in the programming language R, participants will learn how to realize single steps of such an analysis on a corpus of Job Advertisements. We cover a range of text mining methods from simple lexicometric measures such as word frequencies, key term extraction and co-occurrence analysis. Furthermore, we provide a short overview of more complex machine learning approaches such as topic models and supervised text classification. The goal is to provide a broad overview of technologies that are already established in the social sciences and that have the potential to be used in NLP based labour market research.


The workshop is hands-on oriented and we will used the programming language R. Thus, we strongly recommend some basic knowledge of R.
If you already have a certain amount of knowledge in another programming language, learning R will be easy for you. However, since R is a statistical programming language, some of its concepts largely differ from other languages.
For participants without basic knowledge of R, we strongly recommend to learn at least a little in preparation of the course. For this, we provide links to material and online tutorials prior to the course through the Gesis e-learning platform.


Referenteninformationen - Dr. Andreas Niekler

Referenteninformationen - Dr. Arnim Bleier

Referenteninformationen - Dr. Gregor Wiedemann

Weitere Informationen