GESIS Training Courses

Scientific Coordination

Verena Kunz

Administrative Coordination

Claudia O'Donovan-Bellante
Tel: +49 621 1246-221

Introduction to Computational Text Analysis with R

Online via Zoom
General Topics:
Course Level:
Software used:
Students: 330 €
Academics: 495 €
Commercial: 990 €
Additional links
Lecturer(s): Lea Kaftan, Jan Schwalbach

About the lecturer - Lea Kaftan

About the lecturer - Jan Schwalbach

Course description

Computational text analysis is a fast-growing and widely used methods field, allowing researchers to structure and study large corpora of texts in-depth from a multitude of theoretical backgrounds. The workshop introduces key concepts, standard methods, and the research logic underlying computational text analysis. It is designed for researchers with no or limited experience with computational text analysis who want to analyze text data for their own research.
Participants will learn to pre-process their text data and to apply and validate basic supervised and unsupervised methods such as dictionaries (e.g. sentiment), classification methods (e.g. support vector machines or random forest), and clustering methods (e.g. topic models) to their own text data using the programming language R. Participants will additionally learn the benefits and pitfalls of research designs based on the quantitative analysis of large text corpora, including ethical issues, potential biases, and implications of choices regarding document selection and pre-processing. After the course, participants will be able to choose appropriate designs for their own research questions and apply standard methods of text analysis in R.

Target group

Participants will find the course useful if:
  • they want to work with text data in their research for the first time.
  • they want to refresh prior basic knowledge in text analysis.
  • they want to learn how to preprocess text data for quantitative analyses.
  • they want an overview of standard computational approaches to analyzing text data.

Learning objectives

By the end of the course participants will:
  • be able to design and execute their own research based on large text corpora.
  • be able to preprocess, analyze, and visualize text data using R.
  • understand the principles of supervised and unsupervised methods for computational text analysis.
  • be able to independently familiarize themselves with more advanced methods.
    Organizational structure of the course
    The workshop contains both lectures and practical exercises. The lectures aim at providing a general understanding of the research logic and methods. The practical exercises deepen the participants' understanding of the methods and teach them how to apply the methods to their own data. During the exercises, the lecturers will be available for support, troubleshooting, and questions. Participants can bring their own data or work with data provided by the lecturers.


  • Basic knowledge of R is a prerequisite. The course cannot provide a general introduction to R. Participants should know how to import data, such as csv files, install packages, and work with objects in R.
    Software requirements
    R (at least version 4.0.0) and RStudio. Prior to the workshop, participants will receive an R script to install all required packages.


    Recommended readings