GESIS Training Courses

Scientific Coordination

Verena Kunz

Administrative Coordination

Noemi Hartung
Tel: +49 621 1246-211

Introduction to Computational Text Analysis with R

Online via Zoom
General Topics:
Course Level:
Software used:
Students: 300 €
Academics: 450 €
Commercial: 900 €
Additional links
Lecturer(s): Marco Wähner, Lena Masch

About the lecturer - Marco Wähner

About the lecturer - Lena Masch

Course description

Computational text analysis is a fast-growing method used in a wide range of research fields: A computer scientist might ask how to extract information from unstructured text data, a communication scientist might want to detect hate speech, and a political scientist might be interested in comparing party manifestos.
The workshop introduces key concepts and methods of quantitative computational text analysis using the programming language R, which will allow researchers to analyze large quantities of text data (“big data”) in an efficient and automated way. It is aimed at those who have little or no prior experience with computational text analysis but want to use text data in their own research.
Participants will learn about common pipelines for computational preprocessing of text, such as importing and cleaning text data as well as creating corpora and extracting features (e.g., word counts or sentiments). In addition, we will analyze and visualize text data using different methodological approaches, e.g., supervised and unsupervised machine learning. To this end, the workshop will provide hands-on exercises using R to study different text data sources (e.g., text data from social media). Participants can also work with data from their own research. Each session will consist of a short introduction by the lecturer followed by hands-on exercises.

Target group

Participants will find the course useful if:
  • they want to work with text data in their research for the first time.
  • they want to refresh basic knowledge in text analysis.
  • they want to learn how to preprocess and represent text data for quantitative analyses.
  • they want an overview of current computational approaches to analyze text data.

  • Learning objectives

    By the end of the course, participants will:
  • be able to preprocess, analyze, and visualize text data using R.
  • understand the principles of differing approaches for computational text analysis.
  • be able to independently familiarize themselves with more advanced methods of computational text analysis.
    Organizational structure of the course
    The workshop is separated into lectures and practical exercises. While the lectures aim to provide an overview of key terms and concepts, the hands-on exercises involve individual text analysis tasks. During the exercises, the lecturer will be available for support, troubleshooting etc. Participants can work with their own data, or the data provided by the lecturer.


    Basic knowledge of R is a prerequisite, as the course cannot provide a general introduction to the programming language. Participants should know how to import data, such as csv files, install packages and work with data objects in R.  
    Software requirements
    R (at least version 4.0.0) and RStudio. Prior to the workshop, participants will receive an R script to install all required packages.


    Recommended readings