GESIS Training Courses

Scientific Coordination

André Ernst

Administrative Coordination

Janina Götsche

Introduction to Social Media as Research Data: Potentials and Pitfalls

Online via Zoom
General Topics:
Course Level:
Software used:
Students: 160 €
Academics: 240 €
Commercial: 480 €
Additional links
Lecturer(s): Dr. Katrin Weller, Indira Sen

About the lecturer - Dr. Katrin Weller

About the lecturer - Indira Sen

Course description

The activities and interactions of hundreds of millions of people worldwide are recorded as digital traces including social media data from websites like Facebook, Twitter, Instagram, reddit and more. These data offer increasingly comprehensive pictures of both individuals and groups on different platforms, but also allow inferences about broader target populations beyond those platforms. Notwithstanding the many advantages, studying the errors that can occur when digital traces are used to learn about humans and social phenomena is essential.
In this workshop, we propose to combine theory, data and methods to demonstrate both the pitfalls and potentials of digital traces from social media users. It includes both hands on sessions and general reflections on how to design a social media study.
In sessions spread over the course of four days, participates will
  • Get an overview on the field of social media research.
  • Conduct exemplary hands-on exercises in data collection, data preprocessing and data analysis.
  • Learn how error frameworks and data documentation initiatives can help to reflect about potential limitations of research designs.
  • Collaboratively draft a research design for an example case and receive feedback.
    For the practical part we will be mainly working with examples from publicly available social media data, most likely from the platform Reddit. Participants can work with scripts prepared by the teachers to collect and interact with the data and to learn about potential limitations and pitfalls. Previous programming skills are not required, and necessary preparation instructions will be shared prior to the course. We will demonstrate how existing scripts (Python) can be used via different environments such as GESIS Notebooks or Google Colab.
    For the theoretical part we will be drawing on existing approaches that critically reflect the possibilities and limitations of social media research, including our own approach of a Total Error Framework for Digital Traces of Humans Online (TED-On) [the first paper in the literature list]. This conceptual framework helps to identify potential sources of errors in digital trace based research, organized by the different phases in a research process such as data collection, data preprocessing and data analysis. To help understand the utility of TED-On for digital traces, we apply it to diagnose and document errors in existing computational social science studies such as Understanding Political Opinion using Twitter and Using Search Queries for Inferring Health Statistics.

    Learning objectives

    Participants will hence gain insights on
  • typical scenarios for research based on digital trace data from the web including their potentials for social science research
  • how to critically reflect on research design in social media or web data based studies and to systematically spot and document errors in their studies
  • using existing scripts for basic data collection and preprocessing tasks

  • Prerequisites

    Open to people of different disciplines but primarily aimed at those
  • who have some prior experience in survey research and want to extend their knowledge on how digital behavioral data might be suitable additional data sources for their research questions
  • who have already worked with digital behavioral data from the web and want to learn about additional possibilities to critically reflect on research designs and their limitations.
  • No programming skills will be required. We will provide reusable scripts and show how to run them in order to practically demonstrate parts of the research process.


    Recommended readings

    More Information