GESIS Training Courses

Scientific Coordination

Sebastian E. Wenz
Tel: +49 221 47694-159

Administrative Coordination

Jacqueline Schüller
Tel: +49 0221 47694-160

Course 3: Data Science Techniques for Survey Researchers

Online via Zoom
Course duration
10:00-13:00 CEST
14:30-17:30 CEST
General Topics:
Course Level:
Software used:
5 days
Students: 500 €
Academics: 750 €
Commercial: 1500 €
Additional links
Lecturer(s): Dr. Christoph Kern, Dr. Malte Schierholz

About the lecturer - Dr. Christoph Kern

About the lecturer - Dr. Malte Schierholz

Course description

Please note: This Online-class is taught online only-live and in real time. Recordings will not be available.
A variety of digital data sources are providing new avenues for empirical social science research. To effectively utilize these data for answering substantive research questions, a modern methodological toolkit paired with a critical perspective on data quality is needed. Organized and offered in collaboration with BERD@NFDI, this course introduces state-of-the-art data science techniques that are suited for collecting and analyzing digital behavioral data, so-called "big data", and traditional survey data. In addition, aspects of data quality and error frameworks for digital (big) data sources are discussed.
The course will cover the following topics and techniques:
  • Overview of Big Data: What is it and why does it matter?
  • Total Survey Error for Big Data
  • Git and GitHub
  • Web Scraping
  • Data bases and SQL
  • Data quality for gathered data types
  • Sampling from online material (e.g., Twitter)
  • (Supervised) Machine Learning for Social Scientists, including:
  • Regularized Regression
  • Decision Trees and Random Forest
  • Boosting
  • Applications
  • Working with textual data: Text Mining and Topic Models
    After the course, participants will have a profound understanding of important methods from the data science toolkit for collecting and analyzing the data types mentioned. They will be able to apply these methods and techniques in their research using statistical software.
    For additional details on the course and a day-to-day schedule, please download the full-length syllabus.

    Target group

    Participants will find the course useful if:
  • they are interested in learning some fundamental techniques in data science
  • they want to collect and work with digital behavioural data, be it administrative data or data found online
  • they want to understand what (supervised) machine learning is

    Learning objectives

    By the end of the course participants will:
  • Understand the challenges when analyzing digital behavioural data
  • Be familiar with some of the software tools commonly used to analyze such data
  • Know additional data science tools and techniques
  • Know the promises and benefits of (supervised) machine learning
  • Be able to use (supervised) machine learning for data analysis
  • Be able to use common routines for analyzing textual data
  • Learn some of the metrics used to assess data quality for gathered data types
  • Learn about two different approaches to sampling Twitter.[MS1] 
    Organizational Structure of the Course
    The course is partly theoretical, partly practical. Each topic will be introduced in a lecture. The best way to deepen one´s understanding is with practical hands-on exercises. Files written in R Markdown will be provided to help participants execute the prepared scripts on their own computer and complete the assignments. The instructors are available to assist and answer questions during the practical sessions.


  • general knowledge of statistics and statistical modelling (i.e., regression)
  • prior experiences with syntax-based software (like R, Stata, or Python)
    Some basic experience with programming in R is helpful, but not strictly necessary. For those without prior exposure to R, we will ensure everyone is able to execute R markdown files. Students without any R knowledge are encouraged to work through one or more R tutorials prior of the course. Some resources can be found here:
    Software and Hardware Requirements
    We will use R for the practical sessions. Please have R and RStudio installed on your computer. Both programs are free and open source. We will inform you a few days before the course starts about recommended steps to setup your system. You should be able to access the internet and install additional packages during the course.
    Scholarships sponsored by BERD@NFDI
    Anyone who participates in the online course “Data Science for Survey Researchers” at the GESIS Summer School 2022 and is eligible for the student rate may apply for one of six available scholarships funded by BERD@NFDI. The scholarship works as a fee waiver and is worth EUR 500. Should you apply for a scholarship funded by the DAAD, you cannot apply for a scholarship funded by BERD@NFDI. However, you may simultaneously apply for an ESRA-sponsored scholarship for a different course. Calls for all scholarship programs are available here.
    Applications will be evaluated based on academic and social criteria, in that order. To apply, please submit the following documents as one PDF file via our online application form below:
  • A brief cover letter in which you explain
  • why it is important to you and your work to participate in the online course “Data Science for Survey Researchers” at the GESIS Summer School 2022; and
  • your need for a scholarship.
  • A brief up-to-date CV including a list of publications (if applicable).
    Both documents must be submitted via our online application form below as one PDF file by 15 May 2022 (German time). Applicants will be informed by the end of May.
    Before applying for this scholarship, please register at to secure a place in the course “Data Science for Survey Researchers”. You will not have to pay any course fees. You may cancel your registration after learning about our decision but no later than four weeks before the course. Please note that we will have to charge you the full student rate if you do not participate successfully (i.e., you miss more than 20% of course time) in the course “Data Science Techniques for Survey Researchers”.
    Should you not be granted a scholarship and wish to cancel your participation in the GESIS Summer School, you may send a cancellation request to You will receive a full refund if you cancel up to 4 weeks before the course starts (see terms and conditions).
    All questions may be sent to
    Submit your application via our online application form