´╗┐´╗┐ GESIS Training Courses

Scientific Coordination

Sebastian E. Wenz
Tel: +49 221 47694-159

Administrative Coordination

Angelika Ruf
Tel: +49 221 47694-162

Course 5: Statistical Analysis of Incomplete Data

Online via Zoom / Time: 09:30-12:30 & 15:00-17:00 (CEST)
General Topics:
Course Level:
Software used:
Students: 400 €
Academics: 600 €
Commercial: 1200 €
Additional links
Lecturer(s): Dr. Florian Meinfelder, Angelina Hammon

About the lecturer - Dr. Florian Meinfelder

About the lecturer - Angelina Hammon

Course description

[This is a 30 hour class.]
This course provides an introduction to the theory and application of Multiple Imputation (MI) (Rubin 1987) which has become a very popular way for handling missing data, because it allows for correct statistical inference in the presence of missing data. With the advent of MI algorithms implemented in statistical standard software (R, SAS, Stata, SPSS, …), the method has become more accessible to data analysts. For didactic purposes, we start by introducing some naive ways of handling missing data, and we use the examination of their weaknesses to create an understanding of the framework of Multiple Imputation. The first day of this course is of a somewhat theoretical nature, but we believe that a fundamental understanding of the MI principle helps to adapt to a wider range of practical problems than focusing on a few select situations. We will subsequently shift to the more practical aspects of statistical analysis with missing data, and we will address frequent problems like regression with missing data. Further examples will be covered throughout the course, which are predominantly based on the statistical language R. We recommend basic R skills for this course, but it is possible to understand the course contents without prior knowledge in R, as the main MI algorithms are almost identical across all major software packages.

Target group

Participants will find the course useful if:
  • are survey methodologists working with incomplete data;
  • are researchers who want to learn more about the analysis of incomplete data in general;
  • are already aware of MI and its benefits, but feel uncomfortable about the available parameter settings in MI algorithms implemented in their preferred statistical software

  • Learning objectives

    By the end of the course participants will:
  • be familiar with the theoretical implications of the MI framework and will be aware of the explicit and implicit assumptions (e.g. will be able to explain within an article why MAR was assumed, etc.);
  • know when to use MI (and when not);
  • be aware how to specify a "good" imputation model and how to use diagnostics;
  • be familiar with the availability of the various MI algorithms;
  • be able to not only replicate situations akin to the case studies covered in the course, but also know how to handle incomplete data in general.
    Organizational Structure of the Course:
    We aim to intersperse teaching with many breaks for doodling/ trying out ideas on your own. In order to prevent fatigue (on yours and our side) uninterrupted teaching (lecture style) will be no longer than about an hour, before we 'break out'. In total, the pure teaching part will amount to max. 4 hours a day; including lab/break-out parts, the total course time per day will be about 6 hours.
    We will also prepare some smaller videos with fundamental stuff that you can download and watch anytime. Course notes and other material (videos, R Markdown documents,…) will be made available via ILIAS.


  • general knowledge of data preparation and data analysis
  • an advanced understanding of the (generalized) linear model;
  • familiarity with statistical distributions;
  • basic knowledge of matrix algebra helpful;
  • solid skills in either R, SPSS, or Stata (recommended for exercises).
    Software and Hardware Requirements:
    We recommend that course participants download and install Zoom (www.zoom.us) as well as R and RStudio from www.r-project.org/ and www.rstudio.com/ (note that you should install R first and the RStudio editor subsequently) on the computer / notebook they are planning to use for the online course. If you feel comfortable enough around R, feel free to already download and install the VIM and the mice package.
    We also recommend that you use multi-display if your OS supports this, so that you can use R/RStudio on your PC/ Notebook, while the Zoom functionalities (zoom controls, tiles, chat,…) are displayed on a different screen (if available). We make annotations to the course slides during lectures, so we recommend either you have a printout version of the course notes prepared or you are using a touch-screen and software to annotate pdf files.