GESIS Training Courses

Scientific Coordination

Julia Leesch
Tel: +49 221 47694-169

Administrative Coordination

Claudia ODonovan-Bellante

Multiple Imputation of Missing Data: Theory and Application in Stata

Dr. Jan Paul Heisig, Dr. Ferdinand Geißler

Date: 28.08 - 30.08.2019 ics-file

Location: Mannheim B2, 8 / Course language: English

About the lecturer - Dr. Jan Paul Heisig

About the lecturer - Dr. Ferdinand Geißler

Course description

Missing data are a pervasive problem in the social sciences. Data for a given unit may be missing entirely, for example, because a sampled respondent refused to participate in a survey (survey nonresponse). Alternatively, information may be missing only for a subset of variables (item nonresponse), for example, because a respondent refused to answer some of the questions in a survey. The traditional way of dealing with item nonresponse, referred to as “complete case analysis” (CCA) or “listwise deletion”, excludes any observation with missing information from the analysis. While easy to implement, complete case analysis is wasteful and can lead to biased estimates. Multiple imputation (MI) seeks to address these issues and provides more efficient and unbiased estimates if certain conditions are met.
The goals of the course are to introduce participants to the basic concepts and statistical foundations of missing data analysis and MI, and to enable them to use MI in their own work. The course puts heavy emphasis on the practical application of MI and on the complex decisions and challenges that researchers are facing in its course. The focus is on MI using iterated chained equations (aka “fully conditional specification”) and its implementation in the software package Stata. Participants should have a good working knowledge of Stata to follow the applied parts of the course and to successfully master the exercises. Participants who are not familiar with Stata may still benefit from the course, but will likely find the exercises quite challenging.


Target group

     Participants will find the course useful if they:
  •         use survey or other types of quantitative data and want to learn about MI as an alternative to CCA;
  •       are already using MI, but want to gain a better understanding of the underlying assumptions, of current best practice recommendations, and/or of how to solve specific problems that arise in its application (e.g., imputation diagnostics, convergence problems, imputation of transformed variables such as interactions, imputation of hierarchical data).

Learning objectives

   By the end of the course participants will:
  • understand basic concepts of missing data analysis such as “missing at random”;
  • be familiar with different approaches to handling item nonresponse and with their advantages and drawbacks;
  • have a solid understanding of the main assumptions and statistical theory underlying MI and of the main steps of an analysis involving MI (imputation, diagnostics, and analysis);
  • know how to implement multiple imputation using chained equations in Stata;
  • know how to deal with various (Stata-specific and general) practical complications that arise in the application of MI using chained equations.


  •         Experience in the analysis of quantitative data
  •         Good knowledge of regression analysis
  •         Good working knowledge of Stata
  •         Basic understanding of probability theory  and sampling


Recommended readings

More Information