Scientific Coordination
Sebastian E. Wenz
Tel: +49 221 47694-159
Tel: +49 221 47694-159
Administrative Coordination
Jacqueline Schüller
Tel: +49 0221 47694-160
Tel: +49 0221 47694-160
Please wait...
Course 8: Missing Data and Multiple Imputation
About
Location:
Online via Zoom
Online via Zoom
Course duration:
9:00-16:00 CEST
General Topics:
Course Level:
Format:
Software used:
Duration:
Language:
Fees:
Students: 500 €
Academics: 750 €
Commercial: 1500 €
Keywords
Additional links
Lecturer(s): Florian Meinfelder, Angelina Hammon
Course description
This online course provides an introduction to the theory and application of Multiple Imputation (MI) (Rubin 1987) which has become a very popular way for handling missing data, because it allows for correct statistical inference in the presence of missing data. With the advent of MI algorithms implemented in statistical standard software (R, SAS, Stata, SPSS,…), the method has become more accessible to data analysts. For didactic purposes, we start by introducing some naive ways of handling missing data, and we use the examination of their weaknesses to create an understanding of the framework of Multiple Imputation. The first day of this course is of a somewhat theoretical nature, but we believe that a fundamental understanding of the MI principle helps to adapt to a wider range of practical problems than focusing on a few select situations. We will subsequently shift to the more practical aspects of statistical analysis with missing data, and we will address frequent problems like regression with missing data. Further examples will be covered throughout the course, which are predominantly based on the statistical language R. We recommend basic R skills for this course, but it is possible to understand the course contents without prior knowledge in R, as the main MI algorithms are almost identical across all major software packages.
For additional details on the course and a day-to-day schedule, please download the full-length syllabus.
Target group
Participants will find the course useful if they:
Learning objectives
By the end of the course participants should be:
Organizational structure of the course
We aim to include many smaller breaks so that lecture-style teaching will be no longer than about an hour at a time. Besides the pure teaching part, there will also be several virtual lab sessions per day, so that you have the opportunity to directly implement and practice the covered material. In addition, there will be room for individual consultations on the treatment of missing data in your own projects. Course notes and other material (videos, R Markdown documents,…) will be made available via the e-learning platform ILIAS.
Prerequisites
Software and hardware requirements
Course participants will need a computer or laptop with R (https://cran.r-project.org/) and RStudio installed (https://www.rstudio.com/). Both programs are free and open source. We recommend using the Zoom desktop client for the best online teaching experience in Zoom.
Agenda
Monday, 14.08. | |
09:00-16:00 | Introduction to Missing-Data Terminology Missing-data mechanisms; missing-data patterns Introduction to Multiple Imputation (MI) Why MI?; basic concept of MI; how to use Rubin's Rules |
Tuesday, 15.08. | |
09:00-16:00 | Implementation of MI in R Sequential regression and joint modeling; introduction to the mice package; overview of similarities and differences for the MI implementations in Stata and SPSS |
Wednesday, 16.08. | |
09:00-16:00 | Digging deeper into MI Imputation methods; analysis of multiply imputed data |
Thursday, 17.08. | |
09:00-16:00 | Empirical problems Dealing with skips and implausible values; rounded and heaped data; passive imputation and logical consistency |
Friday, 18.08. | |
09:00-16:00 | Generalized Linear) Modelling with multiply imputed data Missings in covariates and response variables; imputation of squares and interactions; multilevel modelling Further Applications of MI Data fusion and split questionnaire designs; the Rubin Causal Model |