GESIS Training Courses
user_jsdisabled
Search

Scientific Coordination

Sebastian E. Wenz
Tel: +49 221 47694-159

Administrative Coordination

Loretta Langendörfer M.A.
Tel: +49 221 47694-143

Causal Machine Learning for Cross-sectional and Panel Data

About
Location:
Cologne / Unter Sachsenhausen 6-8
 
General Topics:
Course Level:
Format:
Software used:
Duration:
Language:
Fees:
Students: 550 €
Academics: 825 €
Commercial: 1650 €
 
Keywords
Additional links
Lecturer(s): Martin Spindler, Jannis Kück

About the lecturer - Martin Spindler

About the lecturer - Jannis Kück

Course description

Participants of this course will learn and apply recent Causal Machine Learning methods to analyze effects in either cross-sectional or panel data. Causal Machine Learning combines the field machine learning, which was developed for predictions and is based on correlation, and the field of causal inference. In this course we will focus on the so-called Double Machine Learning approach (DML) which allows for valid inference in high-dimensional settings.
This course will focus on tools that are easy to implement for practitioners in the R / Python and covers three blocks:
  • Basics of causal inference
  • Basics of machine learning
  • Double Machine Learning (DML) for cross-sectional and panel data (including difference-in-differences, instrumental variables and mediation analysis)
  • The final day will also give the opportunity to discuss own work and applications of the participants (if requested).
    The course will be based on three pillars to teach the new methods: (i) lecture-based introduction of the theoretical concepts, (ii) getting to know the methods with hands-on examples / notebooks provided by the lecturer, (iii) supervised application to provided or own datasets.
     
    For additional details on the course and a day-to-day schedule, please download the full-length syllabus.


    Target group

    Participants will find the course useful if:
  • They are familiar with the basics of causal inference and regression analysis and are curious how machine learning methods could enter their empirical toolbox
  • They work with experimental and observational data in social science or related fields.
  • They want to learn and understand the new field of Causal Machine Learning, in particular Double Machine Learning


  • Learning objectives

    By the end of the course participants will:
  • Understand popular methods that are likely to appear in future studies they consume.
  • Know in which settings and for which research questions the current state of Causal Machine Learning provides attractive alternatives to standard tools.
  • Be able to apply Causal Machine Learning in basic settings.
  • Have the background knowledge to learn about Causal Machine Learning methods for more complex settings that are not covered in the course.
  • Understanding the Double Machine Learning approach
  •  
    Organizational structure of the course
  • During lab time, participants will apply the methods that were introduced in the morning session to synthetic and real datasets. A suggestive workflow for the analysis will be provided by the lecturer. Participants are encouraged to bring their own datasets if they come from a research design that is covered in this course.
  • Lecturer will support work on the datasets and is available for questions. Further, he is available for individual consultations on participants' projects.


  • Prerequisites

  • Basic understanding of probability theory (conditional expectations) and regression analysis (OLS)
  • Basic understanding of causal research designs, in particular randomized experiments and observational designs that control for confounding factors
  • Basic experience with the software R or Python
  • (Not required, but an advantage:) Basic understanding of Machine Learning methods, in particular shrinkage methods (e.g., Lasso, Ridge) and tree-based methods (regression trees, random forest)
  •   
    Software and hardware requirements
    Course participants will need to bring a laptop with the latest versions of R (https://cran.r-project.org/) and RStudio (https://www.rstudio.com/) as well as Python installed. All three programs are freely available for download and use. Please install the DoubleML packages for both R and Python: https://docs.doubleml.org/stable/index.html
     
    Participants will need to be able to download files from the internet (free Wifi is provided by GESIS) and have the rights to install packages on their laptops during the course.


    Recommended readings