GESIS Training Courses
user_jsdisabled
Search

Scientific Coordination

Verena Kunz

Administrative Coordination

Claudia O'Donovan-Bellante

Explainable AI and Fair Machine Learning with R and Python

About
Location:
Online via Zoom
 
General Topics:
Course Level:
Format:
Software used:
Duration:
Language:
Fees:
Students: 330 €
Academics: 495 €
Commercial: 990 €
 
Keywords
Additional links
Lecturer(s): Paul Bauer, Lion Behrens

About the lecturer - Paul Bauer

About the lecturer - Lion Behrens

Course description

In today's rapidly evolving research landscape, machine learning has emerged as an indispensable tool, enabling unparalleled insights and efficiencies across diverse domains. As social scientists have begun to use machine learning tools in their research, the need to understand and explain the complex behavior of seemingly black box machine learning models has risen enormously. Why did a model arrive at its concrete prediction for a particular data instance? Which variables were most important for the model behavior globally and how do they interplay? Does the model show unfair and discriminatory behavior towards vulnerable groups?
 
This workshop introduces the field of interpretable machine learning, also often referred to as explainable AI (xAI), with R and Python. Throughout the workshop, we will build both a solid conceptual understanding as well as engage in hands-on experience in applying the most popular xAI techniques like SHAP, LIME, Counterfactual Explanations, Permutation Feature Importance and Partial Dependence Plots (PDP) as well as Individual Conditional Expectations Plots (ICE). In addition, we will get to know techniques designed not only to understand models' behavior in general, but also with a particular focus on fairness, bias, and discrimination. The focus of the workshop lies on supervised machine learning models for tabular data. During the coding labs of the workshop, participants can individually decide whether they prefer to implement the discussed techniques either with R or with Python.
 
Organizational Structure of the Course
Each day is structured around one three-hour block in the morning and one three-hour block in the afternoon. Each block consists of 90 minutes of conceptual lectures and discussions of the material as well as 90 minutes of hands-on lab sessions. During the labs, participants can individually decide whether they prefer to work in R or Python. During the lab sessions, lecturers will be available to support participants, to facilitate discussions about the application of the material, and for individual consultations on participants' projects as long as they relate to the application of explainable AI and machine learning in general.


Target group

You will find the course useful if:
  • you are a regular R or Python user who is interested in learning about explainable interpretable machine learning/AI,
  • who is interested in applying or is already applying machine learning methods in your projects, or
  • who is interested in using machine learning methods for your academic research or in your non-academic career


Learning objectives

By the end of the course, you will:
  • be able to categorize different methods from explainable interpretable machine learning/AI into distinct groups
  • understand the most prominent approaches used in field of explainable interpretable machine learning/AI
  • be able to critically assess these approaches, discuss their advantages and disadvantages, and be able to reflect on the scenarios in which these approaches will fail to provide valid insights
  • be able to apply the discussed material in projects that use supervised machine learning models on tabular data using either R or Python
  • be able to visualize the insights that are gathered about the behavior of machine learning models using explainable interpretable machine learning/AI methods


Prerequisites

  • If you have ever fitted a machine learning model in R or Python, you have the necessary prerequisites for this course. As a refresher we will very shortly re-introduce basic ML concepts during the workshop.
  • Specifically, this means:
  • Basic knowledge of the machine learning workflow, including training/test split/cross-validation as well as model evaluation
  • Knowledge of at least one machine learning model suitable for tabular data for which the need of explainable AI methods exists (for instance Ridge/Lasso, Regression and Classification Trees, Random Forests, Gradient Boosting, Deep Neural Networks)
  • Understanding of applied data analysis with either R or Python, independently of coding philosophy (base R vs. tidyverse) or IDE (RStudio, Spyder, VS Code, PyCharm)
  • Basic knowledge of introductory statistics, including regression analysis (e.g. logistic regression) and frequentist methods
  •  
     
    Software and Hardware Requirements
    The workshop will be based on the open-source programming languages of R and Python, among which the participants can individually choose. We follow the principles of 'Open Data,' 'Open Code,' and the integration of narrative text and code (no commercial software is needed). Depending on your choice of working with either R or Python, please install R and RStudio before the workshop or Python and an IDE of your choice (e.g. Spyder, VS Code), respectively. Participants will receive an email with further installation instructions before the workshop (e.g., regarding required R and Python packages).


    Schedule

    Recommended readings