GESIS Training Courses

Scientific Coordination

Verena Kunz

Administrative Coordination

Janina Götsche

Workflows for Reproducible Research with R & Git

Online via Zoom
General Topics:
Course Level:
Software used:
Students: 200 €
Academics: 300 €
Commercial: 600 €
Additional links
Lecturer(s): Johannes Breuer, Bernd Weiß, Arnim Bleier

About the lecturer - Johannes Breuer

About the lecturer - Bernd Weiß

About the lecturer - Arnim Bleier

Course description

The workshop focuses on reproducible research in the quantitative social and behavioral sciences. In the context of this workshop, reproducibility means that other researchers can fully understand and rerun your data preparation and statistical analyses. However, the workflows and tools covered in this workshop will also help in facilitating your own work as they allow you to automate and track analysis and reporting tasks. In addition to a conceptual introduction to the methods and key terms around reproducible research, this workshop focuses on procedures for maximizing the reproducibility of data analyses using R. After discussing essential definitions and dimensions of reproducibility, we will cover some computer literacy and project organization basics that are helpful for conducting reproducible research (e.g., folder structures, naming schemes, or command-line interfaces). After that, we will focus on version control, dependency management, and computational reproducibility. The tools we will use for that include Git and GitHub, R packages for dependency management as well as Binder, a tool to package and share reproducible and interactive analysis environments.

Target group

The workshop is targeted at participants who have (at least some) experience with R and want to learn (more) about workflows and tools for making the results of their research reproducible.

Learning objectives

By the end of the workshop, participants should
  • have gained insights into key concepts of reproducible research and recommended best practices
  • be able to work with frameworks and tools that can be used for maximizing reproducibility, such as Git, packages for dependency management, or Binder
  • be able to publish reproducible computational analysis pipelines with R

  • Prerequisites

    Participants should have some basic knowledge of R and RStudio (e.g., installing and loading packages, importing different data types, basic data wrangling, and analyses). Other than that, the only other prerequisite is that you should not be afraid to use your keyboard more often than your mouse. To facilitate applying the methods covered in the workshop to their own work, we recommend that participants install all necessary software on their computers. Information on how to do this will be shared with all registered participants prior to the workshop.
    Organizational structure of the course
    The workshop is structured into segments of instructive lectures and interactive hands-on sessions. During the interactive sessions, participants will, e.g., create Git repositories, use GitHub, generate interactive copies of (different parts of) computational environments in R, and publish those using Binder. The lecturers will be available for support during hands-on segments and can also consult on participants' projects.
    Software requirements
    All software used in the workshop is available without cost as open source under Windows, MacOS, and Linux systems. Detailed installation instructions for the installation and setup of the required software will be provided before the start of the workshop.
    Thursday, 16.11.
  • Introduction: What is reproducible research?
  • Computer literacy
  • 12:00-13:00Lunch Break
  • Git & GitHub I
  • Git & GitHub II (incl. Git & RStudio)
  • Q & A
  • Friday, 17.11.
  • Recap first day
  • Dependency management
  • Saving computational environments
  • 12:00-13:00Lunch Break
  • Notebooks & Binder
  • Build your own Binder
  • Recap & Outlook

  • Recommended readings