GESIS Training Courses

Scientific Coordination

Sabina Haveric
Tel: +49 (0221) 47694 - 166

Administrative Coordination

Laura Rüwe

Introduction to Machine Learning for the Social Sciences

Bruno Castanho Silva

Date: 20.01 - 22.01.2020 ics-file

Location: Cologne / Course language: English

About the lecturer - Bruno Castanho Silva

Course description

Machine Learning is an analytical approach in which users can build statistical models that 'learn' from data to make accurate predictions and decisions. From customer-recommendation systems (think of Netflix suggesting what movies you should watch) to policy design and implementation, machine learning algorithms are becoming ubiquitous in a big data world.
Their potential, however, is only starting to be explored in the social sciences, and in few and specific areas. In this course, you will learn the fundamentals of machine learning as a data analysis approach, and will get an overview of the most common and versatile classes of ML algorithms in use today, with all practical examples in R.
By the end of the course, you will be able to identify what kind of technique is most suitable for your research question and data, and how to design, test and interpret your models. You will also be equipped with sufficient basic knowledge to proceed independently for more advanced algorithms and problems. This is an introductory course, so math and programming technicalities will be kept to a minimum. If you can run and interpret multivariate regressions in R, you can (and should!) take this course.


Learning objectives

By the end of this course, you will be able to identify what kind of machine learning method better suits the data and question you have at hand (whether it is a classification or regression problem, if supervised or unsupervised, etc), and to fit and interpret the appropriate models covered in class. This is an introductory course for social science students with no prior knowledge of machine learning or algorithms at large. Those who are familiar with and have used these methods already, or who have experience with computational sciences, programming, and software development, are likely to get frustrated with the basic level of teaching.


This is a beginner's course. I expect you to be familiar only with concepts and practices of traditional multivariate regression analysis.
All examples will be given in R, so you should also have a minimal working knowledge of this software (which can be replaced by an advanced knowledge of another programming language, such as Python, to adapt the codes and exercises on your own).
You will receive workshop PCs with the required software packages on site. You are also welcome to bring your own laptops. To pull data onto your devices, please bring USB sticks with you. You have free WiFi access.


Recommended readings

More Information