GESIS Training Courses
user_jsdisabled
Search

Scientific Coordination

Dr.
Marlene Mauk
Tel: +49 221 47694-579

Administrative Coordination

Noemi Hartung
Tel: +49 621 1246-211

Introduction to Computational Social Science with R

About
Location:
Mannheim B6, 4-5
 
Course duration:
10:00-17:00 CEST
General Topics:
Course Level:
Format:
Software used:
Duration:
Language:
Fees:
Students: 500 €
Academics: 750 €
Commercial: 1500 €
 
Keywords
Additional links
Lecturer(s): Aleksandra Urman, Max Pellert

About the lecturer - Aleksandra Urman

About the lecturer - Max Pellert

Course description

The course will provide an overview of the methods used in the field of computational social science (CSS) and their real-world applications. It will include both theoretical explanations of different methods and hands-on practical exercises through which the participants will be able to apply the discussed techniques in R. The course is aimed at participants with no or little experience with computational methods. Within the course, topics such as web scraping, foundations of computational text analysis, data visualization, and ethical aspects of CSS will be covered. The course will take place in person and will consist of a combination of lectures and practical exercises. By the end of the course, each participant will have practical experience in R in retrieving web data, applying basic text analysis techniques to it, and visualizing the results. The participants will gain this experience through supervised practical exercises as well as through group projects in which they will work semi-independently, with the guidance from the lecturers, throughout the course. To make full use of the course, participants should have knowledge of the very basic concepts of programming in R (for example, writing a loop themselves, reading in a csv file, and being familiar with data types such as a data.frame).We link to a self-assessment test below (see Course Prerequisites). To gain that basic knowledge, several pointers to online crash courses on those very basics of R are linked to below (see Course Prerequisites). Participants are expected to work through some of those materials before the course should they have never worked with R before at all or only had very limited experience with R.
 
For additional details on the course and a day-to-day schedule, please download the full-length syllabus.


Target group

Participants will find the course useful if:
  • they are social scientists with experience using R but very little or no experience with computational methods who would like to learn more about the methods such as digital trace data collection and web scraping, and computational text analysis, and potentially use them in their research


  • Learning objectives

    By the end of the course participants will:
  • be able to define what constitutes the field of computational social science and know which methodologies are commonly utilized in the field as well as which types of research questions can be handled using these methodologies
  • be familiar with the major ethical aspects of conducting computational social science research
  • have hands-on experience gathering digital trace data from online sources through direct web scraping and APIs using R
  • know about the basic computational text analysis methods and have practical experience utilizing some of them using R
  • be able to visualize their data using various techniques in R
  • be equipped to use provided pointers to advanced materials to further improve their skills
  • have a hands-on experience realizing their own small project using computational methods
  •  
    Organisational Structure of the Course
    The course will consist of a combination of lectures and practical hands-on lab sessions that will take place live online. The lab sessions will consist of two components. The first one is practical scripted exercises related to a specific topic that the participants will be guided through by the lecturers. The second one involves semi-independent group work on the side of the participants and will be constituted by a group project in which the participants will apply the skills gained by studying different topics covered in the course. Throughout this project, the participants will be supported through individual and (for group projects) group consultations with the lecturers.


    Prerequisites

  • basic knowledge of R (if you are unsure if your R knowledge is sufficient, here is a self-assessment test we prepared for you. In case you will see that the test is too difficult for you, we have also included links to several free online R crash courses that you should go through to prepare for our course https://seafile.ifi.uzh.ch/f/63542a5ab4be4d37846d/)
  • working command of English language
  • knowledge of basic statistics (distributions, correlation)
  • basic programming knowledge (variables, loops, conditions) in R (see the self-assessment test above)
  • For those who would like a primer or refresher in R, we recommend taking the online workshop “Introduction to R” that takes place from 05-07 September 2023.
  •  
    Software and hardware requirements
    Participants should bring their own laptop for use in the course.
    All the participants should have R and RStudio installed on their laptops, it's highly preferable that R is updated to the latest version. We will let participants know about specific packages necessary to install shortly before the course, and, if necessary, will help them with the specific package installation problems on Day 1 of the course. The lecturers are most familiar with Linux environments (e.g., Ubuntu or Debian) to run R and RStudio, but they can also provide support for Windows and MacOs.
     
    Agenda  
    Monday, 11.09.
    Morning Session
  • On day 1 we will start with a lecture that presents an overview of the field of computational social science and its development, including examples of CSS research in different subdomains (e.g., CSS research focusing on political processes, economic phenomena, communication science, etc). We will also provide background on the ethical conduct of CSS research and give participants practical guidelines on how to make sure their research adheres to ethical (and legal) standards.
  • Group task: each group of participants will receive a case study about ethics in CSS. Within this task, the participants will need to evaluate different CSS study designs from an ethical standpoint and propose ways to mitigate potential harm. Before lunch, each group will have ~45 minutes to get familiar with the study and start discussing it. After lunch, the groups will first finish their discussions and then shortly present their outcomes (the details and guidelines will be provided during the course).
  • Afternoon Session
  • Group task: participants continue discussing the case studies and prepare short summaries of their discussions to present those to other groups.
  • Group presentations on the group task, joint discussion
  • Lecture on Digital Trace Data that will provide the participants with background information on what digital trace data is and how it can be leveraged within CSS. This will serve as the basis for the next day's practical lectures and assignments. The lecture is followed by a short introduction to the Group Project work that will take place during the course.
  • Participants start discussing ideas for the final group projects
  • Tuesday, 12.09.
    Morning Session
  • Hands-on tutorial on the use of APIs for data collection using R
  • Practical task where participants use R to collect data from an API based on the tutorial
  • Solutions to the practical task are presented
  • Afternoon Session
  • Hands-on tutorial on the use of web scraping for data collection using R
  • Practical task where participants use R to collect data from using web scraping based on the tutorial
  • Solutions to the practical task are presented
  • Group work further developing ideas/potentially starting to collect the data for the final project
  • Wednesday, 13.09.
    Morning Session
  • During Day 3, we will give the background on contemporary methods commonly employed for computational text analysis. We will discuss and showcase several different methods for sentiment analysis and talk about how to validate results. We will provide participants with practical skills in foundational bag-of-words-approach-based text analysis methods such as frequency analysis, co-occurrence analysis, and LDA topic modeling. We will also give participants an overview of existing more advanced methods that they might want to explore if they are interested in the topic.
  • Afternoon Session
  • Participants will do practical exercises on automated text analysis. In project groups, they will further develop their ideas, and decide on the ways to address question(s) they are interested in using computational text analysis methods they learned on Day 3
  • Thursday, 14.09.
    Morning Session
  • We will cover the basics of data visualization using R, including different types of plots and diagrams, with a focus on the ggplot2 package. We will also give directions on the more advanced visualization techniques in R such as interactive graphs (plotly) for participants who are interested in the topic.
  • Afternoon Session
  • Participants will do practical exercises on data visualization using R. They will further develop their project ideas and come up with ways to visualize their group project results.
  • Friday, 15.09.
    Morning Session
  • In the morning, the participants will keep working on the projects they started during the previous practical sessions, and finalize them (under the guidance of the course instructors).
  • Afternoon Session
  • In the afternoon, the participants will present their group projects.
  • Optional: Participants can choose to take part in a short session giving an overview of more advanced packages and methods for data analysis in R (like “data.table”) and a quick introduction to versioning control with git.
  •