GESIS Training Courses
user_jsdisabled
Suche

Wiss. Koordination

Dr.
Sabina Haveric
Tel: +49 (0221) 47694 - 166

Administrative Koordination

Claudia O'Donovan-Bellante
Tel: +49 621 1246-221

Workshop - Big Data: Introduction to Data Science with Python

Dozent(en):
Dr. Fabian Flöck, Dr. Arnim Bleier

Datum: 16.09 - 18.09.2019 ics-Datei

Referenteninformationen - Dr. Fabian Flöck

Referenteninformationen - Dr. Arnim Bleier

Seminarinhalt

Data Science is the interdisciplinary science of the extraction of interpretable and useful knowledge from potentially large datasets. Due to the rapid surge of digital trace data (often as “Big Data”) in a wide range of application areas, Data Science is also increasingly utilized in the social sciences and humanities. In contrast to empirical social science, Data Science methods often serve purposes of exploration and inductive inference. In this course, we aim to provide an introduction into Data Science for practitioners. In particular, we want to impart basic understanding of the main methods and algorithms and understand how these can be deployed in practical application scenarios, focusing on the analysis of digital behavioral data found on the Web. We cover aspects of data collection, preprocessing, exploration, visualization and machine learning using basic Python and key packages like pandas, numpy and scikit- learn.
We would like to call your attention to our symposium which will take place following the second workshop day, on Tuesday, 17th September 2019. The topic of the discussion will be "Legal Challenges of Web Scraping in the Data Science Context". The venue will be Mannheim University. Further information and the possibility to sign on will be given shortly.


Keywords



Zielgruppe

The course is targeted at social scientists and researchers from the humanities who are interested in analyzing digital trace data.


Lernziel

Participants will learn about typical data types and structures encountered when dealing with digital behavioral data, state-of-the art data analysis methods and tools in Python. This will enable them to identify benefits and pitfalls in their field of interest and will thus allow them to select and appropriately apply data analysis and machine-learning methods for large datasets in their own research. The knowledge obtained in this course provides a starting point for participants to investigate specialized methods for their individual research projects.


Voraussetzungen

Participants should be willing to study algorithmic approaches on abstract and applied levels. Previous knowledge of (i) statistics as well as (ii) programming in Python, another programming language (like R, Java) or at least scripting language (Syntax-Code in SPSS, Stata) is very advantageous to follow the coursework. To ensure a common starting level between participants, it is mandatory for attendants to familiarize themselves with the most basic concepts of Python such as variables, lists, and loops via learning materials provided beforehand, which will be refreshed at the beginning of the course. Please note that participants have to bring their own laptop for this course. All utilized software is available without cost as open source under Windows, MacOS, and Linux systems. Detailed installation instructions for the suggested development environments will be provided before the start of the course.


Zeitplan

Literaturempfehlungen

Weitere Informationen