Introduction

In this 3h. session we introduce the concepts of workflow, openness and reproducability. In the first part, We argue why they are important and what as social scientists we can learn from data scientists. Our main argument is that, even though in the social sciences complete reproducability is often infeasible, we should strive for research to become as reproducable as possible.

In the second part we lay out the road-map for the rest of the workshop. Most importantly, we explain why in this workshop we make use a set of particular tools, namely:

  1. R and RStudio (with Yihui Xie’s knitr package)
  2. Markdown language
  3. Bibdesk/Mendeley
  4. Git and Github
  5. GNU make

We are aware that using a particular data analysis tool is costly in terms of time investment and is in terms of preferences and needs ideosyncratic. However, in this workshop we decided to make use of the combination R and RStudio for two main reasons: (i) it works the best out of the box for our purposes and (ii) at the moment most researchers probably work with this combination for reproducability (at least it gets the biggest buzz…)

Requirements

You will need several tools to be installed on your machine to follow the workshop along with your laptop. Head over to the Requirements page to see how to install them if you haven’t yet.

Outcomes

After this session you should:

Slides

References

Reproducability :