In the social sciences few attention to what tools to use (and why they make sense)
Increasing need for/in openness & transparancy
Why we want to give this workshop
We are mostly interested in the principles behind a good open (scientific) workflow, aware of the facts that
However, being a practical workshop we do
Inspired by Kieran Healey’s (associate professor in sociology) work: Choosing your Workflow Applications
Workflow: Progression of steps (tasks, events, interactions) that comprise a work process, involve two or more persons, and create or add value to the organization’s activities (BusinessDictionary)
Open workflow: One that enhances transparency, collaboration and reproducibility
Good scientific practice: document how you have achieved your results; this ensures
A journey of a thousand miles begins with a single step
Lao-tzu
In science consensus is irrelevant. What is relevant is reproducible results. The greatest scientists in history are great precisely because they broke with the consensus (Michael Crichton)
The data and code used to make a finding are available and they are sufficient for an independent researcher to recreate the finding (Peng, 2011)
Literature programming (Donald E. Knuth, 1984):
Synonyms
All based on text files
Only output is displayed/interpreted differently (e.g., in a browser or pdf viewer)
What we want is that with one single command we
This all under a full fledged versioning control system
Terminal tools (GNU make, diff, pandoc)
Versioning system (Git & VCN)
Reference manager (bibdesk/Mendeley)
Statistical software (pure command line driven): Python and R
R packages http://cran.r-project.org/
iPython notebook viewer http://nbviewer.ipython.org/
Reproducible Research with R and RStudio Book1
Amsterdam paper example using ipython notebook:
knitr
package)Only implicitly we make use of LaTeX, BibTex, HTML and pandoc (all under the hood of RStudio)
[Lunch]
[Break]
[Diner]
[break]
[Lunch]
This workshop is financially supported by FOSTER.