Step-by-step of my workflow

1 minute read

  1. Open a new repository in Github
  2. Create a new RStudio Project by choosing the option to “Checkout a project from a version control repository”
    • I find doing 1 then 2 easier than the other way around
  3. Create folders/directories:
    • rawdata
    • data - intermediate data outputs
    • reports - usually R Notebooks or R Markdown files recording summary statistics, analyses
    • scripts
      • If you’re using multiple languages you can have subdirectories for different languages e.g scripts/R, scripts/python
      • functions.R script to keep functions separate from other programming
      • source.R script that runs all other scripts and (at least in theory!) can generate final output from raw data
    • presentations - slides for presentations; most recently I’ve been using revealjs
    • manuscripts - papers, extended abstracts
    • plots - I often save plots/graphs as .rds files instead of image files (like JPEG, PNG etc.) so that I can do the editing in R Markdown with ggplot2
    • Others:
      • models - save model output after, say, running a regression
      • results - typically regression coefficients stored as a dataframe
  4. Github Issues (click on the Issues tab in your Github repository): I use this as my to-do list. It’s a nice way of keeping your tasks organized by project
  5. Communicating with collaborators: I use Slack but there are other options; good ol’ email is fine but it can get hard to dig up old conversations

There are plenty of great resources out there that go into detail on the hows and whys of setting up a reproducible project. A couple of places to get started:

Leave a Comment