-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Thomas Leo Scherer edited this page Jul 16, 2020
·
58 revisions
Research projects at the cPASS Machine-Learning for Social Science Lab (MSSL) are complex in terms of authors, methods, and data. MSSL has adopted a project management system based on Github, RStudio, and LaTeX that is:
- efficient - everyone has access to every part of the project.
- organized - establishes a central repository for data, analysis, issues, and tasks.
- transparent - changes are tracked and previous iterations can be instantly recalled.
- replicable - when ready the repository can be made public and serve as the replication package.
This Github repository titled MSSLStyleGuide documents this system as it develops and guides new users in setting up a work station, joining a project, and developing and documenting content.
- Workstation Setup - how to install and setup the software that you will use.
- Repository Setup - how to get started on a specific project.
- Project/Package Setup - initial setup for a new project / package.
- Writing Setup - alternative setup and workflow instructions for those who will only be contributing to papers and will not need R/RStudio.
- Project Organization - some guidelines on where to keep what.
- Dropbox Organization - using Dropbox with Git Version Control.
- Using Git - some tips for pushing and pulling to and from a repository.
Documentation for MSSL projects should be kept in one of four places:
- Functions Documentation - ROxygen comments describing the function's purpose, usage, and parameters
- Issues - anything that needs action, including data coding, code writing, analysis, paper writing, and admin.
- Using R-Markdown and R-Notebook - documentation related to data and analysis.
- Wiki - information relevant to the project that doesn't need action and isn't tied to the code or data.
- Apache Spark and sparklyr - allows users to query data in Spark using SQL.
- Ubuntu - a free and open-source Linux operating system.
- Loading and Storing Data - how to read-in and save datasets in a clear and reproducible manner.
- Functions - how to correctly automate and document operations.
- General Coding Practices - rules of thumb for clear and concise coding.
- Packages - how to call and use packages in R.
- Tools - an overview of useful tools for data entry and analysis.
- Plots - a guide to visualizing data with general coding practices and plot examples.
- Tables - create clear and reproducible tables using R and LaTeX.
- Paper Writing Guide - a comprehensive guide on writing papers using LaTeX.
- Collaborative Writing - how to set up a repository to ease collaboration within projects.
- Zotero - use Zotero to manage bibliographies.
- Authorship - information on authorship and collaboration.
- Writing in Bookdown - Bookdown can be useful for writing manuscripts; combining LaTeX and R with ease.
- Writing in RMarkdown
- R Resources - extra reading for those looking to expand their R skills.
- R Troubleshooting - examples and solutions to common R-related problems.
- Miscellaneous Troubleshooting - examples and solutions to non-coding-related issues.