Skip to content

A repo that holds raw data and data collection scripts for tutorials.compjour.org

Notifications You must be signed in to change notification settings

schetudiante/tutorial-data-stash

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 

Repository files navigation

tutorial-data-stash

The raw data and scripts to collect them, as used and mirrored for Stanford Computational Journalism data tutorials at http://tutorials.compjour.org.

Currently kind of hacked together, will eventually be a more formalized framework for fetching and packaging data in different formats and different stages of cleaning, so that they can be used as practice for both data gathering and analysis.

The data-holding directory contains the downloaded files and some of their compiled versions. Some of the bigger files have been split into smaller files so that they'd fit more gracefully into version control, but I haven't written the compilation scripts to re-assemble. The ultimate goal is to have scripts that produce downloadable links to easy-to-use CSVs and SQLite databases for class exercises (as soon as I finish learning SQLalchemy).

The scripts directory contains the (mostly Python 3) scripts for fetching them. I've been writing them as I go, so each subfolder/project is a bit different depending on my mood at that moment and whether I've learned from mistakes in fetching the other datasets.

Inventory so far

Todo:

About

A repo that holds raw data and data collection scripts for tutorials.compjour.org

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.0%
  • Shell 1.0%