Skip to content

Replication code to accompany the paper Measuring the Landscape of Civil War

License

Notifications You must be signed in to change notification settings

tlscherer/MeasuringLandscape

Repository files navigation

Measuring the Landscape of Civil War

This is a package, documentation, and replication repository for the paper "Measuring the Landscape of Civil War" (provisionally accepted for publication) 2017

The Paper

Measuring the Landscape of Civil War - Read the Paper

Measuring the Landscape of Civil War - Read the Online Appendix

The Authors

Replication Code and Analysis

Self Contained Package

All of the files necessary for reproducing our analysis are including in a self contained R package "MeasuringLandscape." You can install the package MeasuringLandscapeCivilWar from github with:

# install.packages("devtools")
devtools::install_github("rexdouglass/MeasuringLandscape")

R-Notebooks

The analysis and figures in the paper and statistical appendix are produced in a number of R Notebooks

  • 00 Project Setup: Useful commands for installing necessary packages and setting up the project.

File Preparation:

  • 01 Prep Events Counts: Loads and cleans a novel dataset of violent events obserevd during the 1950s Mau Mau Rebellion.
  • 02 Prep Gazeteers: Cleans and combines a large number of gazeteers of place names for looking up locations by name and retrieving their coordinates.

Fuzzy Matcher: A supervised learning pipeline for matching two placenames to one another even when they are spelled slightly diferently.

Georeferencer: A supervised leaning pipeline for assigning a real world coordinate to a placename.

  • 05 Georeferencer: Takes in locations of events described as text, and returns all possible matches across different gazeteers.
  • 06 Ensemble and Hand Rules: Ranks the returned matches from best to worst. First, using simple hand rules of what kind of match to prefer over others. Then second, with a supervised model that attempts to predict which match will be geographically closest to the true location (fewest kilometers away from the right answer).

Analysis: Main analysis of the paper.

  • 07 Recall Accuracy: Rate georeferencing options in terms of recall (how many event locations they recover) and accuracy (how far away their imputed locations tend to be from the true location)
  • 08 Predict Missingness DV: Rate georeferencing options in terms of how systmetic they are at recovering locations for certain kinds of events but not others.
  • 09 Predicted Effects: Demonstrate what kinds of events tend to systematically get excluded. Here, in terms of whether the event would have received an original military coordinate or not.
  • 10 Bias: Demonstrate that the kinds of locations that are imputed are different from the true locations, in terms of things like population, distance from roads, ruggeness, etc.
  • 11 So What: Demonstrate that different georeferencing decisions will produce different results in a simple linear regression model in terms of both statistical significance and substantive effects.

About

Replication code to accompany the paper Measuring the Landscape of Civil War

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published