Predictive model for health and safety inspections in the USA. It is based in data from the Occupational Health and Safety Administration, curated by Enigma public
The model is based on the random forest algorithm and it predicts both the outcome of a violation and its probability.
This Python code:
- creates the SQL database
- trains and validates the model, optimizing hyperparameters
- evaluates its performance
- creates maps of violations by US state
This code is Python 2.7 and relies on the following packages:
- numpy
- matplotlib
- pandas
- postgresql
- sqlalchemy
- psycopg2
- basemap
Please ensure these packages are installed before attempting to run this code
- Adrián Soto
This project is licensed under the MIT License - see the LICENSE.md file for details
- OSHA for collecting the data
- Enigma Public for releasing the data
- The almighty Stack Overflow community since the code in map/ is a modification of https://stackoverflow.com/questions/39742305/how-to-use-basemap-python-to-plot-us-with-50-states and for many other questions I've answered there while working on this project.