This project explores obesity trends in the United States using data from the Behavioral Risk Factor Surveillance System. Through detailed data analysis and visualization, it examines the relationship between obesity rates and various behavioral factors across different demographic and socioeconomic stratifications from 2011 to 2022. The final report includes a comprehensive analysis of obesity trends, with summary tables and distribution plots highlighting key findings.
To replicate the computing environment necessary to run this project, follow these steps:
- Clone this repository to your local machine.
- Ensure that you have R installed on your system.
- Navigate to the project directory and run
make installfrom the terminal. This will set up the R environment using therenvpackage. - Ensure Docker is installed on your system to handle the project containerization.
- Use the provided Makefile to build the Docker image, which sets up the R environment with all necessary dependencies.
make project_image- Accessing the Docker Image View and pull the Docker image directly from DockerHub to avoid local builds: DockerHub Repository for Data 550 Final Project
- Pulling the Docker Image To pull the latest version of the Docker image:
docker pull yiweishi/data550_final_project:latest- Generate Report Using Docker After pulling the image, generate the final report by running the Docker container:
docker run --rm -v "${PWD}/final_report:/project/final_report" yiweishi/data550_final_projectThe final report is generated from the final_report.Rmd document, which compiles the analysis results, including tables and figures, into a cohesive narrative. To generate the report:
- Ensure all prerequisites are installed.
- Open a terminal and navigate to the project's root directory.
- Execute the command
make all. This will run the data processing, analysis, and plotting scripts, followed by knitting thefinal_report.Rmdinto an HTML or PDF document.
The report includes:
- An introduction to the dataset and research objectives.
- A data processing section detailing the cleaning and preparation steps.
- Descriptive analysis findings, with summary tables highlighting key statistics.
- Visualizations of obesity trends over time and across various stratifications.
- Conclusions and potential areas for further research.
-
Summary Tables: Generated in the
analysis.Rscript, summary tables provide an overview of key statistics for each stratification. These tables are saved in theoutput/tablesdirectory. -
Figures: The main figure illustrating the trend of obesity over the years is created in the
plotting.Rscript. This and other figures are saved in theoutput/figuresdirectory.