The project aims to analyze Mexico's COVID-19 dataset to identify patterns, relationships, and impacts of the pandemic. The team will clean and preprocess the data, conduct statistical tests to explore associations between variables, and evaluate the performance of statistical models used in the analysis.
The process involves:
- Data Pre-processing: Cleaning and preparing the data for analysis.
- Chi-square Analysis: Testing the relationship between categorical variables.
- Associations Analysis: Using Logistic Regression to explore deeper associations in the data.
- Model Evaluation: Assessing the accuracy and effectiveness of our models.
- Dynamic Report: Compiling all findings into a comprehensive report, tailored to suit different audience needs.
The goal is to provide insights into the COVID-19 impact in Mexico through detailed statistical analysis and modeling.
The Covid-19 report has been broken into two subprojects.
data/contains the code, original file and output related to the projectdata/clean.Rproduces thedata_clean.rds- teammate can choose to use the original file or
data_clean.rds
- teammate can choose to use the original file or
subproject1/contains all code and output related to the descriptive analysissubproject1/code/descriptive_analysis.Rproduces tables and plots- output should be saved to
subproject1/output/descriptive_output/
- output should be saved to
subproject1/code/chi-square.Rproduces the chi-square results containing tables and plots- output should be saved to
subproject1/output/chi-square/
- output should be saved to
subproject1/report.Rmdreads in output fromsubproject1/output/and creates the report for the descriptive analysissuproject1/render_report.Rrenders the report for subproject1subproject1/Makefilecan help user produce the report for subproject1 using command line
subproject2/contains all code and output related to the regression analysis and model evaluationsubproject2/code/models.Rfits Multinominal Logistic Regression models- summary tables and plots should be saved to
subproject2/output/model/
- summary tables and plots should be saved to
subproject2/code/model_evaluation.Revaluate the effectiveness of model- summary tables and plots should be saved to
subproject2/output/model_evaluation/
- summary tables and plots should be saved to
subproject2/report.Rmdreads in output fromsubproject2/output/and creates the report for the regression analysissuproject2/render_report.Rrenders the report for subproject2subproject2/Makefilecan help user produce the report for subproject2 using command line.
The subprojects are inserted into dynamic_report.Rmd dynamically. The user can change the params in yaml title by switching the value from subproject1 to subproject2 to show different parts of analysis.
- Make sure you have installed the
renvpackage (install.packages("renv")) - Use
make installunder the main folder to synchroniz packages - Use
makeunder the main folder to get the dynamic report - If you want to see different parts analysis, please change the
paramsin yaml title by switching the value fromSubproject1toSubproject2or change the plaha level. - Use
makeunder the main folder to produce the report again.