Final Project 8

Project Environment Setting Up

Clone the Repository from the github to your local machine
Navigate to the project directory in terminal

Report Generation

There are two ways to generate the report: locally on your computer or using a Docker image.

Locally:

Make sure you have make and R installed on your system
Make sure you have renv R package is installed
Open a terminal in the project directory
Run command make install to restore the R package environment using renv
Run make report to compile the final report

Using Docker:

Pull the image from DockerHub Repository
Run command make mount-report in the terminal to generate the report
(This step works for both Windows-OS and a Mac/Linux-OS)
The compiled report should be in your local \report folder
(Optional) If you prefer to build the image yourself instead of downloading it from DockerHub, use the command make build_image. An image called "wwwivy111/data550_final_project" will be built.

Repository stucture

The raw dataset Thyroid_Diff.csv was saved in the data/ folder.
Codes were saved in the code/ folder.
The final report was saved in the report/ folder.

README.md
Makefile
Dockerfile
renv.lock
renv/
data/
code/
output/
report/

Report contents

Key sections include:

Introduction: Introduce the study
Method and Analysis: Describes the methods and results for data preparation, exploratory data analysis, and modeling process and model evaluation
Discussion: Discusses the implications for clinical management and future research

Code description

code/01_split_data.R

cleans the data format
splits the data into train and test set
saves new datasets as different .rds objects in data/ folder
(clean_data.rds, train.rds, test.rds)

code/02_EDA.R

conducts Exploratory Data Analysis (EDA)
generates table1 and saves as table1.rds object in output/ folder
generates descriptive plots for outcome, continuous, and categorical variables and saves as .png objects in output/ folder
(descriptive_age_plots.png, descriptive_bar_outcome.png, descriptive_pie_charts.png)

code/03_modeling.R

generate new train and test data train_1.rds and test_1.rds in data/ folder
fits univariate models and multivariable model, stepwise selection model, and final model
saves models and corresponding tables as different .rds objects in output/ folder
conducts model evaluation for the final model
saves evaluation matrix and ROC plot as .rds and .png objects in output/ folder

code/04_render_report.R

renders report.Rmd

report.Rmd

reads outputs from code/01_split_data.R, code/02_EDA.R, code/03_modeling.R
makes the final report

Makefile

contains rules for building the final report and other targets
make report will compile the report into .html object
make split_data will generate the outputs of code/01_split_data.R
make EDA will generate the outputs of code/02_EDA.R
make modeling will generate the outputs of code/03_modeling.R
make clean will clean all outputs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Final Project 8

Project Environment Setting Up

Report Generation

Repository stucture

Report contents

Code description

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
code		code
data		data
renv		renv
report		report
.Rhistory		.Rhistory
.Rprofile		.Rprofile
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
final_project_4.Rproj		final_project_4.Rproj
renv.lock		renv.lock

howardnc/data550-final-project

Folders and files

Latest commit

History

Repository files navigation

Final Project 8

Project Environment Setting Up

Report Generation

Repository stucture

Report contents

Code description

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages