Olympic Medal Data Analysis 🥇

Welcome to the Olympic Medal Data Analysis project! This repository focuses on analyzing Olympic medal data using R. By combining TidyTuesday records with World Bank indicators, we assess raw medal counts, efficiency metrics, and the economic context surrounding Olympic performance.

Introduction

The Olympic Games serve as a global stage for athletic excellence. This project aims to delve into the data behind these events, providing insights into how countries perform and what factors influence their success.

Project Overview

This project integrates data from TidyTuesday and World Bank indicators to explore several key areas:

Raw medal counts for different countries.
Efficiency metrics that evaluate how effectively countries convert resources into medals.
Economic context that provides background on the countries' performances.

By leveraging R's powerful data analysis capabilities, we create visualizations that reveal patterns and trends in Olympic performance over the years.

Data Sources

The primary data sources for this project include:

TidyTuesday: A weekly data project that provides datasets for analysis.
World Bank Indicators: Economic data that gives context to Olympic performance.

We clean and preprocess this data to ensure it is ready for analysis.

Features

This project includes several key features:

Data Cleaning: Scripts to clean and preprocess raw data.
Visualizations: A variety of charts and graphs to illustrate findings.
Statistical Analysis: Regression and clustering methods to identify patterns.
Economic Context: Analysis of how economic factors relate to Olympic success.

Getting Started

To get started with this project, you need to clone the repository and install the required packages.

Prerequisites

Make sure you have R and RStudio installed on your machine. You will also need the following R packages:

tidyverse
ggplot2
dplyr
readr
tidyr

Installation

Clone the repository:

git clone https://github.com/synth001/Olympic-Medal-Data-Analysis.git

Navigate to the project directory:

cd Olympic-Medal-Data-Analysis

Install the required packages:

install.packages(c("tidyverse", "ggplot2", "dplyr", "readr", "tidyr"))

Usage

Once you have installed the required packages, you can run the analysis scripts. The main script to start with is analysis.R. This script will load the data, perform the necessary analysis, and generate visualizations.

Run the script in RStudio:

source("analysis.R")

This will execute the analysis and produce the visualizations in your RStudio environment.

Visualizations

Visualizations are a critical part of this project. They help convey complex data in an understandable way. We create various types of charts, including:

Bar Charts: To display the total number of medals won by each country.
Line Graphs: To show trends in Olympic performance over time.
Heat Maps: To visualize the efficiency of countries in converting resources to medals.

Here’s an example of a bar chart that displays the total medal counts:

library(ggplot2)

ggplot(data, aes(x = country, y = total_medals)) +
  geom_bar(stat = "identity") +
  theme_minimal() +
  labs(title = "Total Olympic Medals by Country", x = "Country", y = "Total Medals")

Regression and Clustering

We apply regression analysis to understand the relationship between economic indicators and medal counts. This helps identify which factors most significantly influence Olympic success.

Regression Analysis

We use linear regression to model the relationship between GDP and the number of medals won. The following code snippet illustrates this:

model <- lm(total_medals ~ gdp_per_capita, data = data)
summary(model)

Clustering Analysis

Clustering helps group countries based on similar performance metrics. We utilize k-means clustering for this purpose. Here’s how to implement it:

set.seed(123)
clusters <- kmeans(data[, c("total_medals", "gdp_per_capita")], centers = 3)
data$cluster <- clusters$cluster

Contributing

We welcome contributions to this project. If you have suggestions or improvements, please follow these steps:

Fork the repository.
Create a new branch for your feature or bug fix.
Make your changes and commit them.
Push your branch to your fork.
Create a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for more information.

Acknowledgments

We thank the TidyTuesday community for providing valuable datasets. Their efforts help promote data analysis and visualization skills across various domains.

For further details and updates, visit our Releases section.

Happy analyzing!

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
img		img
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
Olympic_Medal_Analysis.R		Olympic_Medal_Analysis.R
Olympic_Medal_Analysis.ipynb		Olympic_Medal_Analysis.ipynb
README.md		README.md
report.Rmd		report.Rmd
report.tex		report.tex
run_analysis.sh		run_analysis.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Olympic Medal Data Analysis 🥇

Table of Contents

Introduction

Project Overview

Data Sources

Features

Getting Started

Prerequisites

Installation

Usage

Visualizations

Regression and Clustering

Regression Analysis

Clustering Analysis

Contributing

License

Acknowledgments

About

Uh oh!

Releases 1

Packages

Contributors 2

Uh oh!

Languages

License

synth001/Olympic-Medal-Data-Analysis

Folders and files

Latest commit

History

Repository files navigation

Olympic Medal Data Analysis 🥇

Table of Contents

Introduction

Project Overview

Data Sources

Features

Getting Started

Prerequisites

Installation

Usage

Visualizations

Regression and Clustering

Regression Analysis

Clustering Analysis

Contributing

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Uh oh!

Languages

Packages