🚌 University Transport Optimization Analysis (STU BUAP)

📋 Project Overview

The University Transport System (STU) connects the main campus (CU) with the new engineering campus (CU2). This project analyzes user data to identify saturation patterns, optimize transit schedules, and reduce the environmental impact of inefficient trips.

Authors: Danna Patricia Riveroll Martínez & Josué Salvador Marín Nieva.

🎯 Objectives

Optimize Logistics: Reduce wait times and improve bus scheduling efficiency.
User Experience: Analyze student satisfaction and overcrowding (standing frequency).
Sustainability: Minimize the carbon footprint by maximizing unit capacity per trip.

🛠️ Tech Stack & Tools

Language: Python 3.x
Data Processing: Pandas, NumPy (ETL, Regex cleaning).
Machine Learning: Scikit-Learn (Decision Tree Classifier).
Visualization: Matplotlib, PowerBI (Dashboards).

📊 Methodology

1. ETL & Data Cleaning

Dataset: ~3,400 survey records after cleaning (originally ~3,000 raw entries).
Techniques:
- Handling missing values via mean/mode imputation.
- Removing duplicates and standardizing time formats using Regex.
- Outlier analysis using boxplots (preserved for realistic representation of wait times).

2. Exploratory Data Analysis (EDA)

Key insights derived from the data:

Peak Hours: Identified high demand windows at 6:00-8:00 AM and 1:00-3:00 PM.
Wait Times: Most users wait between 15 to 60 minutes.
Satisfaction: Predominantly low (Level 2/5), correlated with overcrowding.
Correlation: Found a positive correlation (0.22) between wait times and the likelihood of standing during the trip.

3. Machine Learning Model

Algorithm: Decision Tree Classifier.
Target: Predicting "Standing Frequency" (Saturation level).
Results: The model achieved an accuracy of 51%. It performed well in predicting extreme cases (Always Standing vs. Never Standing) but highlighted the need for more complex features to predict intermediate states.

💡 Proposed Solution

Based on the data analysis, we propose a staggered logistical schedule:

Coordination: Buses departing every 15/30 minutes synchronized between CU and CU2.
Efficiency: Ensuring units do not return empty ("deadheading") by aligning departures with arrival peaks, reducing unnecessary emissions.

Note: The source code, datasets, and detailed PDF reports contained in this repository are in Spanish.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Analisis exploratorio de datos (EDA).ipynb		Analisis exploratorio de datos (EDA).ipynb
DATABASE_CLEAN_STU_2.csv		DATABASE_CLEAN_STU_2.csv
Modelo de machine learning.ipynb		Modelo de machine learning.ipynb
README.md		README.md
Reporte_1-Introduccion a ciencia de datos.pdf		Reporte_1-Introduccion a ciencia de datos.pdf
Uso del Servicio de transporte universitario.pdf		Uso del Servicio de transporte universitario.pdf
practica_2-Introduccion a ciencia de datos.pdf		practica_2-Introduccion a ciencia de datos.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚌 University Transport Optimization Analysis (STU BUAP)

📋 Project Overview

🎯 Objectives

🛠️ Tech Stack & Tools

📊 Methodology

1. ETL & Data Cleaning

2. Exploratory Data Analysis (EDA)

3. Machine Learning Model

💡 Proposed Solution

About

Uh oh!

Releases

Packages

Languages

Dannap7337/University-Transport-Optimization

Folders and files

Latest commit

History

Repository files navigation

🚌 University Transport Optimization Analysis (STU BUAP)

📋 Project Overview

🎯 Objectives

🛠️ Tech Stack & Tools

📊 Methodology

1. ETL & Data Cleaning

2. Exploratory Data Analysis (EDA)

3. Machine Learning Model

💡 Proposed Solution

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages