⏱️ A machine learning project that predicts flight arrival delays and classifies flights as delayed or on-time based on various delay factors.
- 📉 Regression Modeling using Linear Regression to predict numeric arrival delay
- ✅ Classification Modeling with Decision Tree to detect delayed flights (accuracy: 98%)
- 📊 Visualization Dashboard with scatter plots, feature importance, and confusion matrix
- 🧼 Data cleaning, encoding, and feature engineering for improved model performance
- 🧪 Evaluated using R², MAE, MSE for regression and accuracy/F1-score for classification
| Component | Tool/Library |
|---|---|
| Language | Python 3.10 |
| ML Models | LinearRegression, DecisionTreeClassifier |
| Data Handling | pandas, NumPy |
| Visualization | Matplotlib, Seaborn |
| Evaluation Metrics | scikit-learn (MAE, R², accuracy, F1-score) |
git clone https://github.com/akasha456/Flight-Delay-Detection
cd Flight-Delay-Detection
pip install -r requirements.txtflowchart TD
A[Load Flight Dataset] --> B[Preprocess & Clean Data]
B --> C[Train Regression Model]
B --> D[Train Classification Model]
C --> E[Predict Arrival Delays]
D --> F[Classify Flights as Delayed/On-Time]
E --> G[Evaluate Regression Metrics]
F --> H[Evaluate Accuracy and Confusion Matrix]
G --> I[Visualize Predictions]
H --> I
| Metric | Score |
|---|---|
| R² Score | 0.972 |
| MAE | 6.97 |
| MSE | 90.12 |
| Explained Variance | 0.972 |
| Metric | Score |
|---|---|
| Accuracy | 98% |
| Precision | 1.00 (Not Delayed), 0.95 (Delayed) |
| Recall | 0.97 (Not Delayed), 1.00 (Delayed) |
| F1-Score | 0.98 |
| Feature | Importance |
|---|---|
| NAS_Delay | 0.5882 |
| Dep_Delay | 0.4118 |
| Others | 0.0000 |
✈️ Integrate live flight data via airline APIs- 📍 Add geographical visualization of delays by airport
- 🧠 Explore ensemble models (Random Forest, XGBoost)
- 🗂️ Summarize delays by day, airline, or region
- 📱 Build a simple UI for user input and results visualization
This project is licensed under the MIT License.
- Scikit-learn for ML algorithms
- Matplotlib and Seaborn for visualizations
- Kaggle for access to flight datasets