This project analyzes air quality index (AQI) data to understand pollution trends, seasonal variations, and key contributors affecting air quality. Through data visualization and statistical analysis, we gain insights into air pollution levels across different regions.
- Source: Publicly available air quality dataset
- Columns:
Date: Timestamp of data collectionLocation: Monitoring stationPM2.5,PM10,NO2,SO2,CO,O3: Pollutant concentrationsAQI: Air Quality Index valueCategory: Air quality classification (Good, Moderate, Poor, etc.)
-
Data Preprocessing
- Handling missing values
- Data type conversions
- Feature engineering for time-series analysis
-
Exploratory Data Analysis (EDA)
- Distribution of pollutants
- Monthly and yearly AQI trends
- Seasonal variations in air quality
-
Data Visualization
- Line charts for AQI trends
- Heatmaps for correlation analysis
- Bar charts for pollution levels by location
-
Insights & Findings
- Identification of high-pollution regions
- Correlation between pollutants and AQI
- Effect of seasons on air quality
# Install dependencies
pip install pandas numpy matplotlib seaborn
# Clone the repository
git clone <repo_url>
cd <repo_folder>
# Open Jupyter Notebook
jupyter notebook air-quality-index-data-analysis.ipynb- This project helps in understanding how air pollution varies over time and space.
- The insights can guide policymakers in making data-driven decisions for environmental improvement.