Skip to content

JeetMajumdar2003/Vendor-Performance-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Vendor Performance Data Analysis

Python License Status

A comprehensive data analytics solution for vendor performance evaluation, procurement optimization, and inventory efficiency analysis.

This project delivers actionable insights from vendor transaction data to support supply chain optimization and commercial strategy. The analysis covers $441M+ in sales across 119 vendors and 7,707 SKUs, identifying opportunities for margin improvement, cost reduction, and working capital efficiency.


πŸ“Š Key Findings Summary

Metric Value Insight
Total Sales $441.41M Strong portfolio performance
Gross Profit $134.07M 38.7% average margin
Top 10 Vendor Concentration 65.7% Moderate supply chain risk
Hidden Margin Opportunities 198 brands $15-25M revenue potential
Order Size Savings 72% Large vs. small order unit costs
Idle Inventory Capital $2.71M Working capital optimization target

πŸ—οΈ Project Structure

The repository now follows a standard Python data-analysis layout inspired by the Cookiecutter Data Science conventions.

Vendor Performance Data Analytics/
β”œβ”€β”€ configs/                     # YAML/JSON parameter files for pipelines and experiments
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ raw/                     # Immutable source data dumps
β”‚   β”œβ”€β”€ interim/                 # Staging outputs from cleaning notebooks/scripts
β”‚   └── processed/               # Curated datasets ready for modeling & reporting
β”œβ”€β”€ docs/                        # Architecture decisions, meeting notes, specs
β”œβ”€β”€ logs/                        # Runtime and ingestion logs (non-versioned)
β”œβ”€β”€ models/                      # Serialized models, experiment artifacts
β”œβ”€β”€ notebooks/                   # Jupyter notebooks (EDA, ingestion, modeling)
β”œβ”€β”€ reports/
β”‚   β”œβ”€β”€ dashboards/              # BI artifacts (e.g., Power BI)
β”‚   β”œβ”€β”€ figures/                 # Static visual exports
β”‚   └── tables/                  # Final KPI tables & summaries
β”œβ”€β”€ src/
β”‚   └── vendor_performance/      # Reusable Python package with data pipelines & utils
β”œβ”€β”€ tests/                       # Pytest suite for regression coverage
β”œβ”€β”€ LICENSE
└── README.md

Key highlights:

  • All Python modules live under src/vendor_performance, making it easy to install the package with pip install -e . later.
  • Notebooks are isolated in notebooks/, keeping the repo root clean while preserving exploratory work.
  • Data is separated into raw, interim, and processed zones to enforce reproducible pipelines.
  • Final deliverables (dashboards, figures, summary tables) sit in reports/ for quick stakeholder access.

πŸ“ˆ Reports & Visual Assets

Deliverable Path Description
Full Analysis Report reports/vendor_performance_analysis_report.md Comprehensive 5-page industry-ready analysis
Vendor KPI Dashboard reports/dashboards/vendor_performance_dashboard.pbix Interactive Power BI dashboard
Top Vendors & Brands reports/figures/top_vendors_and_brands.png Revenue leaders visualization
Inventory at Risk reports/figures/unsold_inventory.png Capital lock-up analysis
Vendor Sales Summary reports/tables/vendor_sales_summary.csv Curated dataset (8,564 rows)

Dashboard Preview

Vendor Performance Dashboard

Sample Visualizations

Top Vendors & Brands Unsold Inventory Analysis
top vendors unsold inventory

πŸ’‘ Tip: Open the CSV in Power BI/Excel or feed it to downstream ML experiments. The Power BI file already points to this tableβ€”simply refresh after regenerating the summary via the notebooks.


πŸ”¬ Analysis Highlights

The analysis addresses 8 diagnostic business questions:

  1. Q1 - Hidden Gems: 198 high-margin brands with low sales identified for promotional campaigns
  2. Q2 - Revenue Leaders: DIAGEO, MARTIGNETTI, and PERNOD RICARD lead with $139M combined sales
  3. Q3 - Purchase Concentration: Top 5 vendors account for 45.7% of procurement spend
  4. Q4 - Vendor Dependency: 65.7% concentration in top 10 vendors signals moderate risk
  5. Q5 - Order Economics: Unit costs drop 72% from small (≀85 units) to large (>1,500 units) orders
  6. Q6 - Low Turnover: 10+ vendors with turnover < 1.0 (inventory exceeds annual sales)
  7. Q7 - Idle Capital: $2.71M locked in unsold inventory (DIAGEO, JIM BEAM, PERNOD = 64%)
  8. Q8 - Statistical Validation: Welch's t-test (p < 0.0001) confirms significant margin differences

Statistical Confidence

Vendor Group Mean Margin 95% CI
Top Performers (β‰₯75th %ile sales) 31.18% [30.74%, 31.61%]
Low Performers (≀25th %ile sales) 41.57% [40.50%, 42.64%]

Welch's t-statistic: -17.67 | p-value: < 0.0001



πŸš€ Getting Started

Prerequisites

  • Python 3.9+
  • SQLite (included with Python)
  • Power BI Desktop (optional, for dashboard)

Installation

  1. Clone the repository:

    git clone https://github.com/yourusername/vendor-performance-analytics.git
    cd vendor-performance-analytics
  2. Create a virtual environment (recommended):

    python -m venv .venv
    .\.venv\Scripts\Activate.ps1   # Windows PowerShell
    # OR
    source .venv/bin/activate       # Linux/macOS
  3. Install dependencies:

    pip install -r requirements.txt

Running the Analysis

Option 1: Via Notebooks (Interactive)

jupyter notebook notebooks/

Execute in order:

  1. ingestion_db.ipynb - Load raw data into SQLite
  2. eda.ipynb - Data cleaning and feature engineering
  3. vendor_performance_analysis.ipynb - Full diagnostic analysis

Option 2: Via Python Modules (CLI)

cd src
python -m vendor_performance.ingestion_db
python -m vendor_performance.eda
python -m vendor_performance.vendor_performance_analysis

Data Pipeline

data/raw/*.csv β†’ ingestion_db.py β†’ inventory.db β†’ eda.py β†’ vendor_sales_summary β†’ analysis β†’ reports/

πŸ“ Data Dictionary

Source Files (data/raw/)

File Description Key Fields
begin_inventory.csv Opening inventory positions SKU, Quantity, Value
end_inventory.csv Closing inventory positions SKU, Quantity, Value
purchases.csv Purchase transactions Vendor, SKU, Quantity, Amount
purchase_prices.csv Price book reference SKU, Unit Price
sales.csv Sales transactions SKU, Quantity, Amount, Date
vendor_invoice.csv Invoice and freight data Vendor, Invoice, Freight

Analytical Output (reports/tables/vendor_sales_summary.csv)

Column Description
VendorNumber Unique vendor identifier
VendorName Vendor company name
Brand Product brand name
Description SKU description
PurchasePrice Unit purchase cost
ActualPrice Unit selling price
TotalPurchaseQuantity Total units purchased
TotalPurchaseDollars Total purchase spend
TotalSalesQuantity Total units sold
TotalSalesDollars Total sales revenue
GrossProfitDollars Sales - Purchase cost
ProfitMargin Gross Profit / Sales
StockTurnover Sales Qty / Avg Inventory


🎯 Business Problem & Objectives

Effective inventory and sales management are critical for optimizing profitability in the retail and wholesale industry. Companies need to ensure that they are not incurring losses due to inefficient pricing, poor inventory turnover, or vendor dependency.

Primary Objectives

Objective Analysis Approach Key Deliverable
Identify underperforming brands Percentile-based segmentation (margin vs. volume) 198 promotional candidates
Determine top vendors Revenue and profit contribution analysis Top 10 vendor scorecard
Analyze bulk purchasing impact Order-size tier economics 72% cost savings validated
Assess inventory turnover Stock turnover ratio analysis Low-turnover vendor list
Investigate profitability variance Welch's t-test + 95% CI Statistically significant gap confirmed

Strategic Recommendations

Based on the analysis, we recommend:

  1. Launch precision promotions on 198 high-margin, low-volume brands ($15-25M revenue potential)
  2. Consolidate procurement orders to leverage 72% unit cost savings at scale
  3. Rebalance inventory with DIAGEO, JIM BEAM, and PERNOD to free $1.5M+ working capital
  4. Diversify vendor base to reduce top-3 concentration below 25%
  5. Embed statistical monitoring in BI dashboards for proactive margin management

πŸ› οΈ Technology Stack

Component Technology
Data Processing Python, Pandas, NumPy
Database SQLite
Statistical Analysis SciPy (Welch's t-test, CI)
Visualization Matplotlib, Seaborn
BI Dashboard Power BI
Notebooks Jupyter

πŸ“‹ Next Steps (TODO)

  • Add pyproject.toml for package installation (pip install -e .)
  • Introduce configuration templates under configs/
  • Stand up automated tests in tests/
  • Consider data versioning (DVC, LakeFS) for evolving raw feeds
  • Deploy dashboard to Power BI Service for stakeholder access
  • Implement automated refresh pipeline with scheduled runs

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ‘€ Author

Jeet Majumdar

For questions or collaboration, please open an issue or submit a pull request.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published