Vendor Performance Data Analysis

A comprehensive data analytics solution for vendor performance evaluation, procurement optimization, and inventory efficiency analysis.

This project delivers actionable insights from vendor transaction data to support supply chain optimization and commercial strategy. The analysis covers $441M+ in sales across 119 vendors and 7,707 SKUs, identifying opportunities for margin improvement, cost reduction, and working capital efficiency.

📊 Key Findings Summary

Metric	Value	Insight
Total Sales	$441.41M	Strong portfolio performance
Gross Profit	$134.07M	38.7% average margin
Top 10 Vendor Concentration	65.7%	Moderate supply chain risk
Hidden Margin Opportunities	198 brands	$15-25M revenue potential
Order Size Savings	72%	Large vs. small order unit costs
Idle Inventory Capital	$2.71M	Working capital optimization target

🏗️ Project Structure

The repository now follows a standard Python data-analysis layout inspired by the Cookiecutter Data Science conventions.

Vendor Performance Data Analytics/
├── configs/                     # YAML/JSON parameter files for pipelines and experiments
├── data/
│   ├── raw/                     # Immutable source data dumps
│   ├── interim/                 # Staging outputs from cleaning notebooks/scripts
│   └── processed/               # Curated datasets ready for modeling & reporting
├── docs/                        # Architecture decisions, meeting notes, specs
├── logs/                        # Runtime and ingestion logs (non-versioned)
├── models/                      # Serialized models, experiment artifacts
├── notebooks/                   # Jupyter notebooks (EDA, ingestion, modeling)
├── reports/
│   ├── dashboards/              # BI artifacts (e.g., Power BI)
│   ├── figures/                 # Static visual exports
│   └── tables/                  # Final KPI tables & summaries
├── src/
│   └── vendor_performance/      # Reusable Python package with data pipelines & utils
├── tests/                       # Pytest suite for regression coverage
├── LICENSE
└── README.md

Key highlights:

All Python modules live under src/vendor_performance, making it easy to install the package with pip install -e . later.
Notebooks are isolated in notebooks/, keeping the repo root clean while preserving exploratory work.
Data is separated into raw, interim, and processed zones to enforce reproducible pipelines.
Final deliverables (dashboards, figures, summary tables) sit in reports/ for quick stakeholder access.

📈 Reports & Visual Assets

Deliverable	Path	Description
Full Analysis Report	`reports/vendor_performance_analysis_report.md`	Comprehensive 5-page industry-ready analysis
Vendor KPI Dashboard	`reports/dashboards/vendor_performance_dashboard.pbix`	Interactive Power BI dashboard
Top Vendors & Brands	`reports/figures/top_vendors_and_brands.png`	Revenue leaders visualization
Inventory at Risk	`reports/figures/unsold_inventory.png`	Capital lock-up analysis
Vendor Sales Summary	`reports/tables/vendor_sales_summary.csv`	Curated dataset (8,564 rows)

Dashboard Preview

Sample Visualizations

Top Vendors & Brands	Unsold Inventory Analysis

💡 Tip: Open the CSV in Power BI/Excel or feed it to downstream ML experiments. The Power BI file already points to this table—simply refresh after regenerating the summary via the notebooks.

🔬 Analysis Highlights

The analysis addresses 8 diagnostic business questions:

Q1 - Hidden Gems: 198 high-margin brands with low sales identified for promotional campaigns
Q2 - Revenue Leaders: DIAGEO, MARTIGNETTI, and PERNOD RICARD lead with $139M combined sales
Q3 - Purchase Concentration: Top 5 vendors account for 45.7% of procurement spend
Q4 - Vendor Dependency: 65.7% concentration in top 10 vendors signals moderate risk
Q5 - Order Economics: Unit costs drop 72% from small (≤85 units) to large (>1,500 units) orders
Q6 - Low Turnover: 10+ vendors with turnover < 1.0 (inventory exceeds annual sales)
Q7 - Idle Capital: $2.71M locked in unsold inventory (DIAGEO, JIM BEAM, PERNOD = 64%)
Q8 - Statistical Validation: Welch's t-test (p < 0.0001) confirms significant margin differences

Statistical Confidence

Vendor Group	Mean Margin	95% CI
Top Performers (≥75th %ile sales)	31.18%	[30.74%, 31.61%]
Low Performers (≤25th %ile sales)	41.57%	[40.50%, 42.64%]

Welch's t-statistic: -17.67 | p-value: < 0.0001

🚀 Getting Started

Prerequisites

Python 3.9+
SQLite (included with Python)
Power BI Desktop (optional, for dashboard)

Installation

Clone the repository:

git clone https://github.com/yourusername/vendor-performance-analytics.git
cd vendor-performance-analytics

Create a virtual environment (recommended):

python -m venv .venv
.\.venv\Scripts\Activate.ps1   # Windows PowerShell
# OR
source .venv/bin/activate       # Linux/macOS

Install dependencies:
```
pip install -r requirements.txt
```

Running the Analysis

Option 1: Via Notebooks (Interactive)

jupyter notebook notebooks/

Execute in order:

ingestion_db.ipynb - Load raw data into SQLite
eda.ipynb - Data cleaning and feature engineering
vendor_performance_analysis.ipynb - Full diagnostic analysis

Option 2: Via Python Modules (CLI)

cd src
python -m vendor_performance.ingestion_db
python -m vendor_performance.eda
python -m vendor_performance.vendor_performance_analysis

Data Pipeline

data/raw/*.csv → ingestion_db.py → inventory.db → eda.py → vendor_sales_summary → analysis → reports/

📁 Data Dictionary

Source Files (`data/raw/`)

File	Description	Key Fields
`begin_inventory.csv`	Opening inventory positions	SKU, Quantity, Value
`end_inventory.csv`	Closing inventory positions	SKU, Quantity, Value
`purchases.csv`	Purchase transactions	Vendor, SKU, Quantity, Amount
`purchase_prices.csv`	Price book reference	SKU, Unit Price
`sales.csv`	Sales transactions	SKU, Quantity, Amount, Date
`vendor_invoice.csv`	Invoice and freight data	Vendor, Invoice, Freight

Analytical Output (`reports/tables/vendor_sales_summary.csv`)

Column	Description
`VendorNumber`	Unique vendor identifier
`VendorName`	Vendor company name
`Brand`	Product brand name
`Description`	SKU description
`PurchasePrice`	Unit purchase cost
`ActualPrice`	Unit selling price
`TotalPurchaseQuantity`	Total units purchased
`TotalPurchaseDollars`	Total purchase spend
`TotalSalesQuantity`	Total units sold
`TotalSalesDollars`	Total sales revenue
`GrossProfitDollars`	Sales - Purchase cost
`ProfitMargin`	Gross Profit / Sales
`StockTurnover`	Sales Qty / Avg Inventory

🎯 Business Problem & Objectives

Effective inventory and sales management are critical for optimizing profitability in the retail and wholesale industry. Companies need to ensure that they are not incurring losses due to inefficient pricing, poor inventory turnover, or vendor dependency.

Primary Objectives

Objective	Analysis Approach	Key Deliverable
Identify underperforming brands	Percentile-based segmentation (margin vs. volume)	198 promotional candidates
Determine top vendors	Revenue and profit contribution analysis	Top 10 vendor scorecard
Analyze bulk purchasing impact	Order-size tier economics	72% cost savings validated
Assess inventory turnover	Stock turnover ratio analysis	Low-turnover vendor list
Investigate profitability variance	Welch's t-test + 95% CI	Statistically significant gap confirmed

Strategic Recommendations

Based on the analysis, we recommend:

Launch precision promotions on 198 high-margin, low-volume brands ($15-25M revenue potential)
Consolidate procurement orders to leverage 72% unit cost savings at scale
Rebalance inventory with DIAGEO, JIM BEAM, and PERNOD to free $1.5M+ working capital
Diversify vendor base to reduce top-3 concentration below 25%
Embed statistical monitoring in BI dashboards for proactive margin management

🛠️ Technology Stack

Component	Technology
Data Processing	Python, Pandas, NumPy
Database	SQLite
Statistical Analysis	SciPy (Welch's t-test, CI)
Visualization	Matplotlib, Seaborn
BI Dashboard	Power BI
Notebooks	Jupyter

📋 Next Steps (TODO)

Add pyproject.toml for package installation (pip install -e .)
Introduce configuration templates under configs/
Stand up automated tests in tests/
Consider data versioning (DVC, LakeFS) for evolving raw feeds
Deploy dashboard to Power BI Service for stakeholder access
Implement automated refresh pipeline with scheduled runs

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

👤 Author

Jeet Majumdar

For questions or collaboration, please open an issue or submit a pull request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vendor Performance Data Analysis

📊 Key Findings Summary

🏗️ Project Structure

📈 Reports & Visual Assets

Dashboard Preview

Sample Visualizations

🔬 Analysis Highlights

Statistical Confidence

🚀 Getting Started

Prerequisites

Installation

Running the Analysis

Data Pipeline

📁 Data Dictionary

Source Files (`data/raw/`)

Analytical Output (`reports/tables/vendor_sales_summary.csv`)

🎯 Business Problem & Objectives

Primary Objectives

Strategic Recommendations

🛠️ Technology Stack

📋 Next Steps (TODO)

📄 License

👤 Author

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
configs		configs
docs		docs
models		models
notebooks		notebooks
reports		reports
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

JeetMajumdar2003/Vendor-Performance-Analysis

Folders and files

Latest commit

History

Repository files navigation

Vendor Performance Data Analysis

📊 Key Findings Summary

🏗️ Project Structure

📈 Reports & Visual Assets

Dashboard Preview

Sample Visualizations

🔬 Analysis Highlights

Statistical Confidence

🚀 Getting Started

Prerequisites

Installation

Running the Analysis

Data Pipeline

📁 Data Dictionary

Source Files (data/raw/)

Analytical Output (reports/tables/vendor_sales_summary.csv)

🎯 Business Problem & Objectives

Primary Objectives

Strategic Recommendations

🛠️ Technology Stack

📋 Next Steps (TODO)

📄 License

👤 Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Source Files (`data/raw/`)

Analytical Output (`reports/tables/vendor_sales_summary.csv`)

Packages