A comprehensive data analytics solution for vendor performance evaluation, procurement optimization, and inventory efficiency analysis.
This project delivers actionable insights from vendor transaction data to support supply chain optimization and commercial strategy. The analysis covers $441M+ in sales across 119 vendors and 7,707 SKUs, identifying opportunities for margin improvement, cost reduction, and working capital efficiency.
| Metric | Value | Insight |
|---|---|---|
| Total Sales | $441.41M | Strong portfolio performance |
| Gross Profit | $134.07M | 38.7% average margin |
| Top 10 Vendor Concentration | 65.7% | Moderate supply chain risk |
| Hidden Margin Opportunities | 198 brands | $15-25M revenue potential |
| Order Size Savings | 72% | Large vs. small order unit costs |
| Idle Inventory Capital | $2.71M | Working capital optimization target |
The repository now follows a standard Python data-analysis layout inspired by the Cookiecutter Data Science conventions.
Vendor Performance Data Analytics/
βββ configs/ # YAML/JSON parameter files for pipelines and experiments
βββ data/
β βββ raw/ # Immutable source data dumps
β βββ interim/ # Staging outputs from cleaning notebooks/scripts
β βββ processed/ # Curated datasets ready for modeling & reporting
βββ docs/ # Architecture decisions, meeting notes, specs
βββ logs/ # Runtime and ingestion logs (non-versioned)
βββ models/ # Serialized models, experiment artifacts
βββ notebooks/ # Jupyter notebooks (EDA, ingestion, modeling)
βββ reports/
β βββ dashboards/ # BI artifacts (e.g., Power BI)
β βββ figures/ # Static visual exports
β βββ tables/ # Final KPI tables & summaries
βββ src/
β βββ vendor_performance/ # Reusable Python package with data pipelines & utils
βββ tests/ # Pytest suite for regression coverage
βββ LICENSE
βββ README.md
Key highlights:
- All Python modules live under
src/vendor_performance, making it easy to install the package withpip install -e .later. - Notebooks are isolated in
notebooks/, keeping the repo root clean while preserving exploratory work. - Data is separated into
raw,interim, andprocessedzones to enforce reproducible pipelines. - Final deliverables (dashboards, figures, summary tables) sit in
reports/for quick stakeholder access.
| Deliverable | Path | Description |
|---|---|---|
| Full Analysis Report | reports/vendor_performance_analysis_report.md |
Comprehensive 5-page industry-ready analysis |
| Vendor KPI Dashboard | reports/dashboards/vendor_performance_dashboard.pbix |
Interactive Power BI dashboard |
| Top Vendors & Brands | reports/figures/top_vendors_and_brands.png |
Revenue leaders visualization |
| Inventory at Risk | reports/figures/unsold_inventory.png |
Capital lock-up analysis |
| Vendor Sales Summary | reports/tables/vendor_sales_summary.csv |
Curated dataset (8,564 rows) |
| Top Vendors & Brands | Unsold Inventory Analysis |
|---|---|
![]() |
![]() |
π‘ Tip: Open the CSV in Power BI/Excel or feed it to downstream ML experiments. The Power BI file already points to this tableβsimply refresh after regenerating the summary via the notebooks.
The analysis addresses 8 diagnostic business questions:
- Q1 - Hidden Gems: 198 high-margin brands with low sales identified for promotional campaigns
- Q2 - Revenue Leaders: DIAGEO, MARTIGNETTI, and PERNOD RICARD lead with $139M combined sales
- Q3 - Purchase Concentration: Top 5 vendors account for 45.7% of procurement spend
- Q4 - Vendor Dependency: 65.7% concentration in top 10 vendors signals moderate risk
- Q5 - Order Economics: Unit costs drop 72% from small (β€85 units) to large (>1,500 units) orders
- Q6 - Low Turnover: 10+ vendors with turnover < 1.0 (inventory exceeds annual sales)
- Q7 - Idle Capital: $2.71M locked in unsold inventory (DIAGEO, JIM BEAM, PERNOD = 64%)
- Q8 - Statistical Validation: Welch's t-test (p < 0.0001) confirms significant margin differences
| Vendor Group | Mean Margin | 95% CI |
|---|---|---|
| Top Performers (β₯75th %ile sales) | 31.18% | [30.74%, 31.61%] |
| Low Performers (β€25th %ile sales) | 41.57% | [40.50%, 42.64%] |
Welch's t-statistic: -17.67 | p-value: < 0.0001
- Python 3.9+
- SQLite (included with Python)
- Power BI Desktop (optional, for dashboard)
-
Clone the repository:
git clone https://github.com/yourusername/vendor-performance-analytics.git cd vendor-performance-analytics -
Create a virtual environment (recommended):
python -m venv .venv .\.venv\Scripts\Activate.ps1 # Windows PowerShell # OR source .venv/bin/activate # Linux/macOS
-
Install dependencies:
pip install -r requirements.txt
Option 1: Via Notebooks (Interactive)
jupyter notebook notebooks/Execute in order:
ingestion_db.ipynb- Load raw data into SQLiteeda.ipynb- Data cleaning and feature engineeringvendor_performance_analysis.ipynb- Full diagnostic analysis
Option 2: Via Python Modules (CLI)
cd src
python -m vendor_performance.ingestion_db
python -m vendor_performance.eda
python -m vendor_performance.vendor_performance_analysisdata/raw/*.csv β ingestion_db.py β inventory.db β eda.py β vendor_sales_summary β analysis β reports/
| File | Description | Key Fields |
|---|---|---|
begin_inventory.csv |
Opening inventory positions | SKU, Quantity, Value |
end_inventory.csv |
Closing inventory positions | SKU, Quantity, Value |
purchases.csv |
Purchase transactions | Vendor, SKU, Quantity, Amount |
purchase_prices.csv |
Price book reference | SKU, Unit Price |
sales.csv |
Sales transactions | SKU, Quantity, Amount, Date |
vendor_invoice.csv |
Invoice and freight data | Vendor, Invoice, Freight |
| Column | Description |
|---|---|
VendorNumber |
Unique vendor identifier |
VendorName |
Vendor company name |
Brand |
Product brand name |
Description |
SKU description |
PurchasePrice |
Unit purchase cost |
ActualPrice |
Unit selling price |
TotalPurchaseQuantity |
Total units purchased |
TotalPurchaseDollars |
Total purchase spend |
TotalSalesQuantity |
Total units sold |
TotalSalesDollars |
Total sales revenue |
GrossProfitDollars |
Sales - Purchase cost |
ProfitMargin |
Gross Profit / Sales |
StockTurnover |
Sales Qty / Avg Inventory |
Effective inventory and sales management are critical for optimizing profitability in the retail and wholesale industry. Companies need to ensure that they are not incurring losses due to inefficient pricing, poor inventory turnover, or vendor dependency.
| Objective | Analysis Approach | Key Deliverable |
|---|---|---|
| Identify underperforming brands | Percentile-based segmentation (margin vs. volume) | 198 promotional candidates |
| Determine top vendors | Revenue and profit contribution analysis | Top 10 vendor scorecard |
| Analyze bulk purchasing impact | Order-size tier economics | 72% cost savings validated |
| Assess inventory turnover | Stock turnover ratio analysis | Low-turnover vendor list |
| Investigate profitability variance | Welch's t-test + 95% CI | Statistically significant gap confirmed |
Based on the analysis, we recommend:
- Launch precision promotions on 198 high-margin, low-volume brands ($15-25M revenue potential)
- Consolidate procurement orders to leverage 72% unit cost savings at scale
- Rebalance inventory with DIAGEO, JIM BEAM, and PERNOD to free $1.5M+ working capital
- Diversify vendor base to reduce top-3 concentration below 25%
- Embed statistical monitoring in BI dashboards for proactive margin management
| Component | Technology |
|---|---|
| Data Processing | Python, Pandas, NumPy |
| Database | SQLite |
| Statistical Analysis | SciPy (Welch's t-test, CI) |
| Visualization | Matplotlib, Seaborn |
| BI Dashboard | Power BI |
| Notebooks | Jupyter |
- Add
pyproject.tomlfor package installation (pip install -e .) - Introduce configuration templates under
configs/ - Stand up automated tests in
tests/ - Consider data versioning (DVC, LakeFS) for evolving raw feeds
- Deploy dashboard to Power BI Service for stakeholder access
- Implement automated refresh pipeline with scheduled runs
This project is licensed under the MIT License - see the LICENSE file for details.
For questions or collaboration, please open an issue or submit a pull request.


