This is a SQL portfolio project based on the classic Sample Superstore dataset.
The goal is to explore and analyze key business questions such as:
- Which customers generate the most profit?
- Which regions and categories perform best?
- How do discounts impact profit margins?
The project demonstrates proficiency in SQL, business analysis, and data storytelling through queries.
- Source:
sample_superstore.csv(public Kaggle dataset) - Size: ~10,000 rows
- Date Range: 2014-10-10 โ 2017-12-30
- Target Table:
superstore(created bysetup_superstore.sql) - Key Columns:
OrderDate (YYYY-MM-DD),Sales (REAL),Profit (REAL),Discount (0โ1),
Category,SubCategory,Segment,Region,CustomerName,ProductName
# 1๏ธโฃ Build or refresh the SQLite database and load the CSV
sqlite3 superstore.db ".read setup_superstore.sql"
# 2๏ธโฃ Run analysis queries (e.g. queries.sql)
# Inside VS Code with SQLTools: open queries.sql โ select a statement โ Ctrl+E, Ctrl+EThe analysis includes:
- ๐ Top-10 customers by total profit and sales volume.
- ๐ฐ AOV by category (Average Order Value).
- ๐ Regional and segment-level performance.
- ๐ Impact of discounts on margins and profit.
- ๐งฉ ABC classification and profitability cohorts.
- The Consumer segment accounts for ~50% of total sales but lower profit margins.
- The Corporate segment is the most profitable overall.
- Furniture category has the lowest profit-to-sales ratio due to high shipping costs.
- Regions West and East outperform others in both revenue and profit.
- Discounts above 30% consistently destroy profit margins.
This analysis can help retail and e-commerce managers:
- Identify the most profitable customer segments and regions.
- Optimize discount strategies to avoid margin erosion.
- Focus marketing campaigns on high-value customers.
๐ Back to Portfolio