47 Scraper is a production-ready tool for extracting structured product and pricing data from the 47brand.com apparel store. It helps teams collect reliable e-commerce data for analysis, monitoring, and decision-making without manual effort.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for 47-scraper you've just found your team — Let’s Chat. 👆👆
This project extracts detailed apparel product information from an online retail store and converts it into clean, structured datasets. It solves the problem of manually tracking product changes, prices, and catalog updates across a growing inventory. The scraper is built for developers, analysts, and e-commerce teams who need accurate product data at scale.
- Crawls product listing and detail pages with consistency
- Normalizes pricing and product attributes into structured records
- Designed for repeatable runs and long-term data tracking
- Suitable for analytics, reporting, and automation workflows
| Feature | Description |
|---|---|
| Product Catalog Crawling | Collects all available products from category and listing pages. |
| Price Extraction | Captures current prices and sale prices with high accuracy. |
| Product Metadata Parsing | Extracts names, descriptions, variants, and identifiers. |
| Structured Output | Produces clean, analysis-ready data for downstream systems. |
| Scalable Design | Handles large product catalogs efficiently and reliably. |
| Field Name | Field Description |
|---|---|
| product_id | Unique identifier for each product. |
| product_name | Official product title as listed in the store. |
| category | Apparel category the product belongs to. |
| price | Current listed price of the product. |
| sale_price | Discounted price when available. |
| currency | Currency used for pricing. |
| availability | Stock or availability status. |
| product_url | Direct URL to the product page. |
| images | List of product image URLs. |
| description | Textual product description. |
[
{
"product_id": "MLB-NYY-001",
"product_name": "New York Yankees Clean Up Cap",
"category": "Headwear",
"price": 28.00,
"sale_price": null,
"currency": "USD",
"availability": "In Stock",
"product_url": "https://www.47brand.com/products/new-york-yankees-clean-up",
"images": [
"https://cdn.47brand.com/images/nyy-cap-front.jpg",
"https://cdn.47brand.com/images/nyy-cap-side.jpg"
],
"description": "A relaxed fit cap featuring the classic Yankees logo."
}
]
47 Scraper/
├── src/
│ ├── index.ts
│ ├── crawler/
│ │ ├── productCrawler.ts
│ │ └── listingCrawler.ts
│ ├── parsers/
│ │ ├── productParser.ts
│ │ └── priceParser.ts
│ ├── utils/
│ │ ├── httpClient.ts
│ │ └── normalize.ts
│ └── config/
│ └── settings.example.json
├── data/
│ ├── sample-input.json
│ └── sample-output.json
├── package.json
├── tsconfig.json
└── README.md
- E-commerce analysts use it to track product prices over time, so they can identify pricing trends and promotions.
- Retail researchers use it to collect apparel catalog data, so they can compare brands and market positioning.
- Growth teams use it to monitor stock availability, so they can react quickly to demand changes.
- Developers use it to feed product data into dashboards, so they can automate reporting workflows.
Is this scraper limited to specific product categories? No. It is designed to work across all apparel categories available on the store, including headwear, clothing, and accessories.
Can it handle large product catalogs? Yes. The architecture is optimized for scalability and can process large catalogs while maintaining stable performance.
Does it capture discounted prices? Yes. When a product is on sale, both the original price and sale price are extracted when available.
Is the output suitable for analytics tools? Absolutely. The structured format is designed for direct use in spreadsheets, databases, and BI tools.
Primary Metric: Average extraction speed of 120–150 products per minute on standard catalog pages.
Reliability Metric: Consistent success rate above 99% across repeated runs.
Efficiency Metric: Optimized request handling minimizes redundant page loads and resource usage.
Quality Metric: High data completeness with accurate pricing and metadata across product variants.
