Skip to content

kami4ka/MorizonScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Morizon.pl Scraper

A Python scraper for extracting property listings from Morizon.pl (Polish real estate portal) using the ScrapingAnt API.

Features

  • Scrapes apartments, houses, rooms, land, and commercial properties
  • Supports rental and sale listings
  • Covers all major Polish cities and voivodeships
  • Parallel scraping for improved performance
  • Extracts 30+ property attributes including price, area, rooms, location, amenities
  • Exports data to CSV format
  • Rate limiting and retry logic for reliability

Installation

  1. Clone the repository:
git clone https://github.com/kami4ka/MorizonScraper.git
cd MorizonScraper
  1. Create a virtual environment and install dependencies:
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt

Usage

Command Line

# Scrape apartments for rent in Warsaw
python main.py --category rent --location warszawa

# Scrape houses for sale in Krakow
python main.py --category house-buy --location krakow --limit 100

# Scrape rooms for rent in Wroclaw with custom output
python main.py --category room-rent --location wroclaw --output wroclaw_rooms.csv

# Enable verbose logging
python main.py --category rent --location gdansk -v

Available Options

Option Description
--category, -c Property category (default: rent)
--location, -l City/region to search in (optional)
--output, -o Output CSV file path (default: properties.csv)
--limit Maximum number of properties to scrape
--max-pages Maximum number of pages to scrape
--max-workers, -w Maximum parallel requests (default: 10)
--api-key, -k ScrapingAnt API key (overrides config)
--verbose, -v Enable verbose logging

Available Categories

Category Description
rent, apartment-rent Apartments for rent
house-rent Houses for rent
room-rent Rooms for rent
commercial-rent Commercial properties for rent
garage-rent Garages for rent
buy, apartment-buy Apartments for sale
house-buy Houses for sale
land-buy Land for sale
commercial-buy Commercial properties for sale
garage-buy Garages for sale

Supported Locations

Major Cities: Warszawa (Warsaw), Krakow, Wroclaw, Poznan, Gdansk, Lodz, Szczecin, Bydgoszcz, Lublin, Katowice, Bialystok, Gdynia, Czestochowa, Radom, Torun, Rzeszow, Kielce, Gliwice, Olsztyn, Zabrze, Bytom, Zielona Gora, Opole

Voivodeships: dolnoslaskie, kujawsko-pomorskie, lubelskie, lubuskie, lodzkie, malopolskie, mazowieckie, opolskie, podkarpackie, podlaskie, pomorskie, slaskie, swietokrzyskie, warminsko-mazurskie, wielkopolskie, zachodniopomorskie

Output Format

The scraper exports data to CSV with the following fields:

Field Description
url Property listing URL
title Property title/description
listing_id Morizon listing ID (mzn...)
price Listed price in PLN
price_currency Currency (zl/PLN)
price_per_sqm Price per square meter
deposit Security deposit
living_area Living area in m2
rooms Number of rooms
floor Floor level
total_floors Total floors in building
interior_height Interior height in cm
condition Property condition
market_type Market type (primary/secondary)
ownership Ownership type
available_from Availability date
contract_type Contract type
kitchen_type Type of kitchen
bathroom_with_wc Bathroom with WC
balcony Has balcony
windows Window type
building_type Building type
year_built Year of construction
heating Type of heating system
address Full address
street Street name
district District/neighborhood
city City name
voivodeship Voivodeship (region)
equipment Available equipment
amenities Available amenities
media Media/utilities
date_added Listing add date
date_updated Last update date
views Number of views
advertiser_type Advertiser type
advertiser_name Advertiser name
agent_company Agency name
description Property description

API Configuration

This scraper uses the ScrapingAnt API for web scraping. You can provide the API key via:

  1. Environment variable: export SCRAPINGANT_API_KEY=your_key
  2. Command line: --api-key YOUR_KEY

Configuration options in config.py:

  • SCRAPINGANT_API_KEY: Your API key
  • DEFAULT_MAX_WORKERS: Parallel request limit (default: 10)
  • DEFAULT_TIMEOUT: Request timeout in seconds (default: 60)
  • MAX_RETRIES: Number of retry attempts (default: 3)

License

MIT License

About

Python scraper for Morizon.pl real estate listings using ScrapingAnt API

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages