Skip to content

Partition/nlscraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

housing scraper

scrapes dutch rental websites and notifies you via discord whenever there's a new listing zz

setup

pip install -r requirements.txt
python run_scrapers.py

also set up a discord channel webhook and enter the url & your discord user id in .env (otherwise lowkey pointless)

oh - you also need to set up a postgres db, preferably before you run the script for the first time:D

config

edit core/config.py or .env or use command line args:

# basic usage
python run_scrapers.py --city amsterdam --max-price 1200

# specific scrapers only
python run_scrapers.py --scrapers kamernet pararius

# see options
python run_scrapers.py --list

supported sites (so far)

  • kamernet
  • pararius
  • huurwoningen

adding new scrapers (see other scrapers implementation in scrapers/ dir)

  1. create scrapers/newsite.py:
from core.scraper_utils import scrape_generic

def extract_listings(soup):
    return soup.find_all("div", class_="listing")

def parse_listing(listing):
    return {
        "title": listing.find("h2").text.strip(),
        "price": listing.find(".price").text.strip(),
        "url": listing.find("a")["href"]
    }

def scrape(city=None, max_price=None):
    return scrape_generic("newsite", extract_listings, parse_listing, city, max_price)
  1. add to core/config.py:
SCRAPERS = {
    "newsite": True,
}

SCRAPER_URLS = {
    "newsite": {
        "base": "https://newsite.com",
        "search": "/search/{city}?max_price={max_price}"
    }
}

that's it - the dynamic import system handles the rest

About

tired of nl housing crisis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages