Skip to content
View MaxMayer1991's full-sized avatar
πŸŽ“
Keep learning
πŸŽ“
Keep learning

Block or report MaxMayer1991

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
MaxMayer1991/README.md

πŸ‘‹ Hi, I'm Maksym Plakushko! πŸš€

GitHub followers GitHub stars Visitor Badge

I'm a passionate Data Engineer & Web Scraper Specialist based in Cracow, Poland (originally from Ukraine). With a focus on automating data extraction and building scalable tools, I turn complex web data into actionable insights. Currently diving deep into Python ecosystems, Docker orchestration, and database automation. Let's build something awesome together!

πŸ› οΈ Skills & Tech Stack

  • Languages: Python (expert in scraping, data analysis), SQL
  • Tools/Frameworks: Scrapy, Selenium, Docker, PostgreSQL, APScheduler, Pandas
  • Specialties: Web Scraping with anti-detection (proxies, user-agents, throttling), Data Pipelines, Automation Scripts, Containerization
  • Other: Git, CI/CD basics, Cloud (AWS/GCP in learning)

Top Languages GitHub Stats

πŸ“‚ Featured Projects

Here are my standout projects β€” each in its own repo for easy forking and collaboration:

  1. High-Speed Async Web Extractor
    A professional-grade scraping solution designed for speed and efficiency. This project demonstrates the power of asynchronous processing to handle modern, JavaScript-heavy web applications.

    Key Features:

    • Async Engine: Powered by Scrapy-Playwright for non-blocking I/O operations.
    • Modern JS Handling: Seamlessly renders dynamic content that traditional scrapers miss.
    • Robust Infrastructure: Integrated Docker support for easy deployment and Proxy rotation to ensure high availability.
    • Optimized Performance: Fine-tuned concurrency settings for maximum data throughput.
    • ⭐ Stars: 0 (let's boost it!)
    • πŸ›‘οΈ License: MIT
    • πŸ”‘ Tech: Scrapy, Playwright, Docker, APScheduler
  2. Robust Data Orchestration Pipeline

    A comprehensive data extraction and storage system built for reliability and long-term monitoring of automotive market trends.

    Architectural Highlights:

    • Hybrid Approach: Combines Scrapy's efficiency with Selenium's ability to mimic real human behavior.
    • Advanced Resource Management: Custom Driver Pool and Proxy Rotation middleware to bypass sophisticated anti-bot systems.
    • Data Persistence: Fully automated PostgreSQL integration with duplicate prevention logic.
    • Automation & DevOps: * Scheduled execution and daily database dumps via APScheduler.
    • Containerized environment using Docker-compose for seamless "one-command" setup.
    • Data Schema: Tracks comprehensive car attributes, including VIN, odometer readings, and contact information.
  3. Project3-Name
    (Extract and describe another one, e.g., a data analysis script.)

πŸ“ˆ Contributions & Activity

  • Active contributor to personal projects, focusing on real-world data tools.
  • Recent: Built a robust scraper with production-ready features.
  • Goal: Contribute to open-source scraping libs like Scrapy.

Contribution Graph

🀝 Let's Connect!

If you're into data scraping, automation, or need a custom tool, hit me up! Open to feedback on my repos or joint projects.

Thanks for visiting! πŸš€

Pinned Loading

  1. high-speed-web-extractor high-speed-web-extractor Public

    High-performance asynchronous web scraper utilizing Scrapy and Playwright for dynamic content extraction from heavy-JS automotive platforms

    Python

  2. robust-scraping-orchestrator robust-scraping-orchestrator Public

    A scraper for a major automotive marketplace

    Python