I'm a passionate Data Engineer & Web Scraper Specialist based in Cracow, Poland (originally from Ukraine). With a focus on automating data extraction and building scalable tools, I turn complex web data into actionable insights. Currently diving deep into Python ecosystems, Docker orchestration, and database automation. Let's build something awesome together!
- π Location: KrakΓ³w, Poland
- π« Email: mplakushko@gmail.com
- π LinkedIn/Twitter/Portfolio: https://www.linkedin.com/in/maksymplakushko/
- πΌ Open for: Freelance scraping projects, data pipelines, or collaborations on open-source tools.
- Languages: Python (expert in scraping, data analysis), SQL
- Tools/Frameworks: Scrapy, Selenium, Docker, PostgreSQL, APScheduler, Pandas
- Specialties: Web Scraping with anti-detection (proxies, user-agents, throttling), Data Pipelines, Automation Scripts, Containerization
- Other: Git, CI/CD basics, Cloud (AWS/GCP in learning)
Here are my standout projects β each in its own repo for easy forking and collaboration:
-
High-Speed Async Web Extractor
A professional-grade scraping solution designed for speed and efficiency. This project demonstrates the power of asynchronous processing to handle modern, JavaScript-heavy web applications.- Async Engine: Powered by Scrapy-Playwright for non-blocking I/O operations.
- Modern JS Handling: Seamlessly renders dynamic content that traditional scrapers miss.
- Robust Infrastructure: Integrated Docker support for easy deployment and Proxy rotation to ensure high availability.
- Optimized Performance: Fine-tuned concurrency settings for maximum data throughput.
- β Stars: 0 (let's boost it!)
- π‘οΈ License: MIT
- π Tech: Scrapy, Playwright, Docker, APScheduler
-
Robust Data Orchestration Pipeline
A comprehensive data extraction and storage system built for reliability and long-term monitoring of automotive market trends.
- Hybrid Approach: Combines Scrapy's efficiency with Selenium's ability to mimic real human behavior.
- Advanced Resource Management: Custom Driver Pool and Proxy Rotation middleware to bypass sophisticated anti-bot systems.
- Data Persistence: Fully automated PostgreSQL integration with duplicate prevention logic.
- Automation & DevOps: * Scheduled execution and daily database dumps via APScheduler.
- Containerized environment using Docker-compose for seamless "one-command" setup.
- Data Schema: Tracks comprehensive car attributes, including VIN, odometer readings, and contact information.
-
Project3-Name
(Extract and describe another one, e.g., a data analysis script.)
- Active contributor to personal projects, focusing on real-world data tools.
- Recent: Built a robust scraper with production-ready features.
- Goal: Contribute to open-source scraping libs like Scrapy.
If you're into data scraping, automation, or need a custom tool, hit me up! Open to feedback on my repos or joint projects.
- Follow me on GitHub
- Email me
- Star a repo if it helps you! β
Thanks for visiting! π
