This is my Python repo where I’ve put all my Python-related work.
Selenium Scraper:
I used Selenium to scrape restaurants/cafés from Maps — including reviews, addresses, contacts, and pictures. I also tried to create an algorithm for my main project experience to rank cafés/restaurants based on reviews in different categories like ambience, food, and service. But that didn’t work out because of fake Google reviews. and same to put all reviews and other stuff - that count as weight. Are pushed by python. (At that time, the goal of the project was to create more authentic reviews and ratings.) {cafe_putting -> lost the code}
Tesseract OCR:
I tested different Page Segmentation Modes (PSM) of Tesseract OCR - to scrape restaurant menus and evaluated each mode for OCR accuracy. I often had to zoom into specific parts of an image (like a dish) - to improve results. I also considered switching - to other deep learning-based OCR solutions like MMOCR, but couldn’t use them due to low processing power on my machine. (This is where - I introduced with over-engineering.)