Chatbot app, where the user asks AI about tech events taking place in Poland. Created using Python
This is a project that was assigned to us by a datarabbit.ai company as a recruitment task. We had to complete it in order to do an internship there.
This task was also an opportunity for us to learn new technologies, that we were unfamiliar with at the beginning.
- Python version: 3.13, and its libraries:
- streamlit version: 1.41.1
- langchain version: 0.3
- beautifulsoup version: 4.12
- and many other less important modules listed here
- Docker version: 27.4
- Chroma docker image tag: 0.6.4.dev119
In order to run this app, docker is required.
If you don't have docker installed on your computer yet, you can install it here
Once you have docker installed, follow these guidelines:
-
Clone the repo on your local machine
- You can do it by running this command in terminal:
git clone https://github.com/Rumeleq/ragapp.git
- You can do it by running this command in terminal:
-
Prepare the
.envfile, it should be placed in the project's root folderIt should contain variables like this:
OPENAI_API_KEY=your_api_key CHROMADB_HOST=chromadb CHROMADB_PORT=8000 CHROMADB_DIR=./chroma SCRAPING_OUTPUT_DIR=./data SCRAPING_URLS=https://www.eventbrite.com/d/poland/other--events/?page=1, https://www.eventbrite.com/d/poland/all-events/?subcategories=4004&page=1, https://www.eventbrite.com/d/poland/science-and-tech--events/?page=1, https://crossweb.pl/wydarzenia/, https://unikonferencje.pl/konferencje/technologie_informacyjne, https://unikonferencje.pl/konferencje/elektrotechnika, https://unikonferencje.pl/konferencje/automatyka_robotyka, https://unikonferencje.pl/konferencje/informatyka_teoretyczna -
Make sure you are in the project's root folder and run the command: 1.
docker compose upThere are two versions of this command:
docker-compose upanddocker compose up. On Windows you can run both and it will work fine, however on Linux, it is recommended to pick the second version (without the dash). The commanddocker compose upforces docker to usedocker_compose_v2which is just better, more stable and more reliable. 2. By running the above command, docker should:- install the chromadb image (unless you have it already)
- run etl container after the chroma's healthcheck
- in etl container
scraper.pyscript should scrape the data from websites: - after the
scraper.pyfinishes successfully, frontend container should run and expose the port 8501 - The whole process could take even a few minutes, especially when running for the first time
-
If you see in docker logs that frontend container is starting to run, you can visit the webapp in browser
Correctly set up and working app looks like this:

The project is: done
Special thanks to datarabbit team for giving us this interesting challenge
Many thanks to:
- Programator for providing strong understanding of Docker
- Docker documentation for vital details about creating
docker-compose.ymlfile - Streamlit documentation for everything about streamlit library
- LangChain for information about langchain library
- LangChain_Chroma documentation for information about langchain_chroma library
- LangChain_OpenAI documentation for information about langchain_openai library
- LangChain_Core for information about langchain_core library
- ChromaDB Cookbook for configuration of ChromaDB with Docker
- Bulldogjob for providing an article on how to write README properly
- pixegami for providing a comprehensive guide on RAG and the logic behind it
People and their roles:
Rumeleq - repository owner, responsible for etl scraper
wiktorKycia - repository maintainer, responsible for frontend - displaying data on a website, also responsible for dockerization
JanTopolewski - responsible for data flow, connecting to chroma database and AI, prompt templates