Local RAG with Ollama

This project allows you to run Retrieval-Augmented Generation (RAG) with a model on your local machine using Ollama. You can also run this project from Google Colab using Ngrok to publish to a public URL. Details are below. Running from Colab is an easy and free way to test out a RAG setup, Colab has a limited run time. For a more permanent solution deploy to a server (process not covered here, it depends on the service used).

Start with the instructions for this project here: video, written. I've added detailed additional instructions below to clarify some of the steps.

Ollama reference: https://github.com/ollama/ollama?tab=readme-ov-file

Installing Ollama on MAC

download ollama from the website https://ollama.com/download
unzip the downloaded package and move the unzipped package to Applications
launch Ollama from Applications - it will prompt you to install the cli

Downloading a model to work with

at a terminal run ollama pull <model> - replace with the model name you want to use.
- ollama run <model> runs the model for querying in the terminal.

Requirements for this project

Create your virtual environment and add an app.py file

Enter the code as shown in the repository

Run the following in the terminal

ollama pull nomic-embed-text

pip install BeautifulSoup4

pip install tiktoken

pip install chromadb

To create a requirements file

pip freeze > requirements.txt

Start ollama server

ollama run <model>

Run your streamlit app

streamlit run <your app>

Ollama tips

To check if ollama is running pull up localhost:11434 in the browser. It should respond with Ollama is running

Query Ollama via curl in the terminal: curl http://localhost:11434/api/chat -d '{ "model": "mistral", "messages": [ { "role": "user", "content": "How many countries have nuclear power plants" } ], "stream": false }'

Leave out the "stream" option or set to true in order to stream output.

Or you can enter ollama run <model> and enter a question at the prompt.

Enter /bye at the prompt in the terminal or CTRL-C to quit your Ollama server.

Running on Colab

Upload the Ollama.ipynb to Google Colab. The 'app.py' file can be uploaded to the Colab files section under /content. To use the Ngrok service to create a public url, sign up for an account at ngrok.com. You can then set up an authtoken. Add it as a notebook secret with the name NGROKTOKEN and the value as the token. Be sure to authorize its use in the notebook. Run each cell in order and next to last cell should output the ngrok endpoint to use to access your app.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
Ollama.ipynb		Ollama.ipynb
README.md		README.md
app.py		app.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local RAG with Ollama

Installing Ollama on MAC

Downloading a model to work with

Requirements for this project

Create your virtual environment and add an app.py file

Run the following in the terminal

To create a requirements file

Start ollama server

Run your streamlit app

Ollama tips

Running on Colab

About

Uh oh!

Releases

Packages

Uh oh!

Languages

sdbenezra/OllamaRAGChatbot

Folders and files

Latest commit

History

Repository files navigation

Local RAG with Ollama

Installing Ollama on MAC

Downloading a model to work with

Requirements for this project

Create your virtual environment and add an app.py file

Run the following in the terminal

To create a requirements file

Start ollama server

Run your streamlit app

Ollama tips

Running on Colab

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages