Chatbot API

The Chatbot API is a standalone service that exposes a REST endpoint for natural-language chat. It runs a quantized Mistral-7B model locally via llama.cpp in order to greatly reduce the model's inference cost and allow you to run this large model efficiently on your CPU. It is served through FastAPI.

This API is stateless by default, meaning replies are generated based on the chat history you send in the request. This means you can use it as a standalone service by always including the full message history or integrate it with a custom backend to provide multi-turn memory and persistence. It's currently powers the chatbot feature at danlau.live but can also be deployed and used in other projects.

Features

FastAPI REST API
Mistral-7B (quantized .gguf) running locally with llama.cpp
Includes dockerfile, which aids with containerization with Docker + docker-compose
Works standalone or behind another backend service
Deployable behind NGINX/HTTPS (reverse proxy)

🔧 Setup

1. Clone the repo

git clone https://github.com/your-username/chatbot_API.git
cd chatbot_API

2. Install dependencies(local dev)

python -m venv venv
venv/bin/activate   # On Windows: venv\Scripts\activate
pip install -r requirements.txt

3. Download and set up the model

This service requires a quantized Mistral-7B .gguf model, which is not included in this repo.

1. Create a `models/` folder:

mkdir -p chatbot_API/models

2. Download a quantized model at this link

This API was designed and tested using mistral-7b-instruct.Q4_K_M.gguf but the others should work as well. You would change the name of the model being used if you choose to use a different quantized Mistral model for each instance it pops up in these setup instructions.

3. Place the model file in `chatbot_API/models/`:

chatbot_API/models/mistral-7b-instruct.Q4_K_M.gguf

4. Configure `MODEL_PATH`:

Running locally → use a host filesystem path (absolute path recommended) in the .env file

MODEL_PATH = /absolute/host/path/to/chatbot_API/models/mistral-7b-instruct.Q4_K_M.gguf

Running

Option A - Local development

Use the run.py helper script (auto-reload enabled):

python run.py

Server: http://localhost:8001

Option B — Production Use

In production, you'll can run the API as a containerized service using Docker or integrate it into a larger deployment stack (docker-compose, Kubernetes etc.)

Example with docker-compose:

  # This is just an example of the setup in the docker-compose.yml file. Your setup may differ.
  services:
    chatbot-api:
      build: ./chatbot_API
      volumes:
        - ./chatbot_API/models:/models
      environment:
        - MODEL_PATH=/models/mistral-7b-instruct-v0.1.Q4_K_M.gguf
      restart: always

Note: Since production environments vary widely, this project does not include a full deployment configuration. You are encouraged to adapt the container build and runtime environment to your own needs (volume mounts, environment variables, reverse proxy settings, etc.).

API Endpoints

POST `/chat`

Sends a conversation history and returns the assistant's next response.

Request Body

{
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "Hello!" }
  ]
}

Response

{
  "response": "Hi there! How can I help you today?"
}

Tweaking the Project for Your Own Use

This project is designed to be modular and easy to adapt. You're encouraged to:

Modify the system prompt or response formatting logic in `chat_service.py to better fit your use case
Integrate the API into a broader stack (e.g., add a database for chat history, connect to a frontend, or containerize it within your own ecosystem)

Found a Bug or Issue?

If you encounter a bug, unexpected behavior, or have a suggestion:

Please open an issue describing the problem
Include any relevant error messages, sample inputs, or details about your setup

I would appreciate any feedback you have to give!

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
app		app
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chatbot API

Features

🔧 Setup

1. Clone the repo

2. Install dependencies(local dev)

3. Download and set up the model

1. Create a `models/` folder:

2. Download a quantized model at this link

3. Place the model file in `chatbot_API/models/`:

4. Configure `MODEL_PATH`:

Running

Option A - Local development

Option B — Production Use

API Endpoints

POST `/chat`

Tweaking the Project for Your Own Use

Found a Bug or Issue?

About

Uh oh!

Releases

Packages

Languages

dlau72/chatbot_API

Folders and files

Latest commit

History

Repository files navigation

Chatbot API

Features

🔧 Setup

1. Clone the repo

2. Install dependencies(local dev)

3. Download and set up the model

1. Create a models/ folder:

2. Download a quantized model at this link

3. Place the model file in chatbot_API/models/:

4. Configure MODEL_PATH:

Running

Option A - Local development

Option B — Production Use

API Endpoints

POST /chat

Tweaking the Project for Your Own Use

Found a Bug or Issue?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. Create a `models/` folder:

3. Place the model file in `chatbot_API/models/`:

4. Configure `MODEL_PATH`:

POST `/chat`

Packages