Skip to content
This repository was archived by the owner on Nov 25, 2025. It is now read-only.

End-to-end MLOps pipeline for a Recommendation System. Converts raw MovieLens 1M data (SVD) into a fully-tested (Pytest), containerized (Docker) FastAPI and Streamlit application.

License

Notifications You must be signed in to change notification settings

enesgulerml/movie-recommendation-api

Repository files navigation

🎬 End-to-End Movie Recommendation System

Python FastAPI Docker Testing

📖 Overview

This repository hosts a production-ready MLOps pipeline for movie recommendations based on the MovieLens 1M dataset. It demonstrates a complete lifecycle from raw data processing to model serving via REST API and an interactive dashboard.

Key Features:

  • Inference Engine: SVD Model served via FastAPI (High Performance).
  • Frontend: Decoupled Streamlit dashboard for user interaction.
  • Reproducibility: Fully Dockerized environment using explicit volume mounts.
  • Quality Assurance: Automated E2E and Unit tests via Pytest.
  • Experiment Tracking: MLflow integration for model metrics.

📂 Project Structure

movie-recommendation-api/
│
├── app/                  # API Service (FastAPI)
│   ├── main.py           # Application Entry Point
│   └── schema.py         # Pydantic Data Contracts
│
├── dashboard/            # Frontend (Streamlit)
│   └── app.py            # UI Logic
│
├── notebooks/            # EDA & Experiments (Jupyter)
│   └── 01-Data-Exploration.ipynb
│
├── src/                  # ML Pipeline (Training & Processing)
│   ├── config.py         # Hyperparameters & Paths
│   ├── train.py          # Training Script (SVD + GridSearchCV)
│   └── data_processing.py # ETL & Data Transformation Logic
│
├── tests/                # Automated Test Suite
├── requirements.txt      # Production Dependencies
└── Dockerfile            # Container Configuration

🛠️ Installation & Setup

Prerequisites

1. Environment Setup

We recommend using a fresh virtual environment to avoid dependency conflicts.

# Clone the repository
git clone https://github.com/enesgulerml/movie-recommendation-api.git
cd movie-recommendation-api

# Create Environment
conda create -n movie-rec-sys python=3.10
conda activate movie-rec-sys

# Install Dependencies
pip install -r requirements.txt
pip install -e .

🚀 How to Run

Since the trained model files are not included in the repository (due to size limits), you must train the model locally first.

Option A: Train the Model (Required First Step)

This pipeline processes the raw data, trains the SVD model, and saves the artifacts to the models/ directory.

# Run the training pipeline
python -m src.train

✅ Success: Check that models/recsys_svd_model.pkl has been created.

Option B: Run API with Docker

Once the model is trained, use Docker to serve the API. We mount your local models/ folder so the container can access the model you just created.

  1. Build the Image:
docker build -t recsys-api:latest .
  1. Run the Container:
docker run -d --rm -p 8000:80 \
  -v "$(pwd)/models:/app/models" \
  -v "$(pwd)/data:/app/data" \
  recsys-api:latest

👉 Access API Docs: http://localhost:8000/docs

Option C: User Dashboard (Frontend)

To launch the interactive frontend (ensure API is running first):

streamlit run dashboard/app.py

👉 Access Dashboard: http://localhost:8501

🧪 Testing

The project includes a robust test suite to ensure data integrity and API availability.

# Run all tests
pytest

# Run only fast tests (skip integration)
pytest -m "not slow"

About

End-to-end MLOps pipeline for a Recommendation System. Converts raw MovieLens 1M data (SVD) into a fully-tested (Pytest), containerized (Docker) FastAPI and Streamlit application.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published