Skip to content

staminna/ultralytics

Repository files navigation

YOLO Dataset Annotation Service

πŸš€ Production-ready FastAPI service for managing and annotating YOLO format datasets with MongoDB backend, chunked upload support, and comprehensive test coverage.

✨ Key Features

  • πŸ”₯ YOLO11 Integration: Latest YOLO models with auto-annotation
  • πŸ“Š MongoDB Backend: Scalable document storage with Beanie ODM
  • πŸ“€ Chunked Upload: Support for datasets up to 100GB
  • πŸ§ͺ 90%+ Test Coverage: Comprehensive unit and integration tests
  • πŸ”„ Batch Processing: Efficient handling of large datasets
  • 🌐 RESTful API: Complete CRUD operations with OpenAPI docs
  • 🐳 Docker Support: Containerized MongoDB with web interface

πŸš€ Quick Start

# 1. Clone and setup environment
git clone <repository-url>
cd ultra-assesment
conda env create -f environment.yml
conda activate ultralytics-annotation

# 2. Configure environment
cp .env.example .env
# Edit .env with your settings

# 3. Start services
docker-compose up -d  # MongoDB + mongo-express

# 4. Run the application
cd backend
python -m app.main

# 5. Access services
# API: http://localhost:8000
# Docs: http://localhost:8000/docs
# MongoDB UI: http://localhost:8081

πŸ“‹ Table of Contents

πŸ› οΈ Installation & Setup

Prerequisites

  • Python 3.12+
  • Conda (Anaconda/Miniconda)
  • Docker & Docker Compose
  • Git

1. Environment Setup

Create and activate conda environment:

# Create environment from file
conda env create -f environment.yml

# Activate environment
conda activate ultralytics-annotation

# Verify installation
python --version  # Should be 3.12+
pip list | grep fastapi  # Verify FastAPI installed

Dependency Management

For Conda (Recommended for Development):

  • Use environment.yml - contains all dependencies with proper channels
  • Includes development tools (pytest, black, mypy, etc.)
  • Better handling of scientific packages (numpy, pytorch)
  • No need for requirements.txt

For Docker/Production:

  • Uses requirements.txt for pip-based installation
  • Optimized for production deployment
  • Smaller image size with only runtime dependencies

For pip-only environments:

# If you must use pip instead of conda
pip install -r requirements.txt

2. Docker Services Setup

Start MongoDB and mongo-express:

# Start services in background
docker-compose up -d

# Check services are running
docker-compose ps

# View logs if needed
docker-compose logs mongodb
docker-compose logs mongo-express

3. Environment Configuration

Copy and configure environment file:

cp .env.example .env

Edit .env file with your settings:

# Database Configuration
DATABASE_URL=mongodb://admin:password@localhost:27017
MONGO_DB=dataset_annotation

# Google Cloud Configuration (optional for basic usage)
GCP_PROJECT_ID=your-project-id
GCP_STORAGE_BUCKET=your-bucket-name
GOOGLE_APPLICATION_CREDENTIALS=./service-account-key.json

# YOLO Configuration
YOLO11_MODEL_PATH=yolo11n.pt
YOLO11_DEFAULT_MODEL=yolo11n.pt
YOLO11_PRODUCTION_MODEL=yolo11m.pt

# API Configuration
API_V1_STR=/api/v1
PROJECT_NAME="YOLO Dataset Annotation Service"

πŸƒβ€β™‚οΈ Running the Application

Development Mode

# Activate environment
conda activate ultralytics-annotation

# Start MongoDB services
docker-compose up -d

# Run FastAPI development server
cd backend
python -m app.main

# Or with uvicorn directly
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Production Mode

# Run with production settings
cd backend
uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 4

Access Points

πŸ§ͺ Testing

Unit Tests

Run all tests:

# Activate environment
conda activate ultralytics-annotation

# Run all tests with coverage
pytest --cov=backend/app --cov-report=html --cov-report=term

# Run specific test file
pytest tests/test_final_coverage_push.py -v

# Run tests with detailed output
pytest tests/ -v --tb=short

Test Coverage Analysis:

# Generate coverage report
pytest --cov=backend/app --cov-report=html tests/

# View HTML coverage report
open htmlcov/index.html  # macOS
# or
xdg-open htmlcov/index.html  # Linux

Integration Tests

Run integration tests:

# Start services first
docker-compose up -d

# Wait for MongoDB to be ready
sleep 10

# Run integration tests
pytest tests/test_complete_coverage.py -v
pytest tests/test_90_percent_coverage.py -v

Test Suites Overview

Test File Purpose Tests Coverage
test_final_coverage_push.py Core functionality 22 tests Services, API routes
test_coverage_final.py Utility functions 12 tests Utils, config
test_complete_coverage.py Comprehensive 25+ tests Full application
test_90_percent_coverage.py Coverage improvement 15+ tests Zero-coverage modules
test_focused_coverage.py Module-specific 20+ tests Individual modules
test_targeted_coverage.py Targeted coverage 18+ tests Specific functions

Run specific test categories:

# Core functionality tests
pytest tests/test_final_coverage_push.py::TestChunkedUploadServiceDetailed -v

# API endpoint tests
pytest tests/test_final_coverage_push.py::TestAPIRoutesComprehensive -v

# Service layer tests
pytest tests/test_final_coverage_push.py::TestServiceMethodsCoverage -v

# Configuration tests
pytest tests/test_coverage_final.py::TestConfigurationCoverageFinal -v

🌐 API Usage

Core Endpoints

Dataset Management:

# List all datasets
curl -X GET "http://localhost:8000/api/v1/datasets/"

# Create new dataset
curl -X POST "http://localhost:8000/api/v1/datasets/" \
  -H "Content-Type: application/json" \
  -d '{"name": "My Dataset", "description": "Test dataset"}'

# Get dataset by ID
curl -X GET "http://localhost:8000/api/v1/datasets/{dataset_id}"

Dataset Import:

# Import YOLO dataset (small files)
curl -X POST "http://localhost:8000/api/v1/datasets/import/yolo" \
  -F "file=@dataset.zip" \
  -F "dataset_name=My Dataset"

# Chunked upload (large files)
curl -X POST "http://localhost:8000/api/v1/datasets/import/chunked" \
  -F "file=@large_dataset.zip" \
  -F "dataset_name=Large Dataset"

Image Management:

# List images in dataset
curl -X GET "http://localhost:8000/api/v1/datasets/{dataset_id}/images"

# Get image details
curl -X GET "http://localhost:8000/api/v1/images/{image_id}"

Interactive API Documentation

Visit http://localhost:8000/docs for interactive Swagger UI documentation with:

  • Complete API reference
  • Request/response examples
  • Try-it-out functionality
  • Schema definitions

πŸ“ Dataset Management

Supported Formats

  • YOLO Format: .txt annotation files with classes.txt
  • Archive Types: .zip, .tar, .tar.gz
  • Image Types: .jpg, .jpeg, .png, .bmp
  • Dataset Structure: Standard YOLO directory layout

Upload Methods

1. Script-based Upload:

# Upload single dataset
python scripts/import_dataset.py --path /path/to/dataset --name "Dataset Name"

# Bulk upload from directory
python scripts/upload_all_datasets.py --directory /path/to/datasets

# Chunked upload for large files
python scripts/upload_coco_chunked.py --file large_dataset.zip

2. API Upload:

# Direct API upload
curl -X POST "http://localhost:8000/api/v1/datasets/import/yolo" \
  -F "file=@dataset.zip" \
  -F "dataset_name=My Dataset"

3. Programmatic Upload:

import requests

with open('dataset.zip', 'rb') as f:
    response = requests.post(
        'http://localhost:8000/api/v1/datasets/import/yolo',
        files={'file': f},
        data={'dataset_name': 'My Dataset'}
    )
print(response.json())

πŸ› οΈ Development

Project Structure

ultra-assesment/
β”œβ”€β”€ backend/
β”‚   └── app/
β”‚       β”œβ”€β”€ api/routes/          # API endpoints
β”‚       β”œβ”€β”€ core/               # Configuration, database
β”‚       β”œβ”€β”€ models/             # Data models
β”‚       β”œβ”€β”€ schemas/            # Pydantic schemas
β”‚       β”œβ”€β”€ services/           # Business logic
β”‚       └── main.py            # FastAPI application
β”œβ”€β”€ scripts/                   # Utility scripts
β”œβ”€β”€ tests/                     # Test suites
β”œβ”€β”€ docker-compose.yml         # Services configuration
β”œβ”€β”€ environment.yml           # Conda dependencies
└── requirements.txt          # Pip dependencies

Code Quality

Linting and formatting:

# Format code
black backend/app/
isort backend/app/

# Lint code
flake8 backend/app/
mypy backend/app/

Pre-commit hooks:

# Install pre-commit
conda install pre-commit

# Setup hooks
pre-commit install

# Run on all files
pre-commit run --all-files

Adding New Features

  1. Create feature branch:

    git checkout -b feature/new-feature
  2. Add tests first (TDD):

    # Create test file
    touch tests/test_new_feature.py
    
    # Write failing tests
    pytest tests/test_new_feature.py
  3. Implement feature:

    • Add service logic in backend/app/services/
    • Add API routes in backend/app/api/routes/
    • Add schemas in backend/app/schemas/
  4. Verify tests pass:

    pytest tests/test_new_feature.py -v
    pytest --cov=backend/app

πŸš€ Deployment

Docker Deployment

Build application image:

# Build from backend directory (contains Dockerfile)
cd backend
docker build -t yolo-annotation-service .

# Or build from root directory with context
docker build -f backend/Dockerfile -t yolo-annotation-service .

Run with Docker:

# Run single container (requires external MongoDB)
docker run -p 8000:8000 \
  -e DATABASE_URL=mongodb://host.docker.internal:27017 \
  yolo-annotation-service

# Or use docker-compose for full stack
docker-compose up -d

Multi-stage production build:

# Create production Dockerfile if needed
docker build -f backend/Dockerfile.prod -t yolo-annotation-service:prod .

# Run production container
docker run -d -p 8000:8000 \
  --name yolo-api \
  -e DATABASE_URL=mongodb://prod-mongo:27017 \
  -e ENVIRONMENT=production \
  yolo-annotation-service:prod

Cloud Deployment

Google Cloud Run:

# Build and push to GCR
gcloud builds submit --tag gcr.io/PROJECT_ID/yolo-annotation

# Deploy to Cloud Run
gcloud run deploy yolo-annotation \
  --image gcr.io/PROJECT_ID/yolo-annotation \
  --platform managed \
  --region us-central1

Environment Variables for Production

# Production environment variables
DATABASE_URL=mongodb://user:pass@prod-mongodb:27017
GCP_PROJECT_ID=production-project
GCP_STORAGE_BUCKET=prod-datasets-bucket
ENVIRONMENT=production
LOG_LEVEL=INFO

πŸ”§ Troubleshooting

Common Issues

1. Conda Environment Issues:

# Remove and recreate environment
conda env remove -n ultralytics-annotation
conda env create -f environment.yml
conda activate ultralytics-annotation

2. MongoDB Connection Issues:

# Check MongoDB status
docker-compose ps
docker-compose logs mongodb

# Restart MongoDB
docker-compose restart mongodb

# Reset MongoDB data
docker-compose down -v
docker-compose up -d

3. Port Conflicts:

# Check what's using port 8000
lsof -i :8000

# Kill process if needed
kill -9 <PID>

# Or use different port
uvicorn app.main:app --port 8001

4. Test Failures:

# Clear pytest cache
pytest --cache-clear

# Run tests with verbose output
pytest -v --tb=long

# Run specific failing test
pytest tests/test_file.py::test_function -v

5. Import Errors:

# Verify Python path
echo $PYTHONPATH

# Add current directory to path
export PYTHONPATH="${PYTHONPATH}:$(pwd)"

# Or run from correct directory
cd backend
python -m app.main

Performance Optimization

Database Indexing:

# MongoDB indexes are created automatically
# Check index status in mongo-express at localhost:8081

Memory Usage:

# Monitor memory usage
docker stats

# Limit container memory
docker-compose up -d --memory="2g"

Logging and Monitoring

Application logs:

# View application logs
tail -f logs/app.log

# Docker logs
docker-compose logs -f api

Health checks:

# API health check
curl http://localhost:8000/health

# Database health check
curl http://localhost:8000/api/v1/health/db

πŸ“š Additional Resources

🀝 Contributing

  1. Fork the repository
  2. Create feature branch (git checkout -b feature/amazing-feature)
  3. Add tests for new functionality
  4. Ensure all tests pass (pytest)
  5. Commit changes (git commit -m 'Add amazing feature')
  6. Push to branch (git push origin feature/amazing-feature)
  7. Open Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


🎯 Ready to get started? Follow the Quick Start guide above!

About

Ultralytics in Computer Vision

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published