Skip to content

Deployment

Anup Ghatage edited this page Feb 12, 2026 · 1 revision

Deployment

Docker (Recommended)

Development with MinIO

docker compose up

This starts:

  • Zeppelin on port 8080
  • MinIO on port 9000 (API) and 9001 (console)
  • minio-init bootstraps the zeppelin bucket

MinIO credentials: minioadmin / minioadmin

Dockerfile Details

The Dockerfile uses a multi-stage build:

  • Build stage: rust:1.84-bookworm — compiles with static SSL linking
  • Runtime stage: debian:bookworm-slim — minimal image with ca-certificates and curl
  • Non-root user: zeppelin
  • Cache directory: /var/cache/zeppelin
  • Health check: curl -f http://localhost:8080/healthz (10s interval, 5 retries)

Production S3

Override the MinIO defaults:

docker run -d \
  -p 8080:8080 \
  -e STORAGE_BACKEND=s3 \
  -e S3_BUCKET=my-production-bucket \
  -e AWS_ACCESS_KEY_ID=AKIA... \
  -e AWS_SECRET_ACCESS_KEY=... \
  -e AWS_REGION=us-east-1 \
  -e RUST_LOG=info \
  -v zeppelin-cache:/var/cache/zeppelin \
  --name zeppelin \
  zeppelin:latest

EC2 / Bare Metal

Using the deploy script

# Set up credentials
cp deploy/.env.example deploy/.env
# Edit deploy/.env with your AWS credentials

# Deploy to EC2
./deploy/ec2-bench.sh launch

The ec2-bench.sh script:

  1. Provisions an EC2 instance (default: c7i.xlarge — 4 vCPU, 8 GB RAM)
  2. Installs Docker via cloud-init
  3. Builds and transfers the Docker image
  4. Starts Zeppelin with S3 backend

Commands:

./deploy/ec2-bench.sh launch     # Provision + deploy
./deploy/ec2-bench.sh status     # Check instance and health
./deploy/ec2-bench.sh ssh        # SSH into instance
./deploy/ec2-bench.sh deploy     # Rebuild + redeploy
./deploy/ec2-bench.sh teardown   # Terminate + cleanup

Manual deployment

# On the server
export S3_BUCKET=my-bucket
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_REGION=us-east-1

./deploy/run-server.sh

The run-server.sh script:

  • Sources .env if present
  • Stops any existing container
  • Runs with --restart unless-stopped
  • Waits up to 60 seconds for health check

Multi-Cloud Storage

Zeppelin supports multiple storage backends via the object_store crate:

AWS S3

STORAGE_BACKEND=s3
S3_BUCKET=my-bucket
AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...

Google Cloud Storage

STORAGE_BACKEND=gcs
S3_BUCKET=my-gcs-bucket
GCS_SERVICE_ACCOUNT_PATH=/path/to/service-account.json

Azure Blob Storage

STORAGE_BACKEND=azure
S3_BUCKET=my-container
AZURE_ACCOUNT=mystorageaccount
AZURE_ACCESS_KEY=...

MinIO / S3-Compatible

STORAGE_BACKEND=s3
S3_BUCKET=zeppelin
S3_ENDPOINT=http://minio:9000
AWS_ACCESS_KEY_ID=minioadmin
AWS_SECRET_ACCESS_KEY=minioadmin
S3_ALLOW_HTTP=true

Local Filesystem

STORAGE_BACKEND=local
S3_BUCKET=/var/data/zeppelin

Performance Tuning

Centroids and nprobe

The nprobe / num_centroids ratio controls the accuracy-speed tradeoff:

Vectors Centroids nprobe Notes
< 10K 16-32 4-8 Small dataset
10K-100K 64-256 8-32 Medium dataset
100K-1M 256-1024 16-64 Large dataset
> 1M 1024+ 32-128 Consider hierarchical

Higher nprobe = better recall, slower queries. Start with nprobe = sqrt(num_centroids).

Quantization

Method Compression Accuracy Loss Use Case
None 1x None Small datasets, highest accuracy
SQ8 4x < 1% recall loss Good default for memory savings
PQ 16-32x 2-5% recall loss Very large datasets

Cache sizing

The disk cache stores centroids and frequently-accessed clusters:

ZEPPELIN_CACHE_MAX_SIZE_GB=50  # Adjust based on working set

Rule of thumb: cache should hold all centroids + the hottest 10-20% of clusters.

Compaction interval

ZEPPELIN_COMPACTION_INTERVAL_SECS=30   # Default
ZEPPELIN_MAX_WAL_FRAGMENTS=1000        # Trigger threshold
  • Lower interval = fresher indexes, more S3 operations
  • Higher interval = staler indexes, fewer S3 operations
  • For write-heavy workloads, lower the interval. For read-heavy, raise it.

Monitoring

Prometheus Metrics

Zeppelin exposes metrics at GET /metrics in Prometheus text format. See Metrics Reference for the full list.

Suggested Grafana Panels

Panel Metric Description
Query Latency (p50/p95/p99) zeppelin_query_duration_seconds Query response time distribution
QPS rate(zeppelin_queries_total[5m]) Queries per second by namespace
S3 Latency zeppelin_s3_operation_duration_seconds S3 operation time by type
S3 Error Rate rate(zeppelin_s3_errors_total[5m]) S3 errors by operation
Cache Hit Rate rate(zeppelin_cache_hits_total{result="hit"}[5m]) Cache effectiveness
Active Queries zeppelin_active_queries Current in-flight queries
Compaction Duration zeppelin_compaction_duration_seconds Time per compaction
WAL Appends rate(zeppelin_wal_appends_total[5m]) Write throughput

Health Checks

# Liveness (is the process running?)
curl http://localhost:8080/healthz

# Readiness (can it serve traffic? checks S3)
curl http://localhost:8080/readyz

Clone this wiki locally