Deployment

Docker (Recommended)

Development with MinIO

docker compose up

This starts:

Zeppelin on port 8080
MinIO on port 9000 (API) and 9001 (console)
minio-init bootstraps the zeppelin bucket

MinIO credentials: minioadmin / minioadmin

Dockerfile Details

The Dockerfile uses a multi-stage build:

Build stage: rust:1.84-bookworm — compiles with static SSL linking
Runtime stage: debian:bookworm-slim — minimal image with ca-certificates and curl
Non-root user: zeppelin
Cache directory: /var/cache/zeppelin
Health check: curl -f http://localhost:8080/healthz (10s interval, 5 retries)

Production S3

Override the MinIO defaults:

docker run -d \
  -p 8080:8080 \
  -e STORAGE_BACKEND=s3 \
  -e S3_BUCKET=my-production-bucket \
  -e AWS_ACCESS_KEY_ID=AKIA... \
  -e AWS_SECRET_ACCESS_KEY=... \
  -e AWS_REGION=us-east-1 \
  -e RUST_LOG=info \
  -v zeppelin-cache:/var/cache/zeppelin \
  --name zeppelin \
  zeppelin:latest

EC2 / Bare Metal

Using the deploy script

# Set up credentials
cp deploy/.env.example deploy/.env
# Edit deploy/.env with your AWS credentials

# Deploy to EC2
./deploy/ec2-bench.sh launch

The ec2-bench.sh script:

Provisions an EC2 instance (default: c7i.xlarge — 4 vCPU, 8 GB RAM)
Installs Docker via cloud-init
Builds and transfers the Docker image
Starts Zeppelin with S3 backend

Commands:

./deploy/ec2-bench.sh launch     # Provision + deploy
./deploy/ec2-bench.sh status     # Check instance and health
./deploy/ec2-bench.sh ssh        # SSH into instance
./deploy/ec2-bench.sh deploy     # Rebuild + redeploy
./deploy/ec2-bench.sh teardown   # Terminate + cleanup

Manual deployment

# On the server
export S3_BUCKET=my-bucket
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_REGION=us-east-1

./deploy/run-server.sh

The run-server.sh script:

Sources .env if present
Stops any existing container
Runs with --restart unless-stopped
Waits up to 60 seconds for health check

Multi-Cloud Storage

Zeppelin supports multiple storage backends via the object_store crate:

AWS S3

STORAGE_BACKEND=s3
S3_BUCKET=my-bucket
AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...

Google Cloud Storage

STORAGE_BACKEND=gcs
S3_BUCKET=my-gcs-bucket
GCS_SERVICE_ACCOUNT_PATH=/path/to/service-account.json

Azure Blob Storage

STORAGE_BACKEND=azure
S3_BUCKET=my-container
AZURE_ACCOUNT=mystorageaccount
AZURE_ACCESS_KEY=...

MinIO / S3-Compatible

STORAGE_BACKEND=s3
S3_BUCKET=zeppelin
S3_ENDPOINT=http://minio:9000
AWS_ACCESS_KEY_ID=minioadmin
AWS_SECRET_ACCESS_KEY=minioadmin
S3_ALLOW_HTTP=true

Local Filesystem

STORAGE_BACKEND=local
S3_BUCKET=/var/data/zeppelin

Performance Tuning

Centroids and nprobe

The nprobe / num_centroids ratio controls the accuracy-speed tradeoff:

Vectors	Centroids	nprobe	Notes
< 10K	16-32	4-8	Small dataset
10K-100K	64-256	8-32	Medium dataset
100K-1M	256-1024	16-64	Large dataset
> 1M	1024+	32-128	Consider hierarchical

Higher nprobe = better recall, slower queries. Start with nprobe = sqrt(num_centroids).

Quantization

Method	Compression	Accuracy Loss	Use Case
None	1x	None	Small datasets, highest accuracy
SQ8	4x	< 1% recall loss	Good default for memory savings
PQ	16-32x	2-5% recall loss	Very large datasets

Cache sizing

The disk cache stores centroids and frequently-accessed clusters:

ZEPPELIN_CACHE_MAX_SIZE_GB=50  # Adjust based on working set

Rule of thumb: cache should hold all centroids + the hottest 10-20% of clusters.

Compaction interval

ZEPPELIN_COMPACTION_INTERVAL_SECS=30   # Default
ZEPPELIN_MAX_WAL_FRAGMENTS=1000        # Trigger threshold

Lower interval = fresher indexes, more S3 operations
Higher interval = staler indexes, fewer S3 operations
For write-heavy workloads, lower the interval. For read-heavy, raise it.

Monitoring

Prometheus Metrics

Zeppelin exposes metrics at GET /metrics in Prometheus text format. See Metrics Reference for the full list.

Suggested Grafana Panels

Panel	Metric	Description
Query Latency (p50/p95/p99)	`zeppelin_query_duration_seconds`	Query response time distribution
QPS	`rate(zeppelin_queries_total[5m])`	Queries per second by namespace
S3 Latency	`zeppelin_s3_operation_duration_seconds`	S3 operation time by type
S3 Error Rate	`rate(zeppelin_s3_errors_total[5m])`	S3 errors by operation
Cache Hit Rate	`rate(zeppelin_cache_hits_total{result="hit"}[5m])`	Cache effectiveness
Active Queries	`zeppelin_active_queries`	Current in-flight queries
Compaction Duration	`zeppelin_compaction_duration_seconds`	Time per compaction
WAL Appends	`rate(zeppelin_wal_appends_total[5m])`	Write throughput

Health Checks

# Liveness (is the process running?)
curl http://localhost:8080/healthz

# Readiness (can it serve traffic? checks S3)
curl http://localhost:8080/readyz

Getting Started

API & SDKs

Configuration

Architecture

Operations

Deployment

Deployment

Docker (Recommended)

Development with MinIO

Dockerfile Details

Production S3

EC2 / Bare Metal

Using the deploy script

Manual deployment

Multi-Cloud Storage

AWS S3

Google Cloud Storage

Azure Blob Storage

MinIO / S3-Compatible

Local Filesystem

Performance Tuning

Centroids and nprobe

Quantization

Cache sizing

Compaction interval

Monitoring

Prometheus Metrics

Suggested Grafana Panels

Health Checks

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally