-
Notifications
You must be signed in to change notification settings - Fork 5
Deployment
Anup Ghatage edited this page Feb 12, 2026
·
1 revision
docker compose upThis starts:
-
Zeppelin on port
8080 -
MinIO on port
9000(API) and9001(console) -
minio-init bootstraps the
zeppelinbucket
MinIO credentials: minioadmin / minioadmin
The Dockerfile uses a multi-stage build:
-
Build stage:
rust:1.84-bookworm— compiles with static SSL linking -
Runtime stage:
debian:bookworm-slim— minimal image withca-certificatesandcurl - Non-root user:
zeppelin - Cache directory:
/var/cache/zeppelin - Health check:
curl -f http://localhost:8080/healthz(10s interval, 5 retries)
Override the MinIO defaults:
docker run -d \
-p 8080:8080 \
-e STORAGE_BACKEND=s3 \
-e S3_BUCKET=my-production-bucket \
-e AWS_ACCESS_KEY_ID=AKIA... \
-e AWS_SECRET_ACCESS_KEY=... \
-e AWS_REGION=us-east-1 \
-e RUST_LOG=info \
-v zeppelin-cache:/var/cache/zeppelin \
--name zeppelin \
zeppelin:latest# Set up credentials
cp deploy/.env.example deploy/.env
# Edit deploy/.env with your AWS credentials
# Deploy to EC2
./deploy/ec2-bench.sh launchThe ec2-bench.sh script:
- Provisions an EC2 instance (default:
c7i.xlarge— 4 vCPU, 8 GB RAM) - Installs Docker via cloud-init
- Builds and transfers the Docker image
- Starts Zeppelin with S3 backend
Commands:
./deploy/ec2-bench.sh launch # Provision + deploy
./deploy/ec2-bench.sh status # Check instance and health
./deploy/ec2-bench.sh ssh # SSH into instance
./deploy/ec2-bench.sh deploy # Rebuild + redeploy
./deploy/ec2-bench.sh teardown # Terminate + cleanup# On the server
export S3_BUCKET=my-bucket
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_REGION=us-east-1
./deploy/run-server.shThe run-server.sh script:
- Sources
.envif present - Stops any existing container
- Runs with
--restart unless-stopped - Waits up to 60 seconds for health check
Zeppelin supports multiple storage backends via the object_store crate:
STORAGE_BACKEND=s3
S3_BUCKET=my-bucket
AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...STORAGE_BACKEND=gcs
S3_BUCKET=my-gcs-bucket
GCS_SERVICE_ACCOUNT_PATH=/path/to/service-account.jsonSTORAGE_BACKEND=azure
S3_BUCKET=my-container
AZURE_ACCOUNT=mystorageaccount
AZURE_ACCESS_KEY=...STORAGE_BACKEND=s3
S3_BUCKET=zeppelin
S3_ENDPOINT=http://minio:9000
AWS_ACCESS_KEY_ID=minioadmin
AWS_SECRET_ACCESS_KEY=minioadmin
S3_ALLOW_HTTP=trueSTORAGE_BACKEND=local
S3_BUCKET=/var/data/zeppelinThe nprobe / num_centroids ratio controls the accuracy-speed tradeoff:
| Vectors | Centroids | nprobe | Notes |
|---|---|---|---|
| < 10K | 16-32 | 4-8 | Small dataset |
| 10K-100K | 64-256 | 8-32 | Medium dataset |
| 100K-1M | 256-1024 | 16-64 | Large dataset |
| > 1M | 1024+ | 32-128 | Consider hierarchical |
Higher nprobe = better recall, slower queries. Start with nprobe = sqrt(num_centroids).
| Method | Compression | Accuracy Loss | Use Case |
|---|---|---|---|
| None | 1x | None | Small datasets, highest accuracy |
| SQ8 | 4x | < 1% recall loss | Good default for memory savings |
| PQ | 16-32x | 2-5% recall loss | Very large datasets |
The disk cache stores centroids and frequently-accessed clusters:
ZEPPELIN_CACHE_MAX_SIZE_GB=50 # Adjust based on working setRule of thumb: cache should hold all centroids + the hottest 10-20% of clusters.
ZEPPELIN_COMPACTION_INTERVAL_SECS=30 # Default
ZEPPELIN_MAX_WAL_FRAGMENTS=1000 # Trigger threshold- Lower interval = fresher indexes, more S3 operations
- Higher interval = staler indexes, fewer S3 operations
- For write-heavy workloads, lower the interval. For read-heavy, raise it.
Zeppelin exposes metrics at GET /metrics in Prometheus text format. See Metrics Reference for the full list.
| Panel | Metric | Description |
|---|---|---|
| Query Latency (p50/p95/p99) | zeppelin_query_duration_seconds |
Query response time distribution |
| QPS | rate(zeppelin_queries_total[5m]) |
Queries per second by namespace |
| S3 Latency | zeppelin_s3_operation_duration_seconds |
S3 operation time by type |
| S3 Error Rate | rate(zeppelin_s3_errors_total[5m]) |
S3 errors by operation |
| Cache Hit Rate | rate(zeppelin_cache_hits_total{result="hit"}[5m]) |
Cache effectiveness |
| Active Queries | zeppelin_active_queries |
Current in-flight queries |
| Compaction Duration | zeppelin_compaction_duration_seconds |
Time per compaction |
| WAL Appends | rate(zeppelin_wal_appends_total[5m]) |
Write throughput |
# Liveness (is the process running?)
curl http://localhost:8080/healthz
# Readiness (can it serve traffic? checks S3)
curl http://localhost:8080/readyzGetting Started
API & SDKs
Configuration
Architecture
Operations