Skip to content

Metrics Reference

Anup Ghatage edited this page Feb 12, 2026 · 1 revision

Metrics Reference

Zeppelin exposes Prometheus metrics at GET /metrics. All metrics use the zeppelin_ prefix.

Counters

Metric Labels Description
zeppelin_http_requests_total method, path, status Total HTTP requests
zeppelin_queries_total namespace Total vector/BM25 queries
zeppelin_wal_appends_total namespace WAL append operations (upsert/delete)
zeppelin_cache_hits_total result (hit/miss) Cache lookups by outcome
zeppelin_cache_evictions_total Total cache evictions
zeppelin_compactions_total namespace, status Compaction completions
zeppelin_s3_errors_total operation S3 operation errors
zeppelin_bitmap_vectors_skipped_total namespace Vectors filtered out by bitmap pre-filter
zeppelin_bitmap_fields_built_total namespace Bitmap index fields built per segment
zeppelin_bitmap_prefilter_used_total namespace Times bitmap pre-filter was applied
zeppelin_bitmap_fallback_postfilter_total namespace Times bitmap fell back to post-filter
zeppelin_fts_queries_total namespace Total FTS queries executed

Histograms

Metric Labels Buckets Description
zeppelin_query_duration_seconds namespace 1ms–10s Query latency
zeppelin_s3_operation_duration_seconds operation 5ms–10s S3 operation latency
zeppelin_compaction_duration_seconds namespace 100ms–300s Compaction duration
zeppelin_index_build_duration_seconds namespace, index_type 100ms–300s Index build time
zeppelin_bitmap_build_duration_seconds namespace 1ms–5s Bitmap construction time
zeppelin_bitmap_eval_duration_seconds namespace 0.1ms–100ms Bitmap filter evaluation
zeppelin_fts_query_duration_seconds namespace 1ms–5s FTS query latency
zeppelin_fts_index_build_duration_seconds namespace 10ms–60s Inverted index build time

Histogram Buckets

Query/FTS latency:

0.001, 0.005, 0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0

S3 operations:

0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0

Compaction/index build:

0.1, 0.5, 1.0, 5.0, 10.0, 30.0, 60.0, 120.0, 300.0

Bitmap build:

0.001, 0.005, 0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 5.0

Bitmap evaluation:

0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1

Gauges

Metric Labels Description
zeppelin_cache_entries Current entries in disk cache
zeppelin_active_queries In-flight queries (uses RAII guard for decrement-on-drop)

Example Prometheus Queries

Query latency p99

histogram_quantile(0.99, rate(zeppelin_query_duration_seconds_bucket[5m]))

Queries per second by namespace

rate(zeppelin_queries_total[5m])

Cache hit rate

rate(zeppelin_cache_hits_total{result="hit"}[5m])
/
rate(zeppelin_cache_hits_total[5m])

S3 error rate by operation

rate(zeppelin_s3_errors_total[5m])

Average compaction duration

rate(zeppelin_compaction_duration_seconds_sum[5m])
/
rate(zeppelin_compaction_duration_seconds_count[5m])

Bitmap filter effectiveness

rate(zeppelin_bitmap_vectors_skipped_total[5m])

FTS query latency p95

histogram_quantile(0.95, rate(zeppelin_fts_query_duration_seconds_bucket[5m]))

Active queries (current)

zeppelin_active_queries

Suggested Grafana Dashboard

Row 1: Overview

  • Query rate (rate(zeppelin_queries_total[5m]))
  • Query latency heatmap (zeppelin_query_duration_seconds_bucket)
  • Active queries gauge (zeppelin_active_queries)

Row 2: Storage

  • S3 operation latency by type (zeppelin_s3_operation_duration_seconds)
  • S3 error rate (rate(zeppelin_s3_errors_total[5m]))
  • Cache hit rate (%)

Row 3: Writes

  • WAL append rate (rate(zeppelin_wal_appends_total[5m]))
  • Compaction duration (zeppelin_compaction_duration_seconds)
  • Compaction status (rate(zeppelin_compactions_total[5m]))

Row 4: Indexes

  • Index build duration (zeppelin_index_build_duration_seconds)
  • Bitmap effectiveness (zeppelin_bitmap_vectors_skipped_total)
  • FTS query latency (zeppelin_fts_query_duration_seconds)

Clone this wiki locally