-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
component:metricsMetrics and observabilityMetrics and observabilitycomponent:rust-engineRust load test engineRust load test engineeffort:mMedium - 3-5 daysMedium - 3-5 dayspriority:p2-mediumMedium priorityMedium prioritytype:featureNew feature or functionalityNew feature or functionality
Description
Problem
At very high RPS (50k+), recording every single request to HDR histograms becomes:
- Memory intensive: Each recording adds data to histogram
- CPU intensive: Mutex contention and histogram updates
- Unnecessary: Statistical accuracy only needs a sample
Current behavior:
- 100% of requests tracked in histograms
- At 50k RPS, that's 50,000 histogram updates/sec
- Causes both memory and performance issues
Proposed Solution
Add sampling to only track a percentage of requests:
PERCENTILE_SAMPLING_RATE=10 # Track 10% of requests (1 in 10)Implementation Details
Files to modify:
src/config.rs- Add sampling rate configsrc/percentiles.rs- Implement sampling logicsrc/worker.rs- Apply sampling before recording
Approach 1: Deterministic sampling (Recommended)
use std::sync::atomic::{AtomicU64, Ordering};
static SAMPLE_COUNTER: AtomicU64 = AtomicU64::new(0);
pub fn should_sample(rate: u64) -> bool {
if rate == 100 {
return true; // Fast path: sample everything
}
let counter = SAMPLE_COUNTER.fetch_add(1, Ordering::Relaxed);
(counter % (100 / rate)) == 0
}
// In src/worker.rs
let latency_ms = request_start_time.elapsed().as_millis() as u64;
if should_sample(config.percentile_sampling_rate) {
GLOBAL_REQUEST_PERCENTILES.record_ms(latency_ms);
}Approach 2: Random sampling
use rand::Rng;
pub fn should_sample(rate: u64) -> bool {
if rate == 100 {
return true;
}
let mut rng = rand::thread_rng();
rng.gen_range(0..100) < rate
}Configuration Examples
Based on RPS:
# Low RPS (< 5k) - track everything
PERCENTILE_SAMPLING_RATE=100 # 100%
# Medium RPS (5k-25k) - track half
PERCENTILE_SAMPLING_RATE=50 # 50%
# High RPS (25k-50k) - track 10%
PERCENTILE_SAMPLING_RATE=10 # 10%
# Extreme RPS (50k+) - track 1%
PERCENTILE_SAMPLING_RATE=1 # 1%Statistical Validity
Sampling 10% of requests is statistically valid for percentile calculation:
- Sample size: At 50k RPS for 1h, 10% = 1.8M samples
- Accuracy: More than sufficient for P99.9 calculation
- Standard: Industry practice (DataDog, New Relic sample heavily)
Literature:
- Netflix: Uses 1% sampling for tail latencies
- Google: Samples requests for distributed tracing
- Statistical theorem: Sample size of 1000+ is enough for percentiles
Memory Savings
Example at 50k RPS for 1 hour:
Sampling Rate Recordings Memory Impact
─────────────────────────────────────────────────
100% (current) 180M samples ~100% (baseline)
50% 90M samples ~50% reduction
10% 18M samples ~90% reduction
1% 1.8M samples ~99% reduction
Benefits
- Lower memory: Fewer data points stored
- Better performance: Less mutex contention
- Scalability: Support higher RPS
- Flexibility: Users tune accuracy vs resources
Output Clarity
Mark sampled percentiles clearly in output:
## Single Request Latencies (sampled at 10%)
count=1800000, min=5.23ms, max=1234.56ms, mean=45.67ms
p50=42.31ms, p90=89.23ms, p95=112.45ms, p99=234.56ms, p99.9=567.89ms
Note: Percentiles calculated from 10% sample of 18M requests (1.8M samples)
Auto-tuning (Future enhancement)
Could auto-adjust sampling based on RPS:
fn auto_sampling_rate(current_rps: f64) -> u64 {
match current_rps {
rps if rps < 5000.0 => 100, // Track all
rps if rps < 25000.0 => 50, // Track half
rps if rps < 50000.0 => 10, // Track 10%
_ => 1, // Track 1%
}
}Testing
- Run test with 100% sampling, get baseline percentiles
- Run same test with 10% sampling
- Compare percentiles - should be within 5% of baseline
- Verify memory usage reduced
- Performance test: Compare CPU usage with/without sampling
Documentation
Update:
MEMORY_OPTIMIZATION.md- Add sampling recommendationsREADME.md- Document sampling configurationLOAD_TEST_SCENARIOS.md- Recommend sampling rates by scenario
Related
- Issue [Memory] Add PERCENTILE_TRACKING_ENABLED configuration flag #66 - PERCENTILE_TRACKING_ENABLED flag
- Issue [Memory] Implement periodic histogram reset/rotation #67 - Histogram rotation
- Issue [Memory] Limit maximum unique histogram labels #68 - Label limits
- Issue [Memory] Add process memory usage metrics #69 - Memory metrics
- See
MEMORY_OPTIMIZATION.mdfor full analysis
References
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
component:metricsMetrics and observabilityMetrics and observabilitycomponent:rust-engineRust load test engineRust load test engineeffort:mMedium - 3-5 daysMedium - 3-5 dayspriority:p2-mediumMedium priorityMedium prioritytype:featureNew feature or functionalityNew feature or functionality