-
Notifications
You must be signed in to change notification settings - Fork 47
Open
Description
Current Issue
Problem
Maglev GTFS server experiences severe performance bottlenecks under high concurrent load due to coarse-grained pessimistic locking (PCC):
Current Bottlenecks
1. API Blocking During GTFS Updates
// ForceUpdate() blocks ALL API requests for 2-5 minutes
manager.staticMutex.Lock() // Blocks 1000+ concurrent requests
// Rebuilding GTFS database, spatial index...
manager.staticMutex.Unlock() // Users experience timeouts
// Every API request competes for the same RWMutex
func GetRoutes() {
manager.RLock() // Goroutines waiting in line
routes := manager.gtfsData.Routes
manager.RUnlock() // CPU cycles wasted on synchronization
}
Performance Impact
API Response: 2-5 minute spikes during GTFS updates
Throughput: 90% drop during background operations
User Experience: Timeouts and service unavailability
Proposed Solution: Optimistic Concurrency Control (OCC)
Why OCC Boosts Performance
Lock-Free Reads
// Direct memory access, no blocking
func GetRoutes() {
routes := manager.gtfsData.Routes // Instant access
version := atomic.LoadInt64(&manager.version) // Version check
return routes // No locks = consistent performance
}
Non-Blocking Updates
// Prepare data separately, atomic swap
func ForceUpdate() {
newData := buildNewGTFS() // No locks during expensive work
// Atomic pointer swap - microsecond operation
atomic.StorePointer(&manager.dataPtr, unsafe.Pointer(newData))
atomic.AddInt64(&manager.version, 1)
// API requests never blocked!
}
Expected Performance Gains
Before: API requests block 2-5 minutes during updates
After: Consistent sub-100ms responses even during updates
Scalability: Near-linear scaling with CPU cores
Throughput: 95%+ maintained during background operations
Implementation Strategy
Starting with atomic version tracking foundation, then gradually replacing mutex patterns with lock-free algorithms.
Benefits
Eliminates reader blocking during GTFS updates
Removes lock contention bottlenecks
Improves scalability under high concurrent load
Better resource utilizationReactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels