VenomEngine is a high-performance, C++23-based matching engine and quantitative risk suite. It is designed to simulate the environment of a High-Frequency Trading (HFT) firm, focusing on nanosecond-level execution and hardware-aware mathematical computations.
- Deterministic Latency: Sub-microsecond order matching via cache-optimized data structures.
- Quantitative Signal Generation: Real-time calculation of VPIN (Volume-synchronized Probability of Informed Trading) using SIMD.
- HPC Integration: Utilization of Lock-free SPSC queues, CPU pinning, and AVX-512 intrinsics.
The name VenomEngine is a play on the project's primary mathematical focus: Order Flow Toxicity.
- The Concept: In high-frequency markets, "Toxic Flow" occurs when liquidity providers are adversely selected by informed traders.
- The "Venom": This engine is specifically architected to detect the "venom" (toxic informed flow) in real-time using the VPIN metric.
- The Engine: Just as an antivenom must act instantly, this engine utilizes AVX-512 vectorization to process volume buckets at hardware speeds, allowing the system to identify toxic shifts in the limit order book before price discovery "bites" the market maker.
- Zero-Allocation Policy: No
std::malloc,new, or smart pointers (std::shared_ptr) are permitted during the matching loop. - Memory Pooling: All
Orderobjects must be retrieved from a pre-allocatedOrderPool(Arena). - Data-Oriented Design (DOD): Structures are designed to fit within a 64-byte L1 cache line to prevent cache misses and false sharing.
-
Intrusive Doubly-Linked List: Price levels maintain
headandtailindices. Orders storenext_idxandprev_idxasint32_tto save space and avoid pointer chasing. -
Sparse Array Price Ladder: Indexing price levels via
$O(1)$ direct mapping rather than binary search trees. -
Bitset Optimization: Uses
std::bitsetcombined withstd::countr_zero(hardware-accelerated TZCNT) to identify the best bid/ask instantly.
The risk engine calculates VPIN to measure order flow toxicity.
The engine groups trade flow into volume buckets of size
- SIMD Batching: Volume imbalances across 16+ buckets are calculated in parallel using AVX-512 registers.
- Floating Point Precision: Use
floatfor SIMD throughput unlessdoubleis strictly required for precision.
The system is partitioned into isolated execution cores to prevent OS context switching.
| Component | Responsibility | Technical Requirement |
|---|---|---|
| Matching Engine | Order entry, Price-Time priority | Core Pinning, std::atomic |
| SPSC Queue | Inter-thread message passing | Lock-free, Cache-line padding |
| Risk Engine | VPIN, Micro-price, Greeks | AVX-512, SIMD Intrinsics |
| Market Feed | UDP/TCP Simulation | mmap / Shared Memory |