-
Notifications
You must be signed in to change notification settings - Fork 332
Parallel encoding with Rayon threadpool #661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
|
||
| // Run geyser message loop | ||
| let (messages_tx, messages_rx) = mpsc::unbounded_channel(); | ||
| let parallel_encoder = ParallelEncoder::new(4); // number should be in a config not hard coded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you said, make this configurable please.
| let (tx, rx) = mpsc::unbounded_channel(); | ||
|
|
||
| std::thread::Builder::new() | ||
| .name("encoder-bridge".into()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you name it geyser-encoding-bridge instead so when we list thread can grep all geyser related threads using geyser* please.
| }); | ||
|
|
||
| let _ = response.send(batch); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a log::info!("existing encoder bridge loop) when you leave please to facilitate troubelshooting.
|
|
||
| let (tx, rx) = mpsc::unbounded_channel(); | ||
|
|
||
| std::thread::Builder::new() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spawning thread without every joining the handle is typically a code smell.
In the case of the geyser plugin we must make sure we close all threads before leaving the "uninstall" plugin method, otherwise it the .so binary will still be mapped in memory.
|
|
||
| // Run geyser message loop | ||
| let (messages_tx, messages_rx) = mpsc::unbounded_channel(); | ||
| let parallel_encoder = ParallelEncoder::new(4); // number should be in a config not hard coded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure you can get a hold of the parallel_encoder so we can gracefully shut it down when unloading a geyser plugin.
| } | ||
|
|
||
| impl ParallelEncoder { | ||
| pub fn new(num_threads: usize) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure you also return JoinHandle so code elsewhere can join it during unload procedure.
Parallel encoding with Rayon threadpool
Problem
Pre-encoding in
from_geyser()blocks validator threads during serialization. Even though encoding happens once per message, the validator must wait for each encode to complete before processing the next notification.Solution
Move encoding from
from_geyser()togeyser_loop()and parallelize using a dedicated Rayon threadpool. Messages arrive un-encoded, get batched (up to 31), then parallel encoded before broadcast. This moves CPU-intensive work off validator threads onto dedicated encoder threads.Changes
New module: parallel.rs
ParallelEncoderstruct with dedicated Rayon threadpoolblocking_recvyellowstone-grpc-proto/src/plugin/message.rs
TransactionEncoder::pre_encode()call fromMessageTransactionInfo::from_geyser()AccountEncoder::pre_encode()call fromMessageAccountInfo::from_geyser()yellowstone-grpc-geyser/src/grpc.rs
parallel_encoderparameter togeyser_loop()ParallelEncoder::new(4)at startupparallel_encoder.encode(batch).awaitbefore broadcastyellowstone-grpc-geyser/src/metrics.rs
GEYSER_BATCH_SIZEhistogram to track batch size distributionWhy this works
Arc::get_mut()succeeds because we have sole ownership of messages ingeyser_loopbefore broadcastGEYSER_BATCH_SIZEmetric), justifying parallelization overheadnotify_transaction/notify_accountTesting
test_parallel_encoder_transactions: verifies transactions get encodedtest_parallel_encoder_accounts: verifies accounts get encodedtest_small_batch_uses_sync: verifies batches < 4 skip threadpool overheadtest_mixed_batch: verifies mixed transaction/account batches workExpected impact
GEYSER_BATCH_SIZEhistogram