Skip to content

Make parallel apply compatible with RUN_STANDALONE#5146

Open
SirTyson wants to merge 1 commit intostellar:masterfrom
SirTyson:apply-background-ledger-test-fix
Open

Make parallel apply compatible with RUN_STANDALONE#5146
SirTyson wants to merge 1 commit intostellar:masterfrom
SirTyson:apply-background-ledger-test-fix

Conversation

@SirTyson
Copy link
Contributor

Description

This change makes RUN_STANDALONE compatible with parallel ledger apply. There are a few motivations for this. First, many tests enable RUN_STANDALONE, so we silently disable parallel apply in many unit tests. This change increases test coverage (and we'll need to do it eventually anyway if we make parallel apply the unconditional default). 2nd, this allows us to use parallel apply in apply load tests, which significantly reduces variance of tests like max-sac-tps test.

This touches a few tests I'm not very familiar with (and was AI assisted), but it looks reasonable to me and CI passes, but I'd like for someone more familiar with these unit tests to take a look.

Checklist

  • Reviewed the contributing document
  • Rebased on top of master (no merge commits)
  • Ran clang-format v8.0.0 (via make format or the Visual Studio extension)
  • Compiles
  • Ran all tests
  • If change impacts performance, include supporting evidence per the performance document

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes RUN_STANDALONE compatible with parallel/background ledger apply so that more unit tests (and apply-load runs) can exercise the parallel apply path.

Changes:

  • Allow RUN_STANDALONE to use PARALLEL_LEDGER_APPLY by removing the standalone restriction and updating manual-close/test flows to wait for background apply completion.
  • Prevent virtual time from advancing while background-thread actions are queued (to avoid time jumps that can delay processing apply results).
  • Adjust several tests/configs to be robust under background apply (timeouts, draining apply before shutdown, disabling parallel apply in tests that mutate headers).

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/util/Timer.cpp Avoid virtual-time fast-forward when background-thread actions are pending.
src/overlay/test/TCPPeerTests.cpp Increase timeouts to account for extra main-thread callbacks under background apply.
src/main/Config.cpp Stop auto-disabling parallel apply in standalone; disable only for in-memory SQLite.
src/main/CommandLine.cpp Force parallel apply in apply-load max-sac-tps mode.
src/main/ApplicationImpl.cpp In standalone manual-close, crank until background apply finishes and LCL advances.
src/ledger/test/LedgerCloseMetaStreamTests.cpp Drain background apply before shutdown for consistent debug txset/LCL reads.
src/ledger/LedgerManagerImpl.cpp Improve test-only error for txset mismatch on background apply thread.
src/invariant/test/ConservationOfLumensTests.cpp Disable parallel apply in tests that directly mutate the ledger header.
src/history/test/HistoryTests.cpp Wait for background apply to catch up after externalization in catchup simulation.
src/herder/test/TransactionQueueTests.cpp Wait for background apply to finish before asserting queue state.
src/herder/test/HerderTests.cpp Wait for background apply to finish after externalization helper.
docs/apply-load-max-sac-tps.cfg Enable parallel apply for the max-sac-tps apply-load config.

Comment on lines +990 to +997
// With background apply, triggerNextLedger posts work to the
// apply thread. Crank until the ledger is fully applied and
// LCL has advanced.
while (getLedgerManager().getLastClosedLedgerNum() <
targetLedgerSeq)
{
getClock().crank(true);
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces an unbounded while loop that will crank forever if the ledger never advances (e.g., apply thread stuck or an exception prevented completion). To avoid hanging the manualclose command, add a timeout/deadline and throw a descriptive error if targetLedgerSeq isn’t reached in time.

Copilot uses AI. Check for mistakes.
Comment on lines 1967 to +1972
if (PARALLEL_LEDGER_APPLY && !parallelLedgerClose())
{
if (RUN_STANDALONE)
{
LOG_WARNING(DEFAULT_LOG, "RUN_STANDALONE is enabled, disabling "
"PARALLEL_LEDGER_APPLY");
PARALLEL_LEDGER_APPLY = false;
}
else
{
std::string msg =
"Invalid configuration: PARALLEL_LEDGER_APPLY "
"does not support in-memory database modes.";
throw std::runtime_error(msg);
}
LOG_WARNING(DEFAULT_LOG,
"PARALLEL_LEDGER_APPLY is not supported with "
"in-memory SQLite, disabling.");
PARALLEL_LEDGER_APPLY = false;
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Behavior change: previously, enabling PARALLEL_LEDGER_APPLY with an in-memory DB would throw unless RUN_STANDALONE was set; now it silently disables parallel apply for all modes. This can mask misconfiguration outside tests (users may think parallel apply is active). Consider preserving the previous behavior (throw when not RUN_STANDALONE, or at least when not BUILD_TESTS) and only auto-disable in standalone/testing contexts.

Copilot uses AI. Check for mistakes.
Comment on lines +1979 to +1983
// With background apply, crank until all queued ledgers are applied
while (lm.getLastClosedLedgerNum() < nextLedger + 5)
{
app->getClock().crank(true);
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unbounded crank loop can hang the test indefinitely if something goes wrong and LCL never advances. Since this file already uses testutil::crankUntil elsewhere, consider switching to a bounded crankUntil/deadline-based wait that fails with a clear message on timeout.

Suggested change
// With background apply, crank until all queued ledgers are applied
while (lm.getLastClosedLedgerNum() < nextLedger + 5)
{
app->getClock().crank(true);
}
// With background apply, crank until all queued ledgers are applied.
REQUIRE(testutil::crankUntil(app->getClock(), [&]() {
return lm.getLastClosedLedgerNum() >= nextLedger + 5;
}));

Copilot uses AI. Check for mistakes.
Comment on lines +3073 to +3075
// With background apply, crank until the ledger is fully applied
while (lm.getLastClosedLedgerNum() < ledgerSeq)
{
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unbounded crank loop can hang the test indefinitely if the ledger apply never completes. Consider adding a deadline (e.g., auto timeout = clock.now() + ...; REQUIRE(clock.now() < timeout);) or using a helper that waits with a bounded timeout so failures don’t wedge the test runner.

Suggested change
// With background apply, crank until the ledger is fully applied
while (lm.getLastClosedLedgerNum() < ledgerSeq)
{
// With background apply, crank until the ledger is fully applied
auto const timeout =
clock.now() + std::chrono::seconds(10);
while (lm.getLastClosedLedgerNum() < ledgerSeq)
{
REQUIRE(clock.now() < timeout);

Copilot uses AI. Check for mistakes.
Comment on lines +5285 to +5288
// With background apply, crank until the ledger is fully applied
while (lm.getLastClosedLedgerNum() < ledgerSeq)
{
app.getClock().crank(true);
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper now contains an unbounded while loop that will crank forever if ledger apply stalls. Since this is used by multiple tests, it’s especially important to add a deadline/timeout and fail fast with a clear error when ledgerSeq isn’t reached.

Suggested change
// With background apply, crank until the ledger is fully applied
while (lm.getLastClosedLedgerNum() < ledgerSeq)
{
app.getClock().crank(true);
// With background apply, crank until the ledger is fully applied,
// but fail fast if it does not close within a reasonable time.
auto timeout = app.getClock().now() + std::chrono::seconds(5);
while (lm.getLastClosedLedgerNum() < ledgerSeq)
{
app.getClock().crank(true);
REQUIRE(app.getClock().now() < timeout);

Copilot uses AI. Check for mistakes.
@SirTyson SirTyson force-pushed the apply-background-ledger-test-fix branch from 790b206 to 576c1d6 Compare February 18, 2026 23:48
Copy link
Contributor

@marta-lokhova marta-lokhova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is a helpful change! Looks good overall, I just had a couple of suggestions and a question.

@SirTyson SirTyson force-pushed the apply-background-ledger-test-fix branch from 576c1d6 to b601bbc Compare February 19, 2026 01:29
marta-lokhova
marta-lokhova previously approved these changes Feb 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments