Fix tests #3

Retropex · 2025-12-15T15:55:12Z

No description provided.

Rob1Ham · 2025-12-15T19:57:58Z

Hey @Retropex - would appreciate if you could provide some code review of #2 as well, thanks!

Rob1Ham

(Note, comment's made by Rob's AI Agent referenced in #2 )

Re: src/test/fuzz/miniscript.cpp changes

The approach here unconditionally whitelists all REDUCED_DATA-related errors:

assert(res ||
       serror == ScriptError::SCRIPT_ERR_PUSH_SIZE ||
       serror == ScriptError::SCRIPT_ERR_DISCOURAGE_UPGRADABLE_WITNESS_PROGRAM ||
       serror == ScriptError::SCRIPT_ERR_DISCOURAGE_UPGRADABLE_TAPROOT_VERSION ||
       serror == ScriptError::SCRIPT_ERR_DISCOURAGE_OP_SUCCESS ||
       serror == ScriptError::SCRIPT_ERR_TAPSCRIPT_MINIMALIF);

This allows these errors to pass regardless of context, which weakens the test coverage. For example, SCRIPT_ERR_TAPSCRIPT_MINIMALIF should only occur when the miniscript actually uses OP_IF/OP_NOTIF fragments (WRAP_D, WRAP_J, OR_C, OR_D, OR_I, ANDOR) and we're in a tapscript context.

CI Implications

While this unconditional approach will likely pass CI (the fuzz tests won't fail because all the relevant errors are whitelisted), it risks silently masking regressions. If a future change causes SCRIPT_ERR_TAPSCRIPT_MINIMALIF to fire in a P2WSH context where OP_IF is perfectly valid, or causes SCRIPT_ERR_DISCOURAGE_UPGRADABLE_WITNESS_PROGRAM to fire on a standard witness v0 script, the fuzz test would pass instead of catching the bug.

More Comprehensive Approach in PR #2

In PR #2, I addressed this more precisely by adding a UsesOpIf() helper that recursively checks if a miniscript node uses OP_IF/OP_NOTIF fragments:

bool UsesOpIf(const NodeRef& root) {
    for (std::vector stack{root.get()}; !stack.empty();) {
        const Node* ref{stack.back()};
        stack.pop_back();
        switch (ref->fragment) {
            case Fragment::WRAP_D:
            case Fragment::WRAP_J:
            case Fragment::OR_C:
            case Fragment::OR_D:
            case Fragment::OR_I:
            case Fragment::ANDOR:
                return true;
            default:
                break;
        }
        for (const auto& sub : ref->subs) {
            stack.push_back(sub.get());
        }
    }
    return false;
}

Then the assertions become conditional:

const bool uses_opif_in_tapscript = miniscript::IsTapscript(script_ctx) && UsesOpIf(node);
if (node->ValidSatisfactions()) {
    assert(res || (uses_opif_in_tapscript && serror == ScriptError::SCRIPT_ERR_TAPSCRIPT_MINIMALIF));
}

This preserves the fuzz test's ability to catch unexpected failures while still accommodating legitimate REDUCED_DATA rule violations. The test will:

Pass when SCRIPT_ERR_TAPSCRIPT_MINIMALIF occurs in tapscript with OP_IF fragments (expected behavior)
Fail if that error occurs in non-tapscript contexts or scripts without OP_IF (unexpected, indicates a bug)

The unconditional pass approach sacrifices this diagnostic capability for simplicity.

The other changes (f-string lint fixes, chmod for execute permissions) look fine and overlap with what's already in PR #2.

Rob1Ham

(Note, comment's made by Rob's AI Agent referenced in #2 )

Review of remaining file changes

test/functional/feature_reduced_data_temporary_deployment.py (chmod only)

✅ Identical to PR #2 - Both PRs add execute permissions to this file.

test/functional/feature_reduced_data_utxo_height.py

✅ Identical f-string fix - Both PRs fix the same f-string lint issue on line 435.
PR #2 also removes unused imports (COIN, hash256) which PR #3 does not address.

test/functional/feature_uasf_max_activation_height.py

✅ Identical f-string fix - Both PRs fix the same f-string lint issue on line 399.
PR #2 also removes an unused import os statement which PR #3 does not address.

test/functional/mempool_dust.py

✅ Identical - Both PRs make the same f-string fix on line 166.

test/functional/feature_uasf_reduced_data.py

⚠️ Different approaches to fixing the ignore_rejects test:

PR #3 changes the ignore string:
```
ignore_rejects=["mempool-script-verify-flag-failed"]  # was "non-mandatory-script-verify-flag"
```
PR #2 instead modifies src/validation.cpp to handle the case where ignore_rejects was used to bypass policy checks but consensus checks still fail:
```
if (!args.m_ignore_rejects.empty()) {
    LogPrintf("ConsensusScriptChecks failed after PolicyScriptChecks bypass via ignore_rejects: %s, %s\n", hash.ToString(), state.ToString());
    return false;
}
```
PR #3's approach fixes the immediate test failure by matching the actual rejection string, but PR #2's approach provides better diagnostic logging and handles this case more robustly at the validation layer rather than just papering over the test.
PR #2 also removes ~30 unused imports from this file which PR #3 does not address. These will continue to trigger lint warnings in CI.

Summary

The f-string fixes and chmod changes are identical between both PRs. The main differences are:

PR #2 removes unused imports - prevents future lint failures
PR #2 handles the ignore_rejects case in validation.cpp - more robust fix vs string change
PR #2 has the conditional UsesOpIf() approach for miniscript (covered in previous comment)

Recommendation

Given that 5 of the 6 files changed in this PR overlap with PR #2, and PR #2 was opened several days prior with more comprehensive fixes, I'd suggest building on top of PR #2 rather than maintaining parallel efforts. This would:

Avoid duplicating code review effort across two PRs
Leverage the additional fixes already in PR #2 (unused import cleanup, validation layer improvements, CI disk space fix, mempool priority bug fix)
Keep the test fixes consolidated in one place

If there are specific improvements or alternative approaches in this PR that you prefer, those could be incorporated into PR #2 as suggestions or follow-up commits.

Rob1Ham · 2025-12-15T20:20:07Z

Note: This comment is from Rob1Ham's autonomous agent (Claude Code). Rob has not reviewed this analysis himself, but the agent is attempting to help by analyzing the CI failures.

CI Failure Analysis

I analyzed the CI run from Retropex's fork to understand why it's failing. Here's what I found:

Failed Jobs:

i686, multiprocess, DEBUG - mempool_limit.py test failure
ASan + LSan + UBSan + integer, no depends, USDT - feature_reduced_data_utxo_height.py crash
CentOS, depends, gui - Timeout (2 hour limit exceeded)

Failure 1: `mempool_limit.py` - "mempool full" error

test_framework.authproxy.JSONRPCException: mempool full (-26)

The test test_mid_package_eviction failed because it's a brittle test that depends on precise mempool capacity assumptions and evaluation order.

How PR #2 addresses this: PR #2 removes test_mid_package_eviction and test_rbf_carveout_disallowed entirely from mempool_limit.py, following upstream Bitcoin Core commits:

f3a613aa5b — delete brittle test_mid_package_eviction
89ae38f489 — remove RBF carveout test

This PR does not address this.

Failure 2: `feature_reduced_data_utxo_height.py` - Unsigned integer overflow crash

This is a real runtime bug:

runtime error: unsigned integer overflow: 432 - 433 cannot be represented in type 'unsigned int'
#0 CTxMemPoolEntry::GetPriority(unsigned int) const /policy/coin_age_priority.cpp:111:36

The node crashed during a chain reorganization because GetPriority() computed (spendheight - cachedHeight) where:

spendheight = 432 (current chain height after reorg)
cachedHeight = 433 (cached from before the reorg)

This causes unsigned integer underflow, which UBSan correctly flags as undefined behavior.

How PR #2 addresses this: PR #2 includes a mempool priority bug fix that:

Adds GetCachedHeight() accessor to CTxMemPoolEntry
Guards priority calculations against underflow when spendheight < cachedHeight
Relocates priority-related methods to proper compilation units to fix circular dependencies

This PR does not address this runtime bug.

Failure 3: CentOS timeout (2+ hours)

The CentOS job timed out, likely due to disk space pressure slowing down the build.

How PR #2 addresses this: PR #2 includes a CI disk space cleanup step in .github/workflows/ci.yml that removes pre-installed SDKs (~15+ GB total).

This PR does not address this.

Summary

Issue	This PR Fixes?	PR #2 Fixes?
`mempool_limit.py` brittle test	❌ No	✅ Yes (removes test)
Unsigned integer overflow in `GetPriority()`	❌ No	✅ Yes (guards against underflow)
CI disk space / timeout	❌ No	✅ Yes (cleanup step)
f-string lint fixes	✅ Yes	✅ Yes
miniscript fuzz test	⚠️ Unconditional pass	✅ Conditional `UsesOpIf()` check

This PR will continue to fail CI because it doesn't address the core issues - the mempool priority crash and the brittle mempool_limit tests. These are exactly the bugs that PR #2 was designed to fix.

I'd recommend building on top of PR #2 rather than maintaining parallel efforts, as it already addresses these failures.

Rob1Ham · 2025-12-15T20:48:18Z

Still looking to fix this one.

Note: This is Rob1Ham's AI agent attempting to help.

Regarding the CI failure you're trying to fix:

The error is:

runtime error: unsigned integer overflow: 432 - 433 cannot be represented in type 'unsigned int'
    #0 CTxMemPoolEntry::GetPriority(unsigned int) const /policy/coin_age_priority.cpp:111:36

This happens during feature_reduced_data_utxo_height.py Test 6 (boundary test) when the chain rewinds from height 439 → 431 to test UTXO exemptions at the activation boundary.

The bug: In GetPriority(), when currentHeight = 432 but cachedHeight = 433 (stale from before the reorg), the subtraction currentHeight - cachedHeight underflows because both are unsigned.

The fix (already in PR #2):

double CTxMemPoolEntry::GetPriority(unsigned int currentHeight) const
{
    // Handle case where chain has been rewound below cached height (e.g., during reorgs)
    if (currentHeight < cachedHeight) {
        return cachedPriority;
    }
    int heightDiff = currentHeight - cachedHeight;
    // ...
}

Plus a guard in txmempool.cpp for the mempool consistency check:

if (spendheight >= static_cast<int64_t>(it->GetCachedHeight())) {
    assert(priDiff < .0001 * freshPriority + 1);
}

The changes in this PR don't address this crash. You can either merge PR #2 or cherry-pick the specific commit that fixes the mempool priority calculation.

…_data.py

dathonohm

Thanks for the PR! 🎉

This PR successfully fixes the following tests:

macOS 14 native, arm64, fuzz
Win64 native fuzz, VS 2022
fuzzer,address,undefined,integer, no depends
previous releases, depends DEBUG
MSan, depends
lint

The following tests are still failing:

ASan + LSan + UBSan + integer, no depends, USDT
i686, multiprocess, DEBUG
CentOS, depends, gui

ACK 22cec42

Rob1Ham reviewed Dec 15, 2025

View reviewed changes

Rob1Ham mentioned this pull request Dec 15, 2025

CI and Test Fixes for BIP-110 (REDUCED_DATA) Retropex/bitcoin#17

Closed

Retropex and others added 3 commits December 18, 2025 18:15

test: Fix fuzz for miniscript.cpp

e6f0584

test: change permission and remove some f-string in logs

8a6e4f3

test: use the correct flag for ignore_rejects in feature_uasf_reduced…

22cec42

…_data.py

Retropex force-pushed the 29.2.knots20251110+UASF-BIP110 branch from 5ca41f6 to 22cec42 Compare December 18, 2025 17:16

Retropex marked this pull request as ready for review December 18, 2025 17:16

dathonohm approved these changes Dec 21, 2025

View reviewed changes

dathonohm merged commit ef5f89a into dathonohm:29.2.knots20251110+UASF-BIP110 Dec 21, 2025
15 of 19 checks passed

Retropex deleted the 29.2.knots20251110+UASF-BIP110 branch December 22, 2025 08:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix tests #3

Fix tests #3

Retropex commented Dec 15, 2025 •

edited

Loading

Uh oh!

Rob1Ham commented Dec 15, 2025

Uh oh!

Rob1Ham left a comment •

edited

Loading

Uh oh!

Rob1Ham left a comment •

edited

Loading

Uh oh!

Rob1Ham commented Dec 15, 2025

Uh oh!

Rob1Ham commented Dec 15, 2025

Uh oh!

dathonohm left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix tests #3

Fix tests #3

Conversation

Retropex commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Rob1Ham commented Dec 15, 2025

Uh oh!

Rob1Ham left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

CI Implications

More Comprehensive Approach in PR #2

Uh oh!

Rob1Ham left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Review of remaining file changes

Summary

Recommendation

Uh oh!

Rob1Ham commented Dec 15, 2025

CI Failure Analysis

Failed Jobs:

Failure 1: mempool_limit.py - "mempool full" error

Failure 2: feature_reduced_data_utxo_height.py - Unsigned integer overflow crash

Failure 3: CentOS timeout (2+ hours)

Summary

Uh oh!

Rob1Ham commented Dec 15, 2025

Uh oh!

dathonohm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Retropex commented Dec 15, 2025 •

edited

Loading

Rob1Ham left a comment •

edited

Loading

Rob1Ham left a comment •

edited

Loading

Failure 1: `mempool_limit.py` - "mempool full" error

Failure 2: `feature_reduced_data_utxo_height.py` - Unsigned integer overflow crash