Skip to content

Fix VM panic when function parameter shadows outer local variable#124

Open
CJHwong wants to merge 1 commit intopydantic:mainfrom
CJHwong:fix/closure-param-shadows-outer-local
Open

Fix VM panic when function parameter shadows outer local variable#124
CJHwong wants to merge 1 commit intopydantic:mainfrom
CJHwong:fix/closure-param-shadows-outer-local

Conversation

@CJHwong
Copy link

@CJHwong CJHwong commented Feb 8, 2026

Summary

I hit a Rust panic in the Monty VM while running a Chudnovsky algorithm for computing pi to 100 digits. The panic occurred at vm/mod.rs:1678 with:

index out of bounds: the len is 1 but the index is 1

The code exercises nested function definitions where a parameter name collides with an outer local variable — valid Python that triggers a name resolution bug in the bytecode compiler.

Reproducer

This valid Python causes a VM panic on v0.0.4:

def compute_pi_chudnovsky(precision=100):
    num_terms = max(10, precision // 14 + 2)
    scale = 10 ** (precision + 15)
    K3 = 640320 ** 3

    def int_sqrt(n, scale):
        x = n * scale
        while True:
            y = (x + n * scale * scale // x) // 2
            if y >= x:
                return x
            x = y

    sqrt_10005 = int_sqrt(10005, scale)
    series_sum = 0
    sign = 1
    fact_6k = 1
    fact_k = 1
    fact_3k = 1

    for k in range(num_terms):
        if k > 0:
            for i in range(6*k - 5, 6*k + 1):
                if i > 0:
                    fact_6k *= i
            fact_k *= k
            for i in range(3*k - 2, 3*k + 1):
                if i > 0:
                    fact_3k *= i
        term_num = sign * fact_6k * (13591409 + 545140134 * k)
        term_den = (fact_k ** 3) * fact_3k * (K3 ** k)
        term = term_num // term_den
        series_sum += term
        sign = -sign

    numerator = sqrt_10005 * 426880
    denominator = 12 * series_sum
    pi_scaled = (numerator * scale) // denominator
    pi_str = str(pi_scaled)
    while len(pi_str) < precision + 1:
        pi_str = "0" + pi_str
    integer_part = pi_str[0]
    decimal_part = pi_str[1:precision+1]
    return integer_part + "." + decimal_part

result = compute_pi_chudnovsky(100)
print(result)

The trigger: int_sqrt(n, scale) has a parameter scale that shadows the outer local scale = 10 ** (precision + 15).

Before fix:

thread 'main' panicked at crates/monty/src/bytecode/vm/mod.rs:1678:49:
index out of bounds: the len is 1 but the index is 1

After fix:

2.6179948410979343891252201980420619945592623594116964664355749645666116832287031210009527536335753466

(Matches CPython output.)

Root Cause

The bug is in prepare.rs:get_id(), which resolves variable names to their scope during bytecode compilation. The method checks a priority list and returns on the first match.

The resolution order before this fix was:

Step 1: global declaration
Step 2: free_var_map    → explicit nonlocal / previously resolved captures
Step 3: cell_var_map    → variables captured by nested functions
Step 4: assigned_names  → locally assigned variables (x = ...)
Step 5: enclosing_locals → implicit closure capture from outer scope  ← CHECKED TOO EARLY
Step 6: name_map         → function parameters (pre-populated)        ← CHECKED TOO LATE
Step 7: global namespace → implicit global read
Step 8: fallthrough      → allocate new local (NameError at runtime)

When compiling int_sqrt's body and encountering a reference to scale:

  1. Steps 1–4: no match — scale is not global, not nonlocal, not captured by a nested function, and not assigned in int_sqrt's body (it's a parameter, not an assignment)
  2. Step 5 (enclosing_locals): finds scale in compute_pi_chudnovsky's locals → incorrectly treats it as an implicit closure capture → adds it to free_var_map → emits LoadCell bytecode (closure cell load)
  3. Step 6 (name_map): would have correctly found scale as parameter index 1 and emitted LoadLocal — but this step is never reached

At runtime, the VM executes LoadCell with a cell index that doesn't exist in the frame's cells array (which was sized based on actual captures, not the bogus ones), causing the index-out-of-bounds panic.

How it got there

Two commits introduced this, neither wrong in isolation:

  1. 0038788 (Dec 2025) — Original closure implementation. Added the enclosing_locals check. No parameter check existed at all at this point.
  2. c7c196a (Jan 2026, PR fix function parameter binding #4) — "fix function parameter binding". Added the name_map check to fix parameters shadowing globals, but appended it after enclosing_locals. The fix was focused on the parameter-vs-global conflict and didn't consider parameter-vs-enclosing-local.

No test ever combined parameter shadowing with closures, so the interaction went undetected.

Fix

Moved the name_map check (step 6 → step 5) before the enclosing_locals check (step 5 → step 6):

Step 4: assigned_names   → locally assigned variables
Step 5: name_map         → function parameters  ← NOW CHECKED FIRST
Step 6: enclosing_locals → implicit closure capture
Step 7: global namespace

This placement is precise — it cannot go higher because:

  • Step 3 (cell_var_map) must still take priority: a parameter captured by a deeper nested function needs Cell scope, not Local
  • Step 4 (assigned_names) must still take priority: a parameter reassigned in the body is already tracked there

LEGB Compliance

I verified this matches Python's LEGB (Local → Enclosing → Global → Built-in) scoping rules using CPython's dis module and code object introspection:

def outer():
    scale = 100
    def inner(n, scale):
        return n * scale
    return inner

inner_fn = outer()
code = inner_fn.__code__

print(code.co_varnames)   # ('n', 'scale') — parameters are local variables
print(code.co_freevars)   # ()             — NOT captured from enclosing
print(code.co_cellvars)   # ()             — NOT cell variables

CPython compiles inner's reference to scale as LOAD_FAST (local variable load), not LOAD_DEREF (closure cell load). Parameters are definitively in the L (Local) tier of LEGB and must shadow enclosing scope.

Test Plan

  • Added closure__param_shadows_outer.py with 6 test scenarios:
    • Basic parameter shadows outer local
    • Multiple parameters both shadow outer locals
    • Mixed: one parameter shadows, another variable legitimately captures
    • Parameter with default value shadows outer local
    • Deeply nested: parameter shadows grandparent local
    • Complex expression matching the reproducer pattern (scale param)
  • All 6 scenarios pass on both Monty and CPython
  • Full test suite: 746 tests passed, 0 failed, 0 regressions
  • Clippy + Python lint clean
  • Pre-commit hooks pass

AI Disclosure: This fix was fully implemented using Claude Code, including root cause analysis, code change, test cases, and PR description. All changes were 100% reviewed and verified by a human before submission.

In get_id(), the name_map check (function parameters) was ordered after
the enclosing_locals check (implicit closure capture). When a nested
function parameter shadowed an outer local (e.g. `def inner(scale)` where
outer scope also has `scale`), the parameter was incorrectly compiled as
a closure cell reference instead of a local variable, causing a VM panic:
"index out of bounds: the len is 1 but the index is 1" at vm/mod.rs:1678.

Move the name_map check before enclosing_locals so parameters correctly
shadow enclosing locals, matching Python's LEGB scoping rules.
@codecov
Copy link

codecov bot commented Feb 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@codspeed-hq
Copy link

codspeed-hq bot commented Feb 8, 2026

CodSpeed Performance Report

Merging this PR will not alter performance

Comparing CJHwong:fix/closure-param-shadows-outer-local (2ab9849) with main (6c277bc)

Summary

✅ 13 untouched benchmarks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants