Skip to content

Comments

SE v2 codegen optimization#11

Merged
rafael84 merged 6 commits intomainfrom
cursor/se-v2-codegen-optimization-41ea
Feb 7, 2026
Merged

SE v2 codegen optimization#11
rafael84 merged 6 commits intomainfrom
cursor/se-v2-codegen-optimization-41ea

Conversation

@rafael84
Copy link
Owner

@rafael84 rafael84 commented Feb 7, 2026

Significantly reduce SE program code size by expanding AST optimizations and improving codegen efficiency.

This PR implements a comprehensive suite of optimizations across the AST optimizer and code generator to drastically reduce the size of compiled SE programs. Key changes include: identity elimination, strength reduction (e.g., (/ x 2^n) to (>> x n)), constant folding, and function inlining for small, non-recursive, leaf functions in the optimizer; and comparison fusion with conditional jumps, binary operation immediate optimization, reduced function call overhead, and efficient set! and cast handling in the codegen. These improvements resulted in a ~29% reduction in code size for floppy.se (from ~3550 to 2512 instructions), providing significantly more room for larger games, moving towards the goal of enabling games 10x bigger than floppy.se.


Open in Cursor Open in Web

cursoragent and others added 6 commits February 7, 2026 06:04
…tant fold when/unless/not, redundant cast elimination

- Identity rules: (+x 0)->x, (-x 0)->x, (*x 1)->x, (*x 0)->0, (/x 1)->x
- Bitwise identity: (&x 0xFF)->x, (&x 0)->0, (|x 0)->x, (^x 0)->x
- Shift identity: (<<x 0)->x, (>>x 0)->x
- Strength reduction: (/x 2^n)->(>>x n), (%x 2^n)->(&x mask), (+x x)->(<<x 1)
- Constant fold when/unless with known conditions
- Double negation elimination: (not (not x))->x
- Redundant cast elimination: (u8 (u8 x))->(u8 x), (i8 (i8 x))->(i8 x)

Co-authored-by: Rafael Lopes <rafael84@gmail.com>
Major codegen optimizations that reduce generated code size by ~10%:

- Comparison fusion: if/while/when/unless/cond now fuse comparisons
  directly with conditional jumps (CMP+Jcc) instead of materializing
  boolean values (0/1) and then using JFALSE/JTRUE macros.
  Saves ~7 instructions per comparison used in control flow.

- Binary op immediate optimization: ADD, SUB, AND, OR, XOR with a
  constant or simple operand avoid PUSH/POP overhead by loading the
  simple operand directly into R1.
  Saves 3 instructions per operation (PUSH + MOV + POP eliminated).

- Optimized u8/i8 casts: when operand is already 8-bit, the AND 0xFF
  mask is eliminated as a no-op.

- Optimized set! for globals: simple values avoid PUSH/POP by setting
  up the address registers first.

- emit_branch_false/emit_branch_true: new helper functions that handle
  fused branching for comparisons, logic ops (and/or/not), and
  predicates (nil?/zero?).

Co-authored-by: Rafael Lopes <rafael84@gmail.com>
…d set!

- Function call overhead: only save/restore R2:R3 (let-local base)
  when inside a let scope. Saves 2 PUSH + 2 POP = 12 bytes per call
  when outside let blocks (common for top-level game loop calls).

- Comparison expressions (EQ/NE/LT/GT/LE/GE): use emit_cmp_operands
  helper with simple operand optimization, avoiding PUSH/POP when one
  operand is a constant.

- Field set! optimization: for simple values on non-16bit fields,
  set up address first to avoid PUSH/POP overhead.

Co-authored-by: Rafael Lopes <rafael84@gmail.com>
Co-authored-by: Rafael Lopes <rafael84@gmail.com>
Converts addition/subtraction with literal 1 into INC/DEC operations,
which are single-instruction (vs LOADI+ADD = 2 instructions).

Co-authored-by: Rafael Lopes <rafael84@gmail.com>
Inline small, non-recursive, leaf functions (no calls, body size <= 5
AST nodes, single expression body) at their call sites. This eliminates
the entire function call overhead (PUSH/POP register saves, argument
passing, CALL/RET) for trivial helpers.

Also fixes count_nodes() to properly count all AST node types (let,
while, for, cond, when, etc.) to avoid inlining large functions.

After inlining, re-runs constant folding and dead code/function
elimination to clean up inlined-then-dead function definitions.

Impact on floppy.se: 2665 -> 2512 instructions (-5.7%)

Co-authored-by: Rafael Lopes <rafael84@gmail.com>
@cursor
Copy link

cursor bot commented Feb 7, 2026

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

@rafael84 rafael84 marked this pull request as ready for review February 7, 2026 11:01
@rafael84 rafael84 merged commit b837034 into main Feb 7, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants