Merged
Conversation
…tant fold when/unless/not, redundant cast elimination - Identity rules: (+x 0)->x, (-x 0)->x, (*x 1)->x, (*x 0)->0, (/x 1)->x - Bitwise identity: (&x 0xFF)->x, (&x 0)->0, (|x 0)->x, (^x 0)->x - Shift identity: (<<x 0)->x, (>>x 0)->x - Strength reduction: (/x 2^n)->(>>x n), (%x 2^n)->(&x mask), (+x x)->(<<x 1) - Constant fold when/unless with known conditions - Double negation elimination: (not (not x))->x - Redundant cast elimination: (u8 (u8 x))->(u8 x), (i8 (i8 x))->(i8 x) Co-authored-by: Rafael Lopes <rafael84@gmail.com>
Major codegen optimizations that reduce generated code size by ~10%: - Comparison fusion: if/while/when/unless/cond now fuse comparisons directly with conditional jumps (CMP+Jcc) instead of materializing boolean values (0/1) and then using JFALSE/JTRUE macros. Saves ~7 instructions per comparison used in control flow. - Binary op immediate optimization: ADD, SUB, AND, OR, XOR with a constant or simple operand avoid PUSH/POP overhead by loading the simple operand directly into R1. Saves 3 instructions per operation (PUSH + MOV + POP eliminated). - Optimized u8/i8 casts: when operand is already 8-bit, the AND 0xFF mask is eliminated as a no-op. - Optimized set! for globals: simple values avoid PUSH/POP by setting up the address registers first. - emit_branch_false/emit_branch_true: new helper functions that handle fused branching for comparisons, logic ops (and/or/not), and predicates (nil?/zero?). Co-authored-by: Rafael Lopes <rafael84@gmail.com>
…d set! - Function call overhead: only save/restore R2:R3 (let-local base) when inside a let scope. Saves 2 PUSH + 2 POP = 12 bytes per call when outside let blocks (common for top-level game loop calls). - Comparison expressions (EQ/NE/LT/GT/LE/GE): use emit_cmp_operands helper with simple operand optimization, avoiding PUSH/POP when one operand is a constant. - Field set! optimization: for simple values on non-16bit fields, set up address first to avoid PUSH/POP overhead. Co-authored-by: Rafael Lopes <rafael84@gmail.com>
Co-authored-by: Rafael Lopes <rafael84@gmail.com>
Converts addition/subtraction with literal 1 into INC/DEC operations, which are single-instruction (vs LOADI+ADD = 2 instructions). Co-authored-by: Rafael Lopes <rafael84@gmail.com>
Inline small, non-recursive, leaf functions (no calls, body size <= 5 AST nodes, single expression body) at their call sites. This eliminates the entire function call overhead (PUSH/POP register saves, argument passing, CALL/RET) for trivial helpers. Also fixes count_nodes() to properly count all AST node types (let, while, for, cond, when, etc.) to avoid inlining large functions. After inlining, re-runs constant folding and dead code/function elimination to clean up inlined-then-dead function definitions. Impact on floppy.se: 2665 -> 2512 instructions (-5.7%) Co-authored-by: Rafael Lopes <rafael84@gmail.com>
|
Cursor Agent can help with this pull request. Just |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Significantly reduce SE program code size by expanding AST optimizations and improving codegen efficiency.
This PR implements a comprehensive suite of optimizations across the AST optimizer and code generator to drastically reduce the size of compiled SE programs. Key changes include: identity elimination, strength reduction (e.g.,
(/ x 2^n)to(>> x n)), constant folding, and function inlining for small, non-recursive, leaf functions in the optimizer; and comparison fusion with conditional jumps, binary operation immediate optimization, reduced function call overhead, and efficientset!and cast handling in the codegen. These improvements resulted in a ~29% reduction in code size forfloppy.se(from ~3550 to 2512 instructions), providing significantly more room for larger games, moving towards the goal of enabling games 10x bigger thanfloppy.se.