Skip to content

Conversation

@kant777
Copy link
Contributor

@kant777 kant777 commented Jan 26, 2026

Adds support for using standard Ethereum RPC providers (Alchemy, Erigon) for transaction simulation via debug_traceCall, as an alternative to the external rethsim endpoint.

  • Add InlineSimulator using debug_traceCall with callTracer

  • Add SwapDetector for detecting swaps via event signatures and known routers

  • Auto-detect simulation mode based on URL pattern (contains /rethsim or not)

  • Support all transaction types including EIP-4844 blob transactions

  • Add Prometheus metrics and proper resource cleanup

  • I have added tests that prove my fix is effective or that my feature works

@kant777 kant777 force-pushed the feat/inline-simulation branch from d362dac to e638981 Compare January 26, 2026 11:42
@iowar
Copy link
Contributor

iowar commented Jan 26, 2026

  • swap detection still relies on topic scanning.. please confirm this matches the previous behavior
  • rethsim used wrappers for output stability.. we should be prepared to preserve the same schema on the mev-commit side
    (should test and document the exact json output of the target RPC endpoints)
  • I’m assuming the RPC supports latest and pending simulations, but you need to be sure and handle unsupported tags deterministically
  • define expected behavior when pending is inconsistent or unavailable
  • the new path changes failure modes and observability.. make sure errors still surface the root cause (revert reason and inner call errors)
  • it might be safer to put this behind a feature flag and keep the old path as a fallback until we’ve proven parity in prod

@kant777 kant777 force-pushed the feat/inline-simulation branch 9 times, most recently from 36965f5 to b1e9636 Compare January 26, 2026 17:19
@kant777
Copy link
Contributor Author

kant777 commented Jan 26, 2026

@iowar

  1. "swap detection still relies on topic scanning.. please confirm this matches the previous behavior"
Yes, matches exactly. swap_detector.go scans topic[0] for the same event signatures rethsim uses: Uniswap V2/V3, Curve (both events), Balancer V2, 0x, Bancor. Also checks destination  
 against known router addresses (Uniswap, 1inch, SushiSwap, Curve, 0x, Paraswap, CoW).  
  1. "rethsim used wrappers for output stability.. we should be prepared to preserve the same schema on the mev-commit side"

Both simulators return identical ([]*types.Log, bool, error) - the sender.Simulator interface. Callers see no difference. The types.Log struct is go-ethereum's standard type.

  1. "I'm assuming the RPC supports latest and pending simulations, but you need to be sure and handle unsupported tags deterministically"

Verified. Alchemy and QuickNode both support pending for debug_traceCall. If unsupported, the RPC returns a JSON-RPC error which we surface directly - no silent fallback to latest.

  1. "define expected behavior when pending is inconsistent or unavailable"

Fails fast with error: "debug_traceCall failed (state=pending): <rpc error>". No automatic degradation. Operator must provide RPC with pending support or use external rethsim.

  1. "the new path changes failure modes and observability.. make sure errors still surface the root cause (revert reason and inner call errors)"
  Yes. decodeRevertFromTrace() decodes Error(string) and Panic(uint256) selectors. findInnerCallError() recursively checks nested calls. Errors surface as "reverted: <reason>" or "inner 
  call reverted: <reason>".           
  1. "it might be safer to put this behind a feature flag and keep the old path as a fallback until we've proven parity in prod"

Done. --use-inline-simulation flag (default: false). Old rethsim path remains default. Both support multiple URLs for fallback.

@kant777 kant777 force-pushed the feat/inline-simulation branch 2 times, most recently from cecf50c to 0e15f77 Compare January 26, 2026 17:32
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI failure due to protobuf parity. Not sure why this is happening since you havent changed the protobuf generated files. Best to generate them once more to see if there are any changes.

cd p2p
make bufgen

@iowar iowar self-requested a review January 26, 2026 22:41
Copy link
Contributor

@iowar iowar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible.. I’d prefer to offload most of the call load to eth_simulateV1, if the RPC supports it, because debug_traceCall puts a significant load on the RPC.. especially with a high number of swaps.. and increases response times.. additionally within rethsim, debug_traceCall was being invoked with separate logic for some edge cases.. however, for the first working version, we can stick with debug_traceCall if performance won’t be a concern.. also, I’d suggest testing this PR in real time with actual fastrpc transactions for a while


// Validate trace response - a valid trace always has non-zero GasUsed
// (at minimum, intrinsic gas of 21000 is consumed)
if result.GasUsed == "" || result.GasUsed == "0x" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: use hexutil.DecodeUint64(result.GasUsed) and validate non zero

}

// Get sender
signer := types.LatestSignerForChainID(tx.ChainId())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly correct.. but it sometimes fails.. rethsim does its own transaction parsing (from raw data to a call object)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looked into rethsim code and I see that your are doing the manual decoding but It looks like all of those cases are handled in types.LatestSignerForChainID(tx.ChainId()). If not, highlight the ones it doesnt handle?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even with the right signer.. sender recovery may still fail for some raw tx edge cases. rethsim avoided this by parsing tx type + recovering from from raw bytes and building the call deterministically

]
}`

func TestInlineSimulator(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s important to include at least one real signed transaction

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should include test vectors built from real raw tx hex and assert e2e behavior.. so we can observe parsing and sender addr recovery failures early and verify the full simulation pipeline

@kant777 kant777 force-pushed the feat/inline-simulation branch 2 times, most recently from 6d0898d to 758160b Compare January 27, 2026 04:25
@kant777
Copy link
Contributor Author

kant777 commented Jan 27, 2026

If possible.. I’d prefer to offload most of the call load to eth_simulateV1, if the RPC supports it, because debug_traceCall puts a significant load on the RPC.. especially with a high number of swaps.. and increases response times.. additionally within rethsim, debug_traceCall was being invoked with separate logic for some edge cases.. however, for the first working version, we can stick with debug_traceCall if performance won’t be a concern.. also, I’d suggest testing this PR in real time with actual fastrpc transactions for a while

yes debug_traceCall is heavy so changed the code to use that only as a fallback. eth_simulateV1 will be the first choice

Copy link
Collaborator

@aloknerurkar aloknerurkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a go perspective the changes look good. But I havent used the eth_simulateV1 endpoint. So as long as that and the topics needed are verified we can merge. I am approving from my side.

kant777 and others added 4 commits January 27, 2026 10:45
- Add InlineSimulator using debug_traceCall with callTracer
- Support pending state simulation (required for preconf-rpc)
- Add enableReturnData for better revert capture
- Port all 15 swap signatures from rethsim
- Detect swaps via event topic signatures only
- Support multiple RPC endpoints with fallback on 5xx/429/network errors
- Add --use-inline-simulation feature flag (default: false)
- Include call path (to, type) in inner revert errors
- Use eth_simulateV1 as the primary simulation method for better performance
- Fall back to debug_traceCall when eth_simulateV1 is not supported by the RPC
- eth_simulateV1 is lighter and reduces load on RPC providers
- debug_traceCall is still used for edge cases or when eth_simulateV1 is unavailable
- Add SimulateV1CallResult, SimulateError, SimulateV1Block structs for eth_simulateV1 response
- Add isMethodNotSupported() to detect unsupported method errors
- Update tests to cover both eth_simulateV1 and fallback scenarios

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ndpoint fallback

Transaction reverts and invalid response errors were not wrapped in
NonRetryableError, causing the code to unnecessarily try all fallback
endpoints when a transaction reverts. This wasted resources and delayed
error responses.

Changes:
- Wrap revert errors in NonRetryableError in both eth_simulateV1 and debug_traceCall
- Wrap empty/invalid response errors in NonRetryableError
- Update shouldFallback to check for NonRetryableError first
- Add clarifying comments about fallback behavior

This ensures reverts fail fast instead of trying all endpoints.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove redundant comments that just repeat what the code does
- Keep comments where they explain why, not what
- Make comments more concise and natural
- Simplify code structure in a few places

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@kant777 kant777 force-pushed the feat/inline-simulation branch from bdaba00 to 3810595 Compare January 27, 2026 18:45
@iowar iowar self-requested a review January 28, 2026 04:03
@kant777 kant777 merged commit 12a662e into main Jan 28, 2026
10 of 11 checks passed
@kant777 kant777 deleted the feat/inline-simulation branch January 28, 2026 04:21
kant777 added a commit that referenced this pull request Jan 28, 2026
* feat: add inline simulation using debug_traceCall

- Add InlineSimulator using debug_traceCall with callTracer
- Support pending state simulation (required for preconf-rpc)
- Add enableReturnData for better revert capture
- Port all 15 swap signatures from rethsim
- Detect swaps via event topic signatures only
- Support multiple RPC endpoints with fallback on 5xx/429/network errors
- Add --use-inline-simulation feature flag (default: false)
- Include call path (to, type) in inner revert errors

* feat: use eth_simulateV1 as primary method with debug_traceCall fallback

- Use eth_simulateV1 as the primary simulation method for better performance
- Fall back to debug_traceCall when eth_simulateV1 is not supported by the RPC
- eth_simulateV1 is lighter and reduces load on RPC providers
- debug_traceCall is still used for edge cases or when eth_simulateV1 is unavailable
- Add SimulateV1CallResult, SimulateError, SimulateV1Block structs for eth_simulateV1 response
- Add isMethodNotSupported() to detect unsupported method errors
- Update tests to cover both eth_simulateV1 and fallback scenarios

* fix: wrap revert errors in NonRetryableError to prevent unnecessary endpoint fallback

Transaction reverts and invalid response errors were not wrapped in
NonRetryableError, causing the code to unnecessarily try all fallback
endpoints when a transaction reverts. This wasted resources and delayed
error responses.

Changes:
- Wrap revert errors in NonRetryableError in both eth_simulateV1 and debug_traceCall
- Wrap empty/invalid response errors in NonRetryableError
- Update shouldFallback to check for NonRetryableError first
- Add clarifying comments about fallback behavior

This ensures reverts fail fast instead of trying all endpoints.

* refactor: clean up comments for readability

- Remove redundant comments that just repeat what the code does
- Keep comments where they explain why, not what
- Make comments more concise and natural
- Simplify code structure in a few places
}

// findInnerCallError recursively searches for errors in nested calls.
func findInnerCallError(call *TraceCallResult) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t see the failing call path in the error message yet.. since you marked this as done, I’m assuming it was tested..this is necessary because it can make us blame the wrong call.. we ran into this issue on the rethsim side with Definitive

}

// Get sender
signer := types.LatestSignerForChainID(tx.ChainId())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even with the right signer.. sender recovery may still fail for some raw tx edge cases. rethsim avoided this by parsing tx type + recovering from from raw bytes and building the call deterministically

]
}`

func TestInlineSimulator(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should include test vectors built from real raw tx hex and assert e2e behavior.. so we can observe parsing and sender addr recovery failures early and verify the full simulation pipeline

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants