Skip to content

Conversation

@heifner
Copy link
Contributor

@heifner heifner commented Dec 19, 2025

See Wire-Network/eos-vm#7 for description of read-only deadlock issue.

Update read_only_trx_test.py so that it is much more likely to fail without the fix. I was able to get the test to fail every run on my machine without the fix of: Wire-Network/eos-vm#7. Also added another contract with an infinite transaction so that more than one wasm module is active for the test.

Depends on: Wire-Network/eos-vm#7

Updated platform_timer to have an explicit set_expired to call from wire-sys-vm timeout_exception since that can be thrown on a thread where the thread's platform_timer has not timed out with wire-sys-vm 1.1.0.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a read-only transaction deadlock issue by updating the wire-sys-vm dependency and adding an explicit set_expired() method to platform_timer. The fix addresses a multi-threaded scenario where timeout_exception can be thrown on threads whose timers haven't actually expired yet, requiring explicit state transitions to ensure proper timeout handling.

Key changes:

  • Updated wire-sys-vm from version 1.0.1-rc1 to 1.1.0 to include the deadlock fix
  • Added set_expired() method to platform_timer for explicit timeout state setting
  • Enhanced read_only_trx_test.py to be more likely to trigger the deadlock without the fix
  • Added new "infinite" test contract to provide multiple active WASM modules during testing

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
vcpkg.json Updates wire-sys-vm dependency from 1.0.1-rc1 to 1.1.0
vcpkg-configuration.json Updates vcpkg registry baseline to support new wire-sys-vm version
unittests/test-contracts/infinite/infinite.cpp Implements infinite contract with slow, forever, and segfault actions for testing
unittests/test-contracts/infinite/infinite.hpp Header file for infinite contract defining action interfaces
unittests/test-contracts/infinite/infinite.abi ABI definition for infinite contract actions
unittests/test-contracts/infinite/infinite.wasm Compiled WASM binary for infinite contract
unittests/test-contracts/infinite/CMakeLists.txt Build configuration for infinite contract
unittests/test-contracts/CMakeLists.txt Adds infinite subdirectory to test contracts build
tests/read_only_trx_test.py Enhances test with infinite contract, additional threads, and relaxed assertions for timeout scenarios
plugins/producer_plugin/producer_plugin.cpp Adds debug logging for read-only transaction tracking
libraries/chain/webassembly/runtimes/sys-vm.cpp Calls set_expired() when handling timeout_exception to ensure checktime() sees timeout
libraries/chain/platform_timer_posix.cpp Implements set_expired() method for POSIX platform timer
libraries/chain/platform_timer_kqueue.cpp Implements set_expired() method for kqueue platform timer
libraries/chain/platform_timer_asio_fallback.cpp Implements set_expired() method for ASIO fallback platform timer
libraries/chain/include/sysio/chain/transaction_context.hpp Adds set_expired() wrapper method to transaction_checktime_timer
libraries/chain/include/sysio/chain/platform_timer.hpp Declares set_expired() method in platform_timer interface

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


void infinite::segv() {
volatile int* p = nullptr;
int x = *p; // cause segfault
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable 'x' is assigned but never used after the initial dereference. The dereference of nullptr on line 65 will cause the segfault as intended, but the subsequent assignments to 'x' in the while loop (line 67) serve no functional purpose. While this may be intentional to prevent compiler optimizations, consider adding a comment explaining why 'x' is being assigned in the loop, or use the value in some way to make the intent clearer (e.g., volatile reads or a side effect).

Suggested change
int x = *p; // cause segfault
volatile int x = *p; // cause segfault (and keep dereferences from being optimized out)

Copilot uses AI. Check for mistakes.
threadList.append(threading.Thread(target = foreverTransactions))
threadList.append(threading.Thread(target = infiniteTransactions))
threadList.append(threading.Thread(target = foreverTransactions))
Print("Sending different speeds of read only transactions simutaneously")
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment has a typo: "simutaneously" should be "simultaneously".

Copilot uses AI. Check for mistakes.
def sendReadOnlySEGV():
return sendTransaction('infinite', action='segv', data={}, auth=[], opts='--read')

# Send read-only trxs from mutltiple threads to bump load
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment has a typo: "mutltiple" should be "multiple".

Copilot uses AI. Check for mistakes.
Comment on lines 105 to 108
void platform_timer::set_expired() {
timer_state_t expected{.state = state_t::running, .callback_in_flight = false, .generation_running = generation};
_state.compare_exchange_strong(expected, timer_state_t{state_t::timed_out, false, generation});
}
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The set_expired() implementation uses the non-atomic 'generation' member variable, while interrupt_timer() loads 'generation_running' from the atomic state before using it. For consistency and to avoid potential race conditions, consider following the same pattern as interrupt_timer() by loading generation_running from the state:

const generation_t generation_running = _state.load().generation_running;
timer_state_t expected{.state = state_t::running, .callback_in_flight = false, .generation_running = generation_running};

This ensures the generation value used in compare_exchange matches the actual running generation in the atomic state, which is more robust in multi-threaded scenarios where timeout_exception can be thrown on a thread before its timer has actually expired.

Copilot uses AI. Check for mistakes.
Comment on lines 92 to 93
timer_state_t expected{.state = state_t::running, .callback_in_flight = false, .generation_running = generation};
_state.compare_exchange_strong(expected, timer_state_t{state_t::timed_out, false, generation});
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The set_expired() implementation uses the non-atomic 'generation' member variable, while interrupt_timer() loads 'generation_running' from the atomic state before using it. For consistency and to avoid potential race conditions, consider following the same pattern as interrupt_timer() by loading generation_running from the state:

const generation_t generation_running = _state.load().generation_running;
timer_state_t expected{.state = state_t::running, .callback_in_flight = false, .generation_running = generation_running};

This ensures the generation value used in compare_exchange matches the actual running generation in the atomic state, which is more robust in multi-threaded scenarios where timeout_exception can be thrown on a thread before its timer has actually expired.

Suggested change
timer_state_t expected{.state = state_t::running, .callback_in_flight = false, .generation_running = generation};
_state.compare_exchange_strong(expected, timer_state_t{state_t::timed_out, false, generation});
const generation_t generation_running = _state.load().generation_running;
timer_state_t expected{.state = state_t::running, .callback_in_flight = false, .generation_running = generation_running};
_state.compare_exchange_strong(expected, timer_state_t{state_t::timed_out, false, generation_running});

Copilot uses AI. Check for mistakes.
Comment on lines 128 to 129
timer_state_t expected{.state = state_t::running, .callback_in_flight = false, .generation_running = generation};
_state.compare_exchange_strong(expected, timer_state_t{state_t::timed_out, false, generation});
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The set_expired() implementation uses the non-atomic 'generation' member variable, while interrupt_timer() loads 'generation_running' from the atomic state before using it. For consistency and to avoid potential race conditions, consider following the same pattern as interrupt_timer() by loading generation_running from the state:

const generation_t generation_running = _state.load().generation_running;
timer_state_t expected{.state = state_t::running, .callback_in_flight = false, .generation_running = generation_running};

This ensures the generation value used in compare_exchange matches the actual running generation in the atomic state, which is more robust in multi-threaded scenarios where timeout_exception can be thrown on a thread before its timer has actually expired.

Suggested change
timer_state_t expected{.state = state_t::running, .callback_in_flight = false, .generation_running = generation};
_state.compare_exchange_strong(expected, timer_state_t{state_t::timed_out, false, generation});
const generation_t generation_running = _state.load().generation_running;
timer_state_t expected{.state = state_t::running, .callback_in_flight = false, .generation_running = generation_running};
_state.compare_exchange_strong(expected, timer_state_t{state_t::timed_out, false, generation_running});

Copilot uses AI. Check for mistakes.
void stop();
//interrupt_timer() can be called from any thread
void interrupt_timer();
//explicit set to expired
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment for set_expired() should provide more context about its purpose and thread safety guarantees, similar to the comment for interrupt_timer(). Consider documenting:

  1. When this method should be called (e.g., when handling timeout_exception in sys-vm contexts)
  2. Thread safety characteristics (whether it can be called from any thread)
  3. What happens if the timer is not in the running state

This would help API consumers understand when and how to use this method safely.

Suggested change
//explicit set to expired
/* Explicitly force the timer into an "expired" state.
This is typically used by higher‑level code when a timeout has already been detected
(for example, when handling a timeout_exception in a system VM or similar context) and
the platform_timer needs to be updated to reflect that expiration.
Thread safety: like interrupt_timer(), this function is intended to be callable from
any thread and may be invoked even while the timer is running.
If the timer is not currently in the running state, calling set_expired() has no harmful
effect; it will not start the timer or schedule a new timeout, and may simply update the
internal state to reflect that no further waiting is required.
*/

Copilot uses AI. Check for mistakes.
@heifner heifner merged commit 77c4531 into master Dec 29, 2025
52 of 54 checks passed
@heifner heifner deleted the feature/fix-deadlock-read-only branch December 29, 2025 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants