Skip to content

Conversation

@thowell
Copy link
Collaborator

@thowell thowell commented Jan 2, 2026

this pr aims to improve testing for batched Model fields. hopefully we can prevent bugs like the ones fixed in #865, #932, #821, #819, #807, #800

notes:

  • --debug_mode does not work with --cpu so the changes to ci.yml will not currently work with the github actions cpu-based ci

@thowell thowell force-pushed the test_model_batched_fields branch from 394fff4 to d5c6a14 Compare January 3, 2026 12:26
@thowell
Copy link
Collaborator Author

thowell commented Jan 3, 2026

@Kenny-Vilella @adenzler-nvidia is there a recommended approach for checking if any out of bounds reads occur on cpu? thanks!

@Kenny-Vilella
Copy link
Collaborator

Kenny-Vilella commented Jan 5, 2026

It will be great to have GPU runner to run compute-sanitizer.
Currently, I am running weekly compute-sanitizer run on a set of benchmark and a subset of UTs (not all due to memory limitation).
If we could automatize that as nightly run, it will probably be the most helpful.

To reply to your question, I guess using valgrind should work.
I tried to run on the UTs and got a lot of noise, but it does detect invalid write/read.

@adenzler-nvidia
Copy link
Collaborator

I'm curious why debug_mode doesn't work with CPU for you? On my machine the tests pass, they are just a lot slower.

@thowell
Copy link
Collaborator Author

thowell commented Jan 5, 2026

@adenzler-nvidia

was noticing a case where --debug_mode would produce cuda error 710, but with --cpu no errors were raised.

to reproduce:

  1. undo the fix here https://github.com/google-deepmind/mujoco_warp/pull/865/changes to induce out of bounds access
  2. pytest -k io_test --debug_mode -> should raise cuda error 710
  3. pytest -k io_test --debug_mode --cpu -> no errors (?)

however, this seems to work

import warp as wp

wp.set_device("cpu")
wp.config.mode = "debug"

@wp.kernel
def kernel(arr: wp.array(dtype=int)):
  arr[1] = 0 # intential out of bounds

arr = wp.zeros(1, dtype=int)
wp.launch(kernel, dim=1, inputs=[arr])

print(arr.numpy())
Module __main__ 7cb498a load on device 'cpu' took 1.15 ms  (cached)
Assertion failed: 'i >= -arr.shape[0] && i < arr.shape[0]'

@adenzler-nvidia
Copy link
Collaborator

interesting, if you use -s here you can see the assert is being printed to stderr, so in theory it works. The problem seems to be connected to the warp assert handler and how assert(false) somehow isn't being picked up by the python side properly.

@adenzler-nvidia
Copy link
Collaborator

I found the issue, made a PR to warp. The issue is that the CPU-side assert handler calls assert(false), but that code is part of the warp library instead of the kernel code so it doesn't respect NDEBUG.

@thowell
Copy link
Collaborator Author

thowell commented Jan 5, 2026

nice! thanks @adenzler-nvidia

@thowell
Copy link
Collaborator Author

thowell commented Jan 5, 2026

NVIDIA/warp#1148

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants