Skip to content

Comments

Enhance reward shaping and add tail/arm actuators for T-Rex and Raptor#75

Merged
kuds merged 1 commit intomainfrom
claude/dinosaur-feet-sizing-8Mze8
Feb 23, 2026
Merged

Enhance reward shaping and add tail/arm actuators for T-Rex and Raptor#75
kuds merged 1 commit intomainfrom
claude/dinosaur-feet-sizing-8Mze8

Conversation

@kuds
Copy link
Owner

@kuds kuds commented Feb 23, 2026

This pull request introduces significant enhancements to the T-Rex and Velociraptor simulation environments, focusing on biomechanical realism and reward shaping for reinforcement learning. The most important changes include new actuators for tail and forelimb control, expanded and normalized reward components, and updates to tests to reflect these improvements.

T-Rex Environment Improvements

  • Added three new tail actuators (tail_1_pitch_act, tail_1_yaw_act, tail_2_pitch_act) for active balance control, increasing the action space from 18 to 21. [1] [2]
  • Introduced multiple new reward components: posture, nosedive, gait symmetry, smoothness, heading alignment, and lateral velocity penalty. All rewards are normalized for physical plausibility, and total reward calculation now includes these components. [1] [2] [3] [4] [5] [6]
  • Added termination condition for excessive forward pitch ("nosedive") beyond natural lean.
  • Updated tests to check for expanded action space, new reward keys, and new termination reasons. [1] [2] [3]

Velociraptor Environment Improvements

  • Changed forelimb joints from ball to two hinge joints for more realistic arm movement, and added four forelimb actuators. Also added a third tail actuator, increasing action space from 17 to 22. [1] [2]
  • Updated keyframe and actuator initialization to match new joint structure and actuator count.
  • Revised tests to validate new action space and actuator count, and updated reward component checks. [1] [2] [3]

Simulation Consistency

  • Updated keyframe and actuator initialization in XML files to reflect new actuators and ensure correct simulation resets. [1] [2]

These changes collectively improve biomechanical fidelity and reward shaping, enabling more robust and realistic reinforcement learning experiments.

Velociraptor:
- Replace passive ball-joint forelimbs with actuated hinge joints
  (shoulder pitch + roll, kp=15) for future grappling/balance use
- Add tail_3_pitch actuator (kp=20) for active distal tail control
- Update keyframe qpos (ball→hinge reduces arm DOF from 4 to 2 each)
  and ctrl for the 5 new actuators (22 total, was 17)

T-Rex:
- Add tail actuators (tail_1_pitch kp=100, tail_1_yaw kp=80,
  tail_2_pitch kp=80) to counterbalance the 16.7kg head assembly
- Normalize all rewards to [-1, 1] matching the raptor's pattern:
  forward_vel /8.0, approach_delta /max_delta, tail /10.0, energy /n_act
- Port 6 missing reward components from raptor: posture penalty,
  nosedive penalty, gait symmetry, action smoothness, heading
  alignment, and lateral velocity penalty (21 actuators, was 18)
- Add nosedive termination (forward_z threshold matching raptor logic)

Tests updated for new action/observation space dimensions.

https://claude.ai/code/session_01CK5BXeF1FvQbRJjjRvZnvM
@kuds kuds merged commit 214bf9c into main Feb 23, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants