This repository contains an implementation of a learning-based motion planning and control system for a ground robot with independently steered wheels. The project was developed as a technical assignment and focuses on correct modeling, stable learning, and clear visualization of results.
The solution demonstrates:
- A kinematic robot model with curvature–velocity control
- Training of a policy from scratch using reinforcement learning
- Visualization of robot behavior using MCAP
- Quantitative evaluation of navigation performance
The robot is modeled as:
- A rectangular rigid body
- Four independently steered and driven wheels
- Bicycle-style kinematic abstraction
- Instantaneous Center of Rotation (ICR) constrained to the robot’s y-axis
- Explicit limits on steering rate and wheel angular acceleration
The kinematic model is implemented in model.py.
- File:
env_single.py - One robot navigating to a target from random initial poses
- Continuous state and continuous action space
- Used to validate dynamics and learning stability
- File:
env_multi.py - Three robots, each with its own target
- Shared policy controlling all robots
- Observation includes relative target information and inter-robot geometry
- Designed to test coordination and scalable control
Each robot is controlled by:
- curvature κ
- forward velocity v
The action space is continuous. Reverse motion is disabled to simplify learning and improve stability.
The reward function encourages:
- Continuous progress toward the target
- Smooth steering behavior
- Successful goal reaching
Each robot receives an individual reward upon reaching its target, and a terminal bonus is given when all robots reach their goals.
python train_single.pyMulti-robot training:
python train_multi.pyEvaluation runs multiple episodes and reports:
- Success rate
- Average number of steps per episode
- Final distance to targets
- Diagnostic spacing metrics in the multi-robot case
Single-robot evaluation:
python eval_single.pyMulti-robot evaluation:
python eval_multi.pyEvaluation scripts can optionally generate performance plots.
Robot behavior is visualized using MCAP, compatible with tools such as Foxglove Studio.
The visualizations show:
- Robot body and wheel geometry
- Steering configuration
- Target locations
- Executed trajectories
Single-robot rollout:
python rollout_single_to_mcap.pyMultiple single-robot episodes:
python rollout_single_10eps_to_mcap.pyMulti-robot rollout:
python rollout_multi_to_mcap.pyThe generated .mcap files can be opened directly in Foxglove Studio.
Install dependencies:
pip install -r requirements.txtTrain a policy:
python train_single.py
# or
python train_multi.pyEvaluate the trained policy:
python eval_single.py
# or
python eval_multi.pyGenerate visualization:
python rollout_multi_to_mcap.pyAll experiments start from random initialization and do not rely on pre-trained models.
.
├── callbacks.py
├── env_single.py
├── env_multi.py
├── eval_single.py
├── eval_multi.py
├── log_mcap.py
├── model.py
├── rollout_single_to_mcap.py
├── rollout_single_10eps_to_mcap.py
├── rollout_multi_to_mcap.py
├── train_single.py
├── train_multi.py
├── requirements.txt
└── README.md
- MCAP visualizations demonstrate learned behaviors.
- Hyperparameters and reward shaping were chosen to prioritize stability and interpretability.