feat: keypoints/pose implementation #521

michaelmohamed · 2025-12-30T06:19:18Z

Description

For Technical Implementation Details: See docs.

This PR adds optional keypoint/pose estimation support to RF-DETR, following YOLOv11's approach for pose estimation. The implementation outputs (x, y, visibility) triplets per keypoint per detection, with full support for COCO-style 17-keypoint annotations.

Key features:

Fully configurable number of keypoints, names, and skeleton connections
Each keypoint outputs (x, y, visibility) like YOLOv11
Supports COCO-style keypoint annotations for training
Optional/configurable (like existing segmentation head) - no impact on existing detection/segmentation workflows
Uses detection weights (rf-detr-medium.pth) as default starting point for training

Related issues:

Addresses community requests for pose estimation support in RF-DETR.

Type of change

New feature (non-breaking change which adds functionality)
This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

29 unit tests covering both training and inference pipelines:

pytest tests/test_keypoint_head.py -v

29 passed in 2.28s

Test coverage includes:

KeypointHead module (forward pass, gradient flow, reference boxes, custom keypoints)
COCO constants validation (names, skeleton, sigmas, flip pairs)
Config classes (RFDETRPoseConfig, KeypointTrainConfig)
Training data pipeline (keypoint extraction, hflip/crop transforms)
Loss functions (L1, BCE visibility, OKS)
Inference pipeline (predict() returns keypoints, PostProcess, coordinate scaling, visibility sigmoid)

Any specific deployment considerations

No pretrained pose weights yet - Users must fine-tune on a keypoint dataset (e.g., COCO-Pose). The model loads detection weights by default and the keypoint head is learned during training.
Fully optional - The KeypointHead import is conditional; existing detection/segmentation workflows are unaffected.
Memory - Adds minimal overhead (~1-2% parameters) when keypoint_head=True.

Docs

Added docs/learn/run/pose.md - Complete usage guide for pose estimation
Added docs/reference/pose.md - API reference for RFDETRPose
Updated docs/learn/train/index.md - Added pose tabs to all training examples
Updated mkdocs.yaml - Added navigation entries for pose documentation

Files Changed

File	Change
rfdetr/models/keypoint_head.py	New - KeypointHead class with coordinate/visibility MLPs, COCO constants
rfdetr/config.py	Added RFDETRPoseConfig, KeypointTrainConfig, keypoint fields to ModelConfig
rfdetr/models/lwdetr.py	Integrated keypoint head, added loss_keypoints(), updated PostProcess
rfdetr/datasets/coco.py	Added keypoint annotation parsing in ConvertCoco
rfdetr/datasets/transforms.py	Updated crop() and hflip() for keypoint transformations
rfdetr/detr.py	Added RFDETRPose class, updated predict() for keypoints
rfdetr/init.py	Exported RFDETRPose
rfdetr/engine.py	Added COCO keypoint evaluation support
tests/test_keypoint_head.py	New - 29 tests for training and inference
tests/conftest.py	New - pytest configuration
pyproject.toml	Added pytest configuration
docs/learn/run/pose.md	New - Pose usage documentation
docs/reference/pose.md	New - RFDETRPose API reference
docs/learn/train/index.md	Added pose training examples
mkdocs.yaml	Added pose navigation entries

Usage / Import Example

from rfdetr import RFDETRPose

Training

# Select a model variant
model = RFDETRPose()
model = RFDETRPoseNano()
model = RFDETRPoseSmall()
model = RFDETRPoseMedium()
model = RFDETRPoseLarge()

model.train(dataset_dir="path/to/coco-pose", epochs=50)

Inference

model = RFDETRPose(pretrain_weights="output/checkpoint_best_total.pth")
detections = model.predict("image.jpg", threshold=0.5)

# Access keypoints

# Keypoints: [N, K, 3] where K=num_keypoints, 3 = (x, y, visibility)
keypoints = detections.data["keypoints"]

# Visibility follows COCO format: 0 = not visible, 2 = visible
# For raw confidence scores (0.0-1.0):
confidence = detections.data["keypoints_confidence"]
keypoints = detections.data["keypoints"]

Fix: Category ID Mapping for COCO Datasets

Problem

#330
#349
#413

When training on Roboflow or custom COCO datasets where category_id starts at 1 (or has gaps), the model would crash with a CUDA index out of bounds error. This happened because:

Roboflow exports use 1-indexed category IDs (e.g., [1, 2, 3])
The model expects 0-indexed class labels (e.g., [0, 1, 2])
With num_classes=3 and category_id=3, accessing index 3 in a size-3 tensor fails

Additionally, COCO evaluation was returning near-zero mAP scores because predictions used 0-indexed labels but COCO evaluation expected original category IDs.

Solution

Added automatic bidirectional category ID mapping:

Training (coco.py):

cat_ids = sorted(self.coco.getCatIds())
cat_id_to_continuous = {cat_id: i for i, cat_id in enumerate(cat_ids)}
# [1,2,3] → {1:0, 2:1, 3:2}

Evaluation (coco_eval.py):

continuous_to_cat_id = {i: cat_id for i, cat_id in enumerate(cat_ids)}
# {0:1, 1:2, 2:3} → converts predictions back for COCO metrics

How It Works

Dataset category_ids	Training mapping	Eval reverse mapping
[0, 1, 2]	{0:0, 1:1, 2:2} (identity)	{0:0, 1:1, 2:2} (identity)
[1, 2, 3]	{1:0, 2:1, 3:2}	{0:1, 1:2, 2:3}
[1, 5, 10]	{1:0, 5:1, 10:2}	{0:1, 1:5, 2:10}

Backwards Compatibility

0-indexed datasets: Identity mapping, behaves exactly as before
1-indexed datasets: Now works correctly instead of crashing
Prediction: Returns 0-indexed labels for direct class_names indexing

Files Changed

rfdetr/datasets/coco.py - Added cat_id_to_continuous mapping in data loading
rfdetr/datasets/coco_eval.py - Added continuous_to_cat_id reverse mapping for evaluation
docs/learn/train/index.md - Added documentation for category ID handling

CLAassistant · 2025-12-30T06:19:26Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Michael Mohamed seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

initial pose implementation

4c24ac5

michaelmohamed requested review from Matvezy, SkalskiP, isaacrob-roboflow and probicheaux as code owners December 30, 2025 06:19

michaelmohamed mentioned this pull request Dec 30, 2025

[Feature Proposal] Implementation of Keypoint Detection Head (Architecture & Loss) #468

Open

2 tasks

Michael Mohamed added 8 commits December 30, 2025 19:57

Add pytest configuration

dddc031

Fix keypoint loss for two-stage encoder outputs

bbe0d21

Add category ID to contiguous class mapping with documentation

fb9499e

add multi model pose support

478000c

update pose model docs

999d1f4

remove redundent num_keypoints

0955fc6

add keypoints to tensorboard

ef59cca

fix kp map via coco pytools

f9fb9c7

michaelmohamed marked this pull request as draft December 31, 2025 15:48

michaelmohamed closed this Dec 31, 2025

michaelmohamed reopened this Dec 31, 2025

Michael Mohamed added 5 commits December 31, 2025 12:44

pass kp score through to data

ceb8ddf

fix for coco eval indices

bdeb74c

add coco category handling details

a81d005

fix docs re kp

59fa5f8

update kp technical docs

169f4be

michaelmohamed marked this pull request as ready for review January 2, 2026 05:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: keypoints/pose implementation #521

feat: keypoints/pose implementation #521

Uh oh!

michaelmohamed commented Dec 30, 2025 •

edited

Loading

Uh oh!

CLAassistant commented Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: keypoints/pose implementation #521

Are you sure you want to change the base?

feat: keypoints/pose implementation #521

Uh oh!

Conversation

michaelmohamed commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

29 passed in 2.28s

Any specific deployment considerations

Docs

Files Changed

Usage / Import Example

Training

Inference

Fix: Category ID Mapping for COCO Datasets

Problem

Solution

Added automatic bidirectional category ID mapping:

How It Works

Backwards Compatibility

Files Changed

Uh oh!

CLAassistant commented Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

michaelmohamed commented Dec 30, 2025 •

edited

Loading