Skip to content

Conversation

@xlyk
Copy link
Contributor

@xlyk xlyk commented Jan 27, 2026

Summary

  • Fix image handling for OpenAI reasoning models (o1, o3, o4, gpt-5)
  • Images were being silently dropped when sending messages to reasoning models
  • Add convert_user_content() function to properly handle multimodal content

Problem

The extract_user_text() function in rig-openai-responses only handled Text and ToolResult content types, ignoring Image content entirely. This meant images sent to reasoning models were silently dropped.

Solution

  • Added new convert_user_content() function that handles all content types including images
  • Converts rig's UserContent::Image to async-openai's InputImageContent format
  • Uses EasyInputContent::ContentList when images are present (required by the OpenAI Responses API for multimodal content)
  • Added base64 dependency for encoding raw image bytes
  • Added unit tests for text-only and image content conversion

Affected Models

All OpenAI reasoning models now properly support images:

  • o1, o1-preview, o1-mini
  • o3, o3-mini
  • o4-mini
  • gpt-5, gpt-5.1, gpt-5.2, gpt-5-mini, gpt-5-nano

Test Plan

  • Unit tests pass (cargo test -p rig-openai-responses)
  • Clippy passes with no warnings
  • E2E tests pass (119 tests)

Convert rig Image types to OpenAI InputImageContent format, handling base64, URL, and raw byte sources with proper media type and detail level mapping.
@xlyk xlyk merged commit bff0596 into main Jan 28, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants