feat: XML tool call fallback parser for non-native tool calling models#11288
feat: XML tool call fallback parser for non-native tool calling models#11288roomote[bot] wants to merge 1 commit intomainfrom
Conversation
…odels When a model does not support native function/tool calling (common with some OpenAI-compatible proxies), it outputs tool calls as XML text instead of structured tool_call events. The extension previously rejected these with a "Model Response Incomplete" error. This commit adds a fallback parser that detects XML-formatted tool calls in text responses (e.g. <read_file><path>...</path></read_file>) and converts them to ToolUse blocks, allowing the tools to execute normally. The fallback only activates when no native tool_use blocks are found in the assistant message, ensuring zero impact on providers that already support native function calling. Closes #11187
The XML fallback parser itself is well-implemented and tested, but the integration in
Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues. |
| // XML tool call fallback: When the model doesn't support native function calling | ||
| // (common with some OpenAI-compatible proxies), it may output tool calls as XML | ||
| // text. If we have text content but no native tool uses, try to parse XML tool | ||
| // calls from the text as a fallback. (See: GitHub issue #11187) | ||
| if (hasTextContent && !hasToolUses) { | ||
| const fallbackResult = parseXmlToolCalls(assistantMessage) | ||
| if (fallbackResult.found) { | ||
| console.log( | ||
| `[Task#${this.taskId}] XML tool call fallback: parsed ${fallbackResult.toolUses.length} tool(s) from text`, | ||
| ) | ||
|
|
||
| // Replace the text block(s) with the parsed tool uses | ||
| // Keep only non-text blocks (if any) from the original content | ||
| this.assistantMessageContent = this.assistantMessageContent.filter( | ||
| (block) => block.type !== "text", | ||
| ) | ||
|
|
||
| // Add the parsed tool uses | ||
| for (const toolUse of fallbackResult.toolUses) { | ||
| this.assistantMessageContent.push(toolUse) | ||
| } | ||
|
|
||
| // Present each tool call so they are processed by presentAssistantMessage | ||
| this.userMessageContentReady = false | ||
| for (const toolUse of fallbackResult.toolUses) { | ||
| presentAssistantMessage(this) | ||
| } | ||
|
|
||
| // Re-check hasToolUses now that we've added fallback-parsed tools | ||
| hasToolUses = this.assistantMessageContent.some( | ||
| (block) => block.type === "tool_use" || block.type === "mcp_tool_use", | ||
| ) | ||
| } | ||
| } |
There was a problem hiding this comment.
This fallback code runs after didCompleteReadingStream = true, but during streaming, presentAssistantMessage is called for each text chunk (line 3050). Inside presentAssistantMessage, the containsXmlToolMarkup check (line 297 of presentAssistantMessage.ts) will detect XML tool tags in the text content and immediately reject them with an error ("XML tool calls are no longer supported"), incrementing consecutiveMistakeCount and setting didAlreadyUseTool = true. This all fires before the stream completes and before this fallback code ever runs, so the fallback parser will never get a chance to intercept the XML tool calls. The containsXmlToolMarkup guard needs to be disabled or bypassed when the fallback path is intended to handle these cases.
Fix it with Roo Code or mention @roomote and request a fix.
| // Replace the text block(s) with the parsed tool uses | ||
| // Keep only non-text blocks (if any) from the original content | ||
| this.assistantMessageContent = this.assistantMessageContent.filter( | ||
| (block) => block.type !== "text", | ||
| ) | ||
|
|
||
| // Add the parsed tool uses | ||
| for (const toolUse of fallbackResult.toolUses) { | ||
| this.assistantMessageContent.push(toolUse) | ||
| } |
There was a problem hiding this comment.
After replacing assistantMessageContent (filtering out text blocks and pushing new tool_use blocks), currentStreamingContentIndex is not reset to 0. This index was already incremented past all original blocks during streaming. When presentAssistantMessage is called at line 3458, it checks currentStreamingContentIndex >= assistantMessageContent.length (line 75 of presentAssistantMessage.ts) and early-returns without processing any of the injected tool_use blocks. A this.currentStreamingContentIndex = 0 is needed before the presentAssistantMessage calls.
Fix it with Roo Code or mention @roomote and request a fix.
| // Present each tool call so they are processed by presentAssistantMessage | ||
| this.userMessageContentReady = false | ||
| for (const toolUse of fallbackResult.toolUses) { | ||
| presentAssistantMessage(this) | ||
| } |
There was a problem hiding this comment.
presentAssistantMessage is async and uses a lock (presentAssistantMessageLocked). Calling it in a loop without await means the first call acquires the lock, and all subsequent calls in the same synchronous iteration see the lock held, set pendingUpdates = true, and return immediately. Additionally, presentAssistantMessage already chains through all content blocks internally (it increments currentStreamingContentIndex and calls itself recursively at line 1010). A single call would be sufficient here; the loop is both redundant and broken due to the missing await.
| // Present each tool call so they are processed by presentAssistantMessage | |
| this.userMessageContentReady = false | |
| for (const toolUse of fallbackResult.toolUses) { | |
| presentAssistantMessage(this) | |
| } | |
| presentAssistantMessage(this) |
Fix it with Roo Code or mention @roomote and request a fix.
Related GitHub Issue
Closes: #11187
Description
This PR attempts to address Issue #11187, where models that do not support native function/tool calling (common with some OpenAI-compatible proxies) fail with "Model Response Incomplete" errors because they output tool calls as XML text instead of structured tool_call events.
Implementation approach:
New fallback parser (
XmlToolCallFallbackParser.ts): Scans text responses for XML-formatted tool calls (e.g.,<read_file><path>...</path></read_file>) and converts them toToolUseblocks that the existing tool execution infrastructure can process.Integration in Task.ts: After the API stream completes, if no native
tool_useblocks are found but text content exists, the fallback parser is invoked. If it finds valid XML tool calls, they are injected intoassistantMessageContentand presented for execution.Key design decisions:
usedLegacyFormat: truefor telemetry tracking.write_file->write_to_file).<environment_details>,<thinking>).Test Procedure
XmlToolCallFallbackParsercovering:cd src && npx vitest run core/assistant-message/__tests__/XmlToolCallFallbackParser.spec.tsPre-Submission Checklist
Documentation Updates
Additional Notes
Feedback and guidance are welcome. This is an initial attempt at addressing the issue -- the approach favors minimal footprint (a standalone parser with zero impact on native tool calling paths) over re-introducing a full "Tool Call Protocol" setting.
Important
Introduces
XmlToolCallFallbackParserto handle XML-formatted tool calls for non-native models, integrated intoTask.tswith comprehensive testing.XmlToolCallFallbackParserto parse XML-formatted tool calls intoToolUseblocks.Task.tsto activate when no nativetool_useblocks are found.usedLegacyFormat: true.XmlToolCallFallbackParser.spec.tsfor various tool types, multi-line content, aliases, and edge cases.Task.ts.This description was created by
for d12f787. You can customize this summary. It will automatically update as commits are pushed.