fix: resolve Windows UTF-8 encoding issue for international characters by eLyiN · Pull Request #3 · eLyiN/codex-bridge

eLyiN · 2025-12-28T10:54:43Z

Summary

Fixes Windows UTF-8 encoding issue where international characters (Chinese, etc.) were corrupted
Improves Windows CLI detection with additional paths and extensions
Enhances error diagnostics with platform-specific messages

Problem

Issue #1 reported that UTF-8 characters passed through MCP were corrupted on Windows. This was caused by the subprocess module using the system's default code page (e.g., cp1252, cp936) instead of UTF-8.

Solution

UTF-8 Encoding Fix

Use explicit UTF-8 encoding for subprocess I/O on Windows
Set PYTHONUTF8=1 and PYTHONIOENCODING=utf-8 environment variables
Encode input as UTF-8 bytes and decode output with errors='replace'

Enhanced Windows CLI Detection

Added .ps1 to Windows executable extensions
Added fallback checks for common Windows installation paths:
- %LOCALAPPDATA%\Programs\codex\codex.exe
- %APPDATA%\npm\codex.cmd
- %USERPROFILE%\.cargo\bin\codex.exe

Improved Error Diagnostics

Added FileNotFoundError specific handling
Error responses now include platform and exception_type fields
Clear instructions for Windows users to verify installation

Test Plan

Python syntax validation passes (python -m py_compile)
All version numbers updated to 1.2.3
CHANGELOG.md updated with v1.2.2 and v1.2.3 entries
Windows user verification (requires feedback from @bianoengpeng)

Files Changed

src/mcp_server.py - Core fix for UTF-8 encoding and error handling
src/__init__.py - Version bump to 1.2.3
pyproject.toml - Version bump to 1.2.3
CHANGELOG.md - Added v1.2.2 and v1.2.3 entries

Closes #1

Fixes issue #1 where UTF-8 characters (Chinese, etc.) were corrupted when passed through MCP on Windows systems. Changes: - Use explicit UTF-8 encoding for subprocess I/O on Windows instead of system default code page (cp1252, cp936, etc.) - Set PYTHONUTF8=1 and PYTHONIOENCODING=utf-8 environment variables - Encode input as UTF-8 bytes and decode output with error handling - Add .ps1 to Windows executable extensions - Add fallback checks for common Windows installation paths - Improve error diagnostics with platform-specific messages - Add FileNotFoundError handling with actionable guidance Tested: - Python syntax validation passes - All version numbers updated to 1.2.3 - Changelog updated with v1.2.2 and v1.2.3 entries Resolves: #1

Addresses PR review feedback: the detected codex path (.ps1, .exe, etc.) was not being used in the actual subprocess execution - it was only used for the pre-flight check. Changes: - Add _build_codex_exec_command() helper that returns the proper command list based on the detected executable type - PowerShell scripts (.ps1) are now executed via: powershell -ExecutionPolicy Bypass -File <path> exec - Windows .exe/.bat/.cmd files use the resolved absolute path - All three MCP tools now use _build_codex_exec_command() instead of hardcoded ["codex", "exec"] - Updated CHANGELOG with the new helper function This ensures that when Codex is installed as a PowerShell script, it will actually be executed correctly instead of failing with FileNotFoundError.

Copilot

Pull request overview

This pull request fixes a Windows UTF-8 encoding issue where international characters (Chinese, etc.) were corrupted when using the MCP server on Windows. The fix explicitly sets UTF-8 encoding for subprocess I/O and adds enhanced Windows CLI detection with additional installation paths and PowerShell support.

Implements UTF-8 encoding for subprocess I/O on Windows to replace system default code page handling
Enhances Windows CLI detection by adding .ps1 extension support and checking common installation paths
Improves error diagnostics with FileNotFoundError handling and platform-specific metadata

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File	Description
src/mcp_server.py	Core UTF-8 encoding fix in `_run_codex_command()`, enhanced Windows CLI detection in `_get_codex_command()`, new `_build_codex_exec_command()` helper, and improved FileNotFoundError handling
src/init.py	Version bump from 1.2.2 to 1.2.3 with updated description
pyproject.toml	Version bump from 1.2.2 to 1.2.3
CHANGELOG.md	Added entries for v1.2.3 and v1.2.2 with detailed changelog of UTF-8 fixes and Windows compatibility improvements

Comments suppressed due to low confidence (1)

src/mcp_server.py:699

The general Exception handler in the batch function is missing the exception_type and platform fields in its metadata, which are included in the same handler for consult_codex (line 420) and consult_codex_with_stdin (line 555). For consistency and better debugging, the metadata should include these fields: "platform": platform.system() and "exception_type": type(e).name.

        except Exception as e:
            results.append({
                "status": "error",
                "index": i,
                "query": query[:100] + "..." if len(query) > 100 else query,
                "error": f"Error executing query: {str(e)}",
                "metadata": {}
            })

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

eLyiN added 2 commits December 28, 2025 11:52

eLyiN requested a review from Copilot December 28, 2025 11:05

Copilot started reviewing on behalf of eLyiN December 28, 2025 11:05 View session

Copilot AI reviewed Dec 28, 2025

View reviewed changes

eLyiN merged commit 0128dee into main Dec 28, 2025
14 checks passed

eLyiN deleted the fix/windows-utf8-encoding branch December 28, 2025 11:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: resolve Windows UTF-8 encoding issue for international characters#3

fix: resolve Windows UTF-8 encoding issue for international characters#3
eLyiN merged 2 commits intomainfrom
fix/windows-utf8-encoding

eLyiN commented Dec 28, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

eLyiN commented Dec 28, 2025

Summary

Problem

Solution

UTF-8 Encoding Fix

Enhanced Windows CLI Detection

Improved Error Diagnostics

Test Plan

Files Changed

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant