feat(codec): integrate SafeCodec into Python bridge for NaN rejection #159

bbopen · 2026-01-21T18:09:06Z

Summary

Integrate SafeCodec into python_bridge.py for NaN/Infinity rejection at encoding time
Handle numpy scalars via .item() extraction for proper JSON serialization
Handle pandas NaT, Timestamp, and Timedelta types correctly
Convert Python sets to lists for JSON compatibility (improves serialization coverage)

Changes

Import SafeCodec and CodecError in python_bridge.py
Create _response_codec instance with allow_nan=False
Replace json.dumps with _response_codec.encode in encode_response()
Update adversarial_module.py to use lambda (truly non-serializable) instead of set (now serializable)
Update test regex to accept new NaN rejection error message

Test plan

All 1225 tests pass
Adversarial playground tests pass (40/40)
Build and typecheck pass
Python integration test verifies NaN rejection and numpy scalar handling

Fixes #95, #45, #41

🤖 Generated with Claude Code

Integrate SafeCodec from safe_codec.py into python_bridge.py to: - Reject NaN/Infinity values at encoding time with clear error messages - Handle numpy scalars via .item() extraction for JSON serialization - Handle pandas NaT, Timestamp, and Timedelta types properly - Convert Python sets to lists for JSON serialization Changes: - Import SafeCodec and CodecError in python_bridge.py - Create _response_codec instance with allow_nan=False - Replace json.dumps with _response_codec.encode in encode_response() - Update adversarial_module.py to use lambda (truly non-serializable) instead of set (now serializable as list) - Update test regex to accept new NaN rejection error message Fixes #95, #45, #41 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

coderabbitai · 2026-01-21T18:09:46Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

The Python bridge now encodes responses using a SafeCodec instance configured with allow_nan=False, replacing direct json.dumps usage and converting codec errors into ValueError. Tests and fixtures were updated to reflect new error messages and a non-serializable fixture value.

Changes

Cohort / File(s)	Summary
Core codec integration `runtime/python_bridge.py`	Added `SafeCodec`/`CodecError` import and module-level `_response_codec` (with `allow_nan=False` and large `max_payload_bytes`); replaced `json.dumps(out)` with `_response_codec.encode(out)` and map `CodecError` → `ValueError`.
Tests & fixtures `test/adversarial_playground.test.ts`, `test/fixtures/python/adversarial_module.py`	Expanded test assertion regex to accept SafeCodec error messages about NaN; changed `return_unserializable()` return from a set to a `lambda` to ensure non-JSON-serializable behavior.

Sequence Diagram(s)

(Skipped — changes are localized to encoding logic and tests and do not introduce a new multi-component sequential flow that requires visualization.)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

feat(codec): envelope validation + size cap #35: Modifies runtime/python_bridge.py response encoding path; likely overlaps with the SafeCodec integration and payload handling changes.

Poem

🐰 Hop, hop — I guard the stream so neat,
No NaN or Infinity will sneak or bleat.
Bytes get checked, errors politely told,
JSON stays tidy, robust, and bold. 🥕

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: integrating SafeCodec into Python bridge for NaN rejection, which is the core objective of the PR.
Description check	✅ Passed	The description is well-related to the changeset, detailing the SafeCodec integration, specific changes made to files, test results, and linked issues.
Linked Issues check	✅ Passed	The PR addresses all requirements from issue `#95`: it integrates SafeCodec to reject NaN/Infinity at encoding time, surfaces clear errors, and includes test updates to validate the new error handling.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to the linked issues: SafeCodec integration in python_bridge.py, test updates for NaN rejection, and adversarial module updates for proper test coverage.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e51c0f7 and 09557ce.

📒 Files selected for processing (1)

runtime/python_bridge.py

🧰 Additional context used

🧠 Learnings (5)

📓 Common learnings

Learnt from: bbopen
Repo: bbopen/tywrap PR: 152
File: docs/adr/002-bridge-protocol.md:168-172
Timestamp: 2026-01-20T16:00:49.738Z
Learning: In the tywrap project's BridgeProtocol SafeCodec implementation, Arrow format decoders can produce NaN/Infinity values from binary representations even when the raw JSON payload doesn't contain them. This is why validation for special floats must occur both before encoding (to reject invalid inputs) and after applying decoders (to catch values introduced during Arrow deserialization), protecting downstream consumers from unexpected NaN/Infinity values.

Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-298
Timestamp: 2026-01-19T21:48:45.693Z
Learning: In `src/runtime/bridge-core.ts`, keep `normalizeErrorPayload` to validate error payloads from the Python subprocess. The subprocess boundary is effectively untrusted, and normalizing error responses prevents `undefined: undefined` errors on malformed payloads. Error responses are not the hot path, so the small validation overhead is acceptable for the added resilience.

Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:375-385
Timestamp: 2026-01-19T21:14:37.032Z
Learning: In tywrap (src/runtime/bridge-core.ts and similar), environment variable parsing follows a tolerant/best-effort policy. For example, `TYWRAP_CODEC_MAX_BYTES=1024abc` should be accepted as 1024. Only reject clearly invalid values (non-numeric start or <=0). This avoids surprising failures from minor typos.

Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-263
Timestamp: 2026-01-19T21:14:40.872Z
Learning: In `src/runtime/bridge-core.ts` and similar hot request/response loop implementations in the tywrap repository, avoid adding extra defensive validation (e.g., runtime shape checks on error payloads) in tight loops unless the protocol boundary is untrusted or there's a concrete bug report. The Python bridge protocol is controlled and validated via tests, so defensive checks would add unnecessary branching overhead without meaningful benefit.

Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: test/fixtures/out_of_order_bridge.py:29-48
Timestamp: 2026-01-19T21:00:52.689Z
Learning: In `test/fixtures/out_of_order_bridge.py`, the fixture intentionally leaves a pending request unanswered at EOF to simulate missing/out-of-order responses and validate bridge behavior when requests never complete; this is the exact failure mode being tested and must be preserved.

📚 Learning: 2026-01-19T21:48:45.693Z

Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-298
Timestamp: 2026-01-19T21:48:45.693Z
Learning: In `src/runtime/bridge-core.ts`, keep `normalizeErrorPayload` to validate error payloads from the Python subprocess. The subprocess boundary is effectively untrusted, and normalizing error responses prevents `undefined: undefined` errors on malformed payloads. Error responses are not the hot path, so the small validation overhead is acceptable for the added resilience.

Applied to files:

runtime/python_bridge.py

📚 Learning: 2026-01-19T21:49:05.612Z

Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: runtime/python_bridge.py:99-123
Timestamp: 2026-01-19T21:49:05.612Z
Learning: In the tywrap repository, TYWRAP_REQUEST_MAX_BYTES uses strict integer parsing that rejects values with trailing characters (e.g., "1024abc"). This differs from TYWRAP_CODEC_MAX_BYTES, which uses tolerant/best-effort parsing that accepts numeric prefixes. The strict policy for REQUEST_MAX_BYTES ensures explicit integer values and consistent parse behavior across Node/Python implementations.

Applied to files:

runtime/python_bridge.py

📚 Learning: 2026-01-19T21:14:37.032Z

Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:375-385
Timestamp: 2026-01-19T21:14:37.032Z
Learning: In tywrap (src/runtime/bridge-core.ts and similar), environment variable parsing follows a tolerant/best-effort policy. For example, `TYWRAP_CODEC_MAX_BYTES=1024abc` should be accepted as 1024. Only reject clearly invalid values (non-numeric start or <=0). This avoids surprising failures from minor typos.

Applied to files:

runtime/python_bridge.py

📚 Learning: 2026-01-20T16:00:49.738Z

Learnt from: bbopen
Repo: bbopen/tywrap PR: 152
File: docs/adr/002-bridge-protocol.md:168-172
Timestamp: 2026-01-20T16:00:49.738Z
Learning: In the tywrap project's BridgeProtocol SafeCodec implementation, Arrow format decoders can produce NaN/Infinity values from binary representations even when the raw JSON payload doesn't contain them. This is why validation for special floats must occur both before encoding (to reject invalid inputs) and after applying decoders (to catch values introduced during Arrow deserialization), protecting downstream consumers from unexpected NaN/Infinity values.

Applied to files:

runtime/python_bridge.py

🧬 Code graph analysis (1)

runtime/python_bridge.py (2)

src/index.ts (1)

SafeCodec (17-17)

runtime/safe_codec.py (1)

CodecError (19-22)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: codec-suite

🔇 Additional comments (2)

runtime/python_bridge.py (2)

14-15: LGTM!

The import of SafeCodec and CodecError from the local safe_codec module is appropriate for the NaN rejection feature.

100-108: LGTM!

Good design: using sys.maxsize for SafeCodec's internal limit preserves the original "no limit unless env var" behavior, while the explicit CODEC_MAX_BYTES check in encode_response() provides the environment-variable-specific error message. The allow_nan=False configuration correctly addresses the NaN/Infinity rejection requirement from Issue #95.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e51c0f7d92

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

runtime/python_bridge.py

Address Codex review feedback: The original python_bridge.py had no size limit unless TYWRAP_CODEC_MAX_BYTES was set. Using sys.maxsize preserves this behavior while letting the explicit size check in encode_response() provide the specific error message mentioning the env var name. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

runtime/python_bridge.py (1)

796-804: Consider consolidating UTF-8 encoding to avoid redundant work in the response loop.

SafeCodec.encode() already calls .encode('utf-8') internally to measure payload size (line 162 of safe_codec.py). The second call to payload.encode('utf-8') on line 801 of python_bridge.py repeats this work unnecessarily in the request/response handler loop. Since _response_codec is initialized with max_payload_bytes=sys.maxsize (effectively disabling SafeCodec's internal check), consider refactoring to avoid the redundant encoding—either by exposing the byte length from SafeCodec.encode() or by restructuring the size check to reuse the already-encoded bytes.

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e51c0f7 and 09557ce.

📒 Files selected for processing (1)

runtime/python_bridge.py

🧰 Additional context used

🧠 Learnings (5)

📓 Common learnings

Learnt from: bbopen
Repo: bbopen/tywrap PR: 152
File: docs/adr/002-bridge-protocol.md:168-172
Timestamp: 2026-01-20T16:00:49.738Z
Learning: In the tywrap project's BridgeProtocol SafeCodec implementation, Arrow format decoders can produce NaN/Infinity values from binary representations even when the raw JSON payload doesn't contain them. This is why validation for special floats must occur both before encoding (to reject invalid inputs) and after applying decoders (to catch values introduced during Arrow deserialization), protecting downstream consumers from unexpected NaN/Infinity values.

Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-298
Timestamp: 2026-01-19T21:48:45.693Z
Learning: In `src/runtime/bridge-core.ts`, keep `normalizeErrorPayload` to validate error payloads from the Python subprocess. The subprocess boundary is effectively untrusted, and normalizing error responses prevents `undefined: undefined` errors on malformed payloads. Error responses are not the hot path, so the small validation overhead is acceptable for the added resilience.

Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:375-385
Timestamp: 2026-01-19T21:14:37.032Z
Learning: In tywrap (src/runtime/bridge-core.ts and similar), environment variable parsing follows a tolerant/best-effort policy. For example, `TYWRAP_CODEC_MAX_BYTES=1024abc` should be accepted as 1024. Only reject clearly invalid values (non-numeric start or <=0). This avoids surprising failures from minor typos.

Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-263
Timestamp: 2026-01-19T21:14:40.872Z
Learning: In `src/runtime/bridge-core.ts` and similar hot request/response loop implementations in the tywrap repository, avoid adding extra defensive validation (e.g., runtime shape checks on error payloads) in tight loops unless the protocol boundary is untrusted or there's a concrete bug report. The Python bridge protocol is controlled and validated via tests, so defensive checks would add unnecessary branching overhead without meaningful benefit.

Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: test/fixtures/out_of_order_bridge.py:29-48
Timestamp: 2026-01-19T21:00:52.689Z
Learning: In `test/fixtures/out_of_order_bridge.py`, the fixture intentionally leaves a pending request unanswered at EOF to simulate missing/out-of-order responses and validate bridge behavior when requests never complete; this is the exact failure mode being tested and must be preserved.

📚 Learning: 2026-01-19T21:48:45.693Z

Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-298
Timestamp: 2026-01-19T21:48:45.693Z
Learning: In `src/runtime/bridge-core.ts`, keep `normalizeErrorPayload` to validate error payloads from the Python subprocess. The subprocess boundary is effectively untrusted, and normalizing error responses prevents `undefined: undefined` errors on malformed payloads. Error responses are not the hot path, so the small validation overhead is acceptable for the added resilience.

Applied to files:

runtime/python_bridge.py

📚 Learning: 2026-01-19T21:49:05.612Z

Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: runtime/python_bridge.py:99-123
Timestamp: 2026-01-19T21:49:05.612Z
Learning: In the tywrap repository, TYWRAP_REQUEST_MAX_BYTES uses strict integer parsing that rejects values with trailing characters (e.g., "1024abc"). This differs from TYWRAP_CODEC_MAX_BYTES, which uses tolerant/best-effort parsing that accepts numeric prefixes. The strict policy for REQUEST_MAX_BYTES ensures explicit integer values and consistent parse behavior across Node/Python implementations.

Applied to files:

runtime/python_bridge.py

📚 Learning: 2026-01-19T21:14:37.032Z

Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:375-385
Timestamp: 2026-01-19T21:14:37.032Z
Learning: In tywrap (src/runtime/bridge-core.ts and similar), environment variable parsing follows a tolerant/best-effort policy. For example, `TYWRAP_CODEC_MAX_BYTES=1024abc` should be accepted as 1024. Only reject clearly invalid values (non-numeric start or <=0). This avoids surprising failures from minor typos.

Applied to files:

runtime/python_bridge.py

📚 Learning: 2026-01-20T16:00:49.738Z

Learnt from: bbopen
Repo: bbopen/tywrap PR: 152
File: docs/adr/002-bridge-protocol.md:168-172
Timestamp: 2026-01-20T16:00:49.738Z
Learning: In the tywrap project's BridgeProtocol SafeCodec implementation, Arrow format decoders can produce NaN/Infinity values from binary representations even when the raw JSON payload doesn't contain them. This is why validation for special floats must occur both before encoding (to reject invalid inputs) and after applying decoders (to catch values introduced during Arrow deserialization), protecting downstream consumers from unexpected NaN/Infinity values.

Applied to files:

runtime/python_bridge.py

🧬 Code graph analysis (1)

runtime/python_bridge.py (2)

src/index.ts (1)

SafeCodec (17-17)

runtime/safe_codec.py (1)

CodecError (19-22)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: codec-suite

🔇 Additional comments (2)

runtime/python_bridge.py (2)

14-15: LGTM!

The import of SafeCodec and CodecError from the local safe_codec module is appropriate for the NaN rejection feature.

100-108: LGTM!

Good design: using sys.maxsize for SafeCodec's internal limit preserves the original "no limit unless env var" behavior, while the explicit CODEC_MAX_BYTES check in encode_response() provides the environment-variable-specific error message. The allow_nan=False configuration correctly addresses the NaN/Infinity rejection requirement from Issue #95.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai bot added area:codec Area: codecs and serialization enhancement New feature or request labels Jan 21, 2026

coderabbitai bot previously approved these changes Jan 21, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Jan 21, 2026

View reviewed changes

runtime/python_bridge.py Outdated Show resolved Hide resolved

bbopen dismissed coderabbitai[bot]’s stale review via 09557ce January 21, 2026 18:21

coderabbitai bot reviewed Jan 21, 2026

View reviewed changes

coderabbitai bot approved these changes Jan 21, 2026

View reviewed changes

bbopen merged commit cee80c3 into main Jan 21, 2026
19 of 20 checks passed

bbopen deleted the feat/python-codec-integration branch January 21, 2026 18:32

This was referenced Jan 21, 2026

chore: remove ADR-002 documentation #164

Merged

chore: bump version to 0.2.1 #165

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(codec): integrate SafeCodec into Python bridge for NaN rejection #159

feat(codec): integrate SafeCodec into Python bridge for NaN rejection #159

Uh oh!

bbopen commented Jan 21, 2026

Uh oh!

coderabbitai bot commented Jan 21, 2026 •

edited

Loading

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(codec): integrate SafeCodec into Python bridge for NaN rejection #159

feat(codec): integrate SafeCodec into Python bridge for NaN rejection #159

Uh oh!

Conversation

bbopen commented Jan 21, 2026

Summary

Changes

Test plan

Uh oh!

coderabbitai bot commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Jan 21, 2026 •

edited

Loading