Refactoring the Demuxers for chunked Tensor data#181
Open
Conversation
…nput and updating the corresponding tests.
Here's a summary of what I've done:
This commit refactors `TensorDemuxer` and `SmoothedTensorDemuxer` to process
`SerializableTensorChunk` objects, aligning with a chunk-based data model.
Unit tests have been carefully updated to ensure minimal diffs for existing
test cases while adding new tests for chunk-specific scenarios.
**Core Refactoring:**
1. **`TensorDemuxer` (`tensor_demuxer.py`):**
* I renamed `on_update_received` to `on_chunk_received`, accepting
`SerializableTensorChunk` (using `chunk.tensor` for data and
`chunk.timestamp.as_datetime()` for timestamp conversion).
* I preserved internal keyframe storage and cascading update logic.
* I ensured `_on_keyframe_updated` is called at most once per affected
timestamp after a chunk is processed and causes a state change.
2. **`SmoothedTensorDemuxer` (`smoothed_tensor_demuxer.py`):**
* I updated `on_chunk_received` to delegate to the parent class's
method for chunk processing.
**Unit Test Updates (Iterative Approach):**
* **Initial Broad Refactor & Subsequent Refinement:** I iteratively refined test updates based on feedback to ensure minimal changes to
existing test cases.
* **`tensor_demuxer_unittest.py`:**
* Existing tests: Calls to `on_update_received(idx, val, ts)` were
transformed to `on_chunk_received(Chunk(ts, idx, [val]))`.
* Assertions for mock client notifications (`call_count`) were
carefully adjusted to reflect the rule that notifications occur
only if a keyframe's final calculated value changes, for both
direct updates and cascades.
* I added new tests to specifically cover scenarios with
multi-update chunks.
* **`smoothed_tensor_demuxer_unittest.py`:**
* Minimal changes were required as these tests primarily verify
smoothing logic by directly manipulating parent keyframe history,
not by calling the data input method.
* **Test Helpers:** I added `MockSynchronizedTimestamp` and an inlined
`SerializableTensorChunk` (with `tensor` field) to test
files to ensure compatibility with `TensorDemuxer`'s expectation of
`timestamp.as_datetime()` and field naming.
* **E2E Test Adjustments:** I applied necessary fixes to
`tensor_e2etest.py` for compatibility with `SerializableTensorChunk`
(field name and timestamp object).
**Code Quality and Verification:**
* **Comments:** I reviewed and refined application code comments to
explain "why" not just "what," and removed placeholder/meta-comments.
* **Placeholder File:** I deleted the temporary
`serializable_tensor_chunk_placeholder.py`.
* **Static Analysis:** All changes passed Black, Ruff, MyPy, and Pylint.
* **Testing:** The full `pytest` suite (895 tests) passed consistently
across two runs, confirming the correctness and integration of changes.
This work completes the transition of the demuxer components to the new
chunk-based data transport model while maintaining test coverage and adhering
to project standards.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This commit refactors
TensorDemuxerandSmoothedTensorDemuxerto processSerializableTensorChunkobjects, aligning with a chunk-based data model. Unit tests have been carefully updated to ensure minimal diffs for existing test cases while adding new tests for chunk-specific scenarios.Core Refactoring:
TensorDemuxer(tensor_demuxer.py):on_update_receivedtoon_chunk_received, acceptingSerializableTensorChunk(usingchunk.tensorfor data andchunk.timestamp.as_datetime()for timestamp conversion)._on_keyframe_updatedis called at most once per affectedtimestamp after a chunk is processed and causes a state change.
SmoothedTensorDemuxer(smoothed_tensor_demuxer.py):on_chunk_receivedto delegate to the parent class'smethod for chunk processing.
Unit Test Updates (Iterative Approach):
existing test cases.
tensor_demuxer_unittest.py:on_update_received(idx, val, ts)weretransformed to
on_chunk_received(Chunk(ts, idx, [val])).call_count) werecarefully adjusted to reflect the rule that notifications occur
only if a keyframe's final calculated value changes, for both
direct updates and cascades.
multi-update chunks.
smoothed_tensor_demuxer_unittest.py:smoothing logic by directly manipulating parent keyframe history,
not by calling the data input method.
MockSynchronizedTimestampand an inlinedSerializableTensorChunk(withtensorfield) to testfiles to ensure compatibility with
TensorDemuxer's expectation oftimestamp.as_datetime()and field naming.tensor_e2etest.pyfor compatibility withSerializableTensorChunk(field name and timestamp object).
Code Quality and Verification:
explain "why" not just "what," and removed placeholder/meta-comments.
serializable_tensor_chunk_placeholder.py.pytestsuite (895 tests) passed consistentlyacross two runs, confirming the correctness and integration of changes.
This work completes the transition of the demuxer components to the new chunk-based data transport model while maintaining test coverage and adhering to project standards.