feat: Implement core context system abstractions (ANS-411)#25
feat: Implement core context system abstractions (ANS-411)#25jharris1679 merged 3 commits intomainfrom
Conversation
- Add minimal ContextProvider protocol with get_context() method - Implement intervention functions (remove_percentage, shuffle, truncate) - Support composable interventions for SIT experiments - Add comprehensive test coverage (9 tests, all passing) - Include documentation for contributors Design prioritizes simplicity: - Protocol/duck typing instead of inheritance - Interventions as functions for easy composition - No artificial abstractions, each system uses natural representation
WalkthroughAdds a ContextProvider protocol, a composable interventions framework (remove/shuffle/truncate/apply/compose), package exports and README documentation, plus tests validating protocol duck-typing, interventions, composition, metadata, and edge cases. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–30 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
semiosis/contexts/__init__.py (1)
17-24: Sort__all__per isort guidelines.The
__all__list should be alphabetically sorted to comply with isort configuration.As per coding guidelines, apply this diff:
__all__ = [ + "ContextProvider", "apply_intervention", - "remove_percentage", - "shuffle_content", - "truncate_context", "compose_interventions", + "remove_percentage", + "shuffle_content", + "truncate_context", ]semiosis/contexts/interventions.py (3)
12-46: Document or reconsider in-place mutation behavior.The function mutates the provider by reassigning
get_context(line 45), which modifies the original object rather than creating a wrapper. This behavior may be surprising to users who expect interventions to return a new provider instance.Consider either:
- Documenting this mutation behavior prominently in the docstring, or
- Creating a wrapper class instead of mutating the original provider
If mutation is intentional, enhance the docstring:
def apply_intervention( provider: Any, intervention_fn: Callable[[str, Dict[str, Any]], Tuple[str, Dict[str, Any]]], name: str, noise_level: float, ) -> Any: """ Apply an intervention function to a context provider. + + Note: This function mutates the provider in-place by replacing its + get_context method. The same provider object is returned. Args: provider: Any object with a get_context method
39-41: Consider iterable unpacking for metadata.The static analysis tool suggests using iterable unpacking instead of list concatenation for better performance and readability.
Apply this diff:
- modified_metadata["interventions"] = metadata.get("interventions", []) + [ - {"name": name, "noise_level": noise_level} - ] + modified_metadata["interventions"] = [ + *metadata.get("interventions", []), + {"name": name, "noise_level": noise_level}, + ]
110-137: Consider parameterizing the full context size assumption.Line 134 assumes 10,000 characters represents "full" context for noise calculation. While this is a reasonable default for the current use case, different context providers might have very different typical sizes.
Consider making this configurable if you anticipate diverse context sizes:
-def truncate_context(provider: Any, max_chars: int) -> Any: +def truncate_context(provider: Any, max_chars: int, full_size: int = 10000) -> Any: """ Truncate context to a maximum character length. Args: provider: Context provider to modify max_chars: Maximum characters to keep + full_size: Expected "full" context size for noise calculation Returns: Modified provider with truncated content """ def intervention_fn( context: str, metadata: Dict[str, Any] ) -> Tuple[str, Dict[str, Any]]: original_length = len(context) truncated = context[:max_chars] metadata = metadata.copy() metadata["original_chars"] = original_length metadata["truncated_chars"] = len(truncated) return truncated, metadata - noise = 1.0 - (max_chars / 10000) # Assume 10k chars is "full" context + noise = 1.0 - (max_chars / full_size) return apply_intervention( provider, intervention_fn, f"truncate_{max_chars}", max(0, min(1, noise)) )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
semiosis/contexts/README.md(1 hunks)semiosis/contexts/__init__.py(1 hunks)semiosis/contexts/interventions.py(1 hunks)semiosis/contexts/protocol.py(1 hunks)tests/contexts/test_protocol.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Google-style docstrings for all public methods and classes
Comprehensive type hints required throughout the codebase
Use Black for code formatting with 88 character line length
Use isort with Black profile for import sorting
Use flake8 for linting Python code
Use mypy for type checking in Python
Use deprecation warnings before removing functionality
Files:
semiosis/contexts/__init__.pytests/contexts/test_protocol.pysemiosis/contexts/interventions.pysemiosis/contexts/protocol.py
tests/**/test_*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/test_*.py: Test files should usetest_*.pynaming convention and be located intests/directory
Mark tests requiring API keys with@pytest.mark.requires_apidecorator
Tests should gracefully skip if API keys are unavailable
Provide fixture data for reproducible tests
Files:
tests/contexts/test_protocol.py
🧬 Code graph analysis (3)
semiosis/contexts/__init__.py (2)
semiosis/contexts/interventions.py (5)
apply_intervention(12-46)compose_interventions(140-157)remove_percentage(49-76)shuffle_content(79-107)truncate_context(110-137)semiosis/contexts/protocol.py (1)
ContextProvider(11-45)
tests/contexts/test_protocol.py (2)
semiosis/contexts/interventions.py (6)
apply_intervention(12-46)compose_interventions(140-157)remove_percentage(49-76)shuffle_content(79-107)truncate_context(110-137)composed(151-155)semiosis/contexts/protocol.py (1)
get_context(27-45)
semiosis/contexts/protocol.py (1)
tests/contexts/test_protocol.py (2)
get_context(20-21)get_context(40-41)
🪛 Ruff (0.14.6)
semiosis/contexts/__init__.py
17-24: __all__ is not sorted
Apply an isort-style sorting to __all__
(RUF022)
semiosis/contexts/interventions.py
39-41: Consider iterable unpacking instead of concatenation
Replace with iterable unpacking
(RUF005)
🔇 Additional comments (5)
semiosis/contexts/protocol.py (1)
11-45: LGTM! Well-designed minimal protocol.The protocol definition is clean, well-documented with Google-style docstrings, and includes comprehensive type hints. The example usage in the docstring effectively demonstrates the expected implementation pattern.
tests/contexts/test_protocol.py (1)
24-48: LGTM! Protocol tests validate duck typing correctly.The tests effectively demonstrate that any class with a
get_contextmethod satisfies the protocol, which aligns with the design philosophy of avoiding inheritance.semiosis/contexts/interventions.py (3)
49-76: LGTM! Intervention correctly implements random removal.The function appropriately uses
random.samplefor stochastic information reduction, which aligns with the SIT experiment design. The defensive check forkeep_count > 0prevents errors when removing 100% of content.
79-107: LGTM! Shuffle intervention is well-implemented.The function correctly shuffles content while preserving all lines. The noise level of 0.3 is appropriately justified with a comment explaining that content is intact but order is lost.
140-157: LGTM! Clean composition implementation.The function correctly composes interventions in sequence. Note that since interventions mutate providers in-place, this will sequentially mutate the same provider object, which is consistent with the design of
apply_intervention.
- Remove incorrect reference to providers/ directory - Add note that DBT and file system implementations coming in ANS-412, ANS-413 - Addresses PR review comment
- Rename test to test_intervention_mutates_provider to reflect actual behavior - Add assertions to verify provider1 is mutated by the intervention - Verify returned object is the same as provider1 (mutation, not copy) - Keep assertion that provider2 remains unaffected - Addresses PR review comment about test accuracy
Design prioritizes simplicity:
Pull Request
What does this PR do?
Brief description of the changes and why they're needed.
Related Issue
Lightweight Checklist
pytest), or rationale provided if not applicableblack,isort,flake8)mypy)Additional Notes (optional)
Any breaking changes, performance considerations, or other context for reviewers.
Summary by CodeRabbit
New Features
Documentation
Tests
✏️ Tip: You can customize this high-level summary in your review settings.