feat: Implement core context system abstractions (ANS-411) by jharris1679 · Pull Request #25 · AnswerLayer/semiosis

jharris1679 · 2025-11-30T16:06:16Z

Add minimal ContextProvider protocol with get_context() method
Implement intervention functions (remove_percentage, shuffle, truncate)
Support composable interventions for SIT experiments
Add comprehensive test coverage (9 tests, all passing)
Include documentation for contributors

Design prioritizes simplicity:

Protocol/duck typing instead of inheritance
Interventions as functions for easy composition
No artificial abstractions, each system uses natural representation

Pull Request

What does this PR do?

Brief description of the changes and why they're needed.

Related Issue

Closes #XXX (if applicable)

Lightweight Checklist

Changes are scoped and documented above
Tests pass locally (pytest), or rationale provided if not applicable
Code formatted and linted (black, isort, flake8)
Types checked where applicable (mypy)

Additional Notes (optional)

Any breaking changes, performance considerations, or other context for reviewers.

Summary by CodeRabbit

New Features
- Added a context system framework with composable interventions (remove percentage, shuffle content, truncate) for experiments.
Documentation
- Added a comprehensive README covering design, usage, metadata tracking, and how to implement new context providers.
Tests
- Added tests validating provider protocol compliance, intervention behaviors, composition, and metadata integrity.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

- Add minimal ContextProvider protocol with get_context() method - Implement intervention functions (remove_percentage, shuffle, truncate) - Support composable interventions for SIT experiments - Add comprehensive test coverage (9 tests, all passing) - Include documentation for contributors Design prioritizes simplicity: - Protocol/duck typing instead of inheritance - Interventions as functions for easy composition - No artificial abstractions, each system uses natural representation

coderabbitai · 2025-11-30T16:06:25Z

Walkthrough

Adds a ContextProvider protocol, a composable interventions framework (remove/shuffle/truncate/apply/compose), package exports and README documentation, plus tests validating protocol duck-typing, interventions, composition, metadata, and edge cases.

Changes

Cohort / File(s)	Summary
Protocol & Package Init `semiosis/contexts/protocol.py`, `semiosis/contexts/__init__.py`, `semiosis/contexts/README.md`	New `ContextProvider` Protocol (method `get_context(query: str) -> (str, dict)`), package-level exports re-exporting protocol and interventions, and README documenting design, usage, metadata, and provider implementation guidance.
Interventions Framework `semiosis/contexts/interventions.py`	New intervention utilities: `apply_intervention`, `remove_percentage`, `shuffle_content`, `truncate_context`, and `compose_interventions`. Wrappers return modified providers whose `get_context` applies transformations and appends intervention metadata.
Tests `tests/contexts/test_protocol.py`	Comprehensive tests for protocol compliance, each intervention, composition, edge cases (100% removal, truncation), metadata integrity, and ensuring interventions do not mutate original providers.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

Pay attention to:
- apply_intervention wrapping semantics and that original provider instances remain unmodified.
- Metadata shape and ordering when multiple interventions are composed.
- Noise level calculations and recorded metrics in truncate_context, remove_percentage, and shuffle_content.
- Test assertions for edge cases (100% removal, exact truncation boundaries).

Poem

🐇 I hopped in to wrap the context with care,
Lines shuffled and trimmed in the brisk autumn air,
Metadata breadcrumbs where changes were made,
Compose us together — a tidy parade,
Semiosis hums on, experiments prepared.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 62.50% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Description check	❓ Inconclusive	The description partially fills the template with key changes and design principles, but checklist items are unchecked and some sections remain incomplete or use placeholder text.	Complete the template sections: fill in 'What does this PR do' with detailed rationale, add the related issue number, check completed checklist items, and provide notes on any breaking changes or performance considerations if applicable.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: implementing core context system abstractions, matching the substantial protocol, interventions, and documentation additions in the changeset.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/context-system-foundation

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f0eef3d and a25ca18.

📒 Files selected for processing (2)

semiosis/contexts/README.md (1 hunks)
tests/contexts/test_protocol.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

semiosis/contexts/README.md
tests/contexts/test_protocol.py

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (4)

semiosis/contexts/__init__.py (1)
17-24: Sort __all__ per isort guidelines.

The __all__ list should be alphabetically sorted to comply with isort configuration.

As per coding guidelines, apply this diff:
 __all__ = [
+    "ContextProvider",
     "apply_intervention",
-    "remove_percentage",
-    "shuffle_content",
-    "truncate_context",
     "compose_interventions",
+    "remove_percentage",
+    "shuffle_content",
+    "truncate_context",
 ]
semiosis/contexts/interventions.py (3)
12-46: Document or reconsider in-place mutation behavior.

The function mutates the provider by reassigning get_context (line 45), which modifies the original object rather than creating a wrapper. This behavior may be surprising to users who expect interventions to return a new provider instance.

Consider either:

Documenting this mutation behavior prominently in the docstring, or

Creating a wrapper class instead of mutating the original provider

If mutation is intentional, enhance the docstring:
 def apply_intervention(
     provider: Any,
     intervention_fn: Callable[[str, Dict[str, Any]], Tuple[str, Dict[str, Any]]],
     name: str,
     noise_level: float,
 ) -> Any:
     """
     Apply an intervention function to a context provider.
+    
+    Note: This function mutates the provider in-place by replacing its
+    get_context method. The same provider object is returned.

     Args:
         provider: Any object with a get_context method
39-41: Consider iterable unpacking for metadata.

The static analysis tool suggests using iterable unpacking instead of list concatenation for better performance and readability.

Apply this diff:
-        modified_metadata["interventions"] = metadata.get("interventions", []) + [
-            {"name": name, "noise_level": noise_level}
-        ]
+        modified_metadata["interventions"] = [
+            *metadata.get("interventions", []),
+            {"name": name, "noise_level": noise_level},
+        ]
110-137: Consider parameterizing the full context size assumption.

Line 134 assumes 10,000 characters represents "full" context for noise calculation. While this is a reasonable default for the current use case, different context providers might have very different typical sizes.

Consider making this configurable if you anticipate diverse context sizes:
-def truncate_context(provider: Any, max_chars: int) -> Any:
+def truncate_context(provider: Any, max_chars: int, full_size: int = 10000) -> Any:
     """
     Truncate context to a maximum character length.

     Args:
         provider: Context provider to modify
         max_chars: Maximum characters to keep
+        full_size: Expected "full" context size for noise calculation

     Returns:
         Modified provider with truncated content
     """

     def intervention_fn(
         context: str, metadata: Dict[str, Any]
     ) -> Tuple[str, Dict[str, Any]]:
         original_length = len(context)
         truncated = context[:max_chars]

         metadata = metadata.copy()
         metadata["original_chars"] = original_length
         metadata["truncated_chars"] = len(truncated)

         return truncated, metadata

-    noise = 1.0 - (max_chars / 10000)  # Assume 10k chars is "full" context
+    noise = 1.0 - (max_chars / full_size)
     return apply_intervention(
         provider, intervention_fn, f"truncate_{max_chars}", max(0, min(1, noise))
     )

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c493439 and f0eef3d.

📒 Files selected for processing (5)

semiosis/contexts/README.md (1 hunks)
semiosis/contexts/__init__.py (1 hunks)
semiosis/contexts/interventions.py (1 hunks)
semiosis/contexts/protocol.py (1 hunks)
tests/contexts/test_protocol.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use Google-style docstrings for all public methods and classes
Comprehensive type hints required throughout the codebase
Use Black for code formatting with 88 character line length
Use isort with Black profile for import sorting
Use flake8 for linting Python code
Use mypy for type checking in Python
Use deprecation warnings before removing functionality

Files:

semiosis/contexts/__init__.py
tests/contexts/test_protocol.py
semiosis/contexts/interventions.py
semiosis/contexts/protocol.py

tests/**/test_*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/test_*.py: Test files should use test_*.py naming convention and be located in tests/ directory
Mark tests requiring API keys with @pytest.mark.requires_api decorator
Tests should gracefully skip if API keys are unavailable
Provide fixture data for reproducible tests

Files:

tests/contexts/test_protocol.py

🧬 Code graph analysis (3)

semiosis/contexts/__init__.py (2)

semiosis/contexts/interventions.py (5)

apply_intervention (12-46)

compose_interventions (140-157)

remove_percentage (49-76)

shuffle_content (79-107)

truncate_context (110-137)

semiosis/contexts/protocol.py (1)

ContextProvider (11-45)

tests/contexts/test_protocol.py (2)

semiosis/contexts/interventions.py (6)

apply_intervention (12-46)

compose_interventions (140-157)

remove_percentage (49-76)

shuffle_content (79-107)

truncate_context (110-137)

composed (151-155)

semiosis/contexts/protocol.py (1)

get_context (27-45)

semiosis/contexts/protocol.py (1)

tests/contexts/test_protocol.py (2)

get_context (20-21)

get_context (40-41)

🪛 Ruff (0.14.6)

semiosis/contexts/__init__.py

17-24: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

semiosis/contexts/interventions.py

39-41: Consider iterable unpacking instead of concatenation

Replace with iterable unpacking

(RUF005)

🔇 Additional comments (5)

semiosis/contexts/protocol.py (1)

11-45: LGTM! Well-designed minimal protocol.

The protocol definition is clean, well-documented with Google-style docstrings, and includes comprehensive type hints. The example usage in the docstring effectively demonstrates the expected implementation pattern.

tests/contexts/test_protocol.py (1)

24-48: LGTM! Protocol tests validate duck typing correctly.

The tests effectively demonstrate that any class with a get_context method satisfies the protocol, which aligns with the design philosophy of avoiding inheritance.

semiosis/contexts/interventions.py (3)

49-76: LGTM! Intervention correctly implements random removal.

The function appropriately uses random.sample for stochastic information reduction, which aligns with the SIT experiment design. The defensive check for keep_count > 0 prevents errors when removing 100% of content.

79-107: LGTM! Shuffle intervention is well-implemented.

The function correctly shuffles content while preserving all lines. The noise level of 0.3 is appropriately justified with a comment explaining that content is intact but order is lost.

140-157: LGTM! Clean composition implementation.

The function correctly composes interventions in sequence. Note that since interventions mutate providers in-place, this will sequentially mutate the same provider object, which is consistent with the design of apply_intervention.

semiosis/contexts/README.md

tests/contexts/test_protocol.py

- Remove incorrect reference to providers/ directory - Add note that DBT and file system implementations coming in ANS-412, ANS-413 - Addresses PR review comment

- Rename test to test_intervention_mutates_provider to reflect actual behavior - Add assertions to verify provider1 is mutated by the intervention - Verify returned object is the same as provider1 (mutation, not copy) - Keep assertion that provider2 remains unaffected - Addresses PR review comment about test accuracy

coderabbitai bot reviewed Nov 30, 2025

View reviewed changes

semiosis/contexts/README.md Outdated Show resolved Hide resolved

tests/contexts/test_protocol.py Outdated Show resolved Hide resolved

jharris1679 added 2 commits November 30, 2025 11:29

docs: Fix README reference to non-existent providers directory

beac296

- Remove incorrect reference to providers/ directory - Add note that DBT and file system implementations coming in ANS-412, ANS-413 - Addresses PR review comment

jharris1679 merged commit 891071c into main Nov 30, 2025
5 checks passed

coderabbitai bot mentioned this pull request Dec 1, 2025

feat: Implement DBT Context System (ANS-412) #26

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat: Implement core context system abstractions (ANS-411)#25

feat: Implement core context system abstractions (ANS-411)#25
jharris1679 merged 3 commits intomainfrom
feature/context-system-foundation

jharris1679 commented Nov 30, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 30, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

jharris1679 commented Nov 30, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request

What does this PR do?

Related Issue

Lightweight Checklist

Additional Notes (optional)

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jharris1679 commented Nov 30, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 30, 2025 •

edited

Loading