Skip to content

Feat/prompt caching#17

Closed
Unshure wants to merge 3 commits intomainfrom
feat/prompt-caching
Closed

Feat/prompt caching#17
Unshure wants to merge 3 commits intomainfrom
feat/prompt-caching

Conversation

@Unshure
Copy link
Owner

@Unshure Unshure commented Jan 26, 2026

Summary

  • Add CacheConfig with strategy="auto" for automatic prompt caching in BedrockModel
  • Cache points are injected at the end of the last assistant message before each model call
  • Supports all Claude models on Bedrock that have prompt caching capability

Usage

from strands import Agent
from strands.models import BedrockModel, CacheConfig

model = BedrockModel(
    model_id="us.anthropic.claude-sonnet-4-5-20250929-v1:0",
    cache_config=CacheConfig(strategy="auto")
)
agent = Agent(model=model)

Test plan

  • Unit tests for cache point injection logic
  • Integration test with Claude models on Bedrock confirming cache hits

Add CacheConfig with strategy="auto" for BedrockModel to automatically
inject cache points at the end of assistant messages in multi-turn
conversations.

- Add CacheConfig dataclass in model.py with strategy field
- Add supports_caching property to check Claude model compatibility
- Implement _inject_cache_point() for automatic cache point management
- Export CacheConfig from models/__init__.py

Closes strands-agents#1432
- Add warning when cache_config enabled but model doesn't support caching
- Make supports_caching private (_supports_caching)
- Fix log formatting to follow style guide
- Clean up tests and imports
@Unshure
Copy link
Owner Author

Unshure commented Jan 26, 2026

/strands review

cache_config = self.config.get("cache_config")
if cache_config and cache_config.strategy == "auto":
if self._supports_caching:
self._inject_cache_point(messages)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: _inject_cache_point mutates the original messages list in-place, which can lead to unexpected side effects on agent.messages.

Suggestion: Make a deep copy of the messages before injecting cache points:

import copy

# In _format_bedrock_messages:
if cache_config and cache_config.strategy == "auto":
    if self._supports_caching:
        messages = copy.deepcopy(messages)  # Work on a copy
        self._inject_cache_point(messages)

This ensures that agent.messages remains unmodified and cache points are only in the request payload sent to Bedrock.

Returns True for Claude models on Bedrock.
"""
model_id = self.config.get("model_id", "").lower()
return "claude" in model_id or "anthropic" in model_id

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Consider making this check more robust. Currently it uses string matching which could have edge cases:

  1. A model ID containing "claude" in a different context
  2. Future Anthropic models with different naming

Consider adding a comment explaining the heuristic, or checking against a known list of supported model patterns.

)

# Simulate a mock BedrockModel with cache_config
mock_model.get_config = unittest.mock.MagicMock(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: This test uses mock_model which doesn't have the _inject_cache_point method, so it doesn't actually test whether BedrockModel mutates messages.

Suggestion: Move this test to tests/strands/models/test_bedrock.py and use a real BedrockModel instance to verify the mutation behavior. For example:

def test_format_bedrock_messages_does_not_mutate_original(bedrock_client):
    """Test that _format_bedrock_messages doesn't mutate the original messages."""
    model = BedrockModel(
        model_id="us.anthropic.claude-sonnet-4-20250514-v1:0",
        cache_config=CacheConfig(strategy="auto")
    )
    
    messages = [
        {"role": "user", "content": [{"text": "Hello"}]},
        {"role": "assistant", "content": [{"text": "Hi there!"}]},
    ]
    original_content_len = len(messages[1]["content"])
    
    model._format_bedrock_messages(messages)
    
    # Original messages should be unchanged
    assert len(messages[1]["content"]) == original_content_len

strategy: Caching strategy to use.
- "auto": Automatically inject cachePoint at optimal positions
"""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good docstring! One minor addition - consider documenting that this config is currently only supported by BedrockModel.

@github-actions
Copy link

Review Summary

Assessment: Request Changes

Key Themes

The implementation approach is well thought out - automatically moving cache points to the last assistant message is a clean way to maximize cache hits. The code is well-structured and follows the repository's coding patterns.

Critical Issue:

  • Message Mutation: The _inject_cache_point method modifies the original messages list in-place. This means agent.messages gets mutated with cache points, which could cause unexpected behavior. A deep copy should be made before injection.

Testing Gaps:

  • The test test_cache_config_does_not_mutate_original_messages uses a mock model that doesn't have the injection logic, so it doesn't actually verify the mutation behavior
  • No integration test was added for the new CacheConfig feature (existing tests cover the deprecated cache_prompt)

Minor Suggestions:

  • The _supports_caching property could benefit from documentation explaining the heuristic
  • CacheConfig docstring could mention it's currently only supported by BedrockModel

Nice work on the feature design! The cache point auto-injection is a useful addition. Once the mutation issue is addressed, this will be a solid contribution. 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants