aws-samples · otamaryx · Dec 10, 2025 · Dec 10, 2025 · Dec 22, 2025
diff --git a/...-speech/amazon-nova-2-sonic/repeatable-patterns/conversation-transfer/README.md b/...-speech/amazon-nova-2-sonic/repeatable-patterns/conversation-transfer/README.md
@@ -0,0 +1,157 @@
+# 🎙️ Nova 2 Sonic Multi-Agent System
+
+A speech-to-speech multi-agent system that unlocks dynamic configuration switching for AWS Bedrock's Nova 2 Sonic model during live conversations.
+
+## ⚠️ The Problem
+
+Speech-To-Speech models face a critical limitation: **static configuration**. Once a conversation starts, you're locked into:
+- A single system prompt that can't adapt to different use cases
+- One fixed set of tools
+- Static voice characteristics
+
+When you need different configurations for different use cases (different prompts and tools), you want specialized agents - each focusing on one task with its own optimized setup. This gives you better control and precision compared to one generalist agent trying to handle everything.
+
+## 💡 The Solution
+
+**Dynamic agent switching using tool triggers** - enabling real-time configuration changes mid-conversation without losing context.
+
+Instead of one overloaded agent, you get:
+- Multiple specialized agents, each with focused tools and optimized prompts
+- Seamless transitions between agents based on user intent
+- Preserved conversation history across switches
+- High accuracy maintained through agent specialization
+
+## 🌟 Why This Matters
+
+✅ **Specialization without compromise** - Each agent excels at its domain  
+✅ **Seamless user experience** - No jarring resets or context loss  
+✅ **Better accuracy** - Fewer tools per agent = better performance  
+✅ **New use cases unlocked** - Enterprise support escalation, healthcare triage, financial services routing, and more
+
+## 🚀 Implementation
+
+This demo showcases three specialized agents that switch dynamically based on conversation flow:
+
+- **Support Agent (Matthew)**: Handles customer issues, creates support tickets
+- **Sales Agent (Amy)**: Processes orders, provides product information
+- **Tracking Agent (Tiffany)**: Checks order status and delivery updates
+
+Each agent brings its own system prompt, tools, and voice - switching happens transparently when the user's intent changes.
+
+## 📁 Project Structure
+
+```
+dynamic-configuration/
+├── main.py                      # Entry point
+├── src/
+│   ├── multi_agent.py          # Agent orchestration
+│   ├── core/                   # Core functionality
+│   │   ├── stream_manager.py  # Bedrock streaming
+│   │   ├── event_templates.py # Event generation
+│   │   ├── tool_processor.py  # Tool execution
+│   │   ├── config.py          # Configuration
+│   │   └── utils.py           # Utilities
+│   ├── agents/                 # Agent definitions
+│   │   ├── agent_config.py    # Agent configs
+│   │   └── tools.py           # Tool implementations
+│   └── audio/                  # Audio handling
+│       └── audio_streamer.py  # Audio I/O
+├── docs/                       # Documentation
+│   └── STRUCTURE.md           # System design
+└── requirements.txt           # Dependencies
+```
+
+## ⚙️ Setup
+
+1. **Install dependencies**:
+```bash
+pip install -r requirements.txt
+```
+
+2. **Configure AWS credentials**:
+```bash
+export AWS_ACCESS_KEY_ID="your_key"
+export AWS_SECRET_ACCESS_KEY="your_secret"
+export AWS_REGION="us-east-1"
+```
+
+3. **Run**:
+```bash
+python main.py
+```
+
+## 🎮 Usage
+
+```bash
+# Normal mode
+python main.py
+
+# Debug mode
+python main.py --debug
+```
+
+## 🔧 Configuration
+
+Edit `src/core/config.py` to modify:
+- Audio settings (sample rates, chunk size)
+- Model parameters (temperature, top_p, max_tokens)
+- AWS region and model ID
+
+## 📋 Requirements
+
+- Python 3.12+
+- AWS Bedrock access
+- Microphone and speakers
+- PyAudio dependencies (portaudio)
+
+## Data Flow
+
+```mermaid
+sequenceDiagram
+    participant User
+    participant MultiAgentSonic
+    participant StreamManager
+    participant Bedrock
+    participant ToolProcessor
+
+    User->>MultiAgentSonic: Speak (microphone)
+    MultiAgentSonic->>StreamManager: Audio chunks
+    StreamManager->>Bedrock: Audio events
+    Bedrock->>StreamManager: Response events
+    StreamManager->>MultiAgentSonic: Audio chunks
+    MultiAgentSonic->>User: Play audio (speakers)
+
+
+    alt Switch Agent Tool Use
+        User->>MultiAgentSonic: Speak (microphone)
+        MultiAgentSonic->>StreamManager: Audio chunks
+        StreamManager->>Bedrock: Audio events
+        Bedrock->>StreamManager:  Switch Agent tool use detected
+        StreamManager->>ToolProcessor: Execute Switch Agent
+        ToolProcessor->>MultiAgentSonic: Start new Session
+        MultiAgentSonic->>Bedrock: Send text input to invoke conversation
+        Bedrock->>StreamManager: Response events
+        StreamManager->>MultiAgentSonic: Audio chunks
+        MultiAgentSonic->>User: Play audio (speakers)
+    end
+```
+
+## Agent Switching Flow
+
+```mermaid
+stateDiagram-v2
+    [*] --> ActiveConversation
+    ActiveConversation --> DetectSwitch: User requests agent change
+    DetectSwitch --> SetSwitchFlag: trigger "switch_agent" tool 
+    SetSwitchFlag --> StopStreaming: StreamManager sets switch_requested = True
+    StopStreaming --> PlayMusic: AudioStreamer stops
+    PlayMusic --> CloseStream: MultiAgentSonic plays transition
+    CloseStream --> SwitchAgent: Close current stream
+    SwitchAgent --> RestartStream: Load new agent config
+    RestartStream --> ActiveConversation: Resume with new agent
+```
+
+## Credits
+Music by <a href="https://pixabay.com/users/hitslab-47305729/?utm_source=link-attribution&utm_medium=referral&utm_campaign=music&utm_content=324902">Ievgen Poltavskyi</a> from <a href="https://pixabay.com//?utm_source=link-attribution&utm_medium=referral&utm_campaign=music&utm_content=324902">Pixabay</a>
+
+
diff --git a/...amazon-nova-2-sonic/repeatable-patterns/conversation-transfer/docs/STRUCTURE.md b/...amazon-nova-2-sonic/repeatable-patterns/conversation-transfer/docs/STRUCTURE.md
@@ -0,0 +1,199 @@
+# Project Structure
+
+## Directory Layout
+
+```
+sonic_multi_agent/
+├── main.py                      # Application entry point
+├── README.md                    # Project overview
+├── requirements.txt             # Python dependencies
+├── music.mp3                    # Transition music for agent switches
+├── .gitignore                   # Git ignore patterns
+│
+├── src/                         # Source code
+│   ├── __init__.py
+│   ├── multi_agent.py          # Multi-agent orchestrator
+│   │
+│   ├── core/                   # Core functionality
+│   │   ├── __init__.py
+│   │   ├── stream_manager.py  # Bedrock bidirectional streaming
+│   │   ├── event_templates.py # Bedrock event JSON generators
+│   │   ├── tool_processor.py  # Async tool executor
+│   │   ├── config.py          # Configuration constants
+│   │   └── utils.py           # Debug logging & timing utilities
+│   │
+│   ├── agents/                 # Agent definitions
+│   │   ├── __init__.py
+│   │   ├── agent_config.py    # Agent configurations (Support, Sales, Tracking)
+│   │   └── tools.py           # Tool implementations
+│   │
+│   └── audio/                  # Audio handling
+│       ├── __init__.py
+│       └── audio_streamer.py  # PyAudio I/O manager
+│
+└── docs/                       # Documentation
+    └── STRUCTURE.md            # This file
+```
+
+## Module Responsibilities
+
+### Root Level
+
+**main.py**
+- Entry point with argument parsing (`--debug` flag)
+- Initializes MultiAgentSonic with model and region
+- Handles keyboard interrupts and errors gracefully
+
+### src/multi_agent.py
+
+**MultiAgentSonic** - Orchestrates multi-agent conversations
+- Manages active agent state and conversation history
+- Handles agent switching with transition music (pygame)
+- Creates and coordinates StreamManager and AudioStreamer
+- Maintains conversation context across agent switches
+
+### src/core/
+
+**stream_manager.py** - BedrockStreamManager
+- Manages bidirectional streaming with AWS Bedrock Nova 2 Sonic
+- Handles audio input/output queues
+- Processes response events (text, audio, tool calls)
+- Coordinates tool execution via ToolProcessor
+- Manages conversation state and barge-in detection
+- Tracks agent switching requests
+
+**event_templates.py** - EventTemplates
+- Generates Bedrock-compatible JSON events
+- Session events (start/end)
+- Content events (audio/text/tool results)
+- Prompt configuration with system instructions
+- Tool schemas for agent capabilities
+
+**tool_processor.py** - ToolProcessor
+- Executes tools asynchronously
+- Maps tool names to implementations
+- Manages concurrent tool tasks
+- Handles tool errors and results
+
+**config.py**
+- Audio configuration (sample rates, chunk size, channels)
+- AWS configuration (model ID, region)
+- Model parameters (max tokens, temperature, top_p)
+- Debug settings
+
+**utils.py**
+- Debug logging with timestamps (`debug_print`)
+- Performance timing decorators (`time_it`, `time_it_async`)
+
+### src/agents/
+
+**agent_config.py**
+- Agent dataclass with voice_id, instruction, and tools
+- AGENTS dictionary with three specialized agents:
+  - **Support (Matthew)**: Customer support with ticket creation
+  - **Sales (Amy)**: Product sales and ordering
+  - **Tracking (Tiffany)**: Order status and delivery tracking
+- Each agent has unique system prompt and tool set
+
+**tools.py**
+- Tool implementations:
+  - `open_ticket_tool`: Creates support tickets
+  - `order_computers_tool`: Processes computer orders
+  - `check_order_location_tool`: Checks order delivery status
+
+### src/audio/
+
+**audio_streamer.py** - AudioStreamer
+- Manages PyAudio streams for input/output
+- Captures microphone input via callback
+- Plays audio output to speakers
+- Handles barge-in detection
+- Audio buffering and queue management
+
+## Data Flow
+
+```mermaid
+sequenceDiagram
+    participant User
+    participant AudioStreamer
+    participant StreamManager
+    participant Bedrock
+    participant ToolProcessor
+    participant Output
+
+    User->>AudioStreamer: Speak (microphone)
+    AudioStreamer->>StreamManager: Audio chunks
+    StreamManager->>Bedrock: Audio events
+    Bedrock->>StreamManager: Response events
+
+    alt Text Response
+        StreamManager->>Output: Display text
+    end
+
+    alt Audio Response
+        StreamManager->>AudioStreamer: Audio chunks
+        AudioStreamer->>User: Play audio (speakers)
+    end
+
+    alt Tool Use
+        StreamManager->>ToolProcessor: Execute tool
+        ToolProcessor->>StreamManager: Tool result
+        StreamManager->>Bedrock: Tool result event
+    end
+```
+
+## Agent Switching Flow
+
+```mermaid
+stateDiagram-v2
+    [*] --> ActiveConversation
+    ActiveConversation --> DetectSwitch: User requests agent change
+    DetectSwitch --> SetSwitchFlag: Bedrock invokes switch_agent tool
+    SetSwitchFlag --> StopStreaming: StreamManager sets flag
+    StopStreaming --> PlayMusic: AudioStreamer stops
+    PlayMusic --> CloseStream: MultiAgentSonic plays transition
+    CloseStream --> SwitchAgent: Close current stream
+    SwitchAgent --> RestartStream: Load new agent config
+    RestartStream --> ActiveConversation: Resume with new agent
+```
+
+## Key Design Patterns
+
+1. **Separation of Concerns**: Each module has a single, well-defined responsibility
+2. **Queue-based Communication**: Async queues decouple audio processing from streaming
+3. **Event-driven Architecture**: Response handling via Bedrock events
+4. **Factory Pattern**: EventTemplates generates configuration-specific events
+5. **Strategy Pattern**: Different agents share the same interface
+6. **Dependency Injection**: Components receive dependencies at initialization
+
+## Architecture Benefits
+
+- **Modularity**: Components can be tested and modified independently
+- **Scalability**: Easy to add new agents, tools, or audio features
+- **Maintainability**: Clear structure makes debugging straightforward
+- **Flexibility**: Agent switching without losing conversation context
+- **Performance**: Async operations prevent blocking
+
+## Adding New Components
+
+### New Agent
+1. Add agent configuration to `src/agents/agent_config.py` in AGENTS dict
+2. Define voice_id, instruction (system prompt), and tools list
+3. Agent automatically available for switching
+
+### New Tool
+1. Implement function in `src/agents/tools.py`
+2. Add to agent's tools list in `src/agents/agent_config.py`
+3. Tool automatically registered in ToolProcessor
+
+### New Audio Feature
+- Modify `src/audio/audio_streamer.py`
+- Update audio configuration in `src/core/config.py` if needed
+
+### New Event Type
+- Add template method to `src/core/event_templates.py`
+- Use in `src/core/stream_manager.py` for sending events
+
+### New Configuration
+- Add constants to `src/core/config.py`
+- Import where needed across modules
diff --git a/speech-to-speech/amazon-nova-2-sonic/repeatable-patterns/conversation-transfer/main.py b/speech-to-speech/amazon-nova-2-sonic/repeatable-patterns/conversation-transfer/main.py
@@ -0,0 +1,36 @@
+"""Main entry point for Nova 2 Sonic multi-agent system."""
+import asyncio
+import argparse
+from src.multi_agent import MultiAgentSonic
+from src.core.config import DEFAULT_MODEL_ID, DEFAULT_REGION
+from src.core import config
+
+
+async def main(debug: bool = False):
+    """Run multi-agent conversation."""
+    config.DEBUG = debug
+
+    sonic = MultiAgentSonic(
+        model_id=DEFAULT_MODEL_ID,
+        region=DEFAULT_REGION,
+        debug=debug
+    )
+
+    await sonic.start_conversation()
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(description='Nova 2 Sonic Multi-Agent System')
+    parser.add_argument('--debug', action='store_true', help='Enable debug mode')
+    args = parser.parse_args()
+
+    try:
+        asyncio.run(main(debug=args.debug))
+    except KeyboardInterrupt:
+        print("\n👋 Goodbye!")
+    except Exception as e:
+        print(f"Error: {e}")
+        if args.debug:
+            import traceback
+            traceback.print_exc()
+
diff --git a/speech-to-speech/amazon-nova-2-sonic/repeatable-patterns/conversation-transfer/music.mp3 b/speech-to-speech/amazon-nova-2-sonic/repeatable-patterns/conversation-transfer/music.mp3