Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
273 changes: 273 additions & 0 deletions nova-omni/getting-started/00_setup.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,273 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "setup-title",
"metadata": {},
"source": [
"# Amazon Nova 2 Omni - Setup and Configuration\n",
"\n",
"This notebook helps you set up your environment for working with Amazon Nova 2 Omni model.\n",
"\n",
"## What is Amazon Nova 2 Omni?\n",
"\n",
"Amazon Nova 2 Omni is a multimodal foundation model that can understand and generate content across text, images, and audio. Key capabilities include:\n",
"\n",
"- **Speech Understanding**: Transcribe, summarize, analyze, and answer questions about audio content\n",
"- **Image Generation**: Create high-quality images from text descriptions\n",
"- **Multimodal Reasoning**: Process and understand multiple input modalities simultaneously\n",
"\n",
"**Supported Audio Formats:** mp3, opus, wav, aac, flac, mp4, ogg, mkv\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "prerequisites",
"metadata": {},
"source": [
"## Prerequisites Check\n",
"\n",
"Let's verify that your environment is properly configured."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "check-python",
"metadata": {},
"outputs": [],
"source": [
"import sys\n",
"print(f\"Python version: {sys.version}\")\n",
"\n",
"# Check Python version\n",
"if sys.version_info >= (3, 12):\n",
" print(\"✅ Python 3.12+ is installed\")\n",
"else:\n",
" print(\"❌ Python 3.12+ is required. Please upgrade your Python version.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "install-dependencies",
"metadata": {},
"outputs": [],
"source": [
"# Install required boto3/botocore versions\n",
"!pip install boto3==1.42.4 botocore==1.42.4 --force-reinstall --no-cache-dir -q"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "check-dependencies",
"metadata": {},
"outputs": [],
"source": [
"# Verify installed versions\n",
"import boto3\n",
"import botocore\n",
"\n",
"print(f\"boto3 version: {boto3.__version__}\")\n",
"print(f\"botocore version: {botocore.__version__}\")\n",
"\n",
"if boto3.__version__ == '1.42.4' and botocore.__version__ == '1.42.4':\n",
" print(\"✅ Correct boto3/botocore versions installed\")\n",
"else:\n",
" print(\"⚠️ Version mismatch detected\")"
]
},
{
"cell_type": "markdown",
"id": "aws-setup",
"metadata": {},
"source": [
"## AWS Configuration\n",
"\n",
"Let's verify your AWS credentials and region configuration."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "check-aws-config",
"metadata": {},
"outputs": [],
"source": [
"import boto3\n",
"from botocore.exceptions import NoCredentialsError, ClientError\n",
"\n",
"try:\n",
" # Check AWS credentials\n",
" session = boto3.Session()\n",
" credentials = session.get_credentials()\n",
" \n",
" if credentials:\n",
" print(\"✅ AWS credentials are configured\")\n",
" print(f\"Region: {session.region_name or 'Not set (will use us-east-1)'}\")\n",
" else:\n",
" print(\"❌ AWS credentials not found. Please configure your AWS CLI or set environment variables.\")\n",
" \n",
"except Exception as e:\n",
" print(f\"❌ Error checking AWS configuration: {e}\")"
]
},
{
"cell_type": "markdown",
"id": "bedrock-setup",
"metadata": {},
"source": [
"## Amazon Bedrock Setup\n",
"\n",
"Let's test the connection to Amazon Bedrock and verify model access."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "test-bedrock-connection",
"metadata": {},
"outputs": [],
"source": [
"from botocore.config import Config\n",
"\n",
"MODEL_ID = \"us.amazon.nova-2-omni-v1:0\"\n",
"REGION_ID = \"us-west-2\"\n",
"\n",
"def test_bedrock_connection():\n",
" \"\"\"Test connection to Amazon Bedrock\"\"\"\n",
" try:\n",
" config = Config(\n",
" read_timeout=2 * 60,\n",
" )\n",
" bedrock = boto3.client(\n",
" service_name=\"bedrock-runtime\",\n",
" region_name=REGION_ID,\n",
" config=config,\n",
" )\n",
" \n",
" # Test with a simple text-only request\n",
" response = bedrock.converse(\n",
" modelId=MODEL_ID,\n",
" messages=[\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": [{\"text\": \"Hello, can you respond with just 'Hello back!'?\"}],\n",
" }\n",
" ],\n",
" inferenceConfig={\"maxTokens\": 50},\n",
" )\n",
" \n",
" print(\"✅ Successfully connected to Amazon Bedrock\")\n",
" print(\"✅ Nova 2 Omni model is accessible\")\n",
" print(f\"Test response: {response['output']['message']['content'][0]['text']}\")\n",
" return True\n",
" \n",
" except ClientError as e:\n",
" error_code = e.response['Error']['Code']\n",
" if error_code == 'AccessDeniedException':\n",
" print(\"❌ Access denied. Please check your IAM permissions include 'bedrock:InvokeModel'\")\n",
" elif error_code == 'ValidationException':\n",
" print(\"❌ Model not found. Please verify the model ID is correct.\")\n",
" else:\n",
" print(f\"❌ Bedrock error: {e}\")\n",
" return False\n",
" \n",
" except Exception as e:\n",
" print(f\"❌ Connection error: {e}\")\n",
" return False\n",
"\n",
"# Test the connection\n",
"connection_success = test_bedrock_connection()"
]
},
{
"cell_type": "markdown",
"id": "next-steps",
"metadata": {},
"source": [
"## Next Steps\n",
"\n",
"If all checks passed successfully, you're ready to explore Nova 2 Omni capabilities!\n",
"\n",
"### Available Notebooks:\n",
"\n",
"1. **01_speech_understanding_examples.ipynb** - Audio processing:\n",
" - Transcribe audio with speaker diarization\n",
" - Summarize and analyze audio content\n",
" - Call analytics with structured output\n",
"\n",
"2. **02_image_generation_examples.ipynb** - Image generation:\n",
" - Text-to-image with aspect ratio control\n",
" - Image editing and style transfer\n",
" - Text in images and creative control\n",
"\n",
"3. **03_multimodal_understanding_examples.ipynb** - Multimodal analysis:\n",
" - Image and video understanding\n",
" - Video summarization and classification\n",
" - Audio content analysis\n",
"\n",
"4. **04_langchain_multimodal_reasoning.ipynb** - LangChain integration:\n",
" - Tool use with structured outputs\n",
" - Reasoning effort configuration\n",
" - MMMU-style evaluation patterns\n",
"\n",
"5. **05_langgraph_multimodal_reasoning.ipynb** - LangGraph workflows:\n",
" - Stateful reasoning workflows\n",
" - Multi-step reasoning chains\n",
" - Conditional routing with tools\n",
"\n",
"6. **06_strands_multimodal_reasoning.ipynb** - Multi-agent systems:\n",
" - Specialized agents for different modalities\n",
" - Agent orchestration and coordination\n",
" - Collaborative reasoning patterns\n",
"\n",
"7. **07_document_understanding_examples.ipynb** - Document processing:\n",
" - OCR and text extraction\n",
" - Key information extraction with JSON\n",
" - Object detection and counting\n",
"\n",
"### Tips for Success:\n",
"\n",
"- Start with the speech understanding examples if you're interested in audio processing\n",
"- The model supports various audio formats: mp3, opus, wav, aac, flac, mp4, ogg, mkv\n",
"- For best results with transcription, use temperature=0.0\n",
"- For creative tasks, experiment with different temperature values (0.1-0.9)\n",
"\n",
"Happy exploring! 🚀"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "99b8b917-12b0-4a43-bdbd-a59778e03930",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "conda_python3",
"language": "python",
"name": "conda_python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.19"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading
Loading