Add comprehensive Nova Omni getting-started notebooks #216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open

Julia-Bobo-Hu wants to merge 2 commits into main from nova-omni-getting-started-notebooks

nova-omni/getting-started/00_setup.ipynb

-Original file line number
+Diff line change
@@ -0,0 +1,273 @@
+    {
+     "cells": [
+      {
+       "cell_type": "markdown",
+       "id": "setup-title",
+       "metadata": {},
+       "source": [
+        "# Amazon Nova 2 Omni - Setup and Configuration\n",
+        "\n",
+        "This notebook helps you set up your environment for working with Amazon Nova 2 Omni model.\n",
+        "\n",
+        "## What is Amazon Nova 2 Omni?\n",
+        "\n",
+        "Amazon Nova 2 Omni is a multimodal foundation model that can understand and generate content across text, images, and audio. Key capabilities include:\n",
+        "\n",
+        "- **Speech Understanding**: Transcribe, summarize, analyze, and answer questions about audio content\n",
+        "- **Image Generation**: Create high-quality images from text descriptions\n",
+        "- **Multimodal Reasoning**: Process and understand multiple input modalities simultaneously\n",
+        "\n",
+        "**Supported Audio Formats:** mp3, opus, wav, aac, flac, mp4, ogg, mkv\n",
+        "\n",
+        "---"
+       ]
+      },
+      {
+       "cell_type": "markdown",
+       "id": "prerequisites",
+       "metadata": {},
+       "source": [
+        "## Prerequisites Check\n",
+        "\n",
+        "Let's verify that your environment is properly configured."
+       ]
+      },
+      {
+       "cell_type": "code",
+       "execution_count": null,
+       "id": "check-python",
+       "metadata": {},
+       "outputs": [],
+       "source": [
+        "import sys\n",
+        "print(f\"Python version: {sys.version}\")\n",
+        "\n",
+        "# Check Python version\n",
+        "if sys.version_info >= (3, 12):\n",
+        "    print(\"✅ Python 3.12+ is installed\")\n",
+        "else:\n",
+        "    print(\"❌ Python 3.12+ is required. Please upgrade your Python version.\")"
+       ]
+      },
+      {
+       "cell_type": "code",
+       "execution_count": null,
+       "id": "install-dependencies",
+       "metadata": {},
+       "outputs": [],
+       "source": [
+        "# Install required boto3/botocore versions\n",
+        "!pip install boto3==1.42.4 botocore==1.42.4 --force-reinstall --no-cache-dir -q"
+       ]
+      },
+      {
+       "cell_type": "code",
+       "execution_count": null,
+       "id": "check-dependencies",
+       "metadata": {},
+       "outputs": [],
+       "source": [
+        "# Verify installed versions\n",
+        "import boto3\n",
+        "import botocore\n",
+        "\n",
+        "print(f\"boto3 version: {boto3.__version__}\")\n",
+        "print(f\"botocore version: {botocore.__version__}\")\n",
+        "\n",
+        "if boto3.__version__ == '1.42.4' and botocore.__version__ == '1.42.4':\n",
+        "    print(\"✅ Correct boto3/botocore versions installed\")\n",
+        "else:\n",
+        "    print(\"⚠️ Version mismatch detected\")"
+       ]
+      },
+      {
+       "cell_type": "markdown",
+       "id": "aws-setup",
+       "metadata": {},
+       "source": [
+        "## AWS Configuration\n",
+        "\n",
+        "Let's verify your AWS credentials and region configuration."
+       ]
+      },
+      {
+       "cell_type": "code",
+       "execution_count": null,
+       "id": "check-aws-config",
+       "metadata": {},
+       "outputs": [],
+       "source": [
+        "import boto3\n",
+        "from botocore.exceptions import NoCredentialsError, ClientError\n",
+        "\n",
+        "try:\n",
+        "    # Check AWS credentials\n",
+        "    session = boto3.Session()\n",
+        "    credentials = session.get_credentials()\n",
+        "    \n",
+        "    if credentials:\n",
+        "        print(\"✅ AWS credentials are configured\")\n",
+        "        print(f\"Region: {session.region_name or 'Not set (will use us-east-1)'}\")\n",
+        "    else:\n",
+        "        print(\"❌ AWS credentials not found. Please configure your AWS CLI or set environment variables.\")\n",
+        "        \n",
+        "except Exception as e:\n",
+        "    print(f\"❌ Error checking AWS configuration: {e}\")"
+       ]
+      },
+      {
+       "cell_type": "markdown",
+       "id": "bedrock-setup",
+       "metadata": {},
+       "source": [
+        "## Amazon Bedrock Setup\n",
+        "\n",
+        "Let's test the connection to Amazon Bedrock and verify model access."
+       ]
+      },
+      {
+       "cell_type": "code",
+       "execution_count": null,
+       "id": "test-bedrock-connection",
+       "metadata": {},
+       "outputs": [],
+       "source": [
+        "from botocore.config import Config\n",
+        "\n",
+        "MODEL_ID = \"us.amazon.nova-2-omni-v1:0\"\n",
+        "REGION_ID = \"us-west-2\"\n",
+        "\n",
+        "def test_bedrock_connection():\n",
+        "    \"\"\"Test connection to Amazon Bedrock\"\"\"\n",
+        "    try:\n",
+        "        config = Config(\n",
+        "            read_timeout=2 * 60,\n",
+        "        )\n",
+        "        bedrock = boto3.client(\n",
+        "            service_name=\"bedrock-runtime\",\n",
+        "            region_name=REGION_ID,\n",
+        "            config=config,\n",
+        "        )\n",
+        "        \n",
+        "        # Test with a simple text-only request\n",
+        "        response = bedrock.converse(\n",
+        "            modelId=MODEL_ID,\n",
+        "            messages=[\n",
+        "                {\n",
+        "                    \"role\": \"user\",\n",
+        "                    \"content\": [{\"text\": \"Hello, can you respond with just 'Hello back!'?\"}],\n",
+        "                }\n",
+        "            ],\n",
+        "            inferenceConfig={\"maxTokens\": 50},\n",
+        "        )\n",
+        "        \n",
+        "        print(\"✅ Successfully connected to Amazon Bedrock\")\n",
+        "        print(\"✅ Nova 2 Omni model is accessible\")\n",
+        "        print(f\"Test response: {response['output']['message']['content'][0]['text']}\")\n",
+        "        return True\n",
+        "        \n",
+        "    except ClientError as e:\n",
+        "        error_code = e.response['Error']['Code']\n",
+        "        if error_code == 'AccessDeniedException':\n",
+        "            print(\"❌ Access denied. Please check your IAM permissions include 'bedrock:InvokeModel'\")\n",
+        "        elif error_code == 'ValidationException':\n",
+        "            print(\"❌ Model not found. Please verify the model ID is correct.\")\n",
+        "        else:\n",
+        "            print(f\"❌ Bedrock error: {e}\")\n",
+        "        return False\n",
+        "        \n",
+        "    except Exception as e:\n",
+        "        print(f\"❌ Connection error: {e}\")\n",
+        "        return False\n",
+        "\n",
+        "# Test the connection\n",
+        "connection_success = test_bedrock_connection()"
+       ]
+      },
+      {
+       "cell_type": "markdown",
+       "id": "next-steps",
+       "metadata": {},
+       "source": [
+        "## Next Steps\n",
+        "\n",
+        "If all checks passed successfully, you're ready to explore Nova 2 Omni capabilities!\n",
+        "\n",
+        "### Available Notebooks:\n",
+        "\n",
+        "1. **01_speech_understanding_examples.ipynb** - Audio processing:\n",
+        "   - Transcribe audio with speaker diarization\n",
+        "   - Summarize and analyze audio content\n",
+        "   - Call analytics with structured output\n",
+        "\n",
+        "2. **02_image_generation_examples.ipynb** - Image generation:\n",
+        "   - Text-to-image with aspect ratio control\n",
+        "   - Image editing and style transfer\n",
+        "   - Text in images and creative control\n",
+        "\n",
+        "3. **03_multimodal_understanding_examples.ipynb** - Multimodal analysis:\n",
+        "   - Image and video understanding\n",
+        "   - Video summarization and classification\n",
+        "   - Audio content analysis\n",
+        "\n",
+        "4. **04_langchain_multimodal_reasoning.ipynb** - LangChain integration:\n",
+        "   - Tool use with structured outputs\n",
+        "   - Reasoning effort configuration\n",
+        "   - MMMU-style evaluation patterns\n",
+        "\n",
+        "5. **05_langgraph_multimodal_reasoning.ipynb** - LangGraph workflows:\n",
+        "   - Stateful reasoning workflows\n",
+        "   - Multi-step reasoning chains\n",
+        "   - Conditional routing with tools\n",
+        "\n",
+        "6. **06_strands_multimodal_reasoning.ipynb** - Multi-agent systems:\n",
+        "   - Specialized agents for different modalities\n",
+        "   - Agent orchestration and coordination\n",
+        "   - Collaborative reasoning patterns\n",
+        "\n",
+        "7. **07_document_understanding_examples.ipynb** - Document processing:\n",
+        "   - OCR and text extraction\n",
+        "   - Key information extraction with JSON\n",
+        "   - Object detection and counting\n",
+        "\n",
+        "### Tips for Success:\n",
+        "\n",
+        "- Start with the speech understanding examples if you're interested in audio processing\n",
+        "- The model supports various audio formats: mp3, opus, wav, aac, flac, mp4, ogg, mkv\n",
+        "- For best results with transcription, use temperature=0.0\n",
+        "- For creative tasks, experiment with different temperature values (0.1-0.9)\n",
+        "\n",
+        "Happy exploring! 🚀"
+       ]
+      },
+      {
+       "cell_type": "code",
+       "execution_count": null,
+       "id": "99b8b917-12b0-4a43-bdbd-a59778e03930",
+       "metadata": {},
+       "outputs": [],
+       "source": []
+      }
+     ],
+     "metadata": {
+      "kernelspec": {
+       "display_name": "conda_python3",
+       "language": "python",
+       "name": "conda_python3"
+      },
+      "language_info": {
+       "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+       },
+       "file_extension": ".py",
+       "mimetype": "text/x-python",
+       "name": "python",
+       "nbconvert_exporter": "python",
+       "pygments_lexer": "ipython3",
+       "version": "3.10.19"
+      }
+     },
+     "nbformat": 4,
+     "nbformat_minor": 5
+    }

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add comprehensive Nova Omni getting-started notebooks #216

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Uh oh!

Uh oh!

Add comprehensive Nova Omni getting-started notebooks #216

Are you sure you want to change the base?

Uh oh!

Add comprehensive Nova Omni getting-started notebooks #216

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Uh oh!

Uh oh!