diff --git a/nova-act/tutorials/research/00-setup/README.md b/nova-act/tutorials/research/00-setup/README.md new file mode 100644 index 00000000..e33f1fb5 --- /dev/null +++ b/nova-act/tutorials/research/00-setup/README.md @@ -0,0 +1,181 @@ +# Amazon Nova Act Tutorials - Setup Guide + +## Overview +This setup guide prepares your environment for all Amazon Nova Act tutorials (01-04). Complete this setup once to run any tutorial script. + +## What This Setup Includes +- Python virtual environment creation +- All required dependencies installation +- Nova Act SDK installation and verification +- API key configuration guidance +- Optional Chrome browser installation +- Environment validation + +## Prerequisites +- **Operating System:** macOS Sierra+, Ubuntu 22.04+, WSL2, or Windows 10+ +- **Python:** 3.10 or higher +- **Internet connection:** Required for package installation +- **Terminal access:** Command line interface + +## Quick Setup + +### Automated Setup (Recommended) +```bash +cd tutorials/research-preview/00-setup +./setup.sh +``` + +This script will: +1. ✓ Check Python version (3.10+ required) +2. ✓ Create virtual environment at `./venv` +3. ✓ Install all dependencies +4. ✓ Verify Nova Act installation +5. ✓ Guide you through API key setup +6. ✓ Optionally install Chrome browser + +### Manual Setup +```bash +# 1. Navigate to setup directory +cd tutorials/research-preview/00-setup + +# 2. Create and activate virtual environment +python3 -m venv venv +source venv/bin/activate # macOS/Linux +# OR venv\Scripts\activate # Windows + +# 3. Install dependencies +pip install --upgrade pip +pip install -r requirements.txt + +# 4. Verify installation +python3 -c "import nova_act; print(f'Nova Act version: {nova_act.__version__}')" +``` + +## API Key Setup + +### Step 1: Get Your API Key +1. Visit [https://nova.amazon.com/act](https://nova.amazon.com/act) +2. Sign up or log in +3. Generate an API key + +### Step 2: Set Environment Variable +**Current session:** +```bash +export NOVA_ACT_API_KEY="your_api_key_here" +``` + +**Persistent (recommended):** +```bash +# For bash +echo 'export NOVA_ACT_API_KEY="your_api_key_here"' >> ~/.bashrc +source ~/.bashrc + +# For zsh +echo 'export NOVA_ACT_API_KEY="your_api_key_here"' >> ~/.zshrc +source ~/.zshrc +``` + +### Step 3: Verify +```bash +echo $NOVA_ACT_API_KEY # Should print your API key +``` + +## Dependencies Installed +- **nova-act** (>=1.0.0) - Amazon Nova Act SDK +- **playwright** (>=1.30.0) - Browser automation +- **pydantic** (>=2.0.0) - Data validation +- **pandas** (>=2.0.0) - Data analysis (Tutorial 03) +- **requests** (>=2.31.0) - HTTP library (Tutorial 03) +- **boto3** (>=1.28.0) - AWS SDK (Tutorial 04, optional) + +## Verifying Setup +```bash +# Activate environment +source venv/bin/activate + +# Check installations +python3 --version # Should be 3.10+ +python3 -c "import nova_act; print(nova_act.__version__)" +echo $NOVA_ACT_API_KEY # Should print your key +``` + +## Running Tutorials +Once setup is complete: + +```bash +# Tutorial 01 - Getting Started +cd ../01-getting-started && python 1_getting_started.py + +# Tutorial 02 - Human in the Loop +cd ../02-human-in-loop && python 1_captcha_handling.py + +# Tutorial 03 - Tool Use +cd ../03-tool-use && python 1_page_object_usage.py + +# Tutorial 04 - Observability +cd ../04-observability && python 1_observability.py +``` + +**Note:** Always activate the virtual environment first: +```bash +source tutorials/research-preview/00-setup/venv/bin/activate +``` + +## Troubleshooting + +### Python Version Issues +Install Python 3.10+ from [python.org](https://www.python.org/downloads/) or use pyenv. + +### Virtual Environment Issues +```bash +# Install venv module (Ubuntu/Debian) +sudo apt-get install python3-venv +``` + +### Permission Issues +```bash +chmod +x setup.sh +./setup.sh +``` + +### Import Errors +1. Ensure virtual environment is activated +2. Reinstall dependencies: `pip install -r requirements.txt` + +### API Key Issues +1. Verify it's set: `echo $NOVA_ACT_API_KEY` +2. Check for typos or extra spaces +3. Re-export if needed + +### Browser Issues +```bash +# Install Chrome +playwright install chrome +# Or use default Chromium +playwright install +``` + +## Security Notes +- Never commit your API key - use environment variables +- Protect your virtual environment +- Follow security best practices in tutorials +- Be cautious with sensitive data + +## What's Next +After setup completion: +1. Start with Tutorial 01 - Getting Started +2. Progress through tutorials in order (01-04) +3. Explore Nova Act samples +4. Build your own automations + +## Additional Resources +- [Nova Act Documentation](https://nova.amazon.com/act) +- [Nova Act GitHub Repository](https://github.com/aws/nova-act) +- [Playwright Documentation](https://playwright.dev/python/) + +## Getting Help +1. Check this README for common issues +2. Review tutorial-specific READMEs +3. Check Nova Act documentation +4. Report issues on GitHub +5. Email: nova-act@amazon.com diff --git a/nova-act/tutorials/research/00-setup/requirements.txt b/nova-act/tutorials/research/00-setup/requirements.txt new file mode 100644 index 00000000..7e7a2dca --- /dev/null +++ b/nova-act/tutorials/research/00-setup/requirements.txt @@ -0,0 +1,21 @@ +# Amazon Nova Act Tutorials - Complete Dependencies +# This file includes all dependencies needed for tutorials 01-04 + +# Core dependencies +nova-act>=1.0.0 +playwright>=1.30.0 + +# Data processing (Tutorial 03) +pandas>=2.0.0 + +# API integration (Tutorial 03) +requests>=2.31.0 + +# Data validation (all tutorials) +pydantic>=2.0.0 + +# Optional: Excel export support (Tutorial 03) +openpyxl>=3.1.0 + +# Optional: AWS S3 integration (Tutorial 04) +boto3>=1.28.0 diff --git a/nova-act/tutorials/research/00-setup/setup.sh b/nova-act/tutorials/research/00-setup/setup.sh new file mode 100755 index 00000000..ba79549c --- /dev/null +++ b/nova-act/tutorials/research/00-setup/setup.sh @@ -0,0 +1,135 @@ +#!/bin/bash +# Amazon Nova Act Tutorials - Setup Script +# This script sets up your environment for all Nova Act tutorials (01-04) + +set -e # Exit on error + +echo "======================================================================" +echo "Amazon Nova Act Tutorials - Environment Setup" +echo "======================================================================" + +# Check Python version +echo "" +echo "Checking Python version..." +python_version=$(python3 --version 2>&1 | awk '{print $2}') +echo "✓ Found Python $python_version" + +# Check if Python 3.10+ +required_version="3.10" +if ! python3 -c "import sys; exit(0 if sys.version_info >= (3, 10) else 1)"; then + echo "✗ Python 3.10 or higher is required" + echo " Current version: $python_version" + exit 1 +fi + +# Create virtual environment +echo "" +echo "Creating virtual environment..." +if [ -d "venv" ]; then + echo "⚠️ Virtual environment already exists at ./venv" + read -p "Do you want to recreate it? (y/N): " -n 1 -r + echo + if [[ $REPLY =~ ^[Yy]$ ]]; then + rm -rf venv + python3 -m venv venv + echo "✓ Virtual environment recreated" + else + echo "✓ Using existing virtual environment" + fi +else + python3 -m venv venv + echo "✓ Virtual environment created at ./venv" +fi + +# Activate virtual environment +echo "" +echo "Activating virtual environment..." +source venv/bin/activate +echo "✓ Virtual environment activated" + +# Upgrade pip +echo "" +echo "Upgrading pip..." +pip install --upgrade pip --quiet +echo "✓ pip upgraded" + +# Install dependencies +echo "" +echo "Installing dependencies..." +pip install -r requirements.txt +echo "✓ All dependencies installed" + +# Verify Nova Act installation +echo "" +echo "Verifying Nova Act installation..." +python3 -c "import nova_act; print(f'✓ Nova Act version: {nova_act.__version__}')" + +# Check for API key +echo "" +echo "======================================================================" +echo "API Key Setup" +echo "======================================================================" +if [ -z "$NOVA_ACT_API_KEY" ]; then + echo "⚠️ NOVA_ACT_API_KEY environment variable is not set" + echo "" + echo "To get an API key:" + echo " 1. Visit https://nova.amazon.com/act" + echo " 2. Sign up and generate an API key" + echo " 3. Set it as an environment variable:" + echo "" + echo " export NOVA_ACT_API_KEY=\"your_api_key_here\"" + echo "" + echo " 4. Add to your shell profile (~/.bashrc or ~/.zshrc) to persist:" + echo "" + echo " echo 'export NOVA_ACT_API_KEY=\"your_api_key_here\"' >> ~/.zshrc" + echo "" +else + echo "✓ NOVA_ACT_API_KEY is set" +fi + +# Optional: Install Chrome +echo "" +echo "======================================================================" +echo "Browser Setup (Optional)" +echo "======================================================================" +echo "Nova Act works best with Google Chrome." +echo "Playwright will use Chromium by default, but you can install Chrome:" +echo "" +echo " playwright install chrome" +echo "" +read -p "Install Chrome now? (y/N): " -n 1 -r +echo +if [[ $REPLY =~ ^[Yy]$ ]]; then + playwright install chrome + echo "✓ Chrome installed" +else + echo "✓ Skipping Chrome installation (will use Chromium)" +fi + +# Summary +echo "" +echo "======================================================================" +echo "Setup Complete!" +echo "======================================================================" +echo "" +echo "✓ Virtual environment created at: ./venv" +echo "✓ All dependencies installed" +echo "✓ Nova Act verified" +echo "" +echo "Next steps:" +echo " 1. Set your API key (if not already set):" +echo " export NOVA_ACT_API_KEY=\"your_api_key_here\"" +echo "" +echo " 2. Activate the virtual environment:" +echo " source venv/bin/activate" +echo "" +echo " 3. Run any tutorial script:" +echo " python ../01-getting-started/getting_started.py" +echo " python ../02-human-in-loop/captcha_handling.py" +echo " python ../03-tool-use/page_object_usage.py" +echo " python ../04-observability/observability.py" +echo "" +echo " 4. Read the README in each tutorial directory for details" +echo "" +echo "Happy automating with Nova Act!" +echo "======================================================================" diff --git a/nova-act/tutorials/research/01-getting-started/1_getting_started.py b/nova-act/tutorials/research/01-getting-started/1_getting_started.py new file mode 100644 index 00000000..6859c38b --- /dev/null +++ b/nova-act/tutorials/research/01-getting-started/1_getting_started.py @@ -0,0 +1,221 @@ +#!/usr/bin/env python3 +""" +Getting Started with Amazon Nova Act + +This tutorial demonstrates the basics of using Amazon Nova Act to automate +web browser tasks. + +Prerequisites: +- Complete the centralized setup first (see ../00-setup/README.md) +- Python 3.10 or higher +- Amazon Nova Act API key + +Setup: +1. Run the centralized setup (one-time): + cd ../00-setup + ./setup.sh + +2. Activate the virtual environment: + source ../00-setup/venv/bin/activate # On macOS/Linux + # OR + ..\00-setup\venv\Scripts\activate # On Windows + +3. Run this tutorial: + python getting_started.py + +Note: The setup script handles all dependencies and API key configuration. +""" + +import os +from nova_act import NovaAct, BOOL_SCHEMA +from pydantic import BaseModel + + +def verify_installation(): + """Verify that Nova Act is installed correctly.""" + try: + import nova_act + print("\033[93m[OK]\033[0m Amazon Nova Act successfully imported!") + print(f" Version: {nova_act.__version__}") + return True + except ImportError: + print("\033[91m[ERROR]\033[0m Failed to import Amazon Nova Act. Please check your installation.") + return False + + +def check_api_key(): + """Check if the API key is set.""" + api_key = os.getenv('NOVA_ACT_API_KEY') + if api_key: + print("\033[93m[OK]\033[0m API key found!") + return api_key + else: + print("\033[91m[ERROR]\033[0m API key not found. Please set the NOVA_ACT_API_KEY environment variable.") + print(" Example: export NOVA_ACT_API_KEY='your_api_key_here'") + return None + + +def example_basic_automation(api_key: str): + """ + Example 1: Basic automation + Navigate to a website and extract information. + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 1: Basic Automation\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + print("\n\033[93m[OK]\033[0m Starting basic automation with natural language commands") + print(" Nova Act can understand plain English instructions like 'click' and 'return information'.") + print(" You'll see Nova Act navigate the page, click elements, and extract the requested data.") + print("\n\033[94m→ Next:\033[0m Executing a multi-step task: clicking a button and extracting blog information") + + with NovaAct(starting_page="https://nova.amazon.com/act", nova_act_api_key=api_key) as nova: + result = nova.act("Click learn more. Then, return the title and publication date of the blog.") + print(f"\n\033[93mResult:\033[0m {result}") + + +def example_extract_structured_data(api_key: str): + """ + Example 2: Extract structured data using Pydantic schemas + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 2: Extract Structured Data\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + print("\n\033[93m[OK]\033[0m Using Pydantic schemas for structured data extraction") + print(" Schemas ensure Nova Act returns data in exactly the format you specify.") + print(" You'll see Nova Act extract specific fields and validate them against the schema.") + print("\n\033[94m→ Next:\033[0m Defining a schema and extracting page title, search bar presence, and main heading") + + class PageInfo(BaseModel): + title: str + has_search_bar: bool + main_heading: str + + with NovaAct(starting_page="https://nova.amazon.com/act", nova_act_api_key=api_key) as nova: + result = nova.act( + "Extract the page title, whether there's a search bar, and the main heading", + schema=PageInfo.model_json_schema() + ) + + if result.matches_schema: + page_info = PageInfo.model_validate(result.parsed_response) + print(f"\n\033[93m[OK]\033[0m Successfully extracted structured data:") + print(f" Title: {page_info.title}") + print(f" Has search bar: {page_info.has_search_bar}") + print(f" Main heading: {page_info.main_heading}") + else: + print(f"\n\033[91m[ERROR]\033[0m Response did not match schema: {result}") + + +def example_boolean_response(api_key: str): + """ + Example 3: Navigate and find specific page + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 3: Navigate and Find Page\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + print("\n\033[93m[OK]\033[0m Using boolean responses for simple yes/no questions") + print(" Boolean schemas are perfect for existence checks and simple decision-making.") + print(" You'll see Nova Act search the website and return a true/false answer.") + print("\n\033[94m→ Next:\033[0m Navigating to Amazon and searching for Black Friday Deals page") + + with NovaAct(starting_page="https://amazon.com", nova_act_api_key=api_key) as nova: + result = nova.act("Look for a 'Black Friday Deals' page on this website", schema=BOOL_SCHEMA) + + if result.matches_schema: + if result.parsed_response: + print("\n\033[92m[OK]\033[0m Found a 'Black Friday Deals' page") + else: + print("\n\033[93m[OK]\033[0m No 'Black Friday Deals' page found") + else: + print(f"\n\033[91m[ERROR]\033[0m Invalid result: {result}") + + +def example_multi_step_workflow(api_key: str): + """ + Example 4: Multi-step workflow + Break down complex tasks into smaller, reliable steps. + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 4: Multi-Step Workflow\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + print("\n\033[93m[OK]\033[0m Breaking complex tasks into smaller, reliable steps") + print(" Multi-step workflows are more reliable than single complex commands.") + print(" You'll see Nova Act execute each step sequentially with clear progress updates.") + print("\n\033[94m→ Next:\033[0m Executing a two-step process: navigation followed by information extraction") + + with NovaAct(starting_page="https://nova.amazon.com/act", nova_act_api_key=api_key) as nova: + # Step 1: Navigate to a section + print("\nStep 1: Clicking 'Learn More'...") + nova.act("Click the 'Learn More' button") + + # Step 2: Extract information + print("Step 2: Extracting blog information...") + result = nova.act("Return the title and publication date of the blog") + print(f"\nBlog info: {result}") + + +def main(): + """Main function to run all examples.""" + print("="*60) + print("Getting Started with Amazon Nova Act") + print("="*60) + + # Step 1: Verify installation + if not verify_installation(): + return + + # Step 2: Check API key + api_key = check_api_key() + if not api_key: + return + + print("\nThis tutorial includes 4 examples. Press Enter after each to continue...") + + # Run examples + try: + # Example 1 + example_basic_automation(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Basic automation with natural language commands") + print(f"\033[94m→ Next:\033[0m Learn to extract structured data using schemas") + input("\n>> Press Enter to continue to Example 2...") + + # Example 2 + example_extract_structured_data(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Structured data extraction with Pydantic models") + print(f"\033[94m→ Next:\033[0m Simple yes/no questions using boolean responses") + input("\n>> Press Enter to continue to Example 3...") + + # Example 3 + example_boolean_response(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Navigation and page search functionality") + print(f"\033[94m→ Next:\033[0m Breaking complex tasks into multiple steps") + input("\n>> Press Enter to continue to Example 4...") + + # Example 4 + example_multi_step_workflow(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Multi-step workflow with sequential actions") + + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94m[OK] All examples completed successfully!\033[0m") + print(f"\033[94m{'='*60}\033[0m") + print("\nNext Steps:") + print("- Explore the other tutorials in this series") + print("- Check out the samples in the Nova Act repository") + print("- Read the full documentation at https://nova.amazon.com/act") + + except KeyboardInterrupt: + print("\n\nTutorial interrupted by user") + except Exception as e: + print(f"\n\033[91m[ERROR]\033[0m Error running examples: {e}") + print("\nTroubleshooting:") + print("- Make sure your API key is valid") + print("- Check your internet connection") + print("- Ensure Chrome/Chromium is installed") + + +if __name__ == "__main__": + main() diff --git a/nova-act/tutorials/research/01-getting-started/README.md b/nova-act/tutorials/research/01-getting-started/README.md new file mode 100644 index 00000000..713a4526 --- /dev/null +++ b/nova-act/tutorials/research/01-getting-started/README.md @@ -0,0 +1,70 @@ +# Getting Started with Amazon Nova Act + +## Overview +This tutorial introduces Amazon Nova Act (research preview), a Python SDK for building agents that reliably take actions in web browsers. You'll learn basic setup, API usage, and create your first automation script. + +## Learning Objectives +- Install and set up Amazon Nova Act +- Create your first automation script +- Understand basic Nova Act API concepts +- Extract structured data from web pages +- Build multi-step automation workflows + +## Prerequisites +**⚠️ Complete the centralized setup first!** +- Complete setup in `../00-setup/` (see [Setup Guide](../00-setup/README.md)) +- Python 3.10 or higher +- Amazon Nova Act API key +- Basic Python programming knowledge + +## What You'll Build +Automation scripts that navigate websites and extract information using natural language commands. + +## Quick Start +```bash +# 1. Complete setup (one-time) +cd ../00-setup && ./setup.sh + +# 2. Activate environment +source ../00-setup/venv/bin/activate + +# 3. Run tutorial +python 1_getting_started.py +``` + +## Key Concepts + +### NovaAct Class +Main interface to the SDK. Creates browser sessions and handles automation. + +### act() Method +Natural language interface for browser actions. Takes a prompt describing what to do and optionally a schema for structured responses. + +### Structured Data Extraction +Use Pydantic models to extract specific information from web pages in a structured format. + +### Best Practices +- Be prescriptive and succinct in prompts +- Break large tasks into smaller steps +- Use separate act() calls for different actions +- Don't mix actions and data extraction in one call + +## What This Tutorial Covers +The `1_getting_started.py` script demonstrates: +- Installation verification +- API key validation +- Basic web automation +- Structured data extraction with schemas +- Boolean responses for yes/no questions +- Multi-step workflow patterns + +## Troubleshooting +- Ensure centralized setup is complete +- Verify virtual environment is activated +- Check API key is configured +- Nova Act doesn't support Jupyter notebooks - use .py files + +## Next Steps +- Explore and modify the tutorial script +- Try the Human in the Loop tutorial +- Read full documentation at https://nova.amazon.com/act diff --git a/nova-act/tutorials/research/02-human-in-loop/1_captcha_handling.py b/nova-act/tutorials/research/02-human-in-loop/1_captcha_handling.py new file mode 100644 index 00000000..ed072692 --- /dev/null +++ b/nova-act/tutorials/research/02-human-in-loop/1_captcha_handling.py @@ -0,0 +1,216 @@ +#!/usr/bin/env python3 +""" +CAPTCHA Handling with Amazon Nova Act + +This script demonstrates how to detect and handle CAPTCHAs in Nova Act workflows. +CAPTCHAs are security measures designed to distinguish humans from bots, and Nova Act +cannot solve them automatically - it requires human intervention. + +Prerequisites: +- Complete the centralized setup first (see ../00-setup/README.md) +- Completion of the Getting Started tutorial + +Setup: +1. Run the centralized setup (one-time): + cd ../00-setup + ./setup.sh + +2. Activate the virtual environment: + source ../00-setup/venv/bin/activate + +3. Run this tutorial: + python captcha_handling.py +""" + +import os +from nova_act import NovaAct, BOOL_SCHEMA + + +def check_api_key(): + """Check if the API key is set.""" + api_key = os.getenv('NOVA_ACT_API_KEY') + if not api_key: + print("\033[91m[ERROR]\033[0m API key not found. Please set the NOVA_ACT_API_KEY environment variable.") + print(" Example: export NOVA_ACT_API_KEY='your_api_key_here'") + return None + print("\033[93m[OK]\033[0m API key found!") + return api_key + + +def example_detect_captcha(api_key: str): + """ + Example 1: Detecting a CAPTCHA on a webpage + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 1: Detecting CAPTCHAs\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + print("\n\033[93m[OK]\033[0m Learning to detect CAPTCHAs on webpages") + print(" CAPTCHA detection helps identify when human intervention is needed.") + print(" You'll see Nova Act analyze a page and determine if a CAPTCHA is present.") + print("\n\033[94m→ Next:\033[0m Navigating to a CAPTCHA demo page and testing detection capabilities") + + # Using nopecha.com for CAPTCHA handling demonstration + with NovaAct(starting_page="https://nopecha.com/captcha/hcaptcha", nova_act_api_key=api_key) as nova: + try: + # Check if a CAPTCHA is present + result = nova.act("Is there a captcha on the screen?", schema=BOOL_SCHEMA) + + if result.matches_schema and result.parsed_response: + print("\033[93m[OK]\033[0m CAPTCHA detected!") + else: + print("\033[93m[OK]\033[0m No CAPTCHA found") + + print(f" Raw response: {result.response}") + + except Exception as e: + if "HumanValidationError" in str(e): + print("\033[93m[OK]\033[0m CAPTCHA detected! (Nova Act correctly refused to solve it)") + print(" This is expected behavior - Nova Act will not solve CAPTCHAs") + else: + print(f"\033[91m[ERROR]\033[0m Unexpected error: {e}") + + +def example_pause_for_captcha(api_key: str): + """ + Example 2: Pausing automation for CAPTCHA solving + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 2: Pausing for Human Input\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + print("\n\033[93m[OK]\033[0m Implementing human-in-the-loop workflow for CAPTCHA handling") + print(" When CAPTCHAs are detected, automation should pause and wait for human intervention.") + print(" You'll see Nova Act detect a CAPTCHA and demonstrate proper handling procedures.") + print("\n\033[94m→ Next:\033[0m Checking for CAPTCHAs and implementing pause-and-wait logic") + + with NovaAct(starting_page="https://nopecha.com/captcha/hcaptcha", nova_act_api_key=api_key) as nova: + # Simulate filling out a form + print("Checking for CAPTCHA before form submission...") + + try: + # Check for CAPTCHA before submitting + result = nova.act("Is there a captcha on the screen?", schema=BOOL_SCHEMA) + + if result.matches_schema and result.parsed_response: + print("\n\033[93m[WARNING]\033[0m CAPTCHA detected. Please solve it manually.") + input("Press Enter after you have solved the CAPTCHA...") + print("\033[93m[OK]\033[0m Continuing with automation...") + else: + print("\033[93m[OK]\033[0m No CAPTCHA detected, proceeding automatically") + + except Exception as e: + if "HumanValidationError" in str(e): + print("\n\033[93m[WARNING]\033[0m CAPTCHA detected! Nova Act correctly refused to interact with it.") + print("In a real scenario, you would pause here for human intervention.") + input("Press Enter to continue (simulating CAPTCHA solved)...") + print("\033[93m[OK]\033[0m Continuing with automation...") + else: + raise e + + print("\033[93m[OK]\033[0m Form submission flow completed") + + +def example_advanced_captcha_detection(api_key: str): + """ + Example 3: Advanced CAPTCHA detection with specific types + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 3: Advanced CAPTCHA Detection\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + print("\n\033[93m[OK]\033[0m Advanced CAPTCHA detection with detailed analysis") + print(" Advanced detection identifies specific CAPTCHA types and security challenges.") + print(" You'll see Nova Act provide detailed descriptions of security elements on the page.") + print("\n\033[94m→ Next:\033[0m Analyzing page security challenges and identifying CAPTCHA variations") + + with NovaAct(starting_page="https://nopecha.com/captcha/hcaptcha", nova_act_api_key=api_key) as nova: + try: + # Get detailed analysis of security challenges + captcha_check = nova.act( + "Describe any security challenges on this page (CAPTCHA, reCAPTCHA, image verification, etc.)" + ) + + print(f"\n\033[93mResult:\033[0m Security challenge analysis:") + print(f" {captcha_check.response}") + + # Check for specific CAPTCHA types + print("\nChecking for specific CAPTCHA types...") + + recaptcha_present = nova.act("Is there a reCAPTCHA checkbox on the screen?", schema=BOOL_SCHEMA) + if recaptcha_present.matches_schema and recaptcha_present.parsed_response: + print("\033[93m[WARNING]\033[0m reCAPTCHA detected") + else: + print("\033[93m[OK]\033[0m No reCAPTCHA found") + + except Exception as e: + if "HumanValidationError" in str(e): + print("\n\033[93m[OK]\033[0m Advanced CAPTCHA detection triggered security measures") + print("This demonstrates Nova Act's built-in CAPTCHA protection") + else: + print(f"\033[91m[ERROR]\033[0m Unexpected error: {e}") + + print("\n\033[93m[OK]\033[0m Advanced CAPTCHA detection completed") + + +def main(): + """Main function to run all CAPTCHA handling examples.""" + print("="*60) + print("CAPTCHA Handling with Amazon Nova Act") + print("="*60) + + # Check API key + api_key = check_api_key() + if not api_key: + return + + print("\n\033[93m[WARNING]\033[0m Important Notes:") + print("- Nova Act will NOT solve CAPTCHAs automatically") + print("- Human intervention is always required for CAPTCHAs") + print("- CAPTCHAs are security measures - respect their purpose") + + print("\nThis tutorial includes 3 examples. Press Enter after each to continue...") + + try: + # Example 1 + example_detect_captcha(api_key) + print(f"\n\033[92m✓ Completed:\033[0m CAPTCHA detection using boolean schemas") + print(f"\033[94m→ Next:\033[0m Pausing automation for human CAPTCHA solving") + input("\n>> Press Enter to continue to Example 2...") + + # Example 2 + example_pause_for_captcha(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Human intervention workflow for CAPTCHA handling") + print(f"\033[94m→ Next:\033[0m Advanced CAPTCHA type detection and analysis") + input("\n>> Press Enter to continue to Example 3...") + + # Example 3 + example_advanced_captcha_detection(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Advanced CAPTCHA detection for multiple types") + + print("\n" + "="*60) + print("\033[93m[OK]\033[0m All CAPTCHA handling examples completed!") + print("="*60) + print("\nKey Takeaways:") + print("- Always check for CAPTCHAs before critical actions") + print("- Use BOOL_SCHEMA for yes/no CAPTCHA detection") + print("- Provide clear instructions to users") + print("- Validate that CAPTCHAs were actually solved") + + print("\nNext Steps:") + print("- Move on to Tool Use tutorial (03-tool-use)") + print("- Explore the Observability tutorial (04-observability)") + print("- Practice CAPTCHA handling with real websites") + + except KeyboardInterrupt: + print("\n\n[WARNING] Tutorial interrupted by user") + except Exception as e: + print(f"\n\033[91m[ERROR]\033[0m Error running examples: {e}") + print("\nTroubleshooting:") + print("- Ensure your API key is valid") + print("- Check your internet connection") + print("- Try with different websites that have CAPTCHAs") + + +if __name__ == "__main__": + main() diff --git a/nova-act/tutorials/research/02-human-in-loop/README.md b/nova-act/tutorials/research/02-human-in-loop/README.md new file mode 100644 index 00000000..917fe081 --- /dev/null +++ b/nova-act/tutorials/research/02-human-in-loop/README.md @@ -0,0 +1,68 @@ +# Human in the Loop with Amazon Nova Act + +## Overview +This tutorial covers incorporating human input into automation workflows for tasks requiring human judgment, security challenges, or sensitive data handling. + +## Learning Objectives +- Understand when human intervention is needed in automation +- Detect and handle CAPTCHAs with human assistance +- Learn proper error handling for security challenges +- Understand Nova Act's built-in CAPTCHA protection + +## Prerequisites +**⚠️ Complete the centralized setup first!** +- Complete setup in `../00-setup/` +- Completion of Tutorial 01 - Getting Started +- Basic understanding of web security concepts + +## When Human Intervention is Needed +- CAPTCHAs and security measures +- Complex decision-making requiring judgment +- Sensitive data entry (passwords, credit cards) +- Authentication flows (2FA, SSO) +- Exception handling for unusual situations + +## Tutorial Script + +### CAPTCHA Handling (`1_captcha_handling.py`) +Learn to detect and handle CAPTCHAs in automation workflows. + +**What you'll learn:** +- Detecting CAPTCHAs using Nova Act's analysis +- Handling `HumanValidationError` exceptions +- Pausing automation for human CAPTCHA solving +- Advanced detection for different CAPTCHA types + +**Key concepts:** +- Nova Act will NOT solve CAPTCHAs automatically +- `HumanValidationError` is expected and correct behavior +- Use `BOOL_SCHEMA` for yes/no CAPTCHA detection +- Always handle CAPTCHA exceptions gracefully + +## Security Notes + +### Critical Security Notice +⚠️ **Nova Act correctly refuses to solve CAPTCHAs** - this is intentional security behavior that should be respected. + +## Best Practices + +### CAPTCHA Handling +- Always check before critical actions +- Handle `HumanValidationError` exceptions properly +- Provide clear instructions to users +- Use try-catch blocks for CAPTCHA detection +- Validate completion after human intervention + +## Quick Start +```bash +# Activate environment +source ../00-setup/venv/bin/activate + +# Run the tutorial +python 1_captcha_handling.py +``` + +## Next Steps +- Practice with real websites using CAPTCHAs +- Understand Nova Act's security protections +- Move on to Tool Use tutorial (03-tool-use) diff --git a/nova-act/tutorials/research/03-tool-use/1_data_processing.py b/nova-act/tutorials/research/03-tool-use/1_data_processing.py new file mode 100644 index 00000000..940c2d8e --- /dev/null +++ b/nova-act/tutorials/research/03-tool-use/1_data_processing.py @@ -0,0 +1,457 @@ +#!/usr/bin/env python3 +""" +Data Processing with Amazon Nova Act and Pandas + +This script demonstrates how to extract data with Nova Act and process it +using pandas for analysis, transformation, and visualization. + +Prerequisites: +- Complete the centralized setup first (see ../00-setup/README.md) +- Completion of previous tutorials + +Setup: +1. Run the centralized setup (one-time): + cd ../00-setup + ./setup.sh + +2. Activate the virtual environment: + source ../00-setup/venv/bin/activate + +3. Run this tutorial: + python data_processing.py + +Note: The setup script installs pandas and all required dependencies. +""" + +import os +import json +from nova_act import NovaAct +from pydantic import BaseModel +from typing import List + +try: + import pandas as pd + PANDAS_AVAILABLE = True +except ImportError: + PANDAS_AVAILABLE = False + print("\033[93m[WARNING]\033[0m pandas not installed. Install with: pip install pandas") + + +def check_api_key(): + """Check if the API key is set.""" + api_key = os.getenv('NOVA_ACT_API_KEY') + if not api_key: + print("\033[91m[ERROR]\033[0m API key not found. Please set the NOVA_ACT_API_KEY environment variable.") + return None + print("\033[93m[OK]\033[0m API key found!") + return api_key + + +def example_extract_to_dataframe(api_key: str): + """ + Example 1: Extract structured data and convert to DataFrame + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 1: Extract to DataFrame\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + if not PANDAS_AVAILABLE: + print("\033[93m[WARNING]\033[0m Skipping - pandas not installed") + return + + print("\n\033[93m[OK]\033[0m Extracting structured data and converting to pandas DataFrame") + print(" DataFrames make it easy to analyze, filter, and export web-scraped data.") + print(" You'll see Nova Act extract product information and organize it into a table format.") + print("\n\033[94m→ Next:\033[0m Scraping Amazon product data and creating a DataFrame for analysis") + + # Define schema for structured data + class Product(BaseModel): + name: str + price: float + rating: float + + class ProductList(BaseModel): + products: List[Product] + + with NovaAct(starting_page="https://www.amazon.com/gp/movers-and-shakers/music/ref=zg_bsms_nav_music_0_amazon-renewed", nova_act_api_key=api_key) as nova: + print("\n\033[93m[OK]\033[0m Extracting structured data...") + + # Extract data with schema + result = nova.act( + "Extract the first 5 products with their exact names, actual prices in dollars, and real star ratings (look for stars or rating numbers like 3.2, 4.7, etc.)", + schema=ProductList.model_json_schema() + ) + + if result.matches_schema: + # Convert to pandas DataFrame + product_list = ProductList.model_validate(result.parsed_response) + df = pd.DataFrame([product.dict() for product in product_list.products]) + + print("\n\033[93m[OK]\033[0m Data extracted and converted to DataFrame:") + print(df) + + # Save to CSV + csv_path = "/tmp/extracted_data.csv" + df.to_csv(csv_path, index=False) + print(f"\n\033[93m[OK]\033[0m Data saved to: {csv_path}") + else: + print("\033[91m[ERROR]\033[0m Could not extract structured data") + + +def example_data_analysis(api_key: str): + """ + Example 2: Perform data analysis on extracted data + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 2: Data Analysis\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + if not PANDAS_AVAILABLE: + print("\033[93m[WARNING]\033[0m Skipping - pandas not installed") + return + + print("\n\033[93m[OK]\033[0m Performing data analysis on extracted product information") + print(" Data analysis helps identify trends, averages, and insights from scraped data.") + print(" You'll see Nova Act extract data and then calculate statistics like average prices and ratings.") + print("\n\033[94m→ Next:\033[0m Scraping product data and computing analytical insights") + + class Product(BaseModel): + name: str + price: float + rating: float + + class ProductList(BaseModel): + products: List[Product] + + with NovaAct(starting_page="https://www.amazon.com/blackfriday?ref_=nav_cs_td_bf_dt_cr", nova_act_api_key=api_key) as nova: + print("\n\033[93m[OK]\033[0m Extracting product data...") + + result = nova.act( + "Extract the first 5 products with their exact names, actual prices in dollars, and real star ratings (look for stars or rating numbers like 3.2, 4.7, etc.)", + schema=ProductList.model_json_schema() + ) + + if result.matches_schema: + product_list = ProductList.model_validate(result.parsed_response) + df = pd.DataFrame([p.dict() for p in product_list.products]) + + print("\n\033[93m[OK]\033[0m Product DataFrame:") + print(df) + + # Perform analysis + print(f"\n\033[93m[OK]\033[0m Statistical Analysis:") + print(f" Total products: {len(df)}") + print(f" Average price: ${df['price'].mean():.2f}") + print(f" Price range: ${df['price'].min():.2f} - ${df['price'].max():.2f}") + print(f" Average rating: {df['rating'].mean():.2f}") + + # Find best value (high rating, low price) + df['value_score'] = df['rating'] / df['price'] + best_value = df.loc[df['value_score'].idxmax()] + print(f"\n\033[93m[OK]\033[0m Best value product:") + print(f" Name: {best_value['name']}") + print(f" Price: ${best_value['price']:.2f}") + print(f" Rating: {best_value['rating']:.2f}") + + +def example_data_transformation(api_key: str): + """ + Example 3: Transform and clean extracted data + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 3: Data Transformation\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + if not PANDAS_AVAILABLE: + print("\033[93m[WARNING]\033[0m Skipping - pandas not installed") + return + + print("\n\033[93m[OK]\033[0m Transforming and cleaning extracted data") + print(" Data transformation converts raw scraped data into clean, usable formats.") + print(" You'll see Nova Act extract data and then apply cleaning and formatting operations.") + print("\n\033[94m→ Next:\033[0m Extracting raw data and applying transformation techniques") + + with NovaAct(starting_page="https://www.amazon.com/blackfriday?ref_=nav_cs_td_bf_dt_cr", nova_act_api_key=api_key) as nova: + print("\n\033[93m[OK]\033[0m Extracting raw data...") + + # Extract some data + result = nova.act( + "Extract the page title and main heading", + schema={"type": "object", "properties": {"title": {"type": "string"}, "heading": {"type": "string"}}, "required": ["title", "heading"]} + ) + + if result.response: + # Create DataFrame from raw data + raw_data = { + 'content': [result.response], + 'url': [nova.page.url], + 'timestamp': [pd.Timestamp.now()] + } + df = pd.DataFrame(raw_data) + + print("\n\033[93m[OK]\033[0m Raw DataFrame:") + print(df) + + # Transform data + df['content_length'] = df['content'].str.len() + df['domain'] = df['url'].str.extract(r'https?://([^/]+)') + df['date'] = df['timestamp'].dt.date + df['time'] = df['timestamp'].dt.time + + print("\n\033[93m[OK]\033[0m Transformed DataFrame:") + print(df[['domain', 'content_length', 'date', 'time']]) + + # Save transformed data + output_path = "/tmp/transformed_data.json" + df.to_json(output_path, orient='records', date_format='iso') + print(f"\n\033[93m[OK]\033[0m Transformed data saved to: {output_path}") + + +def example_multi_page_aggregation(api_key: str): + """ + Example 4: Aggregate data from multiple pages + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 4: Multi-Page Aggregation\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + if not PANDAS_AVAILABLE: + print("\033[93m[WARNING]\033[0m Skipping - pandas not installed") + return + + # Use two different Amazon pages for aggregation + pages = [ + "https://www.amazon.com/gp/browse.html?node=120955898011&ref_=nav_cs_handmade", + "https://www.amazon.com/fmc/ssd-storefront?ref_=nav_cs_SSD_nav_storefront" + ] + + all_data = [] + + for page_url in pages: + print(f"\n\033[93m[OK]\033[0m Processing: {page_url}") + + with NovaAct(starting_page=page_url, nova_act_api_key=api_key) as nova: + result = nova.act( + "What is the main heading on this page?", + schema={"type": "object", "properties": {"heading": {"type": "string"}}, "required": ["heading"]} + ) + + if result.response: + all_data.append({ + 'url': page_url, + 'heading': result.response, + 'timestamp': pd.Timestamp.now() + }) + + # Create combined DataFrame + df = pd.DataFrame(all_data) + + print("\n\033[93m[OK]\033[0m Combined DataFrame:") + print(df) + + # Aggregate statistics + print(f"\n\033[93m[OK]\033[0m Aggregation:") + print(f" Total pages processed: {len(df)}") + print(f" Average heading length: {df['heading'].str.len().mean():.1f} characters") + + # Save aggregated data + output_path = "/tmp/aggregated_data.csv" + df.to_csv(output_path, index=False) + print(f"\n\033[93m[OK]\033[0m Aggregated data saved to: {output_path}") + + +def example_data_filtering(api_key: str): + """ + Example 5: Filter and query extracted data + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 5: Data Filtering\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + if not PANDAS_AVAILABLE: + print("\033[93m[WARNING]\033[0m Skipping - pandas not installed") + return + + class DataPoint(BaseModel): + category: str + value: float + status: str + + class DataSet(BaseModel): + data: List[DataPoint] + + with NovaAct(starting_page="https://www.amazon.com/blackfriday?ref_=nav_cs_td_bf_dt_cr", nova_act_api_key=api_key) as nova: + print("\n\033[93m[OK]\033[0m Extracting dataset...") + + result = nova.act( + "Extract the first 5 data points with categories, values, and status", + schema=DataSet.model_json_schema() + ) + + if result.matches_schema: + dataset = DataSet.model_validate(result.parsed_response) + df = pd.DataFrame([d.dict() for d in dataset.data]) + + print("\n\033[93m[OK]\033[0m Full DataFrame:") + print(df) + + # Filter data + high_value = df[df['value'] > df['value'].median()] + print(f"\n\033[93m[OK]\033[0m High value items (>{df['value'].median():.2f}):") + print(high_value) + + # Group by category + category_stats = df.groupby('category')['value'].agg(['count', 'mean', 'sum']) + print("\n\033[93m[OK]\033[0m Statistics by category:") + print(category_stats) + + # Filter by status + active_items = df[df['status'] == 'active'] + print(f"\n\033[93m[OK]\033[0m Active items: {len(active_items)}") + + +def example_export_formats(api_key: str): + """ + Example 6: Export data in various formats + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 6: Export Formats\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + if not PANDAS_AVAILABLE: + print("\033[93m[WARNING]\033[0m Skipping - pandas not installed") + return + + with NovaAct(starting_page="https://www.amazon.com/gp/movers-and-shakers/?ref_=nav_em_ms_0_1_1_4", nova_act_api_key=api_key) as nova: + print("\n\033[93m[OK]\033[0m Extracting data for export...") + + result = nova.act( + "Extract the page title and URL", + schema={"type": "object", "properties": {"title": {"type": "string"}, "url": {"type": "string"}}, "required": ["title", "url"]} + ) + + if result.response: + # Create sample DataFrame + data = { + 'title': [result.response], + 'url': [nova.page.url], + 'timestamp': [pd.Timestamp.now()], + 'status': ['success'] + } + df = pd.DataFrame(data) + + print("\n\033[93m[OK]\033[0m Exporting to multiple formats...") + + # CSV + csv_path = "/tmp/export.csv" + df.to_csv(csv_path, index=False) + print(f" [OK] CSV: {csv_path}") + + # JSON + json_path = "/tmp/export.json" + df.to_json(json_path, orient='records', date_format='iso') + print(f" [OK] JSON: {json_path}") + + # Excel (requires openpyxl) + try: + excel_path = "/tmp/export.xlsx" + df.to_excel(excel_path, index=False) + print(f" [OK] Excel: {excel_path}") + except ImportError: + print(f" [WARNING] Excel: Skipped (install openpyxl)") + + # HTML + html_path = "/tmp/export.html" + df.to_html(html_path, index=False) + print(f" [OK] HTML: {html_path}") + + print("\n\033[93m[OK]\033[0m All exports completed") + + +def main(): + """Main function to run all data processing examples.""" + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mData Processing with Amazon Nova Act and Pandas\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + if not PANDAS_AVAILABLE: + print("\n\033[91m[ERROR]\033[0m pandas is required for this tutorial") + print(" Install with: pip install pandas") + return + + # Check API key + api_key = check_api_key() + if not api_key: + return + + print("\n[DATA] Data Processing Workflow:") + print(" 1. Extract structured data with Nova Act") + print(" 2. Convert to pandas DataFrame") + print(" 3. Analyze, transform, and filter") + print(" 4. Export in various formats") + + print("\nThis tutorial includes 6 examples. Press Enter after each to continue...") + + try: + # Example 1 + example_extract_to_dataframe(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Data extraction to pandas DataFrame") + print(f"\033[94m→ Next:\033[0m Statistical analysis and data insights") + input("\n>> Press Enter to continue to Example 2...") + + # Example 2 + example_data_analysis(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Data analysis with pandas statistics") + print(f"\033[94m→ Next:\033[0m Data transformation and cleaning operations") + input("\n>> Press Enter to continue to Example 3...") + + # Example 3 + example_data_transformation(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Data transformation and cleaning") + print(f"\033[94m→ Next:\033[0m Multi-page data aggregation workflow") + input("\n>> Press Enter to continue to Example 4...") + + # Example 4 + example_multi_page_aggregation(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Multi-page data collection and aggregation") + print(f"\033[94m→ Next:\033[0m Data filtering and querying techniques") + input("\n>> Press Enter to continue to Example 5...") + + # Example 5 + example_data_filtering(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Data filtering and querying operations") + print(f"\033[94m→ Next:\033[0m Exporting data to multiple file formats") + input("\n>> Press Enter to continue to Example 6...") + + # Example 6 + example_export_formats(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Data export to multiple formats") + + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94m[OK] All data processing examples completed!\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + print("\nKey Takeaways:") + print("- Always use Pydantic schemas for structured extraction") + print("- Convert to DataFrame for powerful analysis") + print("- Use pandas for filtering, grouping, and aggregation") + print("- Export to multiple formats (CSV, JSON, Excel, HTML)") + print("- Aggregate data from multiple pages") + + print("\nNext Steps:") + print("- Explore the Observability tutorial (04-observability)") + print("- Build your own data extraction pipelines") + print("- Experiment with other pandas features") + + except KeyboardInterrupt: + print("\n\n[WARNING] Tutorial interrupted by user") + except Exception as e: + print(f"\n\033[91m[ERROR]\033[0m Error running examples: {e}") + print("\nTroubleshooting:") + print("- Ensure pandas is installed: pip install pandas") + print("- Check your API key is valid") + print("- Verify internet connection") + + +if __name__ == "__main__": + main() diff --git a/nova-act/tutorials/research/03-tool-use/README.md b/nova-act/tutorials/research/03-tool-use/README.md new file mode 100644 index 00000000..298cfbe4 --- /dev/null +++ b/nova-act/tutorials/research/03-tool-use/README.md @@ -0,0 +1,142 @@ +# Python Library Integration with Amazon Nova Act + +## Overview +This tutorial demonstrates extracting product data from Amazon pages using Nova Act and processing it with pandas for analysis, transformation, and export. You'll learn to build complete data pipelines from web extraction to structured analysis. + +## Learning Objectives +- Extract structured product data (names, prices, ratings) from Amazon pages +- Convert extracted data to pandas DataFrames for analysis +- Perform statistical analysis on real product data +- Transform and clean extracted data +- Aggregate data from multiple Amazon product categories +- Filter and query product datasets +- Export data to multiple formats (CSV, JSON, Excel, HTML) + +## Prerequisites +**⚠️ Complete the centralized setup first!** +- Complete setup in `../00-setup/` +- Completion of previous tutorials (01-getting-started, 02-human-in-loop) +- Basic understanding of pandas (helpful but not required) + +## Tutorial Script + +### Data Processing (`1_data_processing.py`) +Six examples demonstrating complete data processing workflows with real Amazon product data. + +**Example 1: Extract to DataFrame** +- **Source**: Amazon Music Movers & Shakers +- **Data**: First 5 products with names, prices, and star ratings +- **Output**: Pandas DataFrame and CSV export +- **Focus**: Basic structured data extraction and DataFrame conversion + +**Example 2: Data Analysis** +- **Source**: Amazon Black Friday deals +- **Analysis**: Statistical analysis (averages, ranges, best value products) +- **Calculations**: Price statistics, rating analysis, value scoring +- **Focus**: Real-world product analysis and insights + +**Example 3: Data Transformation** +- **Source**: Amazon Black Friday deals +- **Transforms**: Content length, domain extraction, date/time parsing +- **Output**: JSON export with transformed fields +- **Focus**: Data cleaning and feature engineering + +**Example 4: Multi-Page Aggregation** +- **Sources**: Amazon Handmade + SSD Storefront pages +- **Process**: Extract headings from multiple product categories +- **Analysis**: Cross-category comparison and aggregation +- **Focus**: Combining data from multiple sources + +**Example 5: Data Filtering** +- **Source**: Amazon Black Friday deals +- **Operations**: Median filtering, category grouping, status filtering +- **Analysis**: High-value item identification, category statistics +- **Focus**: Advanced pandas filtering and querying + +**Example 6: Export Formats** +- **Source**: Amazon Movers & Shakers +- **Exports**: CSV, JSON, Excel, HTML formats +- **Data**: Page metadata with timestamps +- **Focus**: Multi-format data export workflows + +## Key Concepts + +### Structured Data Extraction +Uses Pydantic schemas to extract specific product information: +```python +class Product(BaseModel): + name: str + price: float + rating: float +``` + +### Real Amazon Data Sources +- **Music Movers & Shakers**: Trending music products with ratings +- **Black Friday Deals**: Seasonal promotions with varied pricing +- **Handmade Products**: Artisan items with unique characteristics +- **SSD Storefront**: Technology products with specifications +- **General Movers & Shakers**: Cross-category trending items + +### Data Processing Pipeline +1. **Extract**: Get structured product data using Nova Act with schemas +2. **Validate**: Ensure data matches expected schema format +3. **Transform**: Clean and enhance data with pandas operations +4. **Analyze**: Perform statistical analysis and insights +5. **Export**: Save results in multiple formats for different uses + +### Best Practices Demonstrated +- Always use Pydantic schemas for reliable extraction +- Validate data with `result.matches_schema` before processing +- Limit extractions to manageable sizes (first 5 products) +- Handle missing or malformed data gracefully +- Use descriptive prompts for accurate extraction +- Save intermediate results to prevent data loss + +## Data Analysis Techniques + +### Statistical Analysis +- Price ranges and averages across product categories +- Rating distributions and quality metrics +- Value scoring (rating-to-price ratios) +- Cross-category comparisons + +### Data Transformation +- Content length analysis for product descriptions +- Domain extraction from URLs +- Timestamp parsing and date/time operations +- Feature engineering for enhanced analysis + +### Filtering and Querying +- Median-based filtering for high-value items +- Category-based grouping and aggregation +- Status-based filtering for active products +- Multi-criteria product selection + +## Export Capabilities +- **CSV**: Spreadsheet-compatible format for analysis tools +- **JSON**: API-friendly format for web applications +- **Excel**: Business-ready format with formatting support +- **HTML**: Web-displayable format for reports and dashboards + +## Quick Start +```bash +# Activate environment +source ../00-setup/venv/bin/activate + +# Run the tutorial +python 1_data_processing.py +``` + +## Real-World Applications +- **E-commerce Analysis**: Product pricing and rating trends +- **Market Research**: Competitive analysis across categories +- **Inventory Management**: Product performance tracking +- **Business Intelligence**: Sales and customer preference insights +- **Automated Reporting**: Regular data extraction and analysis pipelines + +## Next Steps +- Build custom product analysis workflows +- Explore the Observability tutorial (04-observability) +- Create automated reporting systems +- Integrate with business intelligence tools +- Develop real-time product monitoring systems diff --git a/nova-act/tutorials/research/04-observability/1_observability.py b/nova-act/tutorials/research/04-observability/1_observability.py new file mode 100644 index 00000000..8b8e9a5f --- /dev/null +++ b/nova-act/tutorials/research/04-observability/1_observability.py @@ -0,0 +1,362 @@ +#!/usr/bin/env python3 +""" +Observability in Amazon Nova Act + +This script demonstrates Nova Act's built-in logging, tracing, and session +recording capabilities for monitoring and debugging automation workflows. + +Prerequisites: +- Complete the centralized setup first (see ../00-setup/README.md) +- Completion of previous tutorials + +Setup: +1. Run the centralized setup (one-time): + cd ../00-setup + ./setup.sh + +2. Activate the virtual environment: + source ../00-setup/venv/bin/activate + +3. Run this tutorial: + python observability.py +""" + +import os +import logging +from nova_act import NovaAct, BOOL_SCHEMA +from pydantic import BaseModel + + +def check_api_key(): + """Check if the API key is set.""" + api_key = os.getenv('NOVA_ACT_API_KEY') + if not api_key: + print("\033[91m[ERROR]\033[0m API key not found. Please set the NOVA_ACT_API_KEY environment variable.") + return None + print("\033[93m[OK]\033[0m API key found!") + return api_key + + +def example_basic_logging(api_key: str): + """ + Example 1: Basic logging with Nova Act + + Nova Act automatically logs all actions at INFO level or above. + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 1: Basic Logging\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + print("\n\033[93m[OK]\033[0m Nova Act logging is automatic") + print(" Default log level: INFO") + print(" Logs include: actions, decisions, errors") + print(" Nova Act automatically logs every action it takes, including what it sees and decides.") + print(" You'll see detailed output showing the browser automation process in real-time.") + print("\n\033[94m→ Next:\033[0m Performing a simple navigation task with automatic logging enabled") + + with NovaAct(starting_page="https://nova.amazon.com/act/gym/next-dot", nova_act_api_key=api_key) as nova: + print("\n\033[93m[OK]\033[0m Performing actions (check console for logs)...") + result = nova.act("Navigate to the main page and describe what you see") + + print("\n\033[93m[OK]\033[0m Action completed") + print(" Check the console output above for detailed execution logs") + + print(f"\n\033[38;5;208m📁 NAVIGATE TO VIEW ACT RUN:\033[0m") + print(f"\033[38;5;208m Look for the orange-colored output above showing 'View your act run here: /path/to/file.html'\033[0m") + print(f"\033[38;5;208m Open that HTML file in your browser to see detailed trace information\033[0m") + + +def example_debug_logging(api_key: str): + """ + Example 2: Debug-level logging + + Set NOVA_ACT_LOG_LEVEL environment variable for more detailed logs. + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 2: Debug-Level Logging\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + # Set log level to DEBUG for more detailed output + os.environ['NOVA_ACT_LOG_LEVEL'] = str(logging.DEBUG) + print("\n\033[93m[OK]\033[0m Log level set to DEBUG") + print(" This provides maximum detail about Nova Act's operations") + print(" DEBUG level shows internal reasoning, screenshot analysis, and decision trees.") + print(" You'll see much more verbose output including Nova Act's thought process.") + print("\n\033[94m→ Next:\033[0m Running an action with debug-level logging to see detailed internal operations") + + with NovaAct(starting_page="https://nova.amazon.com/act/gym/next-dot", nova_act_api_key=api_key) as nova: + print("\n\033[93m[OK]\033[0m Performing action with debug logging...") + result = nova.act("Click on the first link if available") + + print("\n\033[93m[OK]\033[0m Action completed with debug logging") + print(" Notice the increased detail in console output") + + print(f"\n\033[38;5;208m📁 NAVIGATE TO VIEW ACT RUN:\033[0m") + print(f"\033[38;5;208m Look for the orange-colored output above showing 'View your act run here: /path/to/file.html'\033[0m") + print(f"\033[38;5;208m Open that HTML file in your browser to see detailed trace with debug information\033[0m") + + # Reset to INFO level + os.environ['NOVA_ACT_LOG_LEVEL'] = str(logging.INFO) + + +def example_trace_files(api_key: str): + """ + Example 3: HTML trace files + + Nova Act generates HTML trace files after each act() call. + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 3: HTML Trace Files\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + # Specify a custom logs directory + logs_dir = "/tmp/nova-act-traces" + os.makedirs(logs_dir, exist_ok=True) + + print(f"\n\033[93m[OK]\033[0m Logs directory: {logs_dir}") + print(" Nova Act will save HTML trace files here") + print(" HTML trace files contain screenshots, actions, and decision timelines.") + print(" Each trace file is a complete visual record of what Nova Act saw and did.") + print("\n\033[94m→ Next:\033[0m Running an action that generates an HTML trace file for detailed inspection") + + with NovaAct( + starting_page="https://nova.amazon.com/act/gym/next-dot", + nova_act_api_key=api_key, + logs_directory=logs_dir + ) as nova: + print("\n\033[93m[OK]\033[0m Performing actions...") + result = nova.act("Look for a 'Why Go' page on this website") + + print("\n\033[93m[OK]\033[0m Action completed") + print(f" Trace files saved to: {logs_dir}") + print(" Open the HTML files in a browser to view detailed traces") + print(" Traces include: screenshots, actions, decisions, timing") + + print(f"\n\033[38;5;208m📁 NAVIGATE TO VIEW ACT RUN:\033[0m") + print(f"\033[38;5;208m 1. Look for the orange-colored output above: 'View your act run here: /path/to/file.html'\033[0m") + print(f"\033[38;5;208m 2. Copy that file path and open it in your web browser\033[0m") + print(f"\033[38;5;208m 3. Or navigate to {logs_dir} and open any .html file\033[0m") + print(f"\033[38;5;208m 4. The trace shows screenshots, actions, and Nova Act's decision process\033[0m") + + +def example_session_recording(api_key: str): + """ + Example 4: Session video recording + + Record the entire browser session as a video. + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 4: Session Video Recording\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + logs_dir = "/tmp/nova-act-videos" + os.makedirs(logs_dir, exist_ok=True) + + print(f"\n\033[93m[OK]\033[0m Video directory: {logs_dir}") + print(" Nova Act will record the browser session") + print(" Session recording captures the entire browser window as a video file.") + print(" This creates a movie of exactly what Nova Act did, perfect for debugging or demos.") + print("\n\033[94m→ Next:\033[0m Running an action while recording the browser session to video") + + with NovaAct( + starting_page="https://nova.amazon.com/act/gym/next-dot", + nova_act_api_key=api_key, + logs_directory=logs_dir, + record_video=True # Enable video recording + ) as nova: + print("\n\033[93m[OK]\033[0m Recording session...") + nova.act("Scroll down the page") + nova.act("Scroll back to the top") + + print("\n\033[93m[OK]\033[0m Session recorded") + print(f" Video saved to: {logs_dir}") + print(" Look for .webm video files") + + +def example_error_debugging(api_key: str): + """ + Example 5: Debugging failed automations + + Demonstrates how to use observability features to debug issues. + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 5: Error Debugging\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + logs_dir = "/tmp/nova-act-debug" + os.makedirs(logs_dir, exist_ok=True) + + print(f"\n\033[93m[OK]\033[0m Debug logs directory: {logs_dir}") + print(" Enabling debug logging and trace files") + print(" Error debugging shows how Nova Act handles failures and provides diagnostic information.") + print(" You'll see detailed error logs and trace files that help identify what went wrong.") + print("\n\033[94m→ Next:\033[0m Attempting an impossible action to demonstrate error logging and recovery") + + # Enable debug logging + os.environ['NOVA_ACT_LOG_LEVEL'] = str(logging.DEBUG) + + try: + with NovaAct( + starting_page="https://nova.amazon.com/act/gym/next-dot", + nova_act_api_key=api_key, + logs_directory=logs_dir + ) as nova: + print("\n\033[93m[OK]\033[0m Attempting action that might fail...") + + # Try an action that might not work + result = nova.act("Click on the non-existent button labeled 'XYZ123'") + + print("\n\033[93m[OK]\033[0m Action completed (or failed gracefully)") + + except Exception as e: + print(f"\n[WARNING] Action failed: {e}") + print("\n\033[93m[OK]\033[0m Debugging information available:") + print(f" 1. Console logs (above) show detailed error") + print(f" 2. Trace files in {logs_dir} show what Nova Act saw") + print(f" 3. Screenshots in trace files show page state") + print("\n Use these to understand why the action failed") + + # Reset log level + os.environ['NOVA_ACT_LOG_LEVEL'] = str(logging.INFO) + + +def example_custom_logging(api_key: str): + """ + Example 6: Custom logging in your automation + + Add your own logging alongside Nova Act's logs. + """ + print(f"\n\033[94m{'='*60}\033[0m") + print(f"\033[94mExample 6: Custom Logging\033[0m") + print(f"\033[94m{'='*60}\033[0m") + + # Set up custom logger + logger = logging.getLogger(__name__) + logger.setLevel(logging.INFO) + + # Add console handler if not already present + if not logger.handlers: + handler = logging.StreamHandler() + formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') + handler.setFormatter(formatter) + logger.addHandler(handler) + + print("\n\033[93m[OK]\033[0m Custom logger configured") + print(" A custom logger lets you add your own timestamped messages alongside Nova Act's logs.") + print(" This logger is configured to show INFO-level messages with timestamps and module names.") + print(" You'll see both your custom messages and Nova Act's internal logging in the output.") + print("\n\033[94m→ Next:\033[0m Running a multi-step workflow with custom logging at each stage") + + with NovaAct(starting_page="https://nova.amazon.com/act/gym/next-dot", nova_act_api_key=api_key) as nova: + logger.info("Starting custom automation workflow") + + logger.info("Step 1: Navigating to page") + result = nova.act("Go to the main content area", schema=BOOL_SCHEMA) + + logger.info(f"Step 1 completed: {result.response[:50] if result.response else 'No response'}...") + + logger.info("Step 2: Extracting information") + + class PageTitle(BaseModel): + title: str + + result = nova.act("What is the page title?", schema=PageTitle.model_json_schema()) + + logger.info(f"Step 2 completed: {result.response}") + + logger.info("Automation workflow completed successfully") + + print("\n\033[93m[OK]\033[0m Custom logging integrated with Nova Act logs") + + +def main(): + """Main function to demonstrate observability features.""" + print("="*60) + print("Observability in Amazon Nova Act") + print("="*60) + + # Check API key + api_key = check_api_key() + if not api_key: + return + + print("\n[DATA] Observability Features:") + print(" 1. Automatic logging (INFO level by default)") + print(" 2. Debug logging (set NOVA_ACT_LOG_LEVEL)") + print(" 3. HTML trace files (generated after each act())") + print(" 4. Session video recording (record_video=True)") + print(" 5. Custom logs directory (logs_directory parameter)") + + print("\nThis tutorial includes 6 examples. Press Enter after each to continue...") + + try: + # Example 1 + example_basic_logging(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Basic logging and console output") + print(f"\033[94m→ Next:\033[0m Debug-level logging for detailed troubleshooting") + input("\n>> Press Enter to continue to Example 2...") + + # Example 2 + example_debug_logging(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Debug logging with detailed information") + print(f"\033[94m→ Next:\033[0m HTML trace file generation and analysis") + input("\n>> Press Enter to continue to Example 3...") + + # Example 3 + example_trace_files(api_key) + print(f"\n\033[92m✓ Completed:\033[0m HTML trace file generation") + print(f"\033[94m→ Next:\033[0m Session video recording for visual debugging") + input("\n>> Press Enter to continue to Example 4...") + + # Example 4 + example_session_recording(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Session video recording") + print(f"\033[94m→ Next:\033[0m Error debugging using observability tools") + input("\n>> Press Enter to continue to Example 5...") + + # Example 5 + example_error_debugging(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Error debugging with trace analysis") + print(f"\033[94m→ Next:\033[0m Custom logging integration patterns") + input("\n>> Press Enter to continue to Example 6...") + + # Example 6 + example_custom_logging(api_key) + print(f"\n\033[92m✓ Completed:\033[0m Custom logging integration with Nova Act") + + print("\n" + "="*60) + print("\033[93m[OK]\033[0m All observability examples completed!") + print("="*60) + + print("\nKey Takeaways:") + print("- Nova Act automatically logs all actions") + print("- Use NOVA_ACT_LOG_LEVEL for debug output") + print("- HTML trace files provide visual debugging") + print("- Video recording captures entire sessions") + print("- Custom logging integrates with Nova Act logs") + + print("\nBest Practices:") + print("- Always specify logs_directory in production") + print("- Use debug logging when troubleshooting") + print("- Review trace files when actions fail") + print("- Record videos for complex workflows") + print("- Add custom logging for business logic") + + print("\nNext Steps:") + print("- Review the generated trace files and videos") + print("- Practice debugging with intentional failures") + print("- Integrate observability into your workflows") + print("- Explore S3 integration for log storage (see README)") + + except KeyboardInterrupt: + print("\n\n[WARNING] Tutorial interrupted by user") + except Exception as e: + print(f"\n\033[91m[ERROR]\033[0m Error running examples: {e}") + print("\nTroubleshooting:") + print("- Check your API key is valid") + print("- Ensure logs directories are writable") + print("- Verify internet connection") + + +if __name__ == "__main__": + main() diff --git a/nova-act/tutorials/research/04-observability/README.md b/nova-act/tutorials/research/04-observability/README.md new file mode 100644 index 00000000..def762ce --- /dev/null +++ b/nova-act/tutorials/research/04-observability/README.md @@ -0,0 +1,131 @@ +# Observability in Amazon Nova Act + +## Overview +Observability is crucial for understanding, debugging, and optimizing automation workflows. Nova Act provides built-in logging, tracing, and session recording capabilities to monitor automations and diagnose issues. + +## Learning Objectives +- Understand the importance of monitoring automation workflows +- Access and interpret Nova Act execution logs +- Use built-in tracing functionality to visualize workflow execution +- Record and analyze session videos +- Apply debugging techniques to resolve issues +- Integrate custom logging with Nova Act + +## Prerequisites +**⚠️ Complete the centralized setup first!** +- Complete setup in `../00-setup/` +- Completion of previous tutorials (01-03) +- Basic understanding of Python logging + +## Why Observability Matters +- **Debugging** - Understand why automations fail +- **Performance Optimization** - Identify bottlenecks +- **Monitoring** - Ensure automations run as expected +- **Compliance** - Maintain audit trails +- **Learning** - Understand AI agent decision-making +- **Troubleshooting** - Quickly diagnose issues + +## Nova Act Observability Features + +### 1. Automatic Logging +Nova Act automatically logs all actions at INFO level or above to console. + +### 2. Debug Logging +Enable detailed debug information: +```bash +export NOVA_ACT_LOG_LEVEL=10 # DEBUG level +``` + +### 3. HTML Trace Files +Self-contained HTML files with: +- Screenshots of each step +- Actions taken by Nova Act +- AI decision-making process +- Timing information +- Error details + +### 4. Session Video Recording +Record entire browser sessions as WebM videos for visual debugging. + +### 5. S3 Integration +Store session data in Amazon S3 for long-term retention with automatic upload. + +## Tutorial Script + +### Observability (`1_observability.py`) +Comprehensive demonstration of all observability features including: +- Basic and debug logging +- HTML trace generation +- Video recording +- Error debugging techniques +- Custom logging integration + +## Debugging Workflow +1. **Enable Debug Logging** - Set `NOVA_ACT_LOG_LEVEL=10` +2. **Review Console Logs** - Look for errors and unexpected behavior +3. **Examine Trace Files** - View HTML traces in browser +4. **Watch Session Video** - Identify UI timing issues +5. **Iterate and Fix** - Adjust based on findings + +## Best Practices + +### Production Deployments +- Always set `logs_directory` for persistent storage +- Use appropriate log levels (INFO for production, DEBUG for development) +- Implement log rotation for large trace files +- Store logs securely with S3 integration + +### Development +- Enable debug logging for detailed troubleshooting +- Record videos for complex workflows +- Review trace files regularly to understand Nova Act behavior +- Add custom logging for workflow milestones + +### Monitoring +- Track success/failure rates and execution times +- Set up alerts for failed automations +- Regular review of error patterns +- Optimize workflows based on trace analysis + +## Common Debugging Scenarios + +### Action Fails Silently +- Enable debug logging +- Review trace screenshots +- Check element detection +- Verify page load completion + +### Intermittent Failures +- Record video to see timing issues +- Add explicit waits +- Check for dynamic content +- Review network timing + +### Unexpected Behavior +- Review trace screenshots +- Check prompt clarity +- Verify page structure +- Add more specific instructions + +## Quick Start +```bash +# Activate environment +source ../00-setup/venv/bin/activate + +# Run observability tutorial +python 1_observability.py +``` + +## Log Levels +- **DEBUG (10)** - Development, detailed troubleshooting +- **INFO (20)** - Production, general monitoring (default) +- **WARNING (30)** - Potential issues +- **ERROR (40)** - Failures, exceptions +- **CRITICAL (50)** - System failures + +## Next Steps +- Review generated trace files and videos +- Practice debugging with intentional failures +- Integrate observability into production workflows +- Set up S3 storage for long-term retention +- Implement monitoring and alerting