Skip to content

πŸ”₯ Needlessly over-engineered audio transcription platform β€’ Whisper + GPT-4 β€’ Multilingual UI β€’ YouTube integration β€’ AI analysis magic ✨

License

Notifications You must be signed in to change notification settings

mustafaras/echoforge_whisper

Repository files navigation

πŸ”₯ Echo-Forge v0.1 by Whisper AI - Multilingual

ef copy 2

AI-Powered Multilingual Audio Transcription & Analysis Platform

License: MIT Python OpenAI Streamlit Language AI Models Status Platform

Transform audio into insights with AI-powered transcription and analysis

πŸš€ Quick Start β€’ πŸ“– Features β€’ πŸ”§ Setup β€’ 🌐 Languages


🌟 Overview

Echo-Forge is an advanced audio transcription platform powered by OpenAI's Whisper-1 and GPT-4 models. It provides high-quality transcription, intelligent analysis, and multilingual support through a modern web interface.

Whether you're transcribing meetings, lectures, interviews, or YouTube videos, echo-forge delivers professional-grade results with comprehensive AI analysis. The platform combines cutting-edge speech recognition with advanced language models to provide not just transcription, but deep insights into your audio content.

echoforge.mp4

✨ Key Features

  • 🌍 Multilingual Support: Turkish-English interface with 12+ transcription languages
  • πŸ€– AI Analysis: GPT-4o, GPT-4 Turbo integration for content insights
  • 🎡 Audio Processing: Support for MP3, WAV, M4A, MP4, FLAC, OGG, AAC
  • 🎬 YouTube Integration: Direct video transcription from URLs
  • πŸ“€ Export Options: PDF, Word, Excel, QR codes, and email sharing
  • πŸ“Š Advanced Analytics: Speech patterns, emotion detection, keyword extraction
  • ⚑ Real-time Processing: Live progress tracking with smart memory management
  • πŸ”’ Privacy-First: Local file processing with secure API communications

🎯 Use Cases

  • Business Meetings: Automatic meeting minutes with action items extraction
  • Educational Content: Lecture transcription with key concept identification
  • Research Interviews: Qualitative research analysis with theme detection
  • Content Creation: YouTube video analysis for SEO and content optimization
  • Legal Documentation: Accurate transcription for legal proceedings
  • Medical Records: Clinical interview transcription with specialized terminology

πŸš€ Quick Start

image

πŸ“‹ System Requirements

  • Python: 3.8+ (Recommended: Python 3.10+)
  • OpenAI API Key: With access to Whisper-1 and GPT models
  • Memory: Minimum 4GB RAM (8GB+ recommended for batch processing)
  • Storage: 1GB+ free space for temporary files
  • Internet: Stable connection for AI analysis features

πŸ”§ Installation & Setup

Method 1: Standard Installation

# 1. Clone the repository
git clone https://github.com/mustafaras/echoforge_whisper.git
cd echoforge_whisper

# 2. Create virtual environment (recommended)
python -m venv echo-forge-env

# Windows
echo-forge-env\Scripts\activate

# Linux/Mac  
source echo-forge-env/bin/activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Configure environment
echo "OPENAI_API_KEY=your_api_key_here" > .env

# 5. Launch application
streamlit run app.py --server.port 8502

Method 2: Quick Launch

# Use the included launcher script
python run_multilingual.py

# This automatically:
# - Checks dependencies
# - Validates API configuration  
# - Launches optimized server
# - Opens browser

πŸ”‘ API Configuration

  1. Get OpenAI API Key: Visit OpenAI Platform
  2. Set Environment Variable:
    # Windows
    set OPENAI_API_KEY=sk-your-actual-api-key-here
    
    # Linux/Mac
    export OPENAI_API_KEY="sk-your-actual-api-key-here"
  3. Or use .env file (recommended):
    OPENAI_API_KEY=sk-your-actual-api-key-here
    ECHO_FORGE_LANGUAGE=en  # Default language (en/tr)
    

🌐 First Steps

  1. Open http://localhost:8502 in your browser
  2. Select your preferred interface language (Turkish/English)
  3. Upload an audio file or paste a YouTube URL
  4. Configure transcription settings
  5. Enable AI analysis features
  6. Click "πŸš€ Process" and wait for results
  7. Export your transcription and analysis

πŸ“– Features

🎡 Advanced Audio Processing

Supported File Formats

  • Audio: MP3, WAV, M4A, FLAC, OGG, AAC
  • Video: MP4, MPEG4, AVI, MOV (audio extraction)
  • File Size: Up to 25MB per file with intelligent chunking for larger files
  • Quality: Automatic quality assessment and optimization

Audio Analysis Features

  • Waveform Visualization: Interactive plotly-based audio wave analysis
  • Quality Metrics: Sample rate, channel configuration, bit depth analysis
  • Noise Detection: Background noise and clarity assessment
  • Duration Analysis: Precise timing with millisecond accuracy
  • Batch Processing: Process multiple files simultaneously with progress tracking

πŸ€– Comprehensive AI Analysis

AI Model Support

  • GPT-4o: Latest OpenAI model for cutting-edge analysis
  • GPT-4 Turbo: Enhanced processing with improved context understanding
  • GPT-4: Advanced reasoning and comprehensive analysis
  • GPT-3.5 Turbo: Fast, efficient analysis for standard workflows

Analysis Features

  • πŸ“ Content Summarization:

    • Multi-level summaries (executive, detailed, technical)
    • Key point extraction and topic identification
    • Content categorization and structure analysis
  • πŸ”‘ Intelligent Keyword Extraction:

    • AI-powered identification of significant terms
    • Frequency analysis with statistical importance
    • Contextual relevance ranking
    • Visual keyword highlighting in transcripts
  • ⚑ Speech Pattern Analysis:

    • Speaking rate calculation (Words Per Minute)
    • Pace categorization: Slow (0-120), Normal (120-160), Fast (160-200), Very Fast (200+)
    • Speech quality and clarity assessment
    • Natural speaking pattern detection
  • πŸ’­ Emotion & Sentiment Analysis:

    • Primary emotion detection (positive, negative, neutral, mixed)
    • Confidence scoring for emotional analysis
    • Tone classification (professional, casual, formal, emotional)
    • Sentiment progression throughout content
  • πŸ“Š Text Statistics & Analytics:

    • Comprehensive word, character, and sentence counts
    • Vocabulary richness and unique word analysis
    • Language complexity and reading level assessment
    • Content quality and coherence metrics

Configurable Analysis Depths

  • πŸ” Basic: Essential transcription with core metrics
  • πŸ“Š Medium: Standard AI features with keyword extraction
  • 🎯 Detailed: Comprehensive analysis with emotion detection
  • πŸš€ Comprehensive: Full-spectrum analysis with all available features

🌐 Multilingual System

Interface Languages

  • πŸ‡ΉπŸ‡· Turkish (TΓΌrkΓ§e): Complete native interface with 12 direct languages translations
  • πŸ‡ΊπŸ‡Έ English: Full international support with technical terminology
  • Real-time Language Switching: Instant UI transformation without page refresh
  • Context-Aware Translations: Natural expressions and cultural considerations
  • Persistent Settings: User language preferences saved across sessions

Transcription Languages

Supported Languages: Turkish, English, Spanish, Dutch, German, Italian, Portuguese, Russian, Japanese, Korean, Chinese (Simplified), Arabic

Language Features:

  • Auto-detection for mixed-language content
  • Regional dialect support where available
  • Specialized terminology recognition
  • Custom language model fine-tuning

🎬 YouTube Integration

Video Processing Features

  • Direct URL Processing: Simple paste-and-process workflow
  • Video Information Extraction:
    • Automatic metadata retrieval (title, channel, duration, description)
    • Video quality assessment before processing
    • Length validation with warnings for extended content

Smart Download System

  • Rate Limiting Protection: Intelligent YouTube API restriction handling
  • High-Quality Audio: Optimal audio extraction from video sources
  • Progress Tracking: Real-time download and processing status
  • Error Recovery: Automatic retry mechanisms for failed downloads

πŸ“€ Professional Export & Sharing

Document Generation

  • πŸ“„ PDF Reports:

    • Professional templates with branded headers
    • Complete metadata integration (audio info, analysis, timestamps)
    • Visual elements (charts, graphs, waveforms)
    • Multi-page structured reports with table of contents
  • πŸ“ Word Documents:

    • Fully formatted documents with professional styling
    • Editable templates for customization
    • Embedded analysis results with tables and charts
    • Automated cover pages with project details
  • πŸ“Š Excel Workbooks:

    • Multi-sheet structure (transcription, analysis, statistics)
    • Structured data tables for further analysis
    • Automatic chart generation for metrics
    • Advanced data analysis with pivot tables

Sharing & Collaboration

  • πŸ”² QR Code Generation: Instant QR codes for quick sharing and mobile access
  • πŸ“§ Email Integration: Direct email sending with professional attachments
  • πŸ“¦ ZIP Archives: Complete project packages with all files and formats
  • πŸ”— Secure Sharing: Access-controlled sharing options

πŸ“Š Analytics & History Management

Advanced History System

  • πŸ“š Complete Transaction Log: Detailed records of all transcription operations
  • πŸ” Advanced Search & Filter: Filter by language, date, file type, quality metrics
  • ⭐ Smart Favorites System: One-click favoriting with organized collections
  • πŸ“€ Data Export: Complete database export in JSON/CSV formats
  • πŸ—‘οΈ Intelligent Cleanup: Automatic cleanup with configurable retention policies
image

Real-Time Analytics Dashboard

  • πŸ“ˆ Usage Statistics: Processing metrics, time analysis, success rates
  • 🌍 Language Distribution: Multilingual usage patterns and trends
  • πŸ’° Cost Tracking: OpenAI API usage monitoring and optimization
  • ⚑ Performance Metrics: Memory usage, processing speed, efficiency analysis

πŸŽ›οΈ Advanced Configuration

Sidebar Control Panel

image

πŸ”Œ API Status & Health Monitoring

  • Real-Time Status: Live connection status with OpenAI services
  • Health Indicators: Visual status indicators for all connected services
  • Error Diagnostics: Detailed error reporting with suggested solutions
  • Rate Limiting: API usage tracking with limit notifications and optimization tips

βš™οΈ Language & Format Settings

  • 🌐 Transcription Language Selection:

    • Complete language list with native names
    • Auto-detection mode for mixed-language content
    • Custom language preferences with user history
    • Regional dialect support where available
  • πŸ“ Output Format Configuration:

    • Multiple text formats (plain text, formatted, JSON, XML)
    • Response formatting options (structured vs. natural language)
    • Timestamp inclusion with customizable formats
    • Confidence scoring for transcription accuracy

πŸš€ Advanced Processing Settings

  • 🌑️ Temperature Control: AI creativity vs. consistency balance (0.0-1.0)
  • πŸ“ Token Management: Maximum response length configuration (100-4000 tokens)
  • πŸ”„ Retry Logic: Automatic retry counts for failed operations
  • ⏱️ Timeout Settings: Customizable timeout values for different operations

πŸ€– AI Analysis Configuration

  • Analysis Type Selection:

    • Summary Analysis: Multi-level content summarization
    • Keywords Analysis: Frequency and contextual keyword extraction
    • Speech Speed Analysis: Speaking rate and pace evaluation
    • Emotion Analysis: Sentiment and emotional tone detection
  • Analysis Depth Control:

    • Basic: Essential features only
    • Medium: Standard analysis with core features
    • Detailed: Comprehensive analysis with advanced metrics
    • Comprehensive: Full-spectrum analysis with all available features

πŸ‘οΈ View & Navigation Controls

  • πŸ“± Interface Modes:
    • Main File Upload: Primary audio processing interface
    • YouTube Transcription: Video content processing mode
    • Smart Translation: Translation center for existing content
    • History View: Complete transaction history browser
    • Favorites Collection: Curated favorites management
    • Statistics Dashboard: Analytics and performance metrics

⚑ Quick Actions Panel

  • πŸ”„ System Operations:

    • Complete page refresh with state preservation
    • Browser cache clearing with confirmation
    • Intelligent memory cleanup and optimization
    • Processing data reset for current session
  • 🧠 Memory Management:

    • Real-time memory usage tracking with visual indicators
    • Smart cleanup of processed files and temporary data
    • Performance alerts for memory and resource issues
    • Automatic optimization suggestions

Advanced Options

Performance Tuning

  • Concurrent Processing: Configure parallel file processing limits
  • Memory Optimization: Set cache size and cleanup thresholds
  • API Rate Limiting: Customize API call frequency and batching
  • Background Processing: Enable/disable background task execution

Security & Privacy Settings

  • Data Retention: Configure automatic data deletion (default: 30 days)
  • Local Processing: Enable local-only mode for sensitive content
  • Encryption: Optional encryption for stored data and exports
  • Audit Logging: Comprehensive activity logging for compliance

πŸ“š Detailed Usage Examples

Basic Audio Transcription Workflow

Single File Processing

1. 🌐 Language Selection    β†’ Choose interface language (Turkish/English)
2. πŸ“ File Upload          β†’ Drag & drop or click to select audio file
3. βš™οΈ Configuration        β†’ Set transcription language & output format
4. πŸš€ Processing           β†’ Click "πŸš€ Process" button
5. πŸ“Š Analysis Review      β†’ Examine results and AI analysis
6. πŸ“€ Export & Download    β†’ Choose export format and download

Step-by-Step Example:

File: "meeting_recording.mp3" (15 minutes, English)
Settings: English transcription, AI Analysis enabled
Result: Full transcript + Summary + Keywords + Speech analysis (145 WPM)
Export: PDF report with visual analysis charts

Batch Processing Example

Files: [interview1.mp3, interview2.wav, interview3.m4a]
Configuration: Turkish transcription, Comprehensive AI analysis
Processing: 3 files processed simultaneously with live progress
Output: Individual transcriptions + consolidated batch analysis
Export: ZIP archive with separate PDF reports for each file

Advanced AI Analysis Scenarios

Business Meeting Analysis

Use Case: Weekly team meeting transcription
Input: 45-minute meeting recording
Configuration:
- Analysis Depth: Comprehensive
- AI Model: GPT-4 Turbo
- Features: All analysis types enabled

Generated Insights:
β”œβ”€β”€ πŸ“‹ Executive Summary      # Key decisions and action items
β”œβ”€β”€ 🎯 Action Items          # Extracted tasks and responsibilities  
β”œβ”€β”€ πŸ’° Budget Discussions    # Financial mentions and decisions
β”œβ”€β”€ πŸ“Š Participation Metrics # Individual contribution analysis
β”œβ”€β”€ πŸ•’ Topic Timeline        # Time spent on each agenda item
└── πŸ“ˆ Strategic Insights    # Long-term planning elements

Educational Content Processing

Use Case: University lecture analysis
Input: 90-minute physics lecture
Analysis Results:
β”œβ”€β”€ πŸ“š Key Concepts         # Important physics principles
β”œβ”€β”€ πŸŽ“ Learning Objectives  # Educational goals identified
β”œβ”€β”€ ❓ Q&A Segments         # Student questions and explanations
β”œβ”€β”€ πŸ“– Reference Materials  # Mentioned textbooks and papers
β”œβ”€β”€ πŸ”¬ Formula Recognition  # Mathematical expressions
└── πŸ“ Study Guide         # Auto-generated revision materials

YouTube Integration Examples

Content Creator Analytics

Video: "Product Review: New Smartphone" (20 minutes)
URL: https://youtube.com/watch?v=example123

Processing Results:
β”œβ”€β”€ πŸ“Ί Video Metadata      # Title, channel, view count, description
β”œβ”€β”€ πŸ“ Full Transcript     # Complete speech-to-text conversion
β”œβ”€β”€ 🎯 Product Features    # Mentioned specifications and benefits
β”œβ”€β”€ πŸ’­ Sentiment Analysis  # Positive/negative opinion tracking
β”œβ”€β”€ πŸ”‘ SEO Keywords       # Content optimization suggestions
β”œβ”€β”€ πŸ“Š Engagement Points   # High-energy moments and key topics
└── 🎬 Content Structure  # Intro, demo, pros/cons, conclusion

Educational Video Processing

Video: "Introduction to Machine Learning" (35 minutes)
Analysis Focus: Educational content extraction

Results:
β”œβ”€β”€ οΏ½ Course Outline      # Structured learning progression
β”œβ”€β”€ πŸ” Technical Terms    # ML terminology and definitions
β”œβ”€β”€ πŸ“Š Concept Hierarchy  # Beginner β†’ Advanced topic flow
β”œβ”€β”€ πŸ’‘ Key Insights       # Important takeaways and principles
β”œβ”€β”€ πŸ§ͺ Practical Examples # Real-world applications mentioned
└── πŸ“š Further Reading    # Recommended resources and papers

Translation Center Workflows

International Business Presentation

Scenario: Turkish presentation for global team
Original: 30-minute Turkish business presentation
Target: Professional English translation

Workflow:
1. πŸ“ Source Processing    β†’ Turkish transcription with full analysis
2. πŸ”„ Translation Setup   β†’ English target with business terminology
3. πŸ€– AI Translation      β†’ GPT-4o for highest quality
4. πŸ“Š Quality Assessment  β†’ Translation accuracy and fluency review
5. πŸ“„ Bilingual Reports   β†’ Side-by-side comparison documents

Output:
β”œβ”€β”€ πŸ“„ Original Text      # Source Turkish transcription
β”œβ”€β”€ πŸ”„ English Translation # Professional business English
β”œβ”€β”€ πŸ“Š Comparison View    # Parallel text comparison
β”œβ”€β”€ πŸ’Ό Business Terminology # Specialized term translations
└── πŸ“ˆ Quality Metrics    # Translation confidence scores

Professional Use Cases

Legal Deposition Transcription

Content: 2-hour legal deposition
Requirements: High accuracy, speaker identification
Configuration:
- Model: GPT-4o (highest accuracy)
- Analysis: Detailed with speaker patterns
- Export: Legal-compliant PDF format

Features:
β”œβ”€β”€ πŸ‘₯ Speaker Identification # Multiple speaker detection
β”œβ”€β”€ ⏱️ Precise Timestamps    # Legal-grade time marking
β”œβ”€β”€ πŸ“‹ Question-Answer Format # Q&A structure preservation
β”œβ”€β”€ πŸ” Key Statement Extraction # Important admissions/denials
β”œβ”€β”€ πŸ“Š Speaking Time Analysis # Individual participation metrics
└── πŸ“„ Court-Ready Format   # Professional legal document

Medical Interview Analysis

Content: Patient consultation recording
Focus: Clinical documentation and analysis
Settings: Medical terminology recognition enabled

Analysis Output:
β”œβ”€β”€ πŸ₯ Medical Terms       # Identified medical vocabulary
β”œβ”€β”€ 🩺 Symptoms Mentioned  # Patient-reported symptoms
β”œβ”€β”€ πŸ’Š Medications Listed  # Current and prescribed medications
β”œβ”€β”€ πŸ“‹ Treatment Plan     # Discussed treatment options
β”œβ”€β”€ ⚠️ Important Alerts   # Critical information flagged
└── πŸ“„ Clinical Summary   # Professional medical documentation

πŸ”§ Project Structure

multilingual/
β”œβ”€β”€ app.py                 # Main application
β”œβ”€β”€ config.py              # Multilingual configuration
β”œβ”€β”€ database.py            # Data management
β”œβ”€β”€ utils.py               # Core utilities
β”œβ”€β”€ export_utils.py        # Export functionality
β”œβ”€β”€ youtube_transcriber.py # YouTube integration
β”œβ”€β”€ translation_tab.py     # Translation center
β”œβ”€β”€ uploads/               # Temporary files
└── exports/               # Generated files

🚨 Comprehensive Troubleshooting

Common Issues & Solutions

Language Interface Problems

Issue: Language not switching properly
Solutions:
1. Clear browser cache: Ctrl+Shift+Delete (Chrome/Firefox)
2. Force page refresh: Ctrl+F5 or Cmd+Shift+R (Mac)
3. Use "πŸ”„ Refresh" button in sidebar Quick Actions
4. Check browser language settings
5. Verify ECHO_FORGE_LANGUAGE environment variable

API Connection Issues

Issue: OpenAI API authentication failures
Solutions:
1. Verify API key format (must start with "sk-")
2. Check API key validity at https://platform.openai.com/api-keys
3. Confirm billing status at https://platform.openai.com/usage
4. Test connection using sidebar API status indicator
5. Check firewall/proxy settings blocking API calls

Example API key validation:
curl -H "Authorization: Bearer $OPENAI_API_KEY" \
     https://api.openai.com/v1/models

File Upload & Processing Problems

Issue: File upload failures or processing errors
Solutions:
1. Verify file format (MP3, WAV, M4A, MP4, FLAC, OGG, AAC)
2. Check file size (maximum 25MB recommended)
3. Test audio quality using waveform visualization
4. Convert files to WAV/MP3 for better compatibility
5. Use "🧠 Memory Status" to check available resources

Supported formats check:
File β†’ Properties β†’ Details β†’ Check codec information

Memory & Performance Issues

Issue: Application running slowly or memory errors
Solutions:
1. Monitor memory usage in sidebar "🧠 Memory Status"
2. Use "πŸ—‘οΈ Clear Processing Data" in Quick Actions
3. Close unnecessary browser tabs and applications
4. Reduce batch processing file count (max 5-10 files)
5. Enable automatic cleanup in advanced settings
6. Restart browser if memory usage exceeds 4GB

Memory optimization commands:
# Clear browser cache
Ctrl+Shift+Delete β†’ Clear browsing data

# Check system memory
Task Manager β†’ Performance β†’ Memory

YouTube Integration Issues

Issue: YouTube video processing failures
Solutions:
1. Verify video URL format and accessibility
2. Check video length (recommended under 2 hours)
3. Ensure video has audio track
4. Test with different video quality settings
5. Check regional restrictions and availability

YouTube URL validation:
Valid formats:
- https://youtube.com/watch?v=VIDEO_ID
- https://youtu.be/VIDEO_ID
- https://m.youtube.com/watch?v=VIDEO_ID

Export & Download Problems

Issue: Export generation failures or download issues
Solutions:
1. Check available disk space (minimum 1GB recommended)
2. Verify export format compatibility with your system
3. Try different export formats (PDF, Word, Excel)
4. Clear browser download cache
5. Disable browser popup blockers for the application
6. Use "Generate ZIP Archive" for complete packages

Export troubleshooting:
Browser β†’ Settings β†’ Downloads β†’ Check download location
Ensure sufficient disk space for export files

Performance Optimization Tips

System Requirements Optimization

Recommended Configuration:
β”œβ”€β”€ CPU: Multi-core processor (4+ cores ideal)
β”œβ”€β”€ RAM: 8GB+ (16GB for heavy batch processing)  
β”œβ”€β”€ Storage: SSD for faster file operations
β”œβ”€β”€ Network: Stable 10+ Mbps for API calls
└── Browser: Chrome/Firefox with 4GB+ available memory

Performance monitoring:
# Windows: Task Manager β†’ Performance
# Mac: Activity Monitor β†’ Memory/CPU
# Linux: htop or system monitor

Browser Optimization

Browser Settings for Optimal Performance:
1. Enable hardware acceleration
2. Clear cache and cookies regularly
3. Disable unnecessary extensions
4. Increase memory allocation for JavaScript
5. Use Incognito/Private mode for testing

Chrome optimization:
chrome://settings/system β†’ Use hardware acceleration
chrome://settings/privacy β†’ Clear browsing data

Network & API Optimization

API Performance Tips:
1. Use stable internet connection (avoid mobile hotspots)
2. Monitor API usage in sidebar to prevent rate limiting
3. Choose appropriate AI models for your needs:
   - GPT-3.5 Turbo: Fast, basic analysis
   - GPT-4: Balanced performance and quality
   - GPT-4 Turbo: Enhanced performance
   - GPT-4o: Latest features, highest quality

Rate limiting management:
- Monitor API calls per minute in sidebar
- Use batch processing for multiple files
- Enable automatic retry for failed requests

Advanced Debugging

Developer Tools & Logging

Debug Information Access:
1. Enable debug mode in config.py
2. Use browser developer tools (F12)
3. Check console for JavaScript errors
4. Review network tab for API call status
5. Export debug information from sidebar

Browser debug steps:
F12 β†’ Console β†’ Look for red error messages
F12 β†’ Network β†’ Check failed API requests
F12 β†’ Application β†’ Clear storage if needed

Log File Analysis

Log Files Location:
β”œβ”€β”€ whisper_ai.log          # Main application log
β”œβ”€β”€ streamlit.log           # Streamlit framework log
β”œβ”€β”€ api_calls.log           # API interaction log
└── error_trace.log         # Detailed error traces

Log analysis commands:
# View recent errors
tail -50 whisper_ai.log | grep ERROR

# Monitor real-time logs  
tail -f whisper_ai.log

# Search for specific issues
grep "API" whisper_ai.log | tail -20

🀝 Contributing Guidelines

We welcome contributions to echo-forge! Whether you're fixing bugs, adding features, improving documentation, or translating the interface, your help is appreciated.

πŸš€ Getting Started

Development Setup

# 1. Fork the repository on GitHub
# 2. Clone your fork locally
git clone https://github.com/YOUR_USERNAME/echoforge_whisper.git
cd echoforge_whisper

# 3. Create development environment
python -m venv dev-env
source dev-env/bin/activate  # Linux/Mac
dev-env\Scripts\activate     # Windows

# 4. Install development dependencies
pip install -r requirements-dev.txt

# 5. Set up pre-commit hooks
pre-commit install

# 6. Create feature branch
git checkout -b feature/amazing-feature

Development Guidelines

  • Code Style: Follow PEP 8 for Python code
  • Documentation: Update README and inline comments for new features
  • Testing: Add tests for new functionality
  • Multilingual: Update both Turkish and English text in config.py
  • Performance: Consider memory usage and API cost implications

🌍 Translation Contributions

Adding New Interface Languages

# In config.py, add new language entries:
UI_TEXTS = {
    "en": {
        "app_title": "echo-forge",
        "upload_audio": "Upload Audio File",
        # ... existing English translations
    },
    "tr": {
        "app_title": "echo-forge", 
        "upload_audio": "Ses DosyasΔ± YΓΌkle",
        # ... existing Turkish translations
    },
    "es": {  # New Spanish translation
        "app_title": "echo-forge",
        "upload_audio": "Subir Archivo de Audio",
        # ... add all required translations
    }
}

Translation Requirements

  • Complete all 1,200+ UI text entries
  • Maintain consistent terminology
  • Consider cultural context and technical accuracy
  • Test interface functionality in new language
  • Update language selector in sidebar

πŸ”§ Technical Contributions

Feature Development Process

  1. Issue Discussion: Open GitHub issue to discuss new features
  2. Design Review: Share implementation approach for feedback
  3. Development: Implement feature with tests and documentation
  4. Code Review: Submit pull request for team review
  5. Testing: Verify functionality across different scenarios
  6. Documentation: Update README and inline documentation

Code Standards

# Example of well-documented function
def analyze_audio_content(
    audio_text: str, 
    analysis_type: str = "comprehensive",
    ai_model: str = "gpt-4-turbo",
    language: str = "en"
) -> Dict[str, Any]:
    """
    Analyze transcribed audio content using AI models.
    
    Args:
        audio_text: Transcribed text to analyze
        analysis_type: Type of analysis (basic, medium, detailed, comprehensive)
        ai_model: AI model to use (gpt-4o, gpt-4-turbo, gpt-4, gpt-3.5-turbo)
        language: Interface language for results (en, tr)
    
    Returns:
        Dictionary containing analysis results with keys:
        - summary: Content summary
        - keywords: Extracted keywords
        - emotions: Sentiment analysis
        - speech_rate: Speaking pace analysis
        
    Raises:
        ValueError: If analysis_type is not supported
        APIError: If OpenAI API call fails
    """

πŸ§ͺ Testing Contributions

Test Categories

  • Unit Tests: Individual function testing
  • Integration Tests: Component interaction testing
  • UI Tests: Interface functionality testing
  • Performance Tests: Memory and speed optimization
  • Multilingual Tests: Language switching and translation accuracy

Running Tests

# Run all tests
pytest tests/

# Run specific test categories
pytest tests/unit/
pytest tests/integration/
pytest tests/ui/

# Run with coverage report
pytest --cov=. tests/

# Run performance benchmarks
pytest tests/performance/ --benchmark-only

πŸ“Š Performance & Optimization

Contribution Areas

  • Memory Management: Optimize file processing and cleanup
  • API Efficiency: Reduce API calls and improve batching
  • UI Responsiveness: Enhance user interface performance
  • Caching Systems: Implement intelligent result caching
  • Error Handling: Improve error recovery and user feedback

Performance Testing

# Memory usage profiling
python -m memory_profiler app.py

# API call optimization
python scripts/benchmark_api_calls.py

# UI responsiveness testing  
python scripts/ui_performance_test.py

πŸ› Bug Reports & Issue Tracking

Bug Report Template

## Bug Description
Brief description of the issue

## Steps to Reproduce
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error

## Expected Behavior
What you expected to happen

## Actual Behavior  
What actually happened

## Environment
- OS: [e.g. Windows 10, macOS 12.0, Ubuntu 20.04]
- Python Version: [e.g. 3.10.5]
- Browser: [e.g. Chrome 91.0, Firefox 89.0]
- echo-forge Version: [e.g. v0.1.0]

## Additional Context
Add any other context about the problem here

πŸ“ Documentation Contributions

Documentation Types

  • README Updates: Keep main documentation current
  • API Documentation: Document function parameters and returns
  • User Guides: Create tutorials and how-to guides
  • Developer Docs: Technical implementation details
  • Translation Guides: Multilingual documentation

Documentation Standards

  • Clear, concise language for international users
  • Include code examples and practical scenarios
  • Maintain consistency with existing documentation style
  • Update both English and Turkish versions when applicable
  • Include screenshots for UI-related documentation

🎯 Feature Requests

High-Priority Areas

  • New Language Support: Additional transcription and interface languages
  • Export Formats: New document types and sharing options
  • AI Model Integration: Support for new AI models and providers
  • Advanced Analytics: Enhanced analysis features and visualizations
  • Enterprise Features: SSO, audit logging, bulk processing improvements

Feature Request Process

  1. Research: Check existing issues and feature requests
  2. Proposal: Create detailed feature proposal with use cases
  3. Discussion: Engage with maintainers and community
  4. Implementation: Develop feature following contribution guidelines
  5. Review: Collaborate on code review and testing
  6. Documentation: Update documentation and examples

πŸ”„ Pull Request Process

PR Submission Checklist

  • Code follows project style guidelines
  • Tests added for new functionality
  • Documentation updated appropriately
  • Multilingual text updated in config.py
  • Performance impact considered and optimized
  • Security implications reviewed
  • Backward compatibility maintained

PR Review Criteria

  • Functionality: Feature works as intended
  • Code Quality: Clean, readable, maintainable code
  • Performance: No significant performance degradation
  • Security: No security vulnerabilities introduced
  • Documentation: Adequate documentation provided
  • Testing: Comprehensive test coverage

πŸ‘₯ Community Guidelines

Communication Standards

  • Respectful: Treat all contributors with respect and professionalism
  • Inclusive: Welcome contributors of all backgrounds and skill levels
  • Constructive: Provide helpful feedback and suggestions
  • Patient: Allow time for review and response
  • Collaborative: Work together toward common goals

Getting Help

  • GitHub Issues: Technical questions and bug reports
  • Discussions: General questions and feature brainstorming
  • Code Review: Detailed technical feedback on implementations
  • Documentation: Help with understanding and improving docs

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Mustafa Raşit - Project Creator & Lead Developer
  • OpenAI for Whisper and GPT models
  • Streamlit for the web framework
  • The open-source community for excellent libraries

πŸ“§ Contact

Project Maintainer: Mustafa Raşit

For questions, support, and contributions:


Built with ❀️ using OpenAI Whisper & GPT-4

About

πŸ”₯ Needlessly over-engineered audio transcription platform β€’ Whisper + GPT-4 β€’ Multilingual UI β€’ YouTube integration β€’ AI analysis magic ✨

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages