Enterprise-level call center quality analysis system using Whisper transcription and dual-database architecture (PostgreSQL + MongoDB) for comprehensive agent performance evaluation.
Tested & Validated:
- System tested with 18 different audio files
- Database contains comprehensive analysis results with complete call records, scores, and transcripts
Core Analysis Module:
- Script Adherence Analysis - Intelligent triple-layer matching system:
- Layer 1: Exact phrase matching for precise verification
- Layer 2: Fuzzy matching (85% threshold) for natural language variations
- Layer 3: LLM-powered verification using Groq's LLaMA 3.3 70B when Layers 1 & 2 score < 80%
- Supports built-in scripts (Appointment Booking, Complaint Handling) and custom script creation
- Detailed scoring breakdown with element-level analysis
- Identifies missing elements and provides actionable recommendations
AI Components:
- OpenAI Whisper (base model) for audio transcription at 16kHz
- Groq API (llama-3.3-70b-versatile) as primary LLM with 14,400 requests/day limit
- OpenRouter API (llama-3.2-3b-instruct:free) as fallback LLM
- Optional speaker diarization with pyannote.audio
Admin & Analytics:
- Admin panel with agent management, team analytics, and system monitoring
- Script editor for built-in (Appointment Booking, Complaint Handling) and custom scripts
- Real-time dashboard with KPI cards and performance metrics
- Database export to CSV functionality
The AI Call Analysis System processes calls through the following automated pipeline:
- User uploads call recording via web interface
- Selects agent details and script type (Appointment Booking, Complaint Handling, or Custom)
- Optional speaker diarization can be enabled for multi-speaker calls
- System accepts multiple audio formats (MP3, WAV, M4A, OGG, FLAC)
- Converts audio to 16kHz sample rate for Whisper compatibility
- Applies audio normalization and quality enhancement
- Trims silence from edges
- Applies high-pass filter (80Hz cutoff) to remove background noise
- Calculates Signal-to-Noise Ratio (SNR) for quality assessment
- OpenAI Whisper base model transcribes preprocessed audio
- Generates complete text transcript with word count
- Language detection (default: English)
- Transcript saved to MongoDB for future reference
- Layer 1 - Exact Matching: Searches for exact phrases from script template
- Layer 2 - Fuzzy Matching: Uses RapidFuzz (85% threshold) for natural language variations
- Layer 3 - LLM Verification: If score < 80%, triggers intelligent AI verification:
- Primary: Groq's LLaMA 3.3 70B (14,400 requests/day)
- Fallback: OpenRouter's LLaMA 3.2 3B (unlimited, free tier) if Groq fails/rate-limited
- LLM analyzes semantic meaning to verify if agent conveyed required message in their own words
- Supports 3 script types:
- Built-in: Appointment Booking (12 elements)
- Built-in: Complaint Handling (8 elements)
- Custom: User-created scripts with AI-powered analysis
- Each script element is scored individually based on its weight
- Missing elements are flagged with severity levels
- Aggregates scores from all script elements
- Calculates overall performance score (0-100)
- Assigns rating: Poor (<60), Good (60-70), Very Good (70-80), Excellent (80-90), Outstanding (90+)
- Determines percentile rank among all agent calls
- Identifies missing script elements as issues
- Generates actionable recommendations for improvement
- Highlights positive aspects of the call
- Creates detailed breakdown report
- PostgreSQL: Stores call metadata, scores, agent stats, and issues
- MongoDB: Stores full transcript and detailed analysis report
- Updates agent performance metrics (average score, total calls, high/low performers)
- Maintains historical data for trend analysis
- Creates comprehensive HTML report with visualizations
- Displays progress bars for each script element
- Shows LLM verification badges when AI was triggered
- Provides download links for transcript and detailed analysis
Backend: Flask 3.0.0, Python 3.8+
Databases: PostgreSQL (structured data), MongoDB (transcripts)
AI/NLP: Whisper base, Groq/OpenRouter APIs, RapidFuzz (fuzzy matching), NLTK
Audio: FFmpeg, librosa, pydub
Prerequisites:
- Python 3.8 or higher
- PostgreSQL installed and running
- MongoDB installed and running
- FFmpeg installed and in PATH
- 4GB+ RAM recommended
Setup Steps:
-
Clone or navigate to project directory
-
Create and activate virtual environment:
python -m venv venv
.\venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Set up environment variables (create .env file):
GROQ_API_KEY=your_groq_api_key
OPENROUTER_API_KEY=your_openrouter_api_key
PostgreSQL Configuration:
Create database:
CREATE DATABASE call_analysis_db;Create tables:
CREATE TABLE agents (
agent_id SERIAL PRIMARY KEY,
agent_name VARCHAR(100) NOT NULL,
email VARCHAR(150),
team VARCHAR(50),
average_score DECIMAL(5, 2) DEFAULT 0.0,
total_calls INTEGER DEFAULT 0,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE calls (
call_id SERIAL PRIMARY KEY,
agent_id INTEGER REFERENCES agents(agent_id),
call_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
audio_filename VARCHAR(255),
overall_score DECIMAL(5, 2),
script_adherence_score DECIMAL(5, 2),
communication_score DECIMAL(5, 2),
sentiment_score DECIMAL(5, 2),
outcome_score DECIMAL(5, 2),
empathy_score DECIMAL(5, 2),
audio_duration DECIMAL(10, 2),
word_count INTEGER,
uploaded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE missing_elements (
id SERIAL PRIMARY KEY,
call_id INTEGER REFERENCES calls(call_id),
element_name VARCHAR(100),
element_description TEXT,
points_lost DECIMAL(5, 2),
timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE custom_scripts (
script_id SERIAL PRIMARY KEY,
script_name VARCHAR(100) NOT NULL,
script_config JSONB NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);MongoDB Setup:
MongoDB will automatically create collections (transcripts, reports) on first use. Default connection: localhost:27017
Start the server:
python app.pyDefault Routes:
- / - Upload page
- /dashboard - Analytics dashboard
- /admin - Admin panel (agents, teams, scripts, settings)
- /script-editor - Script management interface
- /api/export_data - Export database to CSV
Upload and Analyze Calls:
- Navigate to homepage (/)
- Select agent from dropdown
- Choose script type (Appointment Booking, Complaint Handling, or Custom)
- Optional: Enable speaker diarization
- Upload audio file (MP3, WAV, M4A)
- View detailed analysis report with scores and recommendations
Admin Panel:
- Add/edit/delete agents and teams
- View system statistics and API status
- Manage script configurations
- Export all data to CSV
Script Editor:
- View built-in scripts
- Create custom scripts with JSON configuration
- Define script elements, weights, and matching phrases
AI-Call-Analysis-System/
├── analyzers/
│ ├── script_adherence_production.py (Appointment Booking analyzer)
│ ├── complaint_script_adherence.py (Complaint Handling analyzer)
│ └── custom_script_analyzer.py (Custom script analyzer)
├── models/
│ ├── transcriber.py (Whisper transcription + audio processing)
│ └── speaker_diarization.py (Optional multi-speaker detection)
├── database/
│ ├── config.py (PostgreSQL + MongoDB managers)
│ └── __init__.py (Database initialization)
├── templates/
│ ├── upload.html (Upload interface)
│ ├── dashboard.html (Analytics dashboard)
│ ├── admin.html (Admin panel)
│ └── script_editor.html (Script management)
├── static/ (CSS, JavaScript assets)
├── app.py (Flask application)
└── requirements.txt
GET /api/dashboard_stats - Dashboard statistics (total calls, avg score, agents, score distribution)
GET /api/agents - List all agents with performance metrics
POST /api/agents - Create new agent (JSON: name, email, team)
DELETE /api/agents/ - Delete agent by ID
GET /api/system_stats - System status (database connections, API health, LLM availability)
GET /api/export_data - Export all data to CSV files (agents, calls, missing_elements, transcripts)
Database Credentials (app.py):
POSTGRES_CONFIG = {
'host': 'localhost',
'database': 'call_analysis_db',
'user': 'postgres',
'password': '123456',
'port': 5432
}
MONGO_URI = 'mongodb://localhost:27017/'
MONGO_DB = 'call_analysis_db'LLM Verification Settings (analyzers/):
LLM_VERIFICATION_THRESHOLD = 80
FUZZY_MATCH_THRESHOLD = 85Whisper Model (models/transcriber.py):
model_name = "base" # Options: tiny, base, small, medium, large
target_sr = 16000 # Sample rate for transcriptionDatabase Connection Errors:
Verify PostgreSQL and MongoDB are running:
psql -U postgres -d call_analysis_db
mongoLLM API Errors:
Check environment variables are set correctly. System falls back to exact + fuzzy matching if APIs unavailable.
Audio Transcription Fails:
Ensure FFmpeg is installed and accessible. Test with:
ffmpeg -versionLLM API Limits:
- Groq: 14,400 requests/day (llama-3.3-70b-versatile)
- OpenRouter: Unlimited (llama-3.2-3b-instruct:free)
- LLM verification only triggers when Phase 1 score < 80%
Whisper Model Speed:
- Tiny: Fastest, lower accuracy
- Base: Balanced (default)
- Small/Medium: Higher accuracy, slower
- Large: Best accuracy, requires GPU
Database Optimization:
Index on call_date and agent_id recommended for large datasets:
CREATE INDEX idx_call_date ON calls(date);
CREATE INDEX idx_agent_id ON calls(agent_id);- Real-Time Call Analysis
- Live audio streaming integration for ongoing call monitoring
- Real-time script adherence tracking with instant agent feedback
- Dynamic alerts for deviations during active calls
- WebSocket-based live dashboard updates
- Cloud Deployment & Scalability
- AWS/Azure/GCP deployment with containerization (Docker/Kubernetes)
- Auto-scaling infrastructure for handling thousands of concurrent calls
- Cloud-based storage integration (S3, Azure Blob) for audio files
- Serverless architecture for cost-effective processing
- Multi-Language Support
- Whisper-based transcription for 50+ languages
- Multilingual LLM verification with language-specific script templates
- Automatic language detection and routing
- Regional dialect and accent handling
- Advanced Analytics & AI Features
- Predictive analytics for agent performance forecasting
- Emotion detection and stress level analysis during calls
- Automated coaching recommendations using GPT-4
- Voice biometrics for caller authentication and fraud detection