SMF Hackathon 2025 - n8n Production Alert Analyzer

AI-powered production alert analysis system using n8n, MCP servers, and OpenAI GPT-4o-mini.

Overview

This project provides an intelligent alert analysis workflow that:

🔍 Monitors Slack channels for production alerts
🤖 Analyzes alerts using AI with context from GitHub code and Kibana logs
📊 Provides root cause analysis and actionable recommendations
🔗 Integrates multiple data sources via MCP (Model Context Protocol) servers

Architecture

┌─────────────┐      ┌──────────────┐      ┌─────────────────┐
│   Slack     │─────▶│   n8n        │─────▶│   OpenAI        │
│   Alerts    │      │   Workflow   │      │   GPT-4o-mini   │
└─────────────┘      └──────┬───────┘      └─────────────────┘
                            │
                ┌───────────┴───────────┐
                │                       │
         ┌──────▼──────┐         ┌─────▼──────┐
         │  GitHub MCP │         │ Kibana MCP │
         │  (Port 3000)│         │ (Port 3001)│
         └─────────────┘         └────────────┘

Components

1. n8n Workflow

Location: n8n-workflow/

Production Alert Analyzer - Enhanced workflow with AI-powered analysis
Slack Integration - Monitors alerts and posts responses
AI Agent - Routes by severity (CRITICAL vs HIGH/MEDIUM)
Code Context - Optionally includes relevant code snippets

Features:

✅ Alert parsing and structured data extraction
✅ Severity-based routing (different analysis depth)
✅ Immediate acknowledgment responses
✅ Threaded Slack replies
✅ Metrics logging

📖 Documentation: See n8n-workflow/WORKFLOW_GUIDE.md

2. MCP Servers

Location: dockers/

Multi-MCP server setup with HTTP bridges for n8n integration:

GitHub MCP Server (Port 3000)

Code search across repositories
File contents retrieval
Commit history and PR management
40+ GitHub tools available

Kibana MCP Server (Port 3001)

Log search and analysis
Visualization management
Saved objects access
Real-time error tracking

📖 Documentation:

dockers/MULTI_MCP_SETUP.md - Complete setup guide
dockers/QUICK_START.md - Quick start guide

3. Sample Application

Location: sample-app/

Spring Boot Order Service that simulates realistic production failures:

Database timeouts and connection pool exhaustion
Payment gateway failures
Inventory service unavailability
High error rates and memory issues

4. Sample Logs

Location: sample-logs/

15 realistic log entries for testing:

Elasticsearch/Kibana import ready
Matches alert scenarios
Includes stack traces with code references
Structured JSON format

📖 Documentation: sample-logs/README.md

5. Sample Alerts

Location: sample-alerts/

10 pre-formatted Slack alerts covering:

Database timeouts (CRITICAL)
Payment gateway failures (HIGH)
Inventory issues (MEDIUM)
High error rates (CRITICAL)
Memory and performance issues

Quick Start

Prerequisites

Docker and Docker Compose
GitHub Personal Access Token
Kibana server access (URL, username, password)
n8n instance (cloud or self-hosted)
OpenAI API key

1. Setup MCP Servers

cd dockers

# Copy and configure environment
cp .env.multi-mcp.example .env
nano .env  # Add your credentials

# Update repository path in multi-mcp-docker-compose.yml
# Edit the volumes section to point to your repo

# Start services
docker compose -f multi-mcp-docker-compose.yml up -d

# Verify
curl http://localhost:3000/health  # GitHub MCP
curl http://localhost:3001/health  # Kibana MCP

2. Import Logs to Kibana

cd sample-logs

# Generate bulk import file
node convert-to-bulk.js

# Import to Elasticsearch (replace with your details)
curl -X POST "https://YOUR_KIBANA_URL:9200/_bulk" \
  -H "Content-Type: application/x-ndjson" \
  -H "Authorization: ApiKey YOUR_API_KEY" \
  --data-binary @kibana-bulk-import.ndjson

# Create index pattern in Kibana UI: logs-order-service-*

3. Import n8n Workflow

Open n8n UI
Go to Workflows → Import from File
Select: n8n-workflow/production-alert-analyzer.json
Configure credentials:
- Slack API: Add your Slack workspace credentials
- OpenAI API: Add your OpenAI API key
Configure MCP clients:
- GitHub MCP: http://localhost:3000/message (or http://host.docker.internal:3000/message if n8n in Docker)
- Kibana MCP: http://localhost:3001/message (or http://host.docker.internal:3001/message if n8n in Docker)
Update channel IDs for your Slack workspace
Activate workflow

4. Test the System

# Option 1: Start sample application
cd sample-app
mvn clean install
mvn spring-boot:run

# Generate traffic
cd ../sample-alerts
./test-workflow.sh

# Option 2: Post sample alert to Slack
# Copy any alert from sample-alerts/sample-slack-alerts.md
# Paste into your #mcp-testing channel

Usage

Testing with Sample Alerts

Open Slack and navigate to your configured input channel (e.g., #mcp-testing)
Copy an alert from sample-alerts/sample-slack-alerts.md
Paste the alert into the channel
Wait 1-2 seconds for acknowledgment
Check output channel (e.g., #n8n-output) for AI analysis (5-10 seconds)

Posting Datadog Alerts to Slack

Use the Python script to post Datadog alert configurations to Slack:

cd sample-alerts

# List all available alerts
python post_alerts_to_slack.py --list

# Post specific alerts by index
python post_alerts_to_slack.py --token YOUR_TOKEN --channel #alerts --alerts 1,3,5

# Post a range of alerts
python post_alerts_to_slack.py --token YOUR_TOKEN --channel #alerts --alerts 1-3

# Interactive mode - select which alerts to post
python post_alerts_to_slack.py --token YOUR_TOKEN --channel #alerts --interactive

# Using environment variables
export SLACK_TOKEN=xoxb-your-token
export SLACK_CHANNEL=#alerts
python post_alerts_to_slack.py --alerts 1,9

Get Slack OAuth Token:

Go to https://api.slack.com/apps
Create App → OAuth & Permissions
Add scopes: chat:write, chat:write.public
Install to workspace → Copy Bot User OAuth Token

Expected AI Response

The AI will provide:

Root Cause Analysis: Identifies the primary issue
Immediate Actions: Actionable next steps
Investigation Checklist: What to check
Long-term Prevention: Recommendations to prevent recurrence
Code References: Specific files and line numbers (if code context enabled)
Log Analysis: Related error patterns from Kibana

Configuration

Workflow Channels

Edit in n8n workflow:

Input Channel: Where alerts are posted (default: #mcp-testing)
Output Channel: Where analysis is sent (default: #n8n-output)

AI Temperature

Adjust in OpenAI node:

0.1-0.3: More deterministic, consistent responses
0.4-0.7: More creative, varied responses

Severity Routing

Edit "Route by Severity" node to change which alerts get detailed analysis:

CRITICAL: Full root cause analysis
HIGH/MEDIUM: Quick analysis

Project Structure

hackathon-2025/
├── dockers/                    # MCP server setup
│   ├── multi-mcp-docker-compose.yml
│   ├── mcp-bridge/            # HTTP bridge for MCP servers
│   ├── kibana-bridge/         # Kibana-specific bridge
│   ├── MULTI_MCP_SETUP.md
│   └── QUICK_START.md
├── n8n-workflow/              # n8n workflows
│   ├── production-alert-analyzer.json
│   ├── WORKFLOW_GUIDE.md
│   └── CODE_CONTEXT_GUIDE.md
├── sample-app/                # Spring Boot test application
│   └── src/main/java/...
├── sample-logs/               # Sample Kibana logs
│   ├── kibana-sample-logs.json
│   ├── convert-to-bulk.js
│   └── README.md
├── sample-alerts/             # Sample Slack alerts
│   ├── sample-slack-alerts.md
│   └── test-workflow.sh
└── TESTING_GUIDE.md          # Complete testing guide

Documentation

TESTING_GUIDE.md - Complete testing guide with scenarios
n8n-workflow/WORKFLOW_GUIDE.md - n8n workflow details
n8n-workflow/CODE_CONTEXT_GUIDE.md - Adding code context
dockers/MULTI_MCP_SETUP.md - Multi-MCP server setup
sample-logs/README.md - Log import and testing
sample-logs/KIBANA_IMPORT_GUIDE.md - Kibana import details

Troubleshooting

MCP Servers Not Starting

# Check logs
docker compose -f multi-mcp-docker-compose.yml logs

# Verify environment variables
cat .env

# Restart services
docker compose -f multi-mcp-docker-compose.yml restart

n8n Workflow Not Triggering

Verify workflow is Active (toggle in top-right)
Check Slack credentials are valid
Verify channel IDs match your workspace
Test with simple message: "test CRITICAL alert"

AI Response Too Generic

Add more context to alerts (service, metrics, stack traces)
Enable code context in workflow
Lower AI temperature (0.1-0.3)
Enhance system prompt with examples

MCP Connection Issues

# Test bridges
curl http://localhost:3000/health
curl http://localhost:3001/health

# If n8n in Docker, use:
# http://host.docker.internal:3000/message
# http://host.docker.internal:3001/message

Performance

Average Response Time: 5-10 seconds
Token Usage: 500-1500 tokens per alert
Cost: ~$0.001-0.003 per alert (GPT-4o-mini)
Throughput: 100+ alerts/hour

Security Notes

⚠️ Important:

Never commit .env files to version control
Use read-only repository mounts when possible
Restrict network access to MCP bridge ports
Use API keys with minimal required permissions
Consider adding authentication to HTTP bridges in production

Next Steps

✅ Test with all sample alerts
✅ Customize prompts for your services
✅ Add incident ticket creation (Jira/ServiceNow)
✅ Integrate with monitoring systems (Datadog/Prometheus)
✅ Build runbook database for AI reference
✅ Add historical incident matching
✅ Create metrics dashboard
✅ Set up on-call escalation

Contributing

This is a hackathon project. Feel free to:

Add more MCP servers
Enhance AI prompts
Add new alert types
Improve error handling
Add more test scenarios

License

MIT License - See LICENSE file for details

Support

For issues or questions:

Check relevant documentation in subdirectories
Review Docker logs: docker compose logs
Test MCP bridges: curl http://localhost:3000/health
Verify n8n execution logs in UI

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
dockers		dockers
n8n-workflow		n8n-workflow
sample-alerts		sample-alerts
sample-app		sample-app
sample-logs		sample-logs
.gitignore		.gitignore
README.md		README.md
TESTING_GUIDE.md		TESTING_GUIDE.md

snofty/smarshhackathon2025

Folders and files

Latest commit

History

Repository files navigation