AI Integration

AI Integration Guide

Complete documentation for integrating with Cortex Linux's built-in Sapiens 0.27B reasoning engine.

Overview

Cortex Linux integrates the Sapiens 0.27B reasoning engine directly into the operating system. The engine is accessible through multiple interfaces:

CLI: cortex-ai command-line tool
HTTP API: RESTful endpoints on port 8080
System Services: Background services for automated tasks
Library APIs: Python and other language bindings

Engine Specifications

Model: Sapiens 0.27B (270 million parameters)
Memory Usage: ~200MB RAM
Latency: 50-200ms for typical queries
Throughput: 5-10 queries/second (depending on hardware)
Capabilities: Reasoning, planning, debugging, optimization

Command-Line Interface

Basic Usage

# Check version
cortex-ai --version

# Get help
cortex-ai --help

# Check service status
cortex-ai status

Core Commands

Reason

Perform general reasoning on a query.

# Basic reasoning
cortex-ai reason "What are the security implications of running a service as root?"

# With context
cortex-ai reason "Analyze this error: $ERROR_MESSAGE" --context "Application: web server, Language: Python"

# Save output to file
cortex-ai reason "Explain quantum computing" --output analysis.md

# Verbose output
cortex-ai reason "Debug this issue" --verbose

Example Output:

Query: What are the security implications of running a service as root?

Analysis:
Running services as root presents significant security risks:

1. Privilege Escalation: If the service is compromised, attackers gain full system access
2. Reduced Attack Surface: Root processes can modify critical system files
3. Principle of Least Privilege: Violates security best practices
4. Audit Trail: Root actions are harder to track and attribute

Recommendations:
- Create dedicated service user with minimal required permissions
- Use systemd user services or containerization
- Implement proper file permissions and SELinux/AppArmor policies
- Monitor root process execution with auditd

Confidence: 0.87
Processing time: 145ms

Plan

Generate structured plans for tasks or projects.

# Create a deployment plan
cortex-ai plan "Deploy a Python web application to production" \
  --requirements "High availability, zero downtime, monitoring"

# Plan with existing configuration
cortex-ai plan "Optimize database performance" \
  --config /etc/postgresql/postgresql.conf \
  --output plan.json

# Multi-step planning
cortex-ai plan "Migrate from MySQL to PostgreSQL" \
  --steps 10 \
  --format markdown

Example Output:

# Deployment Plan: Python Web Application

## Phase 1: Preparation (Days 1-2)
1. Review application requirements
2. Set up staging environment
3. Configure CI/CD pipeline
4. Prepare database migrations

## Phase 2: Infrastructure (Days 3-5)
1. Provision load balancer
2. Configure multiple application servers
3. Set up database replication
4. Configure monitoring and logging

## Phase 3: Deployment (Days 6-7)
1. Deploy to staging
2. Run integration tests
3. Blue-green deployment to production
4. Verify zero-downtime transition

## Phase 4: Validation (Day 8)
1. Monitor application metrics
2. Verify high availability
3. Load testing
4. Rollback plan ready

Risk Assessment:
- Medium: Database migration complexity
- Low: Application deployment
- Low: Infrastructure setup

Estimated Duration: 8 days
Confidence: 0.82

Debug

Analyze errors, logs, and system issues.

# Debug from log file
cortex-ai debug /var/log/app/error.log

# Debug with stack trace
cortex-ai debug --stack-trace stacktrace.txt

# Debug system issue
cortex-ai debug "High CPU usage on server" \
  --metrics /var/log/system-metrics.json

# Interactive debugging
cortex-ai debug --interactive

Example Output:

Error Analysis: /var/log/app/error.log

Primary Issue: Database connection timeout
Location: app/database.py:142
Frequency: 15 occurrences in last hour

Root Cause Analysis:
1. Connection pool exhausted (max_connections: 10, active: 10)
2. Long-running queries blocking connection release
3. Missing connection timeout configuration

Suggested Fixes:
1. Increase connection pool size:
   MAX_CONNECTIONS=20

2. Add connection timeout:
   CONNECTION_TIMEOUT=30

3. Implement query timeout:
   QUERY_TIMEOUT=10

4. Add connection retry logic with exponential backoff

Code Changes Required:
- app/database.py: Update connection pool configuration
- app/models.py: Add query timeout decorators
- app/utils.py: Implement retry mechanism

Confidence: 0.91
Processing time: 178ms

Optimize

Optimize code, configuration, or system performance.

# Optimize code file
cortex-ai optimize /path/to/script.py --language python

# Optimize configuration
cortex-ai optimize /etc/nginx/nginx.conf --target performance

# Optimize system
cortex-ai optimize --target system --output recommendations.json

# Optimize with constraints
cortex-ai optimize "Database query" \
  --constraints "Must use existing indexes, no schema changes"

Example Output:

Optimization Analysis: /path/to/script.py

Current Performance:
- Execution time: 2.3s
- Memory usage: 45MB
- CPU utilization: 85%

Optimization Opportunities:

1. Database Query Optimization (High Impact)
   - Replace N+1 queries with JOIN
   - Add database indexes on frequently queried columns
   - Estimated improvement: 60% faster

2. Caching Strategy (Medium Impact)
   - Implement Redis caching for user data
   - Cache expensive computations
   - Estimated improvement: 40% faster

3. Algorithm Improvements (Medium Impact)
   - Replace O(n²) sorting with O(n log n)
   - Use generators instead of lists for large datasets
   - Estimated improvement: 30% faster

4. Code Refactoring (Low Impact)
   - Remove redundant function calls
   - Optimize string concatenation
   - Estimated improvement: 10% faster

Recommended Changes:
[Detailed code suggestions with line numbers]

Expected Results:
- Execution time: 0.7s (70% improvement)
- Memory usage: 32MB (29% reduction)
- CPU utilization: 55% (35% reduction)

Confidence: 0.85
Processing time: 203ms

Advanced CLI Options

# Set model parameters
cortex-ai reason "Query" \
  --temperature 0.7 \
  --max-tokens 500 \
  --top-p 0.9

# Use specific model version
cortex-ai reason "Query" --model sapiens-0.27b-v2

# Enable streaming output
cortex-ai reason "Long query" --stream

# Set timeout
cortex-ai reason "Query" --timeout 30

# Use custom config file
cortex-ai reason "Query" --config /path/to/config.yaml

HTTP API

The Cortex AI HTTP API provides RESTful endpoints for programmatic access.

Base URL

Local: http://localhost:8080
Network: http://your-server-ip:8080

Authentication

By default, the API runs without authentication. For production, enable API keys:

# /etc/cortex-ai/config.yaml
security:
  api_key_required: true
  api_key: your-secret-key-here

Then include in requests:

curl -H "X-API-Key: your-secret-key-here" ...

Endpoints

POST /reason

Perform reasoning on a query.

Request:

curl -X POST http://localhost:8080/reason \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are best practices for securing a REST API?",
    "context": "Python Flask application",
    "temperature": 0.7,
    "max_tokens": 500
  }'

Response:

{
  "result": "Best practices for securing a REST API include:\n\n1. Authentication: Use JWT tokens or OAuth 2.0\n2. HTTPS: Always use TLS encryption\n3. Input validation: Sanitize all user inputs\n4. Rate limiting: Prevent abuse and DDoS attacks\n5. CORS: Configure Cross-Origin Resource Sharing properly\n6. Error handling: Don't expose sensitive information in errors",
  "confidence": 0.89,
  "processing_time_ms": 156,
  "tokens_used": 127,
  "model_version": "sapiens-0.27b"
}

POST /plan

Generate structured plans.

Request:

curl -X POST http://localhost:8080/plan \
  -H "Content-Type: application/json" \
  -d '{
    "task": "Set up CI/CD pipeline for microservices",
    "requirements": ["GitHub Actions", "Docker", "Kubernetes"],
    "format": "markdown",
    "steps": 8
  }'

Response:

{
  "plan": "# CI/CD Pipeline Setup Plan\n\n## Phase 1: Repository Setup\n...",
  "estimated_duration": "2 weeks",
  "risk_level": "medium",
  "confidence": 0.84,
  "processing_time_ms": 234
}

POST /debug

Debug errors and issues.

Request:

curl -X POST http://localhost:8080/debug \
  -H "Content-Type: application/json" \
  -d '{
    "error_log": "Traceback (most recent call last):\n  File \"app.py\", line 42, in <module>\n    result = process_data(data)\nKeyError: 'missing_key'",
    "context": {
      "language": "Python",
      "framework": "Flask",
      "environment": "production"
    }
  }'

Response:

{
  "analysis": {
    "error_type": "KeyError",
    "location": "app.py:42",
    "root_cause": "Accessing dictionary key that doesn't exist",
    "severity": "medium"
  },
  "suggestions": [
    "Use dict.get() with default value",
    "Check key existence with 'in' operator",
    "Validate input data before processing"
  ],
  "code_fixes": [
    {
      "file": "app.py",
      "line": 42,
      "current": "result = process_data(data)",
      "suggested": "result = process_data(data) if 'required_key' in data else default_value"
    }
  ],
  "confidence": 0.92,
  "processing_time_ms": 167
}

POST /optimize

Optimize code or configuration.

Request:

curl -X POST http://localhost:8080/optimize \
  -H "Content-Type: application/json" \
  -d '{
    "target": "code",
    "content": "def process_items(items):\n    result = []\n    for item in items:\n        if item.valid:\n            result.append(item.process())\n    return result",
    "language": "python",
    "goals": ["performance", "readability"]
  }'

Response:

{
  "optimizations": [
    {
      "type": "list_comprehension",
      "description": "Replace loop with list comprehension",
      "current_code": "def process_items(items):\n    result = []\n    for item in items:\n        if item.valid:\n            result.append(item.process())\n    return result",
      "optimized_code": "def process_items(items):\n    return [item.process() for item in items if item.valid]",
      "improvement_estimate": "15% faster, more Pythonic"
    }
  ],
  "overall_improvement": "15-20% performance gain",
  "confidence": 0.88,
  "processing_time_ms": 189
}

GET /health

Check API health status.

Request:

curl http://localhost:8080/health

Response:

{
  "status": "healthy",
  "version": "1.0.0",
  "model_loaded": true,
  "memory_usage_mb": 187,
  "uptime_seconds": 3600,
  "requests_processed": 1234
}

GET /status

Get detailed system status.

Request:

curl http://localhost:8080/status

Response:

{
  "service": "cortex-ai",
  "version": "1.0.0",
  "model": {
    "name": "sapiens-0.27b",
    "version": "1.0",
    "loaded": true,
    "memory_mb": 200
  },
  "system": {
    "cpu_usage_percent": 45.2,
    "memory_usage_mb": 187,
    "disk_usage_percent": 23.1
  },
  "performance": {
    "avg_response_time_ms": 156,
    "requests_per_second": 6.2,
    "total_requests": 1234
  }
}

Error Responses

All endpoints return standard HTTP status codes:

200 OK: Successful request
400 Bad Request: Invalid request parameters
401 Unauthorized: Missing or invalid API key
429 Too Many Requests: Rate limit exceeded
500 Internal Server Error: Server error

Error response format:

{
  "error": {
    "code": "INVALID_QUERY",
    "message": "Query cannot be empty",
    "details": {}
  }
}

Python Integration

Installation

# Install Python SDK
pip install cortex-ai

# Or from source
git clone https://github.com/cortexlinux/cortex-python-sdk
cd cortex-python-sdk
pip install -e .

Basic Usage

from cortex import AI

# Initialize client
ai = AI()

# Or with custom configuration
ai = AI(
    host='localhost',
    port=8080,
    api_key='your-key'  # Optional
)

# Reason
result = ai.reason("What is the difference between REST and GraphQL?")
print(result.result)
print(f"Confidence: {result.confidence}")

# Plan
plan = ai.plan(
    task="Deploy application to Kubernetes",
    requirements=["High availability", "Auto-scaling"],
    format="markdown"
)
print(plan.plan)

# Debug
debug_result = ai.debug(
    error_log="Error: Connection refused",
    context={"application": "web server", "language": "Python"}
)
for suggestion in debug_result.suggestions:
    print(f"- {suggestion}")

# Optimize
optimization = ai.optimize(
    code="""
def slow_function(data):
    result = []
    for item in data:
        if item > 0:
            result.append(item * 2)
    return result
    """,
    language="python",
    goals=["performance"]
)
print(optimization.optimizations[0].optimized_code)

Advanced Usage

from cortex import AI
import asyncio

# Async client
async def main():
    ai = AI()
    
    # Async reasoning
    result = await ai.reason_async(
        "Analyze this architecture",
        temperature=0.7,
        max_tokens=1000
    )
    print(result.result)

# Streaming
ai = AI()
for chunk in ai.reason_stream("Long query that generates streaming output"):
    print(chunk, end='', flush=True)

# Batch processing
queries = [
    "What is microservices architecture?",
    "Explain container orchestration",
    "Describe CI/CD best practices"
]

results = ai.reason_batch(queries)
for query, result in zip(queries, results):
    print(f"Q: {query}\nA: {result.result}\n")

Error Handling

from cortex import AI
from cortex.exceptions import CortexError, CortexTimeoutError

ai = AI()

try:
    result = ai.reason("Query", timeout=5)
except CortexTimeoutError:
    print("Request timed out")
except CortexError as e:
    print(f"Error: {e.message}")
except Exception as e:
    print(f"Unexpected error: {e}")

Bash Integration

Basic Script Integration

#!/bin/bash

# Function to query Cortex AI
cortex_reason() {
    local query="$1"
    curl -s -X POST http://localhost:8080/reason \
        -H "Content-Type: application/json" \
        -d "{\"query\": \"$query\"}" | \
        jq -r '.result'
}

# Usage
RESULT=$(cortex_reason "How do I check disk usage in Linux?")
echo "$RESULT"

Advanced Bash Integration

#!/bin/bash

# Configuration
CORTEX_HOST="${CORTEX_HOST:-localhost}"
CORTEX_PORT="${CORTEX_PORT:-8080}"
CORTEX_API_KEY="${CORTEX_API_KEY:-}"

# Helper function
cortex_api() {
    local endpoint="$1"
    local data="$2"
    local headers=(-H "Content-Type: application/json")
    
    if [ -n "$CORTEX_API_KEY" ]; then
        headers+=(-H "X-API-Key: $CORTEX_API_KEY")
    fi
    
    curl -s -X POST "http://${CORTEX_HOST}:${CORTEX_PORT}${endpoint}" \
        "${headers[@]}" \
        -d "$data"
}

# Reason wrapper
cortex_reason() {
    local query="$1"
    local context="${2:-}"
    local json_data="{\"query\": \"$query\""
    
    if [ -n "$context" ]; then
        json_data+=", \"context\": \"$context\""
    fi
    json_data+="}"
    
    cortex_api "/reason" "$json_data" | jq -r '.result'
}

# Debug wrapper
cortex_debug() {
    local error_log="$1"
    local log_content
    
    if [ -f "$error_log" ]; then
        log_content=$(cat "$error_log")
    else
        log_content="$error_log"
    fi
    
    cortex_api "/debug" "{\"error_log\": $(jq -Rs . <<< "$log_content")}" | \
        jq -r '.suggestions[]'
}

# Example usage
if [ "$1" = "debug" ]; then
    cortex_debug "$2"
elif [ "$1" = "reason" ]; then
    cortex_reason "$2" "$3"
else
    echo "Usage: $0 {debug|reason} <input> [context]"
fi

System Integration Examples

# Auto-debug on error
trap 'cortex_debug "$(tail -50 /var/log/app/error.log)"' ERR

# Optimize configuration on change
inotifywait -m /etc/nginx/nginx.conf | while read event; do
    cortex_optimize /etc/nginx/nginx.conf
done

# Monitor and reason about system metrics
while true; do
    METRICS=$(collect_metrics)
    ANALYSIS=$(cortex_reason "Analyze these system metrics: $METRICS")
    echo "$ANALYSIS" >> /var/log/system-analysis.log
    sleep 300
done

Performance Characteristics

Latency

Typical response times for different query types:

Query Type	Average Latency	P95 Latency	P99 Latency
Simple reasoning	50-100ms	150ms	200ms
Complex reasoning	100-200ms	300ms	500ms
Planning	150-250ms	400ms	600ms
Debugging	100-180ms	280ms	450ms
Optimization	180-300ms	500ms	800ms

Throughput

Single-threaded: 5-7 queries/second
Multi-threaded (4 cores): 15-20 queries/second
Concurrent requests: Up to 50 simultaneous requests (with queuing)

Resource Usage

Memory: ~200MB base + ~10MB per active request
CPU: 15-30% per active request (single core)
Disk I/O: Minimal (model loaded in memory)

Optimization Tips

# /etc/cortex-ai/config.yaml
ai:
  # Increase threads for multi-core systems
  num_threads: 4
  
  # Adjust memory allocation
  max_memory_mb: 512
  
  # Enable request batching
  batch_size: 10
  
  # Cache frequent queries
  enable_caching: true
  cache_ttl_seconds: 3600

Advanced Configuration

Custom Model Parameters

Edit /etc/cortex-ai/config.yaml:

ai:
  model_path: /usr/lib/cortex-ai/models/sapiens-0.27b
  
  # Generation parameters
  temperature: 0.7        # Creativity (0.0-1.0)
  top_p: 0.9             # Nucleus sampling
  top_k: 40              # Top-k sampling
  max_tokens: 512        # Maximum output length
  
  # Performance
  num_threads: 4         # CPU threads
  max_memory_mb: 512     # Memory limit
  batch_size: 1          # Batch processing
  
  # Caching
  enable_caching: true
  cache_ttl_seconds: 3600
  cache_max_size_mb: 100

Logging Configuration

logging:
  level: INFO  # DEBUG, INFO, WARNING, ERROR
  file: /var/log/cortex-ai/cortex-ai.log
  max_size_mb: 100
  backup_count: 5
  format: json  # json or text

Rate Limiting

rate_limiting:
  enabled: true
  requests_per_minute: 60
  requests_per_hour: 1000
  burst_size: 10

Next Steps

Explore Use Cases: Use Cases and Tutorials
Developer Resources: Developer Documentation
Performance Tuning: Technical Specifications

Last updated: 2024

Uh oh!

AI Integration

AI Integration Guide

Table of Contents

Overview

Engine Specifications

Command-Line Interface

Basic Usage

Core Commands

Reason

Plan

Debug

Optimize

Advanced CLI Options

HTTP API

Base URL

Authentication

Endpoints

POST /reason

POST /plan

POST /debug

POST /optimize

GET /health

GET /status

Error Responses

Python Integration

Installation

Basic Usage

Advanced Usage

Error Handling

Bash Integration

Basic Script Integration

Advanced Bash Integration

System Integration Examples

Performance Characteristics

Latency

Throughput

Resource Usage

Optimization Tips

Advanced Configuration

Custom Model Parameters

Logging Configuration

Rate Limiting

Next Steps

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally