-
-
Notifications
You must be signed in to change notification settings - Fork 48
AI Integration
Complete documentation for integrating with Cortex Linux's built-in Sapiens 0.27B reasoning engine.
- Overview
- Command-Line Interface
- HTTP API
- Python Integration
- Bash Integration
- Performance Characteristics
- Advanced Configuration
Cortex Linux integrates the Sapiens 0.27B reasoning engine directly into the operating system. The engine is accessible through multiple interfaces:
-
CLI:
cortex-aicommand-line tool - HTTP API: RESTful endpoints on port 8080
- System Services: Background services for automated tasks
- Library APIs: Python and other language bindings
- Model: Sapiens 0.27B (270 million parameters)
- Memory Usage: ~200MB RAM
- Latency: 50-200ms for typical queries
- Throughput: 5-10 queries/second (depending on hardware)
- Capabilities: Reasoning, planning, debugging, optimization
# Check version
cortex-ai --version
# Get help
cortex-ai --help
# Check service status
cortex-ai statusPerform general reasoning on a query.
# Basic reasoning
cortex-ai reason "What are the security implications of running a service as root?"
# With context
cortex-ai reason "Analyze this error: $ERROR_MESSAGE" --context "Application: web server, Language: Python"
# Save output to file
cortex-ai reason "Explain quantum computing" --output analysis.md
# Verbose output
cortex-ai reason "Debug this issue" --verboseExample Output:
Query: What are the security implications of running a service as root?
Analysis:
Running services as root presents significant security risks:
1. Privilege Escalation: If the service is compromised, attackers gain full system access
2. Reduced Attack Surface: Root processes can modify critical system files
3. Principle of Least Privilege: Violates security best practices
4. Audit Trail: Root actions are harder to track and attribute
Recommendations:
- Create dedicated service user with minimal required permissions
- Use systemd user services or containerization
- Implement proper file permissions and SELinux/AppArmor policies
- Monitor root process execution with auditd
Confidence: 0.87
Processing time: 145ms
Generate structured plans for tasks or projects.
# Create a deployment plan
cortex-ai plan "Deploy a Python web application to production" \
--requirements "High availability, zero downtime, monitoring"
# Plan with existing configuration
cortex-ai plan "Optimize database performance" \
--config /etc/postgresql/postgresql.conf \
--output plan.json
# Multi-step planning
cortex-ai plan "Migrate from MySQL to PostgreSQL" \
--steps 10 \
--format markdownExample Output:
# Deployment Plan: Python Web Application
## Phase 1: Preparation (Days 1-2)
1. Review application requirements
2. Set up staging environment
3. Configure CI/CD pipeline
4. Prepare database migrations
## Phase 2: Infrastructure (Days 3-5)
1. Provision load balancer
2. Configure multiple application servers
3. Set up database replication
4. Configure monitoring and logging
## Phase 3: Deployment (Days 6-7)
1. Deploy to staging
2. Run integration tests
3. Blue-green deployment to production
4. Verify zero-downtime transition
## Phase 4: Validation (Day 8)
1. Monitor application metrics
2. Verify high availability
3. Load testing
4. Rollback plan ready
Risk Assessment:
- Medium: Database migration complexity
- Low: Application deployment
- Low: Infrastructure setup
Estimated Duration: 8 days
Confidence: 0.82Analyze errors, logs, and system issues.
# Debug from log file
cortex-ai debug /var/log/app/error.log
# Debug with stack trace
cortex-ai debug --stack-trace stacktrace.txt
# Debug system issue
cortex-ai debug "High CPU usage on server" \
--metrics /var/log/system-metrics.json
# Interactive debugging
cortex-ai debug --interactiveExample Output:
Error Analysis: /var/log/app/error.log
Primary Issue: Database connection timeout
Location: app/database.py:142
Frequency: 15 occurrences in last hour
Root Cause Analysis:
1. Connection pool exhausted (max_connections: 10, active: 10)
2. Long-running queries blocking connection release
3. Missing connection timeout configuration
Suggested Fixes:
1. Increase connection pool size:
MAX_CONNECTIONS=20
2. Add connection timeout:
CONNECTION_TIMEOUT=30
3. Implement query timeout:
QUERY_TIMEOUT=10
4. Add connection retry logic with exponential backoff
Code Changes Required:
- app/database.py: Update connection pool configuration
- app/models.py: Add query timeout decorators
- app/utils.py: Implement retry mechanism
Confidence: 0.91
Processing time: 178ms
Optimize code, configuration, or system performance.
# Optimize code file
cortex-ai optimize /path/to/script.py --language python
# Optimize configuration
cortex-ai optimize /etc/nginx/nginx.conf --target performance
# Optimize system
cortex-ai optimize --target system --output recommendations.json
# Optimize with constraints
cortex-ai optimize "Database query" \
--constraints "Must use existing indexes, no schema changes"Example Output:
Optimization Analysis: /path/to/script.py
Current Performance:
- Execution time: 2.3s
- Memory usage: 45MB
- CPU utilization: 85%
Optimization Opportunities:
1. Database Query Optimization (High Impact)
- Replace N+1 queries with JOIN
- Add database indexes on frequently queried columns
- Estimated improvement: 60% faster
2. Caching Strategy (Medium Impact)
- Implement Redis caching for user data
- Cache expensive computations
- Estimated improvement: 40% faster
3. Algorithm Improvements (Medium Impact)
- Replace O(n²) sorting with O(n log n)
- Use generators instead of lists for large datasets
- Estimated improvement: 30% faster
4. Code Refactoring (Low Impact)
- Remove redundant function calls
- Optimize string concatenation
- Estimated improvement: 10% faster
Recommended Changes:
[Detailed code suggestions with line numbers]
Expected Results:
- Execution time: 0.7s (70% improvement)
- Memory usage: 32MB (29% reduction)
- CPU utilization: 55% (35% reduction)
Confidence: 0.85
Processing time: 203ms
# Set model parameters
cortex-ai reason "Query" \
--temperature 0.7 \
--max-tokens 500 \
--top-p 0.9
# Use specific model version
cortex-ai reason "Query" --model sapiens-0.27b-v2
# Enable streaming output
cortex-ai reason "Long query" --stream
# Set timeout
cortex-ai reason "Query" --timeout 30
# Use custom config file
cortex-ai reason "Query" --config /path/to/config.yamlThe Cortex AI HTTP API provides RESTful endpoints for programmatic access.
-
Local:
http://localhost:8080 -
Network:
http://your-server-ip:8080
By default, the API runs without authentication. For production, enable API keys:
# /etc/cortex-ai/config.yaml
security:
api_key_required: true
api_key: your-secret-key-hereThen include in requests:
curl -H "X-API-Key: your-secret-key-here" ...Perform reasoning on a query.
Request:
curl -X POST http://localhost:8080/reason \
-H "Content-Type: application/json" \
-d '{
"query": "What are best practices for securing a REST API?",
"context": "Python Flask application",
"temperature": 0.7,
"max_tokens": 500
}'Response:
{
"result": "Best practices for securing a REST API include:\n\n1. Authentication: Use JWT tokens or OAuth 2.0\n2. HTTPS: Always use TLS encryption\n3. Input validation: Sanitize all user inputs\n4. Rate limiting: Prevent abuse and DDoS attacks\n5. CORS: Configure Cross-Origin Resource Sharing properly\n6. Error handling: Don't expose sensitive information in errors",
"confidence": 0.89,
"processing_time_ms": 156,
"tokens_used": 127,
"model_version": "sapiens-0.27b"
}Generate structured plans.
Request:
curl -X POST http://localhost:8080/plan \
-H "Content-Type: application/json" \
-d '{
"task": "Set up CI/CD pipeline for microservices",
"requirements": ["GitHub Actions", "Docker", "Kubernetes"],
"format": "markdown",
"steps": 8
}'Response:
{
"plan": "# CI/CD Pipeline Setup Plan\n\n## Phase 1: Repository Setup\n...",
"estimated_duration": "2 weeks",
"risk_level": "medium",
"confidence": 0.84,
"processing_time_ms": 234
}Debug errors and issues.
Request:
curl -X POST http://localhost:8080/debug \
-H "Content-Type: application/json" \
-d '{
"error_log": "Traceback (most recent call last):\n File \"app.py\", line 42, in <module>\n result = process_data(data)\nKeyError: 'missing_key'",
"context": {
"language": "Python",
"framework": "Flask",
"environment": "production"
}
}'Response:
{
"analysis": {
"error_type": "KeyError",
"location": "app.py:42",
"root_cause": "Accessing dictionary key that doesn't exist",
"severity": "medium"
},
"suggestions": [
"Use dict.get() with default value",
"Check key existence with 'in' operator",
"Validate input data before processing"
],
"code_fixes": [
{
"file": "app.py",
"line": 42,
"current": "result = process_data(data)",
"suggested": "result = process_data(data) if 'required_key' in data else default_value"
}
],
"confidence": 0.92,
"processing_time_ms": 167
}Optimize code or configuration.
Request:
curl -X POST http://localhost:8080/optimize \
-H "Content-Type: application/json" \
-d '{
"target": "code",
"content": "def process_items(items):\n result = []\n for item in items:\n if item.valid:\n result.append(item.process())\n return result",
"language": "python",
"goals": ["performance", "readability"]
}'Response:
{
"optimizations": [
{
"type": "list_comprehension",
"description": "Replace loop with list comprehension",
"current_code": "def process_items(items):\n result = []\n for item in items:\n if item.valid:\n result.append(item.process())\n return result",
"optimized_code": "def process_items(items):\n return [item.process() for item in items if item.valid]",
"improvement_estimate": "15% faster, more Pythonic"
}
],
"overall_improvement": "15-20% performance gain",
"confidence": 0.88,
"processing_time_ms": 189
}Check API health status.
Request:
curl http://localhost:8080/healthResponse:
{
"status": "healthy",
"version": "1.0.0",
"model_loaded": true,
"memory_usage_mb": 187,
"uptime_seconds": 3600,
"requests_processed": 1234
}Get detailed system status.
Request:
curl http://localhost:8080/statusResponse:
{
"service": "cortex-ai",
"version": "1.0.0",
"model": {
"name": "sapiens-0.27b",
"version": "1.0",
"loaded": true,
"memory_mb": 200
},
"system": {
"cpu_usage_percent": 45.2,
"memory_usage_mb": 187,
"disk_usage_percent": 23.1
},
"performance": {
"avg_response_time_ms": 156,
"requests_per_second": 6.2,
"total_requests": 1234
}
}All endpoints return standard HTTP status codes:
- 200 OK: Successful request
- 400 Bad Request: Invalid request parameters
- 401 Unauthorized: Missing or invalid API key
- 429 Too Many Requests: Rate limit exceeded
- 500 Internal Server Error: Server error
Error response format:
{
"error": {
"code": "INVALID_QUERY",
"message": "Query cannot be empty",
"details": {}
}
}# Install Python SDK
pip install cortex-ai
# Or from source
git clone https://github.com/cortexlinux/cortex-python-sdk
cd cortex-python-sdk
pip install -e .from cortex import AI
# Initialize client
ai = AI()
# Or with custom configuration
ai = AI(
host='localhost',
port=8080,
api_key='your-key' # Optional
)
# Reason
result = ai.reason("What is the difference between REST and GraphQL?")
print(result.result)
print(f"Confidence: {result.confidence}")
# Plan
plan = ai.plan(
task="Deploy application to Kubernetes",
requirements=["High availability", "Auto-scaling"],
format="markdown"
)
print(plan.plan)
# Debug
debug_result = ai.debug(
error_log="Error: Connection refused",
context={"application": "web server", "language": "Python"}
)
for suggestion in debug_result.suggestions:
print(f"- {suggestion}")
# Optimize
optimization = ai.optimize(
code="""
def slow_function(data):
result = []
for item in data:
if item > 0:
result.append(item * 2)
return result
""",
language="python",
goals=["performance"]
)
print(optimization.optimizations[0].optimized_code)from cortex import AI
import asyncio
# Async client
async def main():
ai = AI()
# Async reasoning
result = await ai.reason_async(
"Analyze this architecture",
temperature=0.7,
max_tokens=1000
)
print(result.result)
# Streaming
ai = AI()
for chunk in ai.reason_stream("Long query that generates streaming output"):
print(chunk, end='', flush=True)
# Batch processing
queries = [
"What is microservices architecture?",
"Explain container orchestration",
"Describe CI/CD best practices"
]
results = ai.reason_batch(queries)
for query, result in zip(queries, results):
print(f"Q: {query}\nA: {result.result}\n")from cortex import AI
from cortex.exceptions import CortexError, CortexTimeoutError
ai = AI()
try:
result = ai.reason("Query", timeout=5)
except CortexTimeoutError:
print("Request timed out")
except CortexError as e:
print(f"Error: {e.message}")
except Exception as e:
print(f"Unexpected error: {e}")#!/bin/bash
# Function to query Cortex AI
cortex_reason() {
local query="$1"
curl -s -X POST http://localhost:8080/reason \
-H "Content-Type: application/json" \
-d "{\"query\": \"$query\"}" | \
jq -r '.result'
}
# Usage
RESULT=$(cortex_reason "How do I check disk usage in Linux?")
echo "$RESULT"#!/bin/bash
# Configuration
CORTEX_HOST="${CORTEX_HOST:-localhost}"
CORTEX_PORT="${CORTEX_PORT:-8080}"
CORTEX_API_KEY="${CORTEX_API_KEY:-}"
# Helper function
cortex_api() {
local endpoint="$1"
local data="$2"
local headers=(-H "Content-Type: application/json")
if [ -n "$CORTEX_API_KEY" ]; then
headers+=(-H "X-API-Key: $CORTEX_API_KEY")
fi
curl -s -X POST "http://${CORTEX_HOST}:${CORTEX_PORT}${endpoint}" \
"${headers[@]}" \
-d "$data"
}
# Reason wrapper
cortex_reason() {
local query="$1"
local context="${2:-}"
local json_data="{\"query\": \"$query\""
if [ -n "$context" ]; then
json_data+=", \"context\": \"$context\""
fi
json_data+="}"
cortex_api "/reason" "$json_data" | jq -r '.result'
}
# Debug wrapper
cortex_debug() {
local error_log="$1"
local log_content
if [ -f "$error_log" ]; then
log_content=$(cat "$error_log")
else
log_content="$error_log"
fi
cortex_api "/debug" "{\"error_log\": $(jq -Rs . <<< "$log_content")}" | \
jq -r '.suggestions[]'
}
# Example usage
if [ "$1" = "debug" ]; then
cortex_debug "$2"
elif [ "$1" = "reason" ]; then
cortex_reason "$2" "$3"
else
echo "Usage: $0 {debug|reason} <input> [context]"
fi# Auto-debug on error
trap 'cortex_debug "$(tail -50 /var/log/app/error.log)"' ERR
# Optimize configuration on change
inotifywait -m /etc/nginx/nginx.conf | while read event; do
cortex_optimize /etc/nginx/nginx.conf
done
# Monitor and reason about system metrics
while true; do
METRICS=$(collect_metrics)
ANALYSIS=$(cortex_reason "Analyze these system metrics: $METRICS")
echo "$ANALYSIS" >> /var/log/system-analysis.log
sleep 300
doneTypical response times for different query types:
| Query Type | Average Latency | P95 Latency | P99 Latency |
|---|---|---|---|
| Simple reasoning | 50-100ms | 150ms | 200ms |
| Complex reasoning | 100-200ms | 300ms | 500ms |
| Planning | 150-250ms | 400ms | 600ms |
| Debugging | 100-180ms | 280ms | 450ms |
| Optimization | 180-300ms | 500ms | 800ms |
- Single-threaded: 5-7 queries/second
- Multi-threaded (4 cores): 15-20 queries/second
- Concurrent requests: Up to 50 simultaneous requests (with queuing)
- Memory: ~200MB base + ~10MB per active request
- CPU: 15-30% per active request (single core)
- Disk I/O: Minimal (model loaded in memory)
# /etc/cortex-ai/config.yaml
ai:
# Increase threads for multi-core systems
num_threads: 4
# Adjust memory allocation
max_memory_mb: 512
# Enable request batching
batch_size: 10
# Cache frequent queries
enable_caching: true
cache_ttl_seconds: 3600Edit /etc/cortex-ai/config.yaml:
ai:
model_path: /usr/lib/cortex-ai/models/sapiens-0.27b
# Generation parameters
temperature: 0.7 # Creativity (0.0-1.0)
top_p: 0.9 # Nucleus sampling
top_k: 40 # Top-k sampling
max_tokens: 512 # Maximum output length
# Performance
num_threads: 4 # CPU threads
max_memory_mb: 512 # Memory limit
batch_size: 1 # Batch processing
# Caching
enable_caching: true
cache_ttl_seconds: 3600
cache_max_size_mb: 100logging:
level: INFO # DEBUG, INFO, WARNING, ERROR
file: /var/log/cortex-ai/cortex-ai.log
max_size_mb: 100
backup_count: 5
format: json # json or textrate_limiting:
enabled: true
requests_per_minute: 60
requests_per_hour: 1000
burst_size: 10- Explore Use Cases: Use Cases and Tutorials
- Developer Resources: Developer Documentation
- Performance Tuning: Technical Specifications
Last updated: 2024