Skip to content

Releases: CoreWorxLab/CAAL

v1.6.0 - OpenAI-Compatible & OpenRouter Providers

07 Feb 18:33

Choose a tag to compare

OpenAI-Compatible & OpenRouter Providers

This release adds two new LLM providers, restructures the settings UI around the voice pipeline architecture, and fixes Apple Silicon deployment.

New LLM Providers

  • OpenAI-compatible: Connect to any server exposing the OpenAI API — LM Studio, vLLM, LocalAI, text-generation-inference, and others. Configurable base URL with optional API key.
  • OpenRouter: Access 200+ models (GPT-4, Claude, Llama, Mistral, etc.) through a single API key. Searchable model dropdown for easy selection.
  • Both providers support streaming and tool calling

Voice Pipeline UI Restructure

  • Settings panel reorganized into 5 tabs: Agent, Prompt, Voice Pipeline, AI Provider, Integrations
  • STT decoupled from LLM — speech recognition (Speaches or Groq Whisper) is now an independent choice, no longer auto-coupled to the LLM provider
  • Voice Pipeline tab groups STT + TTS together, reflecting the actual audio flow
  • AI Provider tab combines provider selection with LLM parameters (temperature, context size, max turns)
  • Setup wizard expanded from 3 to 4 steps: STT → LLM → TTS → Integrations
  • Restart prompt when changing providers — notifies that the agent needs a session restart

Apple Silicon Deployment

  • Speaches container added to docker-compose.apple.yaml for Piper TTS and faster-whisper STT
  • PIPER_URL environment variable decouples Piper TTS from the STT service URL
  • SyncOpenAITTS wrapper bypasses httpx async context issues in LiveKit's forked subprocesses
  • Conditional Piper auto-switch — only switches from Kokoro to Piper for non-English when a Piper service is actually available
  • WebRTC port range fixedlivekit.yaml and compose files now agree on 51000-51100 (previously mismatched)

Connection Testing

  • Test endpoints for OpenAI-compatible (/setup/test-openai-compatible) and OpenRouter (/setup/test-openrouter)
  • Model discovery — tests return available models for dropdown population
  • URL validation on settings save for all URL fields

i18n

  • English, French, and Italian translations updated with new pipeline/provider keys
  • Groq API key sharing noted contextually when both STT and LLM use Groq

Other

  • OpenRouterProvider inherits from OpenAICompatibleProvider (DRY — 76 lines vs 227)
  • OpenRouter includes attribution headers (X-Title: CAAL Voice Assistant)
  • .planning/ directory excluded via .gitignore

Thanks to @mmaudet for this contribution!

Migration

  • No action required for existing users — your Ollama/Groq setup continues to work unchanged
  • STT provider defaults to Speaches (same as before)
  • New providers are opt-in via Settings > AI Provider

Full Changelog: v1.5.0...v1.6.0

v1.5.0 - CAAL Tool API & Registry

31 Jan 21:12
07d4b3c

Choose a tag to compare

CAAL Tool API & Registry

This release introduces the CAAL Tool API - a complete system for building, sharing, and installing voice-callable tools. Browse community tools, install with one click, and share your own workflows with the registry.

Tool Registry

  • Browse & install community tools directly from the CAAL web UI
  • Category browsing across Smart Home, Media, Homelab, Productivity, Developer, Utilities, Sports, and Social
  • Search, sort, and filter to find the right tool
  • Guided installation with credential and variable configuration
  • Install tracking shows which tools are from the registry vs custom

Tool Submission

  • Share to Registry button for your custom workflows
  • Client-side sanitization - secrets never leave your network
  • GitHub OAuth - PRs attributed to you via fork-based submission
  • Automated review - bot validates structure, security, and quality
  • LLM metadata generation - auto-suggests name, category, description, and voice triggers

Tool Management

  • Installed Tools view with card and list modes
  • Registry sync shows update availability
  • Workflow detail modal with n8n link for installed tools

Multilingual Support (i18n)

  • Three languages: English, French, Italian
  • Full-stack i18n: Web frontend, mobile app, and voice agent
  • Language selector in setup wizard and settings
  • Localized prompts per language in prompt/{lang}/default.md
  • Language-aware TTS: Auto-selects Piper voices for non-English
  • Wake greetings per language with file-based storage
  • 12-item checklist in docs/I18N.md for adding new languages

Thanks to @mmaudet for the i18n foundation!

Theme System

  • Three themes: Midnight (dark blue), Grey Slate (default), Light
  • Surface depth system for visual hierarchy
  • Live switching with persistent preference

Settings Panel Redesign

  • Tabbed layout: Agent, Prompt, Providers, LLM, Integrations, Wake
  • Theme switcher in settings
  • Improved visual hierarchy with surface elevation

Other Improvements

  • Reduced default LLM temperature to 0.15 for more reliable tool calling
  • Added reasoning_effort: low for Groq GPT-OSS models
  • Registry cache system for faster tool browsing
  • Workflow sanitizer with 15+ secret detection patterns

Documentation

  • New CAAL Tool API wiki page - complete system reference
  • New I18N Guide for adding languages
  • Updated wiki with tool suite patterns and reorganized navigation
  • Slimmed down N8N-WORKFLOWS.md to focus on integration mechanics

Breaking Changes

  • Removed deprecated n8n-workflows/ folder (tools now live in caal-tools)

Migration

  • No action required for existing users
  • Registry features require n8n MCP connection (already configured in most setups)
  • Default theme is Grey Slate (original aesthetic preserved)
  • Language defaults to English; change via Settings > Agent > Language

Full Changelog: v1.4.0...v1.5.0

v1.4.0 - GPU-Free Mode & First-Start Wizard

15 Jan 17:14
8d52534

Choose a tag to compare

🎉 The Accessibility Release

CAAL now runs without a GPU. Deploy on TrueNAS, cloud VMs, or any machine with Docker.

New Features

GPU-Free Deployment

  • docker compose -f docker-compose.cpu.yaml up -d
  • Uses Groq (LLM + STT) + Piper (TTS) - all CPU
  • Free tier available at console.groq.com

First-Start Wizard

  • Browser-based setup - minimal .env changes required
  • Choose LLM provider (Ollama or Groq)
  • Choose TTS provider (Kokoro or Piper)
  • Configure Home Assistant and n8n integrations

Home Assistant Integration

  • Native MCP connection to Home Assistant
  • Simplified tools: hass_control(action, target, value) and hass_get_state(target)
  • See docs/HOME-ASSISTANT.md

Settings Panel Redesign

  • Slide-out panel with 6 tabs replacing cramped modal
  • Real-time provider testing (Ollama, Groq, HASS, n8n)

Mobile App Updates

  • Provider and integration settings
  • Wizard completion check before connecting
  • MCP connection error banner

Providers Added

Provider Type Notes
Groq LLM llama-3.3-70b, gpt-oss-20b, etc.
Groq STT Whisper Large v3 Turbo
Piper TTS CPU-friendly, 35+ languages

Infrastructure

  • Auto-generated self-signed HTTPS certificates
  • MCP connection errors shown to users
  • Piper models persist across container restarts
  • Explicit Docker network naming

Breaking Changes

None - existing configurations continue to work.


📱 Mobile APK attached below

v1.3.0 - Server-side Wake Word with OpenWakeWord

04 Jan 03:27

Choose a tag to compare

What's New

This release replaces client-side Picovoice wake word detection with server-side OpenWakeWord, enabling unlimited device support without per-device licensing restrictions.

Highlights

  • Server-side OpenWakeWord - Wake word detection now runs on the server, removing the 1-device limit from Picovoice
  • Multi-device support - Connect from multiple phones/browsers simultaneously
  • Custom wake word models - Train your own wake words with OpenWakeWord
  • Hey Cal default - New custom-trained "Hey Cal" wake word (Hey Jarvis also included)
  • Wake word model selector - Choose between wake words in frontend and mobile settings
  • Apple Silicon support - Startup script and MLX-Audio for M1/M2/M3 Macs

Improvements

  • Fix wake word timeout cutting off speech mid-sentence (VAD-based detection)
  • Post-response follow-up window for natural conversation flow
  • Wake word state indicators in mobile app
  • Optimized greeting latency with direct TTS
  • Docker startup fix for missing config files

Breaking Changes

  • Picovoice client-side wake word removed from mobile app and web frontend

v1.2.0 - Apple Silicon Support

01 Jan 20:27

Choose a tag to compare

Apple Silicon Support (M1/M2/M3/M4)

CAAL now runs on Apple Silicon Macs using mlx-audio for Metal-accelerated STT/TTS.

New Features

  • docker-compose.apple.yaml - Minimal compose for Apple Silicon (LiveKit + Agent + Frontend)
  • mlx-audio integration - Single service handles both STT and TTS with Metal GPU acceleration
  • Configurable models - MLX_AUDIO_URL, WHISPER_MODEL, TTS_MODEL env vars

Setup

  1. Install: pip install "mlx-audio[all]"
  2. Start server: python -m mlx_audio.server --host 0.0.0.0 --port 8000
  3. Pre-load models:
    curl -X POST "http://localhost:8000/v1/models?model_name=mlx-community/whisper-medium-mlx"
    curl -X POST "http://localhost:8000/v1/models?model_name=prince-canuma/Kokoro-82M"
  4. Run: docker compose -f docker-compose.apple.yaml up -d

Models

Component Model
STT mlx-community/whisper-medium-mlx
TTS prince-canuma/Kokoro-82M

See README for full documentation.

v1.1.0

31 Dec 22:07
2aa0d1e

Choose a tag to compare

New Features

Multi-MCP Server Support

  • Configure additional MCP servers via mcp_servers.json
  • n8n remains in .env as foundational server
  • Tools from non-n8n servers prefixed with server_name__ to avoid collisions
  • Supports SSE and streamable_http transports
  • Tested with Home Assistant MCP and Memento memory server

Runtime Settings UI

  • Configure agent settings from the web frontend
  • Change TTS voice, model, temperature without rebuilding
  • Settings persist in settings.json

Context Management

  • Tool data cache preserves structured responses for follow-up queries
  • Sliding window keeps conversation manageable (OLLAMA_MAX_TURNS)
  • Prevents context overflow - system prompt never truncated

Frontend Improvements

  • Tool use indicator shows if response used a tool (green wrench)
  • Reload tools button in control bar
  • Click tool indicator to see parameters used

Bug Fixes

  • Fixed TTS reading scores like "30-23" as "minus" → now "30 to 23"
  • Fixed OLLAMA_NUM_CTX default to 8192
  • Fixed Windows line ending issues (.gitattributes)

Configuration

New .env options:

  • OLLAMA_MAX_TURNS - Max conversation turns in sliding window (default: 20)
  • TOOL_CACHE_SIZE - Tool responses to cache (default: 3)

New config file:

  • mcp_servers.json - Additional MCP servers (optional)

v1.0.0 - Initial Release

24 Dec 08:45

Choose a tag to compare

CAAL v1.0.0

Local voice assistant with n8n workflow integrations. Your voice never leaves your network.

Features

  • Local Voice Pipeline: Speaches (Faster-Whisper STT) + Kokoro (TTS) + Ollama (LLM)
  • Wake Word Detection: "Hey Cal" activation via Picovoice Porcupine
  • n8n Integrations: Home Assistant, APIs, databases - anything n8n can connect to
  • Web Search: DuckDuckGo integration for real-time information
  • Webhook API: External triggers for announcements and tool reload
  • Self-Modifying: Create new n8n workflows via voice commands

Requirements

  • Docker with NVIDIA Container Toolkit
  • Ollama running on your network
  • n8n with MCP enabled
  • 12GB+ VRAM recommended

Quick Start

git clone https://github.com/CoreWorxLab/CAAL.git
cd CAAL
cp .env.example .env
nano .env  # Set CAAL_HOST_IP, OLLAMA_HOST, N8N_MCP_URL, N8N_MCP_TOKEN
docker compose up -d

📺 Watch the walkthrough on YouTube