Draft add local LiteRT-LM provider +executor #1342

maceip · 2026-01-08T15:16:05Z

Motivation and Context

This PR adds LiteRT-LM as a new LLMProvider to Koog, enabling on-device LLM inference using Google's LiteRT-LM engine. This allows users to run LLMs locally on Android and JVM platforms without requiring network connectivity, which is valuable for:

Privacy-sensitive applications
Offline-capable AI features
Reduced latency for on-device inference
Edge deployment scenarios

The implementation follows the existing Ollama provider patterns for consistency with Koog's architecture.

Breaking Changes

None. This is a purely additive change introducing a new module and provider.

Type of the changes

New feature (non-breaking change which adds functionality)
Bug fix (non-breaking change which fixes an issue)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update
Tests improvement
Refactoring

Checklist

The pull request has a description of the proposed change
I read the Contributing Guidelines before opening the pull request
The pull request uses develop as the base branch
Tests for the changes have been added
All new and existing tests passed

Additional steps for pull requests adding a new feature

An issue describing the proposed change exists
The pull request includes a link to the issue
The change was discussed and approved in the issue
Docs have been added / updated

Summary of Changes

New module: prompt-executor-litertlm-client

Key components:

Component	Description
`LiteRTLMClient`	Main client implementing `LLMClient` interface with conversation API
`ManagedConversation`	Thread-safe multi-turn conversation with history tracking
`LiteRTLMToolBridge`	Bridges Koog's `ToolDescriptor` to LiteRT-LM's annotation-based tool system
`LiteRTLMClientFactory`	KMP factory with platform-specific implementations (JVM actual, stubs for iOS/JS/Wasm)
`LiteRTLMModels`	Model definitions for Gemma 3n variants

Features:

Stateless execute() and executeStreaming() for single requests
conversation() API for multi-turn interactions with context preservation
Multimodal support (image/audio) via typed methods (sendImage, sendAudio, etc.)
Tool calling support via ToolExecutor callback pattern
Thread-safe concurrent access with Mutex synchronization
Conversation history tracking with timestamps
Kotlin 2.3 forward-compatible @MustUseReturnValue annotations

Add support for LiteRT-LM, Google's on-device inference engine that enables running LLMs locally on Android and JVM platforms. Changes: - Add LiteRTLM provider to LLMProvider sealed class - Create LiteRTLMModels with Gemma-3n-E4B model support - Add prompt-executor-litertlm-client module with LiteRTLMClient - Add litertlm-jvm and litertlm-android dependencies to version catalog The LiteRT-LM client supports: - Synchronous and streaming response generation - Temperature control via SamplerConfig - System message and conversation context - CPU and GPU backends for inference Note: The LiteRT-LM library dependency is marked as compileOnly. Users must add the LiteRT-LM runtime dependency to their project when using this provider.

Add test coverage for the LiteRT-LM client: - Unit tests for configuration, error handling, and provider validation - Integration test template for local testing with actual models The integration tests are disabled by default and require: - LiteRT-LM library dependency - A valid model file (set via MODEL_PATH env var)

Add prebuilt native libraries from google-ai-edge/LiteRT-LM for Android ARM64 platform: - libGemmaModelConstraintProvider.so - libLiteRtGpuAccelerator.so - libLiteRtOpenClAccelerator.so - libLiteRtTopKOpenClSampler.so - libLiteRtTopKWebGpuSampler.so - libLiteRtWebGpuAccelerator.so These libraries enable GPU-accelerated inference on Android devices. Source: https://github.com/google-ai-edge/LiteRT-LM/tree/main/prebuilt/android_arm64

Model files (.litertlm) are too large for git. Users should download models separately for testing.

…support Addresses several implementation issues: 1. Conversation history support (issue 2): - Now processes all messages in prompt, not just last user message - Maintains context through multi-turn conversations - Handles System, User, Assistant, Tool.Call, Tool.Result, Reasoning 2. Multimodal content handling (issue 3): - Added support for Image content via Content.ImageBytes - Added support for Audio content via Content.AudioBytes - Validates model capabilities before processing - File attachments converted to text representation 3. Configurable sampler (issue 4): - Added defaultTopK, defaultTopP, defaultTemperature constructor params - Temperature still overridable via prompt.params 4. Tool support (issue 6): - Tools parameter accepted in createConversationConfig - Added TODO noting LiteRT-LM uses annotation-based tool registration - Tool calls/results from history preserved as context strings

These binaries are for Android NDK and not usable by the JVM-only LiteRT-LM client module. Users targeting Android should use the litertlm-android dependency directly which includes the native libs.

- executeStreaming now passes tools parameter instead of emptyList() - Added guard for empty content parts in buildUserMessage

Add expect/actual pattern for cross-platform LiteRT-LM client creation: - commonMain: LiteRTLMClientConfig and factory function declarations - jvmMain: Full implementation using LiteRTLMClient - androidMain: Stub with guidance for adding litertlm-android dependency - appleMain/jsMain/wasmJsMain: Stubs returning UnsupportedOperationException Also update build.gradle.kts to work with full KMP via convention plugin.

Refactored configuration and client to match official LiteRT-LM patterns: - Add LiteRTLMEngineConfig with visionBackend, audioBackend, maxNumTokens - Add LiteRTLMSamplerConfig with Double types and seed parameter - Add NPU backend option - Add require() validation in config classes (matching official style) - Add ImageFile/AudioFile support for file:// URLs - Add cancelProcess() for conversation cancellation - Use @volatile and synchronized lock pattern for thread safety - Update KDoc with example usage matching official documentation style

… duplication - Add sendMultimodal() private helper for sync multimodal sends - Add sendMultimodalStreaming() private helper for async multimodal sends - Public API unchanged, internal code quality improved

- Add back agent modules to settings.gradle.kts - Add dokka entries for agent modules in build.gradle.kts - Restore test-utils dependency in litertlm-client commonTest Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

maceip and others added 14 commits January 9, 2026 11:14

chore: Add .gitignore for LiteRT-LM model files

7552aa6

Model files (.litertlm) are too large for git. Users should download models separately for testing.

chore: Remove Android ARM64 prebuilt binaries

b656cf3

These binaries are for Android NDK and not usable by the JVM-only LiteRT-LM client module. Users targeting Android should use the litertlm-android dependency directly which includes the native libs.

fix: Fix bugs in LiteRTLMClient

4baf302

- executeStreaming now passes tools parameter instead of emptyList() - Added guard for empty content parts in buildUserMessage

Merge remote updates

37a6545

refactor: Extract internal helpers in ManagedConversation for reduced…

aeaba9b

… duplication - Add sendMultimodal() private helper for sync multimodal sends - Add sendMultimodalStreaming() private helper for async multimodal sends - Public API unchanged, internal code quality improved

Merge remote updates

3bef206

Add LiteRT-LM provider support

cf4d2b2

maceip force-pushed the add-litert-lm-provider branch from 3f84fe4 to b4af5ca Compare January 9, 2026 10:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Draft add local LiteRT-LM provider +executor #1342

Draft add local LiteRT-LM provider +executor #1342

Uh oh!

maceip commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Draft add local LiteRT-LM provider +executor #1342

Are you sure you want to change the base?

Draft add local LiteRT-LM provider +executor #1342

Uh oh!

Conversation

maceip commented Jan 8, 2026

Motivation and Context

Breaking Changes

Type of the changes

Checklist

Additional steps for pull requests adding a new feature

Summary of Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant