feat(replicate): add complete Replicate provider with all features #757

cemarta7 · 2025-11-22T02:44:37Z

Pull Request: Complete Replicate Provider Implementation

Summary

This PR adds complete support for the Replicate provider to Prism, including all core features: text generation, streaming, structured output, embeddings, image generation, and audio (TTS/STT). This brings Replicate support to production-ready status.

What Changed

Complete Provider Implementation

This PR implements the Replicate provider from scratch with all major features:

Text Generation (`src/Providers/Replicate/Handlers/Text.php`)

Basic text generation with prompts
System message support
Multi-message conversations
Async prediction handling with configurable polling

Streaming (`src/Providers/Replicate/Handlers/Stream.php`)

Real-time SSE (Server-Sent Events) streaming
Delta-based token streaming
Usage information tracking
All standard stream events (StreamStarted, StreamDelta, StreamEnd, etc.)
Automatic fallback to simulated streaming when SSE unavailable

Structured Output (`src/Providers/Replicate/Handlers/Structured.php`)

JSON schema-based structured generation
Type-safe output parsing
Works with any model supporting structured output

Embeddings (`src/Providers/Replicate/Handlers/Embeddings.php`)

Single and batch embedding generation
Support for any embedding model on Replicate
Flexible model version handling

Image Generation (`src/Providers/Replicate/Handlers/Images.php`)

Text-to-image generation
Support for FLUX and other image models
Configurable generation parameters
Flexible model version handling

Audio (`src/Providers/Replicate/Handlers/Audio.php`)

Text-to-Speech (TTS) with voice options
Speech-to-Text (STT) for WAV/MP3 files
Multiple voice and model support

Core Infrastructure

Provider Class (`src/Providers/Replicate/Replicate.php`)

Base provider configuration
API client setup
Handler routing

Prediction Handling (`src/Providers/Replicate/Concerns/HandlesPredictions.php`)

Asynchronous prediction management
Polling and completion detection
Error handling for failed predictions
Sync mode with Prefer: wait header for lower latency
Automatic fallback to polling when predictions take too long

Message Mapping (`src/Providers/Replicate/Maps/MessageMap.php`)

Converts Prism messages to Replicate format
Handles text, images, audio, video, and documents

Additional Improvements

While implementing the provider, we standardized model version handling across all handlers:

Removed hardcoded model version hashes from Embeddings, Images, Stream, and Structured handlers
All handlers now return model strings as-is, letting Replicate's API resolve versions
This makes the provider work with ANY model and prevents version hashes from becoming stale

Testing

21 comprehensive tests covering all features:

Audio Tests (5 tests)
- TTS with different voices
- STT for WAV and MP3 files
Embeddings Tests (3 tests)
- Single and batch embeddings
- Model information tracking
Image Tests (3 tests)
- Image generation with FLUX
- Provider options
Text Tests (3 tests)
- Basic text generation
- System prompts
- Model version handling
Stream Tests (4 tests)
- Real-time token streaming
- Event emission
- Usage tracking
- Text reconstruction from deltas
SSE Stream Integration Tests (3 tests)
- Real-time streaming validation
- Event ordering
- Multiple token chunks
Structured Output Tests (3 tests)
- Schema-based generation
- Complex structured output
- Usage tracking

Test Results:

21 tests passing (3 skipped for integration)
60 assertions
100% feature coverage
Zero breaking changes to existing code

Implementation Approach

Async Prediction Management

Replicate uses an asynchronous prediction-based architecture. Prism handles this transparently:

Sync Mode (default): Uses Prefer: wait header for lower latency
Async Mode: Traditional polling for long-running predictions
Automatic fallback: Falls back to polling when sync mode times out

SSE Streaming

Replicate uses Server-Sent Events (SSE) for streaming, which required:

Custom SSE stream adapter
JSON event parsing (not plain text)
Delta accumulation for complete responses
Proper event ordering and state management
Automatic fallback to simulated streaming when SSE unavailable

Features Supported

Core Functionality

✅ Text generation with prompts and system messages
✅ Real-time SSE streaming with events
✅ Structured output with JSON schemas
✅ Embeddings (single and batch)
✅ Image generation (FLUX and other models)
✅ Text-to-Speech with voice options
✅ Speech-to-Text (WAV/MP3)

Quality Features

✅ Flexible model version handling (any model works)
✅ Comprehensive error messages
✅ Rate limit handling
✅ Usage tracking
✅ Provider-specific options support
✅ Sync mode with automatic fallback to polling

Example Usage

Text Generation

use Prism\Prism\Facades\Prism;

$response = Prism::text()
    ->using('replicate', 'meta/meta-llama-3.1-8b-instruct')
    ->withPrompt('Explain quantum computing')
    ->generate();

echo $response->text;

Streaming

$stream = Prism::text()
    ->using('replicate', 'meta/meta-llama-3.1-8b-instruct')
    ->withPrompt('Write a story')
    ->stream();

foreach ($stream as $chunk) {
    echo $chunk->text;
}

Structured Output

$response = Prism::structured()
    ->using('replicate', 'meta/meta-llama-3.1-8b-instruct')
    ->withPrompt('Extract: John is 30 years old')
    ->withSchema(new ObjectSchema('person', [
        new StringSchema('name'),
        new IntegerSchema('age'),
    ]))
    ->generate();

echo $response->structured->name; // "John"
echo $response->structured->age;  // 30

Image Generation

$response = Prism::images()
    ->using('replicate', 'black-forest-labs/flux-schnell')
    ->withPrompt('A serene mountain landscape')
    ->generate();

$response->image->save('mountain.png');

Embeddings

$response = Prism::embeddings()
    ->using('replicate', 'mark3labs/embeddings-gte-base')
    ->fromInput(['Hello world', 'Goodbye world'])
    ->generate();

foreach ($response->embeddings as $embedding) {
    echo count($embedding->embedding); // 768
}

Audio (TTS)

$response = Prism::audio()
    ->using('replicate', 'jaaari/kokoro-82m')
    ->withInput('Hello, how are you?')
    ->withVoice('af_bella')
    ->asAudio();

$audioData = base64_decode($response->audio->base64);
file_put_contents('speech.mp3', $audioData);

Audio (STT)

$audioFile = new Audio('path/to/audio.mp3');

$response = Prism::audio()
    ->using('replicate', 'vaibhavs10/incredibly-fast-whisper')
    ->withInput($audioFile)
    ->asText();

echo $response->text; // Transcribed text

Files Changed

Core Provider Files:

src/Providers/Replicate/Replicate.php (provider class)
src/Providers/Replicate/Concerns/HandlesPredictions.php (async predictions)
src/PrismManager.php (register provider)
config/prism.php (add provider config)

Handlers (6 files):

src/Providers/Replicate/Handlers/Text.php
src/Providers/Replicate/Handlers/Stream.php
src/Providers/Replicate/Handlers/Structured.php
src/Providers/Replicate/Handlers/Embeddings.php
src/Providers/Replicate/Handlers/Images.php
src/Providers/Replicate/Handlers/Audio.php

Mappers (2 files):

src/Providers/Replicate/Maps/MessageMap.php
src/Providers/Replicate/Maps/FinishReasonMap.php

Value Objects:

src/Providers/Replicate/ValueObjects/ReplicatePrediction.php

Tests (21 tests across 7 test files):

tests/Providers/Replicate/ReplicateTextTest.php
tests/Providers/Replicate/ReplicateStreamTest.php
tests/Providers/Replicate/ReplicateSSEStreamTest.php
tests/Providers/Replicate/ReplicateStructuredTest.php
tests/Providers/Replicate/ReplicateEmbeddingsTest.php
tests/Providers/Replicate/ReplicateImagesTest.php
tests/Providers/Replicate/ReplicateAudioTest.php

Fixtures:

24+ JSON fixture files for all features
Audio fixtures (WAV/MP3)

Documentation:

docs/providers/replicate.md (455 lines of comprehensive documentation)

Code Quality

✅ Laravel Pint: All code formatted
✅ Rector: Applied automatically
✅ PHPStan Level 8: No errors in new code
✅ Test Coverage: 21 tests with 60 assertions
✅ 100% Feature Coverage: All provider features tested
✅ Integration Tests: Real API validation for streaming

Backward Compatibility

✅ Zero breaking changes
✅ All existing functionality preserved
✅ All existing tests pass
✅ New opt-in provider (doesn't affect existing providers)

Provider Feature Matrix

Feature	Supported	Tests
Text Generation	✅	3
Streaming	✅	7
Structured Output	✅	3
Embeddings	✅	3
Image Generation	✅	3
Text-to-Speech	✅	3
Speech-to-Text	✅	2
Async Predictions	✅	✓
SSE Streaming	✅	✓

Result: 7/7 core features fully implemented 🎉

Testing

# Run Replicate tests
./vendor/bin/pest tests/Providers/Replicate/

# Results: 21 passed (3 skipped), 60 assertions

All test categories passing:

✅ Text generation (basic, system prompts, versions)
✅ Streaming (SSE, events, deltas, usage)
✅ Structured output (schemas, parsing)
✅ Embeddings (single, batch, models)
✅ Images (generation, options)
✅ Audio (TTS voices, STT formats)

Checklist

✅ Complete provider implementation (all features)
✅ 21 tests added and passing
✅ Code formatted with Pint
✅ PHPStan level 8 passing
✅ No breaking changes
✅ Documentation complete (455 lines)
✅ Follows conventional commits
✅ Production-ready code

Note: This is a complete provider implementation validated with real Replicate API testing. Tool calling is not supported as Replicate does not have a native tool calling API, and prompt engineering approaches were not reliable enough for production use.

Add comprehensive Replicate provider implementation supporting all core features: text generation, streaming (SSE), structured output, embeddings, image generation, and audio (TTS/STT). Features: - Text generation with system prompts and conversation history - Real-time SSE streaming with automatic fallback to simulated streaming - Structured output with JSON schema validation - Image generation (FLUX, Stable Diffusion XL, etc.) - Text-to-Speech with multiple voices (Kokoro-82m) - Speech-to-Text with Whisper (WAV, MP3, FLAC, OGG, M4A) - Embeddings (single and batch, 768-dimensional vectors) Implementation: - Async prediction management with configurable polling - Sync mode (Prefer: wait header) for lower latency - Comprehensive error handling with typed exceptions - Full PHPStan level 8 compliance - 21 tests with 60 assertions (100% feature coverage) - 455 lines of comprehensive documentation Files changed: 58 files, 4,444+ lines added

cemarta7 force-pushed the feature/replicate-tool-calling branch from 4231927 to 0b03a7e Compare November 25, 2025 04:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(replicate): add complete Replicate provider with all features #757

feat(replicate): add complete Replicate provider with all features #757

Uh oh!

cemarta7 commented Nov 22, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

feat(replicate): add complete Replicate provider with all features #757

Are you sure you want to change the base?

feat(replicate): add complete Replicate provider with all features #757

Uh oh!

Conversation

cemarta7 commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request: Complete Replicate Provider Implementation

Summary

What Changed

Complete Provider Implementation

Text Generation (src/Providers/Replicate/Handlers/Text.php)

Streaming (src/Providers/Replicate/Handlers/Stream.php)

Structured Output (src/Providers/Replicate/Handlers/Structured.php)

Embeddings (src/Providers/Replicate/Handlers/Embeddings.php)

Image Generation (src/Providers/Replicate/Handlers/Images.php)

Audio (src/Providers/Replicate/Handlers/Audio.php)

Core Infrastructure

Provider Class (src/Providers/Replicate/Replicate.php)

Prediction Handling (src/Providers/Replicate/Concerns/HandlesPredictions.php)

Message Mapping (src/Providers/Replicate/Maps/MessageMap.php)

Additional Improvements

Testing

Implementation Approach

Async Prediction Management

SSE Streaming

Features Supported

Core Functionality

Quality Features

Example Usage

Text Generation

Streaming

Structured Output

Image Generation

Embeddings

Audio (TTS)

Audio (STT)

Files Changed

Code Quality

Backward Compatibility

Provider Feature Matrix

Testing

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cemarta7 commented Nov 22, 2025 •

edited

Loading

Text Generation (`src/Providers/Replicate/Handlers/Text.php`)

Streaming (`src/Providers/Replicate/Handlers/Stream.php`)

Structured Output (`src/Providers/Replicate/Handlers/Structured.php`)

Embeddings (`src/Providers/Replicate/Handlers/Embeddings.php`)

Image Generation (`src/Providers/Replicate/Handlers/Images.php`)

Audio (`src/Providers/Replicate/Handlers/Audio.php`)

Provider Class (`src/Providers/Replicate/Replicate.php`)

Prediction Handling (`src/Providers/Replicate/Concerns/HandlesPredictions.php`)

Message Mapping (`src/Providers/Replicate/Maps/MessageMap.php`)