Go client for woolball server Transform idle browsers into a powerful distributed AI inference network For detailed examples and model lists, visit our GitHub repository.
This SDK is automatically generated by the Swagger Codegen project
go get github.com/woolball-xyz/go-sdkWoolball Server is an open-source network server that orchestrates AI inference jobs across a distributed network of browser-based compute nodes. Instead of relying on expensive cloud infrastructure, harness the collective power of idle browsers to run AI models efficiently and cost-effectively.
| π§ Provider | π― Task | π€ Models | π Status |
|---|---|---|---|
| Transformers.js | π€ Speech-to-Text | ONNX Models | β Ready |
| Transformers.js | π Text-to-Speech | ONNX Models | β Ready |
| Kokoro.js | π Text-to-Speech | ONNX Models | β Ready |
| Transformers.js | π Translation | ONNX Models | β Ready |
| Transformers.js | π Text Generation | ONNX Models | β Ready |
| WebLLM | π Text Generation | MLC Models | β Ready |
| MediaPipe | π Text Generation | LiteRT Models | β Ready |
Generate text with powerful language models
π€ Available Models
| Model | Quantization | Description |
|---|---|---|
HuggingFaceTB/SmolLM2-135M-Instruct |
fp16 |
Compact model for basic text generation |
HuggingFaceTB/SmolLM2-360M-Instruct |
q4 |
Balanced performance and size |
Mozilla/Qwen2.5-0.5B-Instruct |
q4 |
Efficient model for general tasks |
onnx-community/Qwen2.5-Coder-0.5B-Instruct |
q8 |
Specialized for code generation |
import (
"context"
"fmt"
swagger "github.com/woolball-xyz/go-sdk"
)
func main() {
cfg := swagger.NewConfiguration()
cfg.BasePath = "http://localhost:9002"
client := swagger.NewAPIClient(cfg)
api := client.TextGenerationApi
input := "[{\"role\": \"system\", \"content\": \"You are a helpful assistant.\"}, {\"role\": \"user\", \"content\": \"What is the capital of Brazil?\"}]"
response, httpResp, err := api.TextGeneration(context.Background(), "transformers", "HuggingFaceTB/SmolLM2-135M-Instruct", input, 50, 1.0, 0.7, 1.0, "fp16", 20, 250, 0, 0, true, 1, 0, 0, 0, 0, 0.0, 0.0, 0, 0, 0)
if err != nil {
fmt.Println("Error:", err)
return
}
fmt.Println("Response:", response)
fmt.Println("HTTP Response:", httpResp)
}| Parameter | Type | Default | Description |
|---|---|---|---|
model |
string | - | π€ Model ID (e.g., "HuggingFaceTB/SmolLM2-135M-Instruct") |
dtype |
string | - | π§ Quantization level (e.g., "fp16", "q4") |
max_length |
number | 20 | π Maximum length the generated tokens can have (includes input prompt) |
max_new_tokens |
number | null | π Maximum number of tokens to generate, ignoring prompt length |
min_length |
number | 0 | π Minimum length of the sequence to be generated (includes input prompt) |
min_new_tokens |
number | null | π’ Minimum numbers of tokens to generate, ignoring prompt length |
do_sample |
boolean | false | π² Whether to use sampling; use greedy decoding otherwise |
num_beams |
number | 1 | π Number of beams for beam search. 1 means no beam search |
temperature |
number | 1.0 | π‘οΈ Value used to modulate the next token probabilities |
top_k |
number | 50 | π Number of highest probability vocabulary tokens to keep for top-k-filtering |
top_p |
number | 1.0 | π If < 1, only tokens with probabilities adding up to top_p or higher are kept |
repetition_penalty |
number | 1.0 | π Parameter for repetition penalty. 1.0 means no penalty |
no_repeat_ngram_size |
number | 0 | π« If > 0, all ngrams of that size can only occur once |
π€ Available Models
| Model | Description |
|---|---|
DeepSeek-R1-Distill-Qwen-7B-q4f16_1-MLC |
DeepSeek R1 distilled model with reasoning capabilities |
DeepSeek-R1-Distill-Llama-8B-q4f16_1-MLC |
DeepSeek R1 distilled Llama-based model |
SmolLM2-1.7B-Instruct-q4f32_1-MLC |
Compact instruction-following model |
Llama-3.1-8B-Instruct-q4f32_1-MLC |
Meta's Llama 3.1 8B instruction model |
Qwen3-8B-q4f32_1-MLC |
Alibaba's Qwen3 8B model |
import (
"context"
"fmt"
swagger "github.com/woolball-xyz/go-sdk"
)
func main() {
cfg := swagger.NewConfiguration()
cfg.BasePath = "http://localhost:9002"
client := swagger.NewAPIClient(cfg)
api := client.TextGenerationApi
input := "[{\"role\": \"system\", \"content\": \"You are a helpful assistant.\"}, {\"role\": \"user\", \"content\": \"What is the capital of Brazil?\"}]"
response, _, err := api.TextGeneration(context.Background(), "webllm", "DeepSeek-R1-Distill-Qwen-7B-q4f16_1-MLC", input, 0, 0.95, 0.7, 0, "", 0, 0, 0, 0, false, 0, 0, 0, 0, 0, 0.0, 0.0, 0, 0, 0)
if err != nil {
fmt.Println(err)
}
fmt.Println(response)
}| Parameter | Type | Description |
|---|---|---|
model |
string | π€ Model ID from MLC (e.g., "DeepSeek-R1-Distill-Qwen-7B-q4f16_1-MLC") |
provider |
string | π§ Must be set to "webllm" when using WebLLM models |
context_window_size |
number | πͺ Size of the context window for the model |
sliding_window_size |
number | π Size of the sliding window for attention |
attention_sink_size |
number | π― Size of the attention sink |
repetition_penalty |
number | π Penalty for repeating tokens |
frequency_penalty |
number | π Penalty for token frequency |
presence_penalty |
number | ποΈ Penalty for token presence |
top_p |
number | π If < 1, only tokens with probabilities adding up to top_p or higher are kept |
temperature |
number | π‘οΈ Value used to modulate the next token probabilities |
bos_token_id |
number | π Beginning of sequence token ID (optional) |
π€ Available Models
| Model | Device Type | Description |
|---|---|---|
https://woolball.sfo3.cdn.digitaloceanspaces.com/gemma2-2b-it-cpu-int8.task |
CPU | Gemma2 2B model optimized for CPU inference |
https://woolball.sfo3.cdn.digitaloceanspaces.com/gemma2-2b-it-gpu-int8.bin |
GPU | Gemma2 2B model optimized for GPU inference |
https://woolball.sfo3.cdn.digitaloceanspaces.com/gemma3-1b-it-int4.task |
CPU/GPU | Gemma3 1B model with INT4 quantization |
https://woolball.sfo3.cdn.digitaloceanspaces.com/gemma3-4b-it-int4-web.task |
Web | Gemma3 4B model optimized for web deployment |
import (
"context"
"fmt"
swagger "github.com/woolball-xyz/go-sdk"
)
func main() {
cfg := swagger.NewConfiguration()
cfg.BasePath = "http://localhost:9002"
client := swagger.NewAPIClient(cfg)
api := client.TextGenerationApi
input := "[{\"role\": \"system\", \"content\": \"You are a helpful assistant.\"}, {\"role\": \"user\", \"content\": \"Explain quantum computing in simple terms.\"}]"
response, _, err := api.TextGeneration(context.Background(), "mediapipe", "https://woolball.sfo3.cdn.digitaloceanspaces.com/gemma3-1b-it-int4.task", input, 40, 0, 0.7, 0, "", 0, 0, 0, 0, false, 0, 0, 0, 0, 0, 0.0, 0.0, 0, 500, 12345)
if err != nil {
fmt.Println(err)
}
fmt.Println(response)
}| Parameter | Type | Description |
|---|---|---|
model |
string | π€ Model ID for MediaPipe LiteRT models on DigitalOcean Spaces |
provider |
string | π§ Must be set to "mediapipe" when using MediaPipe models |
maxTokens |
number | π’ Maximum number of tokens to generate |
randomSeed |
number | π² Random seed for reproducible results |
topK |
number | π Number of highest probability vocabulary tokens to keep for top-k-filtering |
temperature |
number | π‘οΈ Value used to modulate the next token probabilities |
Convert audio to text with Whisper models
| Model | Quantization | Description |
|---|---|---|
onnx-community/whisper-large-v3-turbo_timestamped |
q4 |
π― High accuracy with timestamps |
onnx-community/whisper-small |
q4 |
β‘ Fast processing |
import (
"context"
"fmt"
"io/ioutil"
swagger "github.com/woolball-xyz/go-sdk"
)
func main() {
cfg := swagger.NewConfiguration()
cfg.BasePath = "http://localhost:9002"
client := swagger.NewAPIClient(cfg)
api := client.SpeechRecognitionApi
audioFile, _ := ioutil.ReadFile("/path/to/your/file.mp3")
input := swagger.Object(audioFile) // ou string para URL
response, _, err := api.SpeechToText(context.Background(), "onnx-community/whisper-large-v3-turbo_timestamped", "q4", input, "true", false, 0, 0, false, "en", "", 0)
if err != nil {
fmt.Println(err)
}
fmt.Println("Transcription:", response)
}| Parameter | Type | Description |
|---|---|---|
model |
string | π€ Model ID from Hugging Face (e.g., "onnx-community/whisper-large-v3-turbo_timestamped") |
dtype |
string | π§ Quantization level (e.g., "q4") |
return_timestamps |
boolean | 'word' | β° Return timestamps ("word" for word-level). Default is false. |
stream |
boolean | π‘ Stream results in real-time. Default is false. |
chunk_length_s |
number | π Length of audio chunks to process in seconds. Default is 0 (no chunking). |
stride_length_s |
number | π Length of overlap between consecutive audio chunks in seconds. If not provided, defaults to chunk_length_s / 6. |
force_full_sequences |
boolean | π― Whether to force outputting full sequences or not. Default is false. |
language |
string | π Source language (auto-detect if null). Use this to potentially improve performance if the source language is known. |
task |
null | 'transcribe' | 'translate' | π― The task to perform. Default is null, meaning it should be auto-detected. |
num_frames |
number | π¬ The number of frames in the input audio. |
Generate natural speech from text
π€ Available Models
| Language | Model | Flag |
|---|---|---|
| English | Xenova/mms-tts-eng |
πΊπΈ |
| Spanish | Xenova/mms-tts-spa |
πͺπΈ |
| French | Xenova/mms-tts-fra |
π«π· |
| German | Xenova/mms-tts-deu |
π©πͺ |
| Portuguese | Xenova/mms-tts-por |
π΅πΉ |
| Russian | Xenova/mms-tts-rus |
π·πΊ |
| Arabic | Xenova/mms-tts-ara |
πΈπ¦ |
| Korean | Xenova/mms-tts-kor |
π°π· |
import (
"context"
"fmt"
swagger "github.com/woolball-xyz/go-sdk"
)
func main() {
cfg := swagger.NewConfiguration()
cfg.BasePath = "http://localhost:9002"
client := swagger.NewAPIClient(cfg)
api := client.TextToSpeechApi
response, _, err := api.TextToSpeech(context.Background(), "Xenova/mms-tts-eng", "q8", "Hello, this is a test for text to speech.", "", false)
if err != nil {
fmt.Println(err)
}
fmt.Println("Audio generated:", response)
}| Parameter | Type | Description | Required For |
|---|---|---|---|
model |
string | π€ Model ID | All providers |
dtype |
string | π§ Quantization level (e.g., "q8") | All providers |
stream |
boolean | π‘ Whether to stream the audio response. Default is false. |
All providers |
π€ Available Models
| Model | Quantization | Description |
|---|---|---|
onnx-community/Kokoro-82M-ONNX |
q8 |
High-quality English TTS with multiple voices |
onnx-community/Kokoro-82M-v1.0-ONNX |
q8 |
Alternative Kokoro model version |
import (
"context"
"fmt"
swagger "github.com/woolball-xyz/go-sdk"
)
func main() {
cfg := swagger.NewConfiguration()
cfg.BasePath = "http://localhost:9002"
client := swagger.NewAPIClient(cfg)
api := client.TextToSpeechApi
response, _, err := api.TextToSpeech(context.Background(), "onnx-community/Kokoro-82M-ONNX", "q8", "Hello, this is a test using Kokoro voices.", "af_nova", false)
if err != nil {
fmt.Println(err)
}
fmt.Println("Kokoro audio generated:", response)
}| Parameter | Type | Description | Required For |
|---|---|---|---|
model |
string | π€ Model ID | Required |
dtype |
string | π§ Quantization level (e.g., "q8") | Required |
voice |
string | π Voice ID (see below) | Required |
stream |
boolean | π‘ Whether to stream the audio response. Default is false. |
Optional |
π Available Voice Options
πΊπΈ American Voices
- π© Female:
af_heart,af_alloy,af_aoede,af_bella,af_jessica,af_nova,af_sarah - π¨ Male:
am_adam,am_echo,am_eric,am_liam,am_michael,am_onyx
π¬π§ British Voices
- π© Female:
bf_emma,bf_isabella,bf_alice,bf_lily - π¨ Male:
bm_george,bm_lewis,bm_daniel,bm_fable
Translate between 200+ languages
| Model | Quantization | Description |
|---|---|---|
Xenova/nllb-200-distilled-600M |
q8 |
π Multilingual translation model supporting 200+ languages |
import (
"context"
"fmt"
swagger "github.com/woolball-xyz/go-sdk"
)
func main() {
cfg := swagger.NewConfiguration()
cfg.BasePath = "http://localhost:9002"
client := swagger.NewAPIClient(cfg)
api := client.TranslationApi
response, _, err := api.Translation(context.Background(), "Xenova/nllb-200-distilled-600M", "q8", "Hello, how are you today?", "eng_Latn", "por_Latn")
if err != nil {
fmt.Println(err)
}
fmt.Println("Translation:", response)
}Uses FLORES200 format - supports 200+ languages!
| Parameter | Type | Description |
|---|---|---|
model |
string | π€ Model ID (e.g., "Xenova/nllb-200-distilled-600M") |
dtype |
string | π§ Quantization level (e.g., "q8") |
srcLang |
string | π Source language code in FLORES200 format (e.g., "eng_Latn") |
tgtLang |
string | π Target language code in FLORES200 format (e.g., "por_Latn") |
We welcome contributions! Here's how you can help:
- π Report bugs via GitHub Issues
- π‘ Suggest features in our Discord
- π§ Submit PRs for improvements
- π Improve documentation
This project is licensed under the MIT License - see the LICENSE file for details.
Made with β€οΈ by the Woolball team