Evolution Audio Converter

This project is a microservice in Go that processes audio files, converts them to opus or mp3 format, and returns both the duration of the audio and the converted file (as base64 or S3 URL). The service accepts audio files sent as form-data, base64, or URL.

Requirements

Before starting, you'll need to have the following installed:

Go (version 1.21 or higher)
Docker (to run the project in a container)
FFmpeg (for audio processing)

Installation

Clone the Repository

Clone this repository to your local machine:

git clone https://github.com/EvolutionAPI/evolution-audio-converter.git
cd evolution-audio-converter

Install Dependencies

Install the project dependencies:

go mod tidy

Install FFmpeg

The service depends on FFmpeg to convert the audio. Make sure FFmpeg is installed on your system.

On Ubuntu:
```
sudo apt update
sudo apt install ffmpeg
```
On macOS (via Homebrew):
```
brew install ffmpeg
```
On Windows, download FFmpeg here and add it to your system PATH.

Configuration

Create a .env file in the project's root directory. Here are the available configuration options:

Basic Configuration

PORT=4040
API_KEY=your_secret_api_key_here

Transcription Configuration

ENABLE_TRANSCRIPTION=true
TRANSCRIPTION_PROVIDER=openai  # or groq
OPENAI_API_KEY=your_openai_key_here
GROQ_API_KEY=your_groq_key_here
TRANSCRIPTION_LANGUAGE=en  # Default transcription language (optional)

Storage Configuration

ENABLE_S3_STORAGE=true
S3_ENDPOINT=play.min.io
S3_ACCESS_KEY=your_access_key_here
S3_SECRET_KEY=your_secret_key_here
S3_BUCKET_NAME=audio-files
S3_REGION=us-east-1
S3_USE_SSL=true
S3_URL_EXPIRATION=24h

Storage Options

The service supports two storage modes for the converted audio:

Base64 (default): Returns the audio file encoded in base64 format
S3 Compatible Storage: Uploads to S3-compatible storage (AWS S3, MinIO, etc.) and returns a presigned URL

When S3 storage is enabled, the response will include a url instead of the audio field:

{
  "duration": 120,
  "format": "ogg",
  "url": "https://your-s3-endpoint/bucket/file.ogg?signature...",
  "transcription": "Transcribed text here..." // if transcription was requested
}

If S3 upload fails, the service automatically falls back to base64 encoding.

Running the Project

Locally

To run the service locally:

go run main.go -dev

The server will be available at http://localhost:4040.

Using Docker

Build the Docker image:
```
docker build -t audio-service .
```

Run the container:

docker run -p 4040:4040 --env-file=.env audio-service

Using Dokploy with Nixpacks

This project is configured to work with Dokploy using Nixpacks for automatic deployment.

Requirements

The project includes a nixpacks.toml configuration file that automatically installs FFmpeg during the build process.

Environment Variables

Configure the following environment variables in your Dokploy deployment:

# Required
API_KEY=your_secret_api_key_here

# Optional
PORT=4040
CORS_ALLOW_ORIGINS=*

# Transcription (optional)
ENABLE_TRANSCRIPTION=true
TRANSCRIPTION_PROVIDER=openai
OPENAI_API_KEY=your_openai_key_here
OPENAI_API_URL=https://api.openai.com  # Use custom proxy like LiteLLM
GROQ_API_KEY=your_groq_key_here
TRANSCRIPTION_LANGUAGE=en

# S3 Storage (optional)
ENABLE_S3_STORAGE=true
S3_ENDPOINT=your_s3_endpoint
S3_ACCESS_KEY=your_access_key
S3_SECRET_KEY=your_secret_key
S3_BUCKET_NAME=audio-files
S3_REGION=us-east-1
S3_USE_SSL=true
S3_URL_EXPIRATION=24h

Deployment Steps

Connect your Git repository to Dokploy
Select "Nixpacks" as the build provider (or "Docker" for maximum compatibility)
Configure the environment variables above
Deploy the application

Note: If you encounter FFmpeg-related errors with Nixpacks, switch to Docker build provider for better compatibility. See TROUBLESHOOTING.md for more details.

The nixpacks.toml file ensures that FFmpeg is installed automatically during the build process.

API Usage

Authentication

All requests must include the apikey header with your API key.

Endpoints

Process Audio

POST /process-audio

Accepts audio files in these formats:

Form-data
Base64
URL

Optional parameters:

format: Output format (mp3 or ogg, default: ogg)
transcribe: Enable transcription (true or false)
language: Transcription language code (e.g., "en", "es", "pt")

Transcribe Only

POST /transcribe

Transcribes audio without format conversion.

Optional parameters:

language: Transcription language code

Example Requests

Form-data Upload

curl -X POST -F "file=@audio.mp3" \
  -F "format=ogg" \
  -F "transcribe=true" \
  -F "language=en" \
  http://localhost:4040/process-audio \
  -H "apikey: your_secret_api_key_here"

Base64 Upload

curl -X POST \
  -d "base64=$(base64 audio.mp3)" \
  -d "format=ogg" \
  http://localhost:4040/process-audio \
  -H "apikey: your_secret_api_key_here"

URL Upload

curl -X POST \
  -d "url=https://example.com/audio.mp3" \
  -d "format=ogg" \
  http://localhost:4040/process-audio \
  -H "apikey: your_secret_api_key_here"

Response Format

With S3 storage disabled (default):

{
  "duration": 120,
  "audio": "UklGR... (base64 of the file)",
  "format": "ogg",
  "transcription": "Transcribed text here..." // if requested
}

With S3 storage enabled:

{
  "duration": 120,
  "url": "https://your-s3-endpoint/bucket/file.ogg?signature...",
  "format": "ogg",
  "transcription": "Transcribed text here..." // if requested
}

License

This project is licensed under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
TROUBLESHOOTING.md		TROUBLESHOOTING.md
docker-compose.yaml		docker-compose.yaml
docker_build.sh		docker_build.sh
go.mod		go.mod
go.sum		go.sum
main.go		main.go
nixpacks.toml		nixpacks.toml
nixpacks.toml.alternative		nixpacks.toml.alternative
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Evolution Audio Converter

Requirements

Installation

Clone the Repository

Install Dependencies

Install FFmpeg

Configuration

Basic Configuration

Transcription Configuration

Storage Configuration

Storage Options

Running the Project

Locally

Using Docker

Using Dokploy with Nixpacks

Requirements

Environment Variables

Deployment Steps

API Usage

Authentication

Endpoints

Process Audio

Transcribe Only

Example Requests

Form-data Upload

Base64 Upload

URL Upload

Response Format

License

About

Uh oh!

Releases

Packages

Languages

License

eunarede/evolution-audio-converter

Folders and files

Latest commit

History

Repository files navigation

Evolution Audio Converter

Requirements

Installation

Clone the Repository

Install Dependencies

Install FFmpeg

Configuration

Basic Configuration

Transcription Configuration

Storage Configuration

Storage Options

Running the Project

Locally

Using Docker

Using Dokploy with Nixpacks

Requirements

Environment Variables

Deployment Steps

API Usage

Authentication

Endpoints

Process Audio

Transcribe Only

Example Requests

Form-data Upload

Base64 Upload

URL Upload

Response Format

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages