Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,13 @@ if(CMAKE_BUILD_TYPE MATCHES "Release")
set_target_properties(ixwebsocket PROPERTIES LINK_FLAGS_RELEASE "-s -w") #-static-libgcc -static-libstdc++
endif()

add_library(mod_openai_audio_stream SHARED mod_openai_audio_stream.c mod_openai_audio_stream.h openai_audio_streamer_glue.h openai_audio_streamer_glue.cpp buffer/ringbuffer.c base64.cpp)
add_library(mod_openai_audio_stream SHARED
mod_openai_audio_stream.c
mod_openai_audio_stream.h
openai_audio_streamer_glue.h
openai_audio_streamer_glue.cpp
base64.cpp
)

set_property(TARGET mod_openai_audio_stream PROPERTY POSITION_INDEPENDENT_CODE ON)

Expand Down
18 changes: 11 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,18 @@
# mod_openai_audio_stream
# mod_openai_realtime

![Build](https://github.com/VoiSmart/mod_openai_audio_stream/actions/workflows/build.yml/badge.svg?branch=main)
![Code-Checks](https://github.com/VoiSmart/mod_openai_audio_stream/actions/workflows/code-checks.yml/badge.svg?branch=main)
![Build](https://github.com/VoiSmart/mod_openai_realtime/actions/workflows/build.yml/badge.svg?branch=main)
![Code-Checks](https://github.com/VoiSmart/mod_openai_realtime/actions/workflows/code-checks.yml/badge.svg?branch=main)
[![License: MIT](https://img.shields.io/badge/license-MIT-blue?style=flat)](LICENSE)

**mod_openai_audio_stream** is a FreeSWITCH module that streams L16 audio from a channel to an OpenAI Realtime WebSocket endpoint. The stream follows OpenAI's Realtime API specification and enables real-time audio playback directly in the channel.
**mod_openai_realtime** is a FreeSWITCH module that streams L16 audio from a channel to an OpenAI Realtime WebSocket endpoint. The stream follows OpenAI's Realtime API specification and enables real-time audio playback directly in the channel.

> [!WARNING]
> This is a standalone fork of `mod_audio_stream`, not affiliated with the original project.
> Legacy naming (`mod_openai_audio_stream`) is retained for backward compatibility but will be updated in a future major release.

It is a fork of [mod_audio_stream](https://github.com/amigniter/mod_audio_stream), specifically adapted for streaming audio to OpenAI's Realtime API and playing the responses back to the user via FreeSWITCH and WebSocket.

The goal of **mod_openai_audio_stream** is to provide a simple, lightweight, yet effective module for streaming audio and receiving responses directly from OpenAI’s Realtime WebSocket into the call through FreeSWITCH. It uses [ixwebsocket](https://machinezone.github.io/IXWebSocket/), a C++ WebSocket library compiled as a static library.
The goal of **mod_openai_realtime** is to provide a simple, lightweight, yet effective module for streaming audio and receiving responses directly from OpenAI’s Realtime WebSocket into the call through FreeSWITCH. It uses [ixwebsocket](https://machinezone.github.io/IXWebSocket/), a C++ WebSocket library compiled as a static library.


## Important Notes
Expand Down Expand Up @@ -76,7 +80,7 @@ The following is **a simple dialplan example** that demonstrates how to use the

* Make sure to replace `sk-xxxxxxxxxxxxxxxxxx` with your actual OpenAI API key.
* The dialplan answers the call and starts streaming audio to OpenAI's Realtime API using `uuid_openai_audio_stream`, so you can try it out and see the OpenAI events in the FreeSWITCH console within the `mod_openai_audio_stream::json` events and other module events.
* The playback action with `silence_stream://-1//` is needed for audio playback to work properly. For more details, check issue [#16](https://github.com/VoiSmart/mod_openai_audio_stream/issues/16).
* The playback action with `silence_stream://-1//` is needed for audio playback to work properly. For more details, check issue [#16](https://github.com/VoiSmart/mod_openai_realtime/issues/16).

#### Next steps

Expand All @@ -85,7 +89,7 @@ The **getting started** example is a basic demonstration of how to use the modul
This way you can build more complex applications **allowing for function calls, updating instructions**, and other interactions with OpenAI's Realtime API. Check out the [OpenAI Realtime documentation](https://platform.openai.com/docs/guides/realtime) and [API reference](https://platform.openai.com/docs/api-reference/realtime) for more details on how to structure your requests and handle responses.

### Channel variables
The following channel variables can be used to fine tune websocket connection and also configure mod_openai_audio_stream logging:
The following channel variables can be used to fine-tune websocket connection and also configure mod_openai_realtime logging:

| Variable | Description | Default |
| -------------------------------------- | ------------------------------------------------------- | ------- |
Expand Down
123 changes: 0 additions & 123 deletions buffer/ringbuffer.c

This file was deleted.

74 changes: 0 additions & 74 deletions buffer/ringbuffer.h

This file was deleted.

12 changes: 6 additions & 6 deletions mod_openai_audio_stream.c
Original file line number Diff line number Diff line change
Expand Up @@ -170,12 +170,12 @@ static switch_status_t send_json(switch_core_session_t *session, char* json) {
#define STREAM_API_SYNTAX \
"USAGE:\n" \
"--------------------------------------------------------------------------------\n" \
"uuid_openai_audio_stream <uuid> [start | stop | send_json | pause | resume |\n" \
" mute | unmute]\n" \
" [wss-url | path | user | openai | all | base64json]\n" \
" [mono | mixed | stereo]\n" \
" [8000 | 16000 | 24000]\n" \
" [mute_user]\n" \
"uuid_openai_audio_stream <uuid> start <wss-url> <mono | mixed | stereo> \n" \
" [8k | 16k | 24k | <other rate>] [mute_user]\n" \
" where <rate> = 8k|16k|24k or any multiple of 8000 (default: 8k)\n" \
Comment on lines +173 to +175
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an inconsistency in formatting - line 173 has a trailing space after "stereo>" and before the newline, and the indentation uses different spacing. Consider aligning all continuation lines consistently for better readability.

Suggested change
"uuid_openai_audio_stream <uuid> start <wss-url> <mono | mixed | stereo> \n" \
" [8k | 16k | 24k | <other rate>] [mute_user]\n" \
" where <rate> = 8k|16k|24k or any multiple of 8000 (default: 8k)\n" \
"uuid_openai_audio_stream <uuid> start <wss-url> <mono | mixed | stereo>\n" \
" [8k | 16k | 24k | <other rate>] [mute_user]\n" \
" where <rate> = 8k|16k|24k or any multiple of 8000 (default: 8k)\n" \

Copilot uses AI. Check for mistakes.
"uuid_openai_audio_stream <uuid> [stop | pause | resume]\n" \
"uuid_openai_audio_stream <uuid> [mute | unmute] [user | openai | all]\n" \
"uuid_openai_audio_stream <uuid> send_json <base64json>\n" \
"--------------------------------------------------------------------------------\n"

typedef enum {
Expand Down
3 changes: 0 additions & 3 deletions mod_openai_audio_stream.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
#include <switch.h>
#include <limits.h>
#include <speex/speex_resampler.h>
#include "buffer/ringbuffer.h"

#define MY_BUG_NAME "audio_stream"
#define MAX_SESSION_ID (256)
Expand Down Expand Up @@ -33,9 +32,7 @@ struct private_data {
int user_audio_muted:1;
int openai_audio_muted:1;
int close_requested:1;
RingBuffer *buffer;
switch_buffer_t *sbuffer;
uint8_t *data;
int rtp_packets;
switch_buffer_t *playback_buffer;
};
Expand Down
Loading