A Swift package for voice-to-prescription functionality with audio recording and real-time transcription capabilities for medical consultation applications.
EkaScribe empowers healthcare applications with advanced voice recording and transcription capabilities. It provides a seamless integration for medical consultation workflows, enabling doctors to record patient interactions and automatically generate prescriptions through AI-powered voice analysis.
- Install the package via Swift Package Manager
- Configure authentication and doctor information at app launch
- Initialize
VoiceToRxViewModelin your view - Start recording with your desired templates and languages
- Handle results when processing completes
The SDK requires iOS 17.0+ and provides a simple, straightforward API for voice-to-prescription functionality.
- 🎙️ Voice Activity Detection (VAD) - Intelligent audio recording with automatic speech detection
- 🔄 Real-time Transcription - Live audio-to-text conversion during consultations
- 🏥 Medical Context Aware - Specialized for healthcare terminology and prescription generation
- 📊 Session Management - Complete recording session lifecycle management
- ☁️ Cloud Integration - Automatic audio upload and processing
- 📝 Template Support - Customizable output formats (SOAP notes, prescriptions, etc.)
- Requirements
- Installation
- Integration Guide
- Core Components
- Configuration
- Recording Management
- Template Management
- Session History
- Result Management
- Error Handling
- API Reference
⚠️ Important: This SDK requires iOS 17.0 or later.
- iOS: 17.0+
- Swift: 5.9+
- Xcode: 15.0+
- Sdk Version: 1.3.1+
Add the following permissions to your app's Info.plist:
<key>NSMicrophoneUsageDescription</key>
<string>This app needs microphone access to record medical consultations</string>
<key>NSLocalNetworkUsageDescription</key>
<string>This app needs network access to upload and process audio recordings</string>Add EkaScribe to your project using Swift Package Manager:
- In Xcode, select File → Add Package Dependencies
- Enter the repository URL:
Or use SSH:
https://github.com/eka-care/EkaVoiceToRx.gitgit@github.com:eka-care/EkaVoiceToRx.git - Choose the version or branch
- Add to your target
dependencies: [
.package(url: "https://github.com/eka-care/EkaVoiceToRx.git", from: "1.3.1")
]After adding the package, import it in your Swift files:
import EkaVoiceToRxFollow these steps to integrate the SDK into your app:
Set up authentication tokens and doctor information when your app launches. This should be done once at app startup:
import SwiftUI
import EkaVoiceToRx
@main
struct YourApp: App {
init() {
configureEkaScribe()
}
var body: some Scene {
WindowGroup {
ContentView()
}
}
private func configureEkaScribe() {
// 1. Set authentication tokens (required)
AuthTokenHolder.shared.authToken = "your_auth_token"
AuthTokenHolder.shared.refreshToken = "your_refresh_token"
AuthTokenHolder.shared.bid = "your_business_id"
// 2. Configure doctor/user information (required)
let config = V2RxInitConfigurations.shared
config.clientId = "your_client_id"
config.ownerName = "Dr. Smith"
config.ownerOID = "doctor_oid_123"
config.ownerUUID = "doctor_uuid_456"
}
}Important: Make sure to set authentication tokens and doctor information before making any SDK calls.
Create a VoiceToRxViewModel instance in your SwiftUI view or UIKit view controller:
SwiftUI:
@StateObject private var viewModel = VoiceToRxViewModel(
voiceToRxInitConfig: .shared,
voiceToRxDelegate: nil
)UIKit:
private let viewModel = VoiceToRxViewModel(
voiceToRxInitConfig: .shared,
voiceToRxDelegate: nil
)Start a recording session with your desired configuration:
The startRecording() method accepts the following parameters:
-
conversationType (
VoiceConversationType): The type of conversation (see Conversation Types) -
inputLanguage (
[InputLanguageType]): Array of supported languages (see Input Languages) -
templates (
[OutputFormatTemplate]): Array of output format templates (see Template Management) -
modelType (
ModelType): The AI model to use for processing (see Model Types)
// Your implementation
private func startRecording() async {
do {
// 1. Fetch available templates (recommended)
// See Template Management section for details on getTemplates API
// VoiceToRxRepo.shared.getTemplates { result in ... }
// 2. Configure templates
let outputFormat: [OutputFormatTemplate] = [
OutputFormatTemplate(
templateID: "template-id-here",
templateType: .defaultType,
templateName: "SOAP Notes"
)
]
// 3. Set patient information (if applicable)
V2RxInitConfigurations.shared.subOwnerOID = patientOID
V2RxInitConfigurations.shared.subOwnerName = patientName
// 4. Calling SDK function
try await viewModel.startRecording(
conversationType: VoiceConversationType.conversation,
inputLanguage: [InputLanguageType.english, InputLanguageType.hindi],
templates: outputFormat,
modelType: ModelType.lite
)
// Successfully started recording
} catch {
// Handle errors
await MainActor.run {
handleRecordingError(error)
}
}
}Note: It's recommended to use the getTemplates() API first to fetch available templates before creating the OutputFormatTemplate array. See the Template Management section for details.
The central view model that manages the entire voice recording and processing workflow.
public class VoiceToRxViewModel: ObservableObject {
@Published public var screenState: RecordConsultationState
public init(
voiceToRxInitConfig: V2RxInitConfigurations,
voiceToRxDelegate: VoiceToRxDelegate?
)
}The screenState property tracks the current state of the recording session:
public enum RecordConsultationState {
case retry // Ready to retry after error
case startRecording // Initial state, ready to start
case listening(conversationType: VoiceConversationType) // Currently recording
case paused // Recording paused
case processing // Processing audio/generating prescription
case resultDisplay(success: Bool, value: String?) // Results available
case deletedRecording // Recording deleted
}public enum VoiceConversationType: String, CaseIterable {
case conversation = "consultation" // Doctor-patient conversation
case dictation // Direct prescription dictation
}public enum ModelType {
case pro // High accuracy model
case lite // Faster, lighter model
}public enum InputLanguageType {
case english
case hindi
case tamil
case telugu
case kannada
case malayalam
case bengali
case gujarati
case marathi
case punjabi
case urdu
case odia
case assamese
}Configure the SDK when your app launches. This should be done once at app startup:
import SwiftUI
import EkaVoiceToRx
@main
struct YourApp: App {
init() {
configureEkaScribe()
}
var body: some Scene {
WindowGroup {
ContentView()
}
}
private func configureEkaScribe() {
// 1. Set authentication tokens (required)
AuthTokenHolder.shared.authToken = KeychainHelper.fetchAuthToken()
AuthTokenHolder.shared.refreshToken = KeychainHelper.fetchRefreshToken()
AuthTokenHolder.shared.bid = JWTDecoder.shared.businessId
// 2. Configure doctor/user information (required)
let config = V2RxInitConfigurations.shared
config.clientId = "your_client_id"
config.ownerName = "Dr. Smith"
config.ownerOID = "doctor_oid_123"
config.ownerUUID = "doctor_uuid_456"
// 3. Set up data container (optional - only if using SwiftData)
// config.modelContainer = SwiftDataRepoContext.modelContext.container
}
}Important: Make sure to set authentication tokens and doctor information before making any SDK calls.
Before starting a recording session, configure patient-specific information. This should be done each time you start a new recording:
// Set patient information for the current session (required before recording)
V2RxInitConfigurations.shared.subOwnerOID = selectedPatientOid
V2RxInitConfigurations.shared.subOwnerName = patientName
// Optional: Set appointment ID for context
V2RxInitConfigurations.shared.appointmentID = appointmentIDNote: Patient information can be set to empty strings if not applicable:
V2RxInitConfigurations.shared.subOwnerOID = ""
V2RxInitConfigurations.shared.subOwnerName = ""public class V2RxInitConfigurations {
public static let shared: V2RxInitConfigurations
// Doctor/User Information
public var ownerName: String? // Doctor name
public var ownerOID: String? // Doctor OID
public var ownerUUID: String? // Doctor UUID
public var clientId: String? // Client identifier
// Patient Information (set per session)
public var subOwnerOID: String? // Patient OID
public var subOwnerName: String? // Patient name
// Optional Context
public var appointmentID: String? // Appointment identifier
// Data Storage
public var modelContainer: ModelContainer? // SwiftData container (optional)
}Set authentication tokens before making any SDK calls:
AuthTokenHolder.shared.authToken = "your_auth_token"
AuthTokenHolder.shared.refreshToken = "your_refresh_token"
AuthTokenHolder.shared.bid = "your_business_id"The startRecording() method accepts the following parameters:
-
conversationType (
VoiceConversationType): The type of conversation (see Conversation Types) -
inputLanguage (
[InputLanguageType]): Array of supported languages (see Input Languages) -
templates (
[OutputFormatTemplate]): Array of output format templates (see Template Management) -
modelType (
ModelType): The AI model to use for processing (see Model Types)
// Start recording
try await viewModel.startRecording(
conversationType: VoiceConversationType.conversation,
inputLanguage: [InputLanguageType.english, InputLanguageType.hindi],
templates: templates,
modelType: ModelType.lite
)Note: It's recommended to use the getTemplates() API first to fetch available templates before creating the OutputFormatTemplate array. See the Template Management section for details.
// Pause recording
viewModel.pauseRecording()
// Resume recording
do {
try viewModel.resumeRecording()
} catch {
// Handle resume error
print("Failed to resume: \(error.localizedDescription)")
}// Stop recording and process
Task {
do {
try await viewModel.stopRecording()
// The viewModel will transition to .processing, then .resultDisplay
} catch {
// Handle error
print("Failed to stop recording: \(error.localizedDescription)")
}
}Monitor the screenState for result availability. Results are automatically provided when processing completes:
case .resultDisplay(success: let success, value: let value):
if success {
// Get the result text (base64 encoded)
let base64EncodedText = value ?? ""
// Decode base64 to get the actual result
guard let decodedData = Data(base64Encoded: base64EncodedText),
let resultText = String(data: decodedData, encoding: .utf8) else {
showError("Failed to decode result")
return
}
// Get the session ID for future reference
let sessionID = viewModel.sessionID?.uuidString ?? ""
// Display results to user
showResults(resultText, sessionID: sessionID)
} else {
// Handle error
showError("Failed to process recording")
}Note: The value parameter contains base64-encoded formatted output based on your selected template. You need to decode it before displaying to users. For more detailed results including multiple template outputs, use VoiceToRxRepo.shared.fetchResultStatusResponse().
Use VoiceToRxRepo to fetch available templates:
import EkaVoiceToRx
VoiceToRxRepo.shared.getTemplates { result in
switch result {
case .success(let response):
let templates = response.items
// Use templates
break
case .failure(let error):
print("Failed to fetch templates: \(error.localizedDescription)")
break
}
}Update user's favorite templates configuration:
let templateIDs = ["template-id-1", "template-id-2"]
VoiceToRxRepo.shared.updateConfig(
templates: templateIDs
) { result in
switch result {
case .success:
print("Templates updated successfully")
case .failure(let error):
print("Failed to update: \(error.localizedDescription)")
}
}Pass templates when starting a recording:
// Your implementation
private func startRecordingWithTemplates() async {
do {
// 1. Fetch available templates first (recommended)
// VoiceToRxRepo.shared.getTemplates { result in ... }
// 2. Configure templates
let templates: [OutputFormatTemplate] = [
OutputFormatTemplate(
templateID: "template-id",
templateType: .defaultType,
templateName: "Template Name"
)
]
// 3. Calling SDK function
try await viewModel.startRecording(
conversationType: VoiceConversationType.conversation,
inputLanguage: [InputLanguageType.english],
templates: templates,
modelType: ModelType.pro
)
} catch {
// Handle error
await MainActor.run {
handleError(error)
}
}
}Retrieve past recording sessions:
VoiceToRxRepo.shared.getEkaScribeHistory { result in
switch result {
case .success(let response):
let sessions = response.data
// Display sessions
break
case .failure(let error):
print("Failed to fetch history: \(error.localizedDescription)")
break
}
}Get the complete result response with selected template outputs:
VoiceToRxRepo.shared.fetchResultStatusResponse(sessionID: sessionID) { result in
switch result {
case .success(let response):
// Note: All values in the response are base64 encoded and need to be decoded before display
// response.data.templateResults.custom - custom template results (base64 encoded)
// response.data.templateResults.transcript - transcript results (base64 encoded)
// response.data.output - general output (base64 encoded)
// response.data.audioMatrix - audio quality metrics
// Example: Decoding a result value
if let encodedValue = response.data?.templateResults?.custom?.first?.value,
let decodedData = Data(base64Encoded: encodedValue),
let decodedText = String(data: decodedData, encoding: .utf8) {
// Use decodedText for display
print("Decoded result: \(decodedText)")
}
break
case .failure(let error):
print("Failed to fetch response: \(error.localizedDescription)")
break
}
}Note: The value parameter returned by fetchResultStatus is base64 encoded and must be decoded before displaying to users.
Switch the output format for an existing session:
VoiceToRxRepo.shared.switchTemplate(
templateID: newTemplateID,
sessionID: sessionID
) { result in
switch result {
case .success(let response):
// Updated response with new template output
break
case .failure(let error):
print("Failed to switch template: \(error.localizedDescription)")
break
}
}Update edited content back to the server:
let request = UpdateResultRequest(
// Configure request with updated content
)
VoiceToRxRepo.shared.updateResult(
sessionID: sessionID,
request: request
) { result, errorStatus in
switch result {
case .success:
print("Result updated successfully")
case .failure(let error):
print("Failed to update: \(error.localizedDescription)")
}
}The SDK provides specific error types:
public enum EkaScribeError: Error {
case freeSessionLimitReached
case microphonePermissionDenied
case microphoneIsInUse
case vadDetectorFailed
case noSessionId
case audioSessionSetupFailed
}Here's a complete example of handling errors when starting a recording:
// Your implementation
private func handleRecording() async {
do {
let templates: [OutputFormatTemplate] = [
// Your templates
]
// Calling SDK function
try await viewModel.startRecording(
conversationType: VoiceConversationType.conversation,
inputLanguage: [InputLanguageType.english, InputLanguageType.hindi],
templates: templates,
modelType: ModelType.lite
)
// Successfully started recording
} catch let error as EkaScribeError {
// Handle SDK-specific errors
await MainActor.run {
switch error {
case .freeSessionLimitReached:
// Show upgrade prompt
showUpgradeAlert()
case .microphonePermissionDenied:
// Show permission alert
showPermissionAlert()
case .microphoneIsInUse:
// Show microphone in use alert
showMicrophoneInUseAlert()
default:
// Handle other SDK-specific errors
showGenericError(error.localizedDescription)
}
}
} catch {
// Handle generic errors (network, API, etc.)
await MainActor.run {
showGenericError(error.localizedDescription)
}
}
}@Published public var screenState: RecordConsultationState// Core recording methods
public func startRecording(
conversationType: VoiceConversationType,
inputLanguage: [InputLanguageType],
templates: [OutputFormatTemplate],
modelType: ModelType
) async throws
public func stopRecording() async throws
public func pauseRecording()
public func resumeRecording() throws
// Session management
public func deleteRecording(id: UUID)
public func clearSession()
public func retryIfNeeded()public func getTemplates(completion: @escaping (Result<TemplateResponse, Error>) -> Void)
public func updateConfig(templates: [String], completion: @escaping (Result<Void, Error>) -> Void)public func getEkaScribeHistory(completion: @escaping (Result<ScribeHistoryResponse, Error>) -> Void)public func fetchResultStatusResponse(
sessionID: String,
completion: @escaping (Result<VoiceToRxStatusResponse, Error>) -> Void
)
// Note: All response values are base64 encoded. Decode before displaying.
public func switchTemplate(
templateID: String,
sessionID: String,
completion: @escaping (Result<VoiceToRxStatusResponse, Error>) -> Void
)
public func updateResult(
sessionID: String,
request: UpdateResultRequest,
completion: @escaping (Result<Void, ErrorStatus>) -> Void
)public class V2RxInitConfigurations {
public static let shared: V2RxInitConfigurations
public var modelContainer: ModelContainer?
public var clientId: String?
public var ownerName: String?
public var ownerOID: String?
public var ownerUUID: String?
public var subOwnerOID: String?
public var subOwnerName: String?
public var appointmentID: String?
}public class AuthTokenHolder {
public static let shared: AuthTokenHolder
public var authToken: String?
public var refreshToken: String?
public var bid: String?
}public struct OutputFormatTemplate {
public let templateID: String
public let templateType: TemplateType
public let templateName: String
public init(
templateID: String,
templateType: TemplateType,
templateName: String
)
}public struct VoiceToRxStatusResponse {
public let data: VoiceToRxStatusData?
}
public struct VoiceToRxStatusData {
public let templateResults: TemplateResults?
public let output: [VoiceToRxOutput]?
public let audioMatrix: AudioMatrix?
}
public struct TemplateResults {
public let custom: [VoiceToRxOutput]?
public let transcript: [VoiceToRxOutput]?
}class RecordingViewController: UIViewController {
private var viewModel: VoiceToRxViewModel?
deinit {
// Clean up resources
viewModel?.clearSession()
}
override func viewWillDisappear(_ animated: Bool) {
super.viewWillDisappear(animated)
// Pause recording when leaving screen
if case .listening = viewModel?.screenState {
viewModel?.pauseRecording()
}
}
}Always observe screenState changes to keep your UI in sync:
.onChange(of: viewModel.screenState) { oldState, newState in
// Update UI based on state
handleStateChange(newState)
}Always handle errors from async operations using try-catch:
// Your implementation
private func startRecording() async {
do {
let templates: [OutputFormatTemplate] = [
// Your templates
]
// Calling SDK function
try await viewModel.startRecording(
conversationType: VoiceConversationType.conversation,
inputLanguage: [InputLanguageType.english, InputLanguageType.hindi],
templates: templates,
modelType: ModelType.lite
)
// Successfully started recording
} catch {
// Handle error appropriately
await MainActor.run {
showError(error)
}
}
}If microphone permission is denied:
- Check
Info.plisthasNSMicrophoneUsageDescription - Request permission programmatically if needed
- Guide users to Settings if permission was previously denied
If sessions fail to create:
- Verify
ownerOIDis set inV2RxInitConfigurations.shared - Check
AuthTokenHolder.shared.authTokenis valid - Ensure network connectivity
- Check for free session limit if applicable
If results are not available after processing:
- Check
screenStatetransitions to.resultDisplay - Verify session ID is available:
viewModel.sessionID - Use
VoiceToRxRepo.shared.fetchResultStatusResponseto get full response - Check network connectivity for result retrieval