Skip to content

Native Node.js addon for capturing per-app audio on macOS using ScreenCaptureKit. Real-time audio streaming with event-based API

License

Notifications You must be signed in to change notification settings

MrLionware/screencapturekit-audio-capture

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ScreenCaptureKit Audio Capture

Native Node.js addon for capturing per-application audio on macOS using the ScreenCaptureKit framework

npm version License: MIT Platform

Capture real-time audio from any macOS application with a simple, event-driven API. Built with N-API for Node.js compatibility and ScreenCaptureKit for system-level audio access.


πŸ“– Table of Contents


Features

  • 🎡 Per-App Audio Capture - Isolate audio from specific applications, windows, or displays
  • 🎭 Multi-Source Capture - Capture from multiple apps simultaneously with mixed output
  • ⚑ Real-Time Streaming - Low-latency callbacks with Event or Stream-based APIs
  • πŸ”„ Multi-Process Service - Server/client architecture for sharing audio across processes
  • πŸ“Š Audio Utilities - Built-in RMS/peak/dB analysis and WAV file export
  • πŸ“˜ TypeScript-First - Full type definitions with memory-safe resource cleanup

Requirements

  • macOS 13.0 (Ventura) or later
  • Node.js 14.0.0 or later (Node.js 18+ recommended for running the automated test suite)
  • Screen Recording permission (granted in System Preferences)

Installation

npm install screencapturekit-audio-capture

Prebuilt binaries are included β€” no compilation or Xcode required for Apple Silicon M series (ARM64) machines.

Fallback Compilation

If no prebuild is available for your architecture, the addon will compile from source automatically. This requires:

  • Xcode Command Line Tools (minimum version 14.0)
    xcode-select --install
  • macOS SDK 13.0 or later

The build process links these macOS frameworks:

  • ScreenCaptureKit - Per-application audio capture
  • AVFoundation - Audio processing
  • CoreMedia - Media sample handling
  • CoreVideo - Video frame handling
  • Foundation - Core Objective-C runtime

All frameworks are part of the macOS system and require no additional installation.

Package Contents

When installed from npm, the package includes:

  • src/ - TypeScript SDK source code and native C++/Objective-C++ code
  • dist/ - Compiled JavaScript and TypeScript declarations
  • binding.gyp - Native build configuration
  • README.md, LICENSE, CHANGELOG.md

Note: Example files are available in the GitHub repository but are not included in the npm package to reduce installation size.

See npm ls screencapturekit-audio-capture for installation location.

Project Structure

screencapturekit-audio-capture/
β”œβ”€β”€ src/                        # Source code
β”‚   β”œβ”€β”€ capture/                # Core audio capture functionality
β”‚   β”‚   β”œβ”€β”€ index.ts            # Barrel exports
β”‚   β”‚   β”œβ”€β”€ audio-capture.ts    # Main AudioCapture class
β”‚   β”‚   └── audio-stream.ts     # Readable stream wrapper
β”‚   β”‚
β”‚   β”œβ”€β”€ native/                 # Native C++/Objective-C++ code
β”‚   β”‚   β”œβ”€β”€ addon.mm            # Node.js N-API bindings
β”‚   β”‚   β”œβ”€β”€ wrapper.h           # C++ header
β”‚   β”‚   └── wrapper.mm          # ScreenCaptureKit implementation
β”‚   β”‚
β”‚   β”œβ”€β”€ service/                # Multi-process capture service
β”‚   β”‚   β”œβ”€β”€ index.ts            # Barrel exports
β”‚   β”‚   β”œβ”€β”€ server.ts           # WebSocket server for shared capture
β”‚   β”‚   └── client.ts           # WebSocket client
β”‚   β”‚
β”‚   β”œβ”€β”€ utils/                  # Utility modules
β”‚   β”‚   β”œβ”€β”€ index.ts            # Barrel exports
β”‚   β”‚   β”œβ”€β”€ stt-converter.ts    # Speech-to-text transform stream
β”‚   β”‚   └── native-loader.ts    # Native addon loader
β”‚   β”‚
β”‚   β”œβ”€β”€ core/                   # Shared types, errors, and lifecycle
β”‚   β”‚   β”œβ”€β”€ index.ts            # Barrel exports
β”‚   β”‚   β”œβ”€β”€ types.ts            # TypeScript type definitions
β”‚   β”‚   β”œβ”€β”€ errors.ts           # Error classes and codes
β”‚   β”‚   └── cleanup.ts          # Resource cleanup utilities
β”‚   β”‚
β”‚   └── index.ts                # Main package exports
β”‚
β”œβ”€β”€ dist/                       # Compiled JavaScript output
β”œβ”€β”€ tests/                      # Test suites (unit, integration, edge-cases)
β”œβ”€β”€ readme_examples/            # Runnable example scripts
β”œβ”€β”€ prebuilds/                  # Prebuilt native binaries
β”œβ”€β”€ build/                      # Native compilation output
β”‚
β”œβ”€β”€ package.json                # Package manifest
β”œβ”€β”€ tsconfig.json               # TypeScript configuration
β”œβ”€β”€ binding.gyp                 # Native addon build configuration
β”œβ”€β”€ CHANGELOG.md                # Version history
β”œβ”€β”€ LICENSE                     # MIT License
└── README.md                   # This file

Quick Start

πŸ“ See readme_examples/basics/01-quick-start.ts for runnable code

import { AudioCapture } from 'screencapturekit-audio-capture';

const capture = new AudioCapture();
const app = capture.selectApp(['Spotify', 'Music', 'Safari'], { fallbackToFirst: true });

capture.on('audio', (sample) => {
  console.log(`Volume: ${AudioCapture.rmsToDb(sample.rms).toFixed(1)} dB`);
});

capture.startCapture(app.processId);
setTimeout(() => capture.stopCapture(), 10000);

Quick Integration Guide

πŸ“ All integration patterns below have runnable examples in readme_examples/

Common patterns for integrating audio capture into your application:

Pattern Example File Description
STT Integration voice/02-stt-integration.ts Stream + event-based approaches for speech-to-text
Voice Agent voice/03-voice-agent.ts Real-time processing with low-latency config
Recording voice/04-audio-recording.ts Capture to WAV file with efficient settings
Robust Capture basics/05-robust-capture.ts Production error handling with fallbacks
Multi-App capture-targets/13-multi-app-capture.ts Capture game + Discord, Zoom + Music, etc.
Multi-Process advanced/20-capture-service.ts Share audio across multiple processes
Graceful Cleanup advanced/21-graceful-cleanup.ts Resource lifecycle and cleanup utilities

Key Configuration Patterns

For STT engines:

{ format: 'int16', channels: 1, minVolume: 0.01 }  // Int16 mono, silence filtered

For low-latency voice processing:

{ format: 'int16', channels: 1, bufferSize: 1024, minVolume: 0.005 }

For recording:

{ format: 'int16', channels: 2, bufferSize: 4096 }  // Stereo, larger buffer for stability

Audio Sample Structure

Property Type Description
data Buffer Audio data (Float32 or Int16)
sampleRate number Sample rate in Hz (e.g., 48000)
channels number 1 = mono, 2 = stereo
format 'float32' | 'int16' Audio format
rms number RMS volume (0.0-1.0)
peak number Peak volume (0.0-1.0)
timestamp number Timestamp in seconds
durationMs number Duration in milliseconds
sampleCount number Total samples across all channels
framesCount number Frames per channel

Module Exports

import { AudioCapture, AudioCaptureError, ErrorCode } from 'screencapturekit-audio-capture';
import type { AudioSample, ApplicationInfo } from 'screencapturekit-audio-capture';

// Multi-process capture service (for sharing audio across processes)
import { AudioCaptureServer, AudioCaptureClient } from 'screencapturekit-audio-capture';

// Resource cleanup utilities
import { cleanupAll, getActiveInstanceCount, installGracefulShutdown } from 'screencapturekit-audio-capture';
Export Description
AudioCapture High-level event-based API (recommended)
AudioStream Readable stream (via createAudioStream())
STTConverter Transform stream for STT (via createSTTStream())
AudioCaptureServer WebSocket server for shared capture (multi-process)
AudioCaptureClient WebSocket client to receive shared audio
AudioCaptureError Error class with codes and details
ErrorCode Error code enum for type-safe handling
cleanupAll Dispose all AudioCapture and AudioCaptureServer instances
getActiveInstanceCount Get total active instance count
installGracefulShutdown Install process exit handlers for cleanup
ScreenCaptureKit Low-level native binding (advanced)

Types: AudioSample, ApplicationInfo, WindowInfo, DisplayInfo, CaptureOptions, PermissionStatus, ActivityInfo, ServerOptions, ClientOptions, and more.

Testing

Note: Test files are available in the GitHub repository but are not included in the npm package.

Tests are written in TypeScript and live under tests/. They use Node's built-in test runner with tsx (Node 18+).

Test Commands:

  • npm test β€” Runs every suite in tests/**/*.test.ts (unit, integration, edge-cases) against the mocked ScreenCaptureKit layer; works cross-platform.
  • npm run test:unit β€” Fast coverage for utilities, audio metrics, selection, and capture control.
  • npm run test:integration β€” Multi-component flows (window/display capture, activity tracking, capability guards) using the shared mock.
  • npm run test:edge-cases β€” Boundary/error handling coverage.

Type Checking:

  • npm run typecheck β€” Type-check the SDK source code.
  • npm run typecheck:tests β€” Type-check the test files.

For true hardware validation, run the example scripts on macOS with Screen Recording permission enabled.

Stream-Based API

πŸ“ See readme_examples/streams/06-stream-basics.ts and readme_examples/streams/07-stream-processing.ts for runnable examples

Use Node.js Readable streams for composable audio processing:

const audioStream = capture.createAudioStream('Spotify', { minVolume: 0.01 });
audioStream.pipe(yourWritableStream);

// Object mode for metadata access
const metaStream = capture.createAudioStream('Spotify', { objectMode: true });
metaStream.on('data', (sample) => console.log(`RMS: ${sample.rms}`));

When to Use Streams vs Events

Use Case Recommended API
Piping through transforms Stream
Backpressure handling Stream
Multiple listeners Event
Maximum simplicity Event

Both APIs use the same underlying capture mechanism and have identical performance.

Stream API Best Practices

  1. Always handle errors - Attach an error handler to prevent crashes
  2. Use pipeline() - Better error handling than chaining .pipe()
  3. Clean up resources - Call stream.stop() when done
  4. Choose the right mode - Normal mode for raw data, object mode for metadata
  5. Stream must flow - Attach a data listener to start capture
import { pipeline } from 'stream';

// Recommended pattern
pipeline(audioStream, transform, writable, (err) => {
  if (err) console.error('Pipeline failed:', err);
});

// Always handle SIGINT
process.on('SIGINT', () => audioStream.stop());

Troubleshooting Stream Issues

Issue Cause Solution
"Application not found" App not running Use selectApp() with fallbacks
No data events App not playing audio / minVolume too high Verify app is playing; lower or remove threshold
"stream.push() after EOF" Stopping abruptly Use pipeline() for proper cleanup
"Already capturing" Multiple streams from one instance Create separate AudioCapture instances
Memory growing Not consuming data Attach data listener; use circular buffer

Stream Performance Tips

  • Normal mode is faster than object mode (no metadata calculation)
  • Batch processing is more efficient than per-sample processing
  • Default highWaterMark is suitable for most cases

πŸ“ See readme_examples/streams/07-stream-processing.ts for a complete production-ready stream example

API Reference

Class: AudioCapture

High-level event-based API (recommended).

Methods Overview

# Category Method Description
Discovery
1 getApplications(opts?) List all capturable apps
2 getAudioApps(opts?) List apps likely to produce audio
3 findApplication(id) Find app by name or bundle ID
4 findByName(name) Alias for findApplication()
5 getApplicationByPid(pid) Find app by process ID
6 getWindows(opts?) List all capturable windows
7 getDisplays() List all displays
Selection
8 selectApp(ids?, opts?) Smart app selection with fallbacks
Capture
9 startCapture(app, opts?) Start capturing from an app
10 captureWindow(id, opts?) Capture from a specific window
11 captureDisplay(id, opts?) Capture from a display
12 captureMultipleApps(ids, opts?) Capture multiple apps (mixed)
13 captureMultipleWindows(ids, opts?) Capture multiple windows (mixed)
14 captureMultipleDisplays(ids, opts?) Capture multiple displays (mixed)
15 stopCapture() Stop current capture
16 isCapturing() Check if currently capturing
17 getStatus() Get detailed capture status
18 getCurrentCapture() Get current capture target info
Streams
19 createAudioStream(app, opts?) Create Node.js Readable stream
20 createSTTStream(app?, opts?) Stream pre-configured for STT
Activity
21 enableActivityTracking(opts?) Track which apps produce audio
22 disableActivityTracking() Stop tracking and clear cache
23 getActivityInfo() Get tracking stats
Lifecycle
24 dispose() Release resources and stop capture
25 isDisposed() Check if instance is disposed

Static Methods

# Method Description
S1 AudioCapture.verifyPermissions() Check screen recording permission
S2 AudioCapture.bufferToFloat32Array(buf) Convert Buffer to Float32Array
S3 AudioCapture.rmsToDb(rms) Convert RMS (0-1) to decibels
S4 AudioCapture.peakToDb(peak) Convert peak (0-1) to decibels
S5 AudioCapture.calculateDb(buf, method?) Calculate dB from audio buffer
S6 AudioCapture.writeWav(buf, opts) Create WAV file from PCM data
S7 AudioCapture.cleanupAll() Dispose all active instances
S8 AudioCapture.getActiveInstanceCount() Get number of active instances

Events

Event Payload Description
'start' CaptureInfo Capture started
'audio' AudioSample Audio data received
'stop' CaptureInfo Capture stopped
'error' AudioCaptureError Error occurred

Method Reference

Discovery Methods

[1] getApplications(options?): ApplicationInfo[]

List all capturable applications.

Option Type Default Description
includeEmpty boolean false Include apps with empty names (helpers, background services)
const apps = capture.getApplications();
const allApps = capture.getApplications({ includeEmpty: true });

[2] getAudioApps(options?): ApplicationInfo[]

List apps likely to produce audio. Filters system apps, utilities, and background processes.

Option Type Default Description
includeSystemApps boolean false Include system apps (Finder, etc.)
includeEmpty boolean false Include apps with empty names
sortByActivity boolean false Sort by recent audio activity (requires [21])
appList Array null Reuse prefetched app list
const audioApps = capture.getAudioApps();
// Returns: ['Spotify', 'Safari', 'Music', 'Zoom']
// Excludes: Finder, Terminal, System Preferences, etc.

// Sort by activity (most active first)
capture.enableActivityTracking();
const sorted = capture.getAudioApps({ sortByActivity: true });

[3] findApplication(identifier): ApplicationInfo | null

Find app by name or bundle ID (case-insensitive, partial match).

Parameter Type Description
identifier string App name or bundle ID
const spotify = capture.findApplication('Spotify');
const safari = capture.findApplication('com.apple.Safari');
const partial = capture.findApplication('spot'); // Matches "Spotify"

[4] findByName(name): ApplicationInfo | null

Alias for findApplication(). Provided for semantic clarity.


[5] getApplicationByPid(processId): ApplicationInfo | null

Find app by process ID.

Parameter Type Description
processId number Process ID
const app = capture.getApplicationByPid(12345);

[6] getWindows(options?): WindowInfo[]

List all capturable windows.

Option Type Default Description
onScreenOnly boolean false Only include visible windows
requireTitle boolean false Only include windows with titles
processId number - Filter by owning process ID

Returns WindowInfo:

  • windowId: Unique window identifier
  • title: Window title
  • owningProcessId: PID of owning app
  • owningApplicationName: App name
  • owningBundleIdentifier: Bundle ID
  • frame: { x, y, width, height }
  • layer: Window layer level
  • onScreen: Whether visible
  • active: Whether active
const windows = capture.getWindows({ onScreenOnly: true, requireTitle: true });
windows.forEach(w => console.log(`${w.windowId}: ${w.title} (${w.owningApplicationName})`));

[7] getDisplays(): DisplayInfo[]

List all displays.

Returns DisplayInfo:

  • displayId: Unique display identifier
  • width: Display width in pixels
  • height: Display height in pixels
  • frame: { x, y, width, height }
  • isMainDisplay: Whether this is the primary display
const displays = capture.getDisplays();
const main = displays.find(d => d.isMainDisplay);

Selection Method

[8] selectApp(identifiers?, options?): ApplicationInfo | null

Smart app selection with multiple fallback strategies.

Parameter Type Description
identifiers string | number | Array | null App name, PID, bundle ID, or array to try in order
Option Type Default Description
audioOnly boolean true Only search audio apps
fallbackToFirst boolean false Return first app if no match
throwOnNotFound boolean false Throw error instead of returning null
sortByActivity boolean false Sort by recent activity (requires [21])
appList Array null Reuse prefetched app list
// Try multiple apps in order
const app = capture.selectApp(['Spotify', 'Music', 'Safari']);

// Get first audio app
const first = capture.selectApp();

// Fallback to first if none match
const fallback = capture.selectApp(['Spotify'], { fallbackToFirst: true });

// Throw on failure
try {
  const app = capture.selectApp(['Spotify'], { throwOnNotFound: true });
} catch (err) {
  console.log('Not found:', err.details.availableApps);
}

Capture Methods

All capture methods accept CaptureOptions:

Option Type Default Description
format 'float32' | 'int16' 'float32' Audio format
channels 1 | 2 2 Mono or stereo
sampleRate number 48000 Requested sample rate (system-dependent)
bufferSize number system Buffer size in frames (affects latency)
minVolume number 0 Min RMS threshold (0-1), filters silence
excludeCursor boolean true Reserved for future video features

Buffer Size Guidelines:

  • 1024: ~21ms latency, higher CPU
  • 2048: ~43ms latency, balanced (recommended)
  • 4096: ~85ms latency, lower CPU

[9] startCapture(appIdentifier, options?): boolean

Start capturing from an application.

Parameter Type Description
appIdentifier string | number | ApplicationInfo App name, bundle ID, PID, or app object
capture.startCapture('Spotify');                  // By name
capture.startCapture('com.spotify.client');       // By bundle ID
capture.startCapture(12345);                      // By PID
capture.startCapture(app);                        // By object

// With options
capture.startCapture('Spotify', {
  format: 'int16',
  channels: 1,
  minVolume: 0.01
});

[10] captureWindow(windowId, options?): boolean

Capture audio from a specific window.

Parameter Type Description
windowId number Window ID from getWindows()
const windows = capture.getWindows({ requireTitle: true });
const target = windows.find(w => w.title.includes('Safari'));
capture.captureWindow(target.windowId, { format: 'int16' });

[11] captureDisplay(displayId, options?): boolean

Capture audio from a display.

Parameter Type Description
displayId number Display ID from getDisplays()
const displays = capture.getDisplays();
const main = displays.find(d => d.isMainDisplay);
capture.captureDisplay(main.displayId);

[12] captureMultipleApps(appIdentifiers, options?): boolean

Capture from multiple apps simultaneously. Audio is mixed into a single stream.

Parameter Type Description
appIdentifiers Array App names, PIDs, bundle IDs, or ApplicationInfo objects
Additional Option Type Default Description
allowPartial boolean false Continue if some apps not found
// Capture game + Discord audio
capture.captureMultipleApps(['Minecraft', 'Discord'], {
  allowPartial: true,  // Continue even if one app not found
  format: 'int16'
});

[13] captureMultipleWindows(windowIdentifiers, options?): boolean

Capture from multiple windows. Audio is mixed.

Parameter Type Description
windowIdentifiers Array Window IDs or WindowInfo objects
Additional Option Type Default Description
allowPartial boolean false Continue if some windows not found
const windows = capture.getWindows({ requireTitle: true });
const browserWindows = windows.filter(w => /Safari|Chrome/.test(w.owningApplicationName));
capture.captureMultipleWindows(browserWindows.map(w => w.windowId));

[14] captureMultipleDisplays(displayIdentifiers, options?): boolean

Capture from multiple displays. Audio is mixed.

Parameter Type Description
displayIdentifiers Array Display IDs or DisplayInfo objects
Additional Option Type Default Description
allowPartial boolean false Continue if some displays not found
const displays = capture.getDisplays();
capture.captureMultipleDisplays(displays.map(d => d.displayId));

[15] stopCapture(): void

Stop the current capture session. Emits 'stop' event.


[16] isCapturing(): boolean

Check if currently capturing.

if (capture.isCapturing()) {
  capture.stopCapture();
}

[17] getStatus(): CaptureStatus | null

Get detailed capture status. Returns null if not capturing.

Returns CaptureStatus:

  • capturing: Always true when not null
  • processId: Process ID (may be null for display capture)
  • app: ApplicationInfo or null
  • window: WindowInfo or null
  • display: DisplayInfo or null
  • targetType: 'application' | 'window' | 'display' | 'multi-app'
  • config: { minVolume, format }
const status = capture.getStatus();
if (status) {
  console.log(`Type: ${status.targetType}, App: ${status.app?.applicationName}`);
}

[18] getCurrentCapture(): CaptureInfo | null

Get current capture target info. Same as getStatus() but without config.


Stream Methods

[19] createAudioStream(appIdentifier, options?): AudioStream

Create a Node.js Readable stream for audio capture.

Parameter Type Description
appIdentifier string | number App name, bundle ID, or PID
Additional Option Type Default Description
objectMode boolean false Emit AudioSample objects instead of Buffers
// Raw buffer mode (for piping)
const stream = capture.createAudioStream('Spotify');
stream.pipe(myWritable);

// Object mode (for metadata access)
const stream = capture.createAudioStream('Spotify', { objectMode: true });
stream.on('data', (sample) => console.log(`RMS: ${sample.rms}`));

// Stop stream
stream.stop();

[20] createSTTStream(appIdentifier?, options?): STTConverter

Create stream pre-configured for Speech-to-Text engines.

Parameter Type Description
appIdentifier string | number | Array | null App identifier(s), null for auto-select
Option Type Default Description
format 'int16' | 'float32' 'int16' Output format
channels 1 | 2 1 Output channels (mono recommended)
objectMode boolean false Emit objects with metadata
autoSelect boolean true Auto-select first audio app if not found
minVolume number - Silence filter threshold
// Auto-selects first audio app, converts to Int16 mono
const sttStream = capture.createSTTStream();
sttStream.pipe(yourSTTEngine);

// With fallback apps
const sttStream = capture.createSTTStream(['Zoom', 'Safari', 'Chrome']);

// Access selected app
console.log(`Selected: ${sttStream.app.applicationName}`);

// Stop
sttStream.stop();

Activity Tracking Methods

[21] enableActivityTracking(options?): void

Enable background tracking of audio activity. Useful for sorting apps by recent audio.

Option Type Default Description
decayMs number 30000 Remove apps from cache after this many ms of inactivity
capture.enableActivityTracking({ decayMs: 60000 }); // 60s decay

[22] disableActivityTracking(): void

Disable tracking and clear the cache.

capture.disableActivityTracking();

[23] getActivityInfo(): ActivityInfo

Get activity tracking status and statistics.

Returns ActivityInfo:

  • enabled: Whether tracking is enabled
  • trackedApps: Number of apps in cache
  • recentApps: Array of ProcessActivityInfo:
    • processId: Process ID
    • lastSeen: Timestamp of last audio
    • ageMs: Time since last audio
    • avgRMS: Average RMS level
    • sampleCount: Number of samples received
const info = capture.getActivityInfo();
console.log(`Active apps: ${info.trackedApps}`);
info.recentApps.forEach(app => {
  console.log(`PID ${app.processId}: ${app.sampleCount} samples`);
});

Lifecycle Methods

[24] dispose(): void

Release all resources and stop any active capture. Safe to call multiple times (idempotent).

const capture = new AudioCapture();
capture.startCapture('Spotify');

// When done, release resources
capture.dispose();

// Instance can no longer be used
console.log(capture.isDisposed()); // true

[25] isDisposed(): boolean

Check if this instance has been disposed.

if (!capture.isDisposed()) {
  capture.startCapture('Spotify');
}

Note: Calling methods like startCapture(), captureWindow(), or captureDisplay() on a disposed instance will throw an error.


Static Method Reference

[S1] AudioCapture.verifyPermissions(): PermissionStatus

Check screen recording permission before capture.

Returns PermissionStatus:

  • granted: Whether permission is granted
  • message: Human-readable status
  • apps: Prefetched app list (reuse with selectApp({ appList }))
  • availableApps: Number of apps found
  • remediation: Fix instructions (if not granted)
const status = AudioCapture.verifyPermissions();
if (!status.granted) {
  console.error(status.message);
  console.log(status.remediation);
  process.exit(1);
}

// Reuse apps list
const app = capture.selectApp(['Spotify'], { appList: status.apps });

[S2] AudioCapture.bufferToFloat32Array(buffer): Float32Array

Convert Buffer to Float32Array for audio processing.

capture.on('audio', (sample) => {
  const floats = AudioCapture.bufferToFloat32Array(sample.data);
  // Process individual samples
  for (let i = 0; i < floats.length; i++) {
    const value = floats[i]; // Range: -1.0 to 1.0
  }
});

[S3] AudioCapture.rmsToDb(rms): number

Convert RMS value (0-1) to decibels.

const db = AudioCapture.rmsToDb(0.5); // -6.02 dB
const db = AudioCapture.rmsToDb(sample.rms);

[S4] AudioCapture.peakToDb(peak): number

Convert peak value (0-1) to decibels.

const db = AudioCapture.peakToDb(sample.peak);

[S5] AudioCapture.calculateDb(buffer, method?): number

Calculate dB level directly from audio buffer.

Parameter Type Default Description
buffer Buffer - Audio data buffer
method 'rms' | 'peak' 'rms' Calculation method
capture.on('audio', (sample) => {
  const rmsDb = AudioCapture.calculateDb(sample.data, 'rms');
  const peakDb = AudioCapture.calculateDb(sample.data, 'peak');
});

[S6] AudioCapture.writeWav(buffer, options): Buffer

Create a complete WAV file from PCM audio data.

Option Type Required Description
sampleRate number βœ“ Sample rate in Hz
channels number βœ“ Number of channels
format 'float32' | 'int16' Audio format (default: 'float32')
import fs from 'fs';

capture.on('audio', (sample) => {
  const wav = AudioCapture.writeWav(sample.data, {
    sampleRate: sample.sampleRate,
    channels: sample.channels,
    format: sample.format
  });
  fs.writeFileSync('output.wav', wav);
});

[S7] AudioCapture.cleanupAll(): number

Dispose all active AudioCapture instances. Returns the number of instances cleaned up.

// Create multiple instances
const capture1 = new AudioCapture();
const capture2 = new AudioCapture();

console.log(AudioCapture.getActiveInstanceCount()); // 2

// Clean up all at once
const cleaned = AudioCapture.cleanupAll();
console.log(`Cleaned up ${cleaned} instances`); // 2

[S8] AudioCapture.getActiveInstanceCount(): number

Get the number of active (non-disposed) AudioCapture instances.

const capture = new AudioCapture();
console.log(AudioCapture.getActiveInstanceCount()); // 1

capture.dispose();
console.log(AudioCapture.getActiveInstanceCount()); // 0

Error Handling

Class: AudioCaptureError

Custom error class thrown by the SDK.

  • message: Human-readable error message
  • code: Machine-readable error code (see below)
  • details: Additional context (e.g., processId, availableApps)

Error Codes

Import ErrorCode for reliable error checking:

import { AudioCapture, AudioCaptureError, ErrorCode } from 'screencapturekit-audio-capture';

const capture = new AudioCapture();
capture.on('error', (err: AudioCaptureError) => {
  if (err.code === ErrorCode.APP_NOT_FOUND) {
    // Handle missing app
  }
});
Code Description
ERR_PERMISSION_DENIED Screen Recording permission not granted
ERR_APP_NOT_FOUND Application not found by name or bundle ID
ERR_PROCESS_NOT_FOUND Process ID not found or not running
ERR_ALREADY_CAPTURING Attempted to start capture while already capturing
ERR_CAPTURE_FAILED Native capture failed to start (e.g., app has no windows)
ERR_INVALID_ARGUMENT Invalid arguments provided to method

Using Error Codes:

import { AudioCapture, AudioCaptureError, ErrorCode } from 'screencapturekit-audio-capture';

const capture = new AudioCapture();

capture.on('error', (err: AudioCaptureError) => {
  switch (err.code) {
    case ErrorCode.PERMISSION_DENIED:
      console.log('Grant Screen Recording permission');
      break;
    case ErrorCode.APP_NOT_FOUND:
      console.log('App not found:', err.details.requestedApp);
      console.log('Available:', err.details.availableApps);
      break;
    case ErrorCode.ALREADY_CAPTURING:
      console.log('Stop current capture first');
      capture.stopCapture();
      break;
    default:
      console.error('Error:', err.message);
  }
});

Stream Classes

AudioStream - Readable stream extending Node.js Readable:

  • stop() - Stop stream and capture
  • getCurrentCapture() - Get current capture info

STTConverter - Transform stream extending Node.js Transform:

  • stop() - Stop stream and capture
  • app - The selected ApplicationInfo
  • captureOptions - Options used for capture

Low-Level API: ScreenCaptureKit

For advanced users who need direct access to the native binding:

import { ScreenCaptureKit } from 'screencapturekit-audio-capture';

const captureKit = new ScreenCaptureKit();

// Get apps (returns basic ApplicationInfo array)
const apps = captureKit.getAvailableApps();

// Start capture (requires manual callback handling)
captureKit.startCapture(processId, config, (sample) => {
  // sample: { data, sampleRate, channelCount, timestamp }
  // No enhancement - raw native data
});

captureKit.stopCapture();
const isCapturing = captureKit.isCapturing();

When to use:

  • Absolute minimal overhead needed
  • Building your own wrapper
  • Avoiding event emitter overhead

Most users should use AudioCapture instead.


Multi-Process Capture Service

macOS ScreenCaptureKit only allows one process to capture audio at a time. If you need multiple processes to receive the same audio data, use the server/client architecture.

πŸ“ See readme_examples/advanced/20-capture-service.ts for a complete example

When to Use

Scenario Solution
Single app capturing audio Use AudioCapture directly
Multiple processes need same audio Use AudioCaptureServer + AudioCaptureClient
Electron main + renderer processes Use server/client
Microservices architecture Use server/client

Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  AudioCaptureServer                     β”‚
β”‚  - Runs in one process                                  β”‚
β”‚  - Handles actual ScreenCaptureKit capture              β”‚
β”‚  - Broadcasts audio to all connected clients            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚ WebSocket (ws://localhost:9123)
              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Client 1  β”‚  β”‚   Client 2  β”‚  β”‚   Client N  β”‚
β”‚  (Process A)β”‚  β”‚  (Process B)β”‚  β”‚  (Process N)β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Server Usage

import { AudioCaptureServer } from 'screencapturekit-audio-capture';

const server = new AudioCaptureServer({
  port: 9123,        // Default: 9123
  host: 'localhost'  // Default: 'localhost'
});

// Start the server
await server.start();

// Server events
server.on('clientConnected', (clientId) => console.log(`Client ${clientId} connected`));
server.on('clientDisconnected', (clientId) => console.log(`Client ${clientId} disconnected`));
server.on('captureStarted', (session) => console.log(`Capture started: ${session.id}`));
server.on('captureStopped', () => console.log('Capture stopped'));
server.on('captureError', (error) => console.error('Capture error:', error));

// Stop the server
await server.stop();

Server Methods

Method Returns Description
start() Promise<void> Start the WebSocket server
stop() Promise<void> Stop server and disconnect all clients
getSession() CaptureSession | null Get current capture session info
getClientCount() number Get number of connected clients

Server Events

Event Payload Description
'clientConnected' clientId: string Client connected
'clientDisconnected' clientId: string Client disconnected
'captureStarted' CaptureSession Capture session started
'captureStopped' - Capture session stopped
'captureError' Error Capture error occurred

Client Usage

import { AudioCaptureClient } from 'screencapturekit-audio-capture';

const client = new AudioCaptureClient({
  url: 'ws://localhost:9123',  // Default
  autoReconnect: true,         // Default: true
  reconnectDelay: 1000,        // Default: 1000ms
  maxReconnectAttempts: 10     // Default: 10
});

// Connect to server
await client.connect();

// Receive audio (similar API to AudioCapture)
client.on('audio', (sample) => {
  console.log(`Received ${sample.data.length} samples at ${sample.sampleRate}Hz`);
});

// List available apps (via server)
const apps = await client.getApplications();

// Start capture (request sent to server)
await client.startCapture('Spotify');
// Or by PID: await client.startCapture(12345);

// Other capture methods
await client.captureWindow(windowId);
await client.captureDisplay(displayId);
await client.captureMultipleApps(['Spotify', 'Discord']);

// Get server status
const status = await client.getStatus();
console.log(`Capturing: ${status.capturing}, Clients: ${status.totalClients}`);

// Stop capture
await client.stopCapture();

// Disconnect
client.disconnect();

Client Methods

Method Returns Description
connect() Promise<void> Connect to the server
disconnect() void Disconnect from server
getApplications() Promise<ApplicationInfo[]> List apps via server
getWindows() Promise<WindowInfo[]> List windows via server
getDisplays() Promise<DisplayInfo[]> List displays via server
startCapture(target, opts?) Promise<boolean> Start app capture
captureWindow(id, opts?) Promise<boolean> Start window capture
captureDisplay(id, opts?) Promise<boolean> Start display capture
captureMultipleApps(targets, opts?) Promise<boolean> Start multi-app capture
stopCapture() Promise<void> Stop current capture
getStatus() Promise<ServerStatus> Get server status
getClientId() string | null Get this client's ID
getSessionId() string | null Get current session ID

Client Events

Event Payload Description
'connected' - Connected to server
'disconnected' - Disconnected from server
'reconnecting' attempt: number Attempting to reconnect
'reconnectFailed' - Max reconnect attempts reached
'audio' RemoteAudioSample Audio data received
'captureStopped' - Server stopped capture
'captureError' { message: string } Server capture error
'error' Error Client-side error

RemoteAudioSample

Audio samples received by clients have this structure:

Property Type Description
data Float32Array Audio sample data
sampleRate number Sample rate in Hz
channels number Number of channels
timestamp number Timestamp in seconds

Events Reference

Event: 'start'

Emitted when capture starts.

capture.on('start', ({ processId, app }) => {
  console.log(`Capturing from ${app?.applicationName}`);
});

Event: 'audio'

Emitted for each audio sample. See Audio Sample Structure for all properties.

capture.on('audio', (sample: AudioSample) => {
  console.log(`${sample.durationMs}ms, RMS: ${sample.rms}`);
});

Event: 'stop'

Emitted when capture stops.

capture.on('stop', ({ processId }) => {
  console.log('Capture stopped');
});

Event: 'error'

Emitted on errors.

capture.on('error', (err: AudioCaptureError) => {
  console.error(`[${err.code}]:`, err.message);
});

TypeScript

Full type definitions included. See Module Exports for import syntax.

Available Types

Type Description
AudioSample Audio sample with data and metadata
ApplicationInfo App info (processId, bundleIdentifier, applicationName)
WindowInfo Window info (windowId, title, frame, etc.)
DisplayInfo Display info (displayId, width, height, etc.)
CaptureInfo Current capture target info
CaptureStatus Full capture status including config
PermissionStatus Permission verification result
ActivityInfo Activity tracking stats
CaptureOptions Options for startCapture()
AudioStreamOptions Options for createAudioStream()
STTStreamOptions Options for createSTTStream()
MultiAppCaptureOptions Options for captureMultipleApps()
MultiWindowCaptureOptions Options for captureMultipleWindows()
MultiDisplayCaptureOptions Options for captureMultipleDisplays()
ServerOptions Options for AudioCaptureServer
ClientOptions Options for AudioCaptureClient
RemoteAudioSample Audio sample received via client
CleanupResult Result of cleanupAll() operation
ErrorCode Enum of error codes

Working with Audio Data

Buffer Format

Audio samples are Node.js Buffer objects containing Float32 PCM by default:

capture.on('audio', (sample) => {
  // Use helper (recommended)
  const float32 = AudioCapture.bufferToFloat32Array(sample.data);
  
  // Or manual
  const float32Manual = new Float32Array(
    sample.data.buffer,
    sample.data.byteOffset,
    sample.data.byteLength / 4
  );
});

Int16 Format

capture.startCapture('Spotify', { format: 'int16' });

capture.on('audio', (sample) => {
  const int16 = new Int16Array(
    sample.data.buffer,
    sample.data.byteOffset,
    sample.data.byteLength / 2
  );
});

Filtering Silence

capture.startCapture('Spotify', { minVolume: 0.01 });
// Only emits audio events when volume > 0.01 RMS

Resource Lifecycle

πŸ“ See readme_examples/advanced/21-graceful-cleanup.ts for a complete example

Properly managing resources ensures your application shuts down cleanly without orphaned captures or memory leaks.

Instance Cleanup

const capture = new AudioCapture();
capture.startCapture('Spotify');

// When done with this specific instance
capture.dispose();  // Stops capture and releases resources

Global Cleanup

import { cleanupAll, getActiveInstanceCount, installGracefulShutdown } from 'screencapturekit-audio-capture';

// Check active instances
console.log(`Active: ${getActiveInstanceCount()}`);

// Clean up all instances at once
const result = await cleanupAll();  // Returns CleanupResult
console.log(`Cleaned up ${result.total} instances`);

// Install automatic cleanup on process exit (SIGINT, SIGTERM, etc.)
installGracefulShutdown();

Best Practices

Pattern When to Use
capture.dispose() Cleaning up a specific instance
AudioCapture.cleanupAll() Cleaning up all AudioCapture instances
cleanupAll() Cleaning up all instances (AudioCapture + AudioCaptureServer)
installGracefulShutdown() Auto-cleanup on Ctrl+C, kill signals, or uncaught exceptions

Process Exit Handling

Exit handlers are automatically installed when you create an AudioCapture or AudioCaptureServer instance. For explicit control:

import { installGracefulShutdown } from 'screencapturekit-audio-capture';

// Install once at application startup
installGracefulShutdown();

// Now SIGINT/SIGTERM will automatically:
// 1. Stop all active captures
// 2. Dispose all instances
// 3. Exit cleanly

Common Issues

No applications available

Solution: Grant Screen Recording permission in System Preferences β†’ Privacy & Security β†’ Screen Recording, then restart your terminal.

Application not found

Solutions:

  1. Check if the app is running
  2. Use capture.getApplications() to list available apps
  3. Use bundle ID instead of name: capture.startCapture('com.spotify.client')

No audio samples received

Solutions:

  1. Ensure the app is playing audio
  2. Check if audio is muted
  3. Remove minVolume threshold for testing
  4. Verify the app has visible windows

Build errors

Note: Most users won't see build errors since prebuilt binaries are included. These steps apply only if compilation is needed.

Solutions:

  1. Install Xcode CLI Tools: xcode-select --install
  2. Verify macOS 13.0+: sw_vers
  3. Clean rebuild: npm run clean && npm run build

Examples

πŸ“ All examples are in readme_examples/

Basics

Example File Description
Quick Start basics/01-quick-start.ts Basic capture setup
Robust Capture basics/05-robust-capture.ts Production error handling
Find Apps basics/11-find-apps.ts App discovery

Voice & STT

Example File Description
STT Integration voice/02-stt-integration.ts Speech-to-text patterns
Voice Agent voice/03-voice-agent.ts Real-time voice processing
Recording voice/04-audio-recording.ts Record and save as WAV

Streams

Example File Description
Stream Basics streams/06-stream-basics.ts Stream API fundamentals
Stream Processing streams/07-stream-processing.ts Transform streams

Processing

Example File Description
Visualizer processing/08-visualizer.ts ASCII volume display
Volume Monitor processing/09-volume-monitor.ts Level alerts
Int16 Capture processing/10-int16-capture.ts Int16 format
Manual Processing processing/12-manual-processing.ts Buffer manipulation

Capture Targets

Example File Description
Multi-App Capture capture-targets/13-multi-app-capture.ts Multiple apps
Per-App Streams capture-targets/14-per-app-streams.ts Separate streams
Window Capture capture-targets/15-window-capture.ts Single window
Display Capture capture-targets/16-display-capture.ts Full display
Multi-Window capture-targets/17-multi-window-capture.ts Multiple windows
Multi-Display capture-targets/18-multi-display-capture.ts Multiple displays

Advanced

Example File Description
Advanced Methods advanced/19-advanced-methods.ts Activity tracking
Capture Service advanced/20-capture-service.ts Multi-process sharing
Graceful Cleanup advanced/21-graceful-cleanup.ts Resource lifecycle management

Run examples:

npx tsx readme_examples/basics/01-quick-start.ts
npm run test:readme  # Run all examples

Targeting specific apps/windows/displays:

Most examples support environment variables to target specific sources instead of using defaults:

Env Variable Type Used By Example
TARGET_APP App name 01-12, 19-21 TARGET_APP="Spotify" npx tsx readme_examples/basics/01-quick-start.ts
TARGET_APPS Comma-separated 13, 14 TARGET_APPS="Safari,Music" npx tsx readme_examples/capture-targets/13-multi-app-capture.ts
TARGET_WINDOW Window ID 15, 17 TARGET_WINDOW=12345 npx tsx readme_examples/capture-targets/15-window-capture.ts
TARGET_DISPLAY Display ID 16, 18 TARGET_DISPLAY=1 npx tsx readme_examples/capture-targets/16-display-capture.ts
VERIFY 1 or true 13 VERIFY=1 npx tsx readme_examples/capture-targets/13-multi-app-capture.ts

Tip: Run npx tsx readme_examples/basics/11-find-apps.ts to list available apps and their names. Window/display IDs are printed when running the respective capture examples.

Important: Environment variables must be placed before the command, not after. TARGET_APP="Spotify" npx tsx ... works, but npx tsx ... TARGET_APP="Spotify" does not.


Platform Support

macOS Version Support Notes
macOS 15+ (Sequoia) ⚠️ Known issues Single-process audio capture limitation (use server/client)
macOS 14+ (Sonoma) βœ… Full Recommended
macOS 13+ (Ventura) βœ… Full Minimum required
macOS 12.x and below ❌ No ScreenCaptureKit not available
Windows/Linux ❌ No macOS-only framework

Note: On macOS 15+, only one process can capture audio at a time via ScreenCaptureKit. If you need multiple processes to receive audio, use the Multi-Process Capture Service.


Performance

Typical (Apple Silicon M1):

  • CPU: <1% for stereo Float32
  • Memory: ~10-20MB
  • Latency: ~160ms (configurable)

Optimization tips:

  • Use minVolume to filter silence
  • Use format: 'int16' for 50% memory reduction
  • Use channels: 1 for another 50% reduction

Contributing

git clone https://github.com/mrlionware/screencapturekit-audio-capture.git
cd screencapturekit-audio-capture
npm install
npm run build
npm test

License

MIT License - see LICENSE


Made with ❀️ for the Node.js and macOS developer community

About

Native Node.js addon for capturing per-app audio on macOS using ScreenCaptureKit. Real-time audio streaming with event-based API

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published