ScreenCaptureKit Audio Capture

Native Node.js addon for capturing per-application audio on macOS using the ScreenCaptureKit framework

Capture real-time audio from any macOS application with a simple, event-driven API. Built with N-API for Node.js compatibility and ScreenCaptureKit for system-level audio access.

📖 Table of Contents

Features
Requirements
Installation
Project Structure
Quick Start
Quick Integration Guide
Module Exports
Testing
Stream-Based API
API Reference
Multi-Process Capture Service
Events Reference
TypeScript
Working with Audio Data
Resource Lifecycle
Common Issues
Examples
Platform Support
Performance
Contributing
License

Features

🎵 Per-App Audio Capture - Isolate audio from specific applications, windows, or displays
🎭 Multi-Source Capture - Capture from multiple apps simultaneously with mixed output
⚡ Real-Time Streaming - Low-latency callbacks with Event or Stream-based APIs
🔄 Multi-Process Service - Server/client architecture for sharing audio across processes
📊 Audio Utilities - Built-in RMS/peak/dB analysis and WAV file export
📘 TypeScript-First - Full type definitions with memory-safe resource cleanup

Requirements

macOS 13.0 (Ventura) or later
Node.js 14.0.0 or later (Node.js 18+ recommended for running the automated test suite)
Screen Recording permission (granted in System Preferences)

Installation

npm install screencapturekit-audio-capture

Prebuilt binaries are included — no compilation or Xcode required for Apple Silicon M series (ARM64) machines.

Fallback Compilation

If no prebuild is available for your architecture, the addon will compile from source automatically. This requires:

Xcode Command Line Tools (minimum version 14.0)
```
xcode-select --install
```
macOS SDK 13.0 or later

The build process links these macOS frameworks:

ScreenCaptureKit - Per-application audio capture
AVFoundation - Audio processing
CoreMedia - Media sample handling
CoreVideo - Video frame handling
Foundation - Core Objective-C runtime

All frameworks are part of the macOS system and require no additional installation.

Package Contents

When installed from npm, the package includes:

src/ - TypeScript SDK source code and native C++/Objective-C++ code
dist/ - Compiled JavaScript and TypeScript declarations
binding.gyp - Native build configuration
README.md, LICENSE, CHANGELOG.md

Note: Example files are available in the GitHub repository but are not included in the npm package to reduce installation size.

See npm ls screencapturekit-audio-capture for installation location.

Project Structure

screencapturekit-audio-capture/
├── src/                        # Source code
│   ├── capture/                # Core audio capture functionality
│   │   ├── index.ts            # Barrel exports
│   │   ├── audio-capture.ts    # Main AudioCapture class
│   │   └── audio-stream.ts     # Readable stream wrapper
│   │
│   ├── native/                 # Native C++/Objective-C++ code
│   │   ├── addon.mm            # Node.js N-API bindings
│   │   ├── wrapper.h           # C++ header
│   │   └── wrapper.mm          # ScreenCaptureKit implementation
│   │
│   ├── service/                # Multi-process capture service
│   │   ├── index.ts            # Barrel exports
│   │   ├── server.ts           # WebSocket server for shared capture
│   │   └── client.ts           # WebSocket client
│   │
│   ├── utils/                  # Utility modules
│   │   ├── index.ts            # Barrel exports
│   │   ├── stt-converter.ts    # Speech-to-text transform stream
│   │   └── native-loader.ts    # Native addon loader
│   │
│   ├── core/                   # Shared types, errors, and lifecycle
│   │   ├── index.ts            # Barrel exports
│   │   ├── types.ts            # TypeScript type definitions
│   │   ├── errors.ts           # Error classes and codes
│   │   └── cleanup.ts          # Resource cleanup utilities
│   │
│   └── index.ts                # Main package exports
│
├── dist/                       # Compiled JavaScript output
├── tests/                      # Test suites (unit, integration, edge-cases)
├── readme_examples/            # Runnable example scripts
├── prebuilds/                  # Prebuilt native binaries
├── build/                      # Native compilation output
│
├── package.json                # Package manifest
├── tsconfig.json               # TypeScript configuration
├── binding.gyp                 # Native addon build configuration
├── CHANGELOG.md                # Version history
├── LICENSE                     # MIT License
└── README.md                   # This file

Quick Start

📁 See readme_examples/basics/01-quick-start.ts for runnable code

import { AudioCapture } from 'screencapturekit-audio-capture';

const capture = new AudioCapture();
const app = capture.selectApp(['Spotify', 'Music', 'Safari'], { fallbackToFirst: true });

capture.on('audio', (sample) => {
  console.log(`Volume: ${AudioCapture.rmsToDb(sample.rms).toFixed(1)} dB`);
});

capture.startCapture(app.processId);
setTimeout(() => capture.stopCapture(), 10000);

Quick Integration Guide

📁 All integration patterns below have runnable examples in readme_examples/

Common patterns for integrating audio capture into your application:

Pattern	Example File	Description
STT Integration	`voice/02-stt-integration.ts`	Stream + event-based approaches for speech-to-text
Voice Agent	`voice/03-voice-agent.ts`	Real-time processing with low-latency config
Recording	`voice/04-audio-recording.ts`	Capture to WAV file with efficient settings
Robust Capture	`basics/05-robust-capture.ts`	Production error handling with fallbacks
Multi-App	`capture-targets/13-multi-app-capture.ts`	Capture game + Discord, Zoom + Music, etc.
Multi-Process	`advanced/20-capture-service.ts`	Share audio across multiple processes
Graceful Cleanup	`advanced/21-graceful-cleanup.ts`	Resource lifecycle and cleanup utilities

Key Configuration Patterns

For STT engines:

{ format: 'int16', channels: 1, minVolume: 0.01 }  // Int16 mono, silence filtered

For low-latency voice processing:

{ format: 'int16', channels: 1, bufferSize: 1024, minVolume: 0.005 }

For recording:

{ format: 'int16', channels: 2, bufferSize: 4096 }  // Stereo, larger buffer for stability

Audio Sample Structure

Property	Type	Description
`data`	`Buffer`	Audio data (Float32 or Int16)
`sampleRate`	`number`	Sample rate in Hz (e.g., 48000)
`channels`	`number`	1 = mono, 2 = stereo
`format`	`'float32' \| 'int16'`	Audio format
`rms`	`number`	RMS volume (0.0-1.0)
`peak`	`number`	Peak volume (0.0-1.0)
`timestamp`	`number`	Timestamp in seconds
`durationMs`	`number`	Duration in milliseconds
`sampleCount`	`number`	Total samples across all channels
`framesCount`	`number`	Frames per channel

Module Exports

import { AudioCapture, AudioCaptureError, ErrorCode } from 'screencapturekit-audio-capture';
import type { AudioSample, ApplicationInfo } from 'screencapturekit-audio-capture';

// Multi-process capture service (for sharing audio across processes)
import { AudioCaptureServer, AudioCaptureClient } from 'screencapturekit-audio-capture';

// Resource cleanup utilities
import { cleanupAll, getActiveInstanceCount, installGracefulShutdown } from 'screencapturekit-audio-capture';

Export	Description
`AudioCapture`	High-level event-based API (recommended)
`AudioStream`	Readable stream (via `createAudioStream()`)
`STTConverter`	Transform stream for STT (via `createSTTStream()`)
`AudioCaptureServer`	WebSocket server for shared capture (multi-process)
`AudioCaptureClient`	WebSocket client to receive shared audio
`AudioCaptureError`	Error class with codes and details
`ErrorCode`	Error code enum for type-safe handling
`cleanupAll`	Dispose all AudioCapture and AudioCaptureServer instances
`getActiveInstanceCount`	Get total active instance count
`installGracefulShutdown`	Install process exit handlers for cleanup
`ScreenCaptureKit`	Low-level native binding (advanced)

Types: AudioSample, ApplicationInfo, WindowInfo, DisplayInfo, CaptureOptions, PermissionStatus, ActivityInfo, ServerOptions, ClientOptions, and more.

Testing

Note: Test files are available in the GitHub repository but are not included in the npm package.

Tests are written in TypeScript and live under tests/. They use Node's built-in test runner with tsx (Node 18+).

Test Commands:

npm test — Runs every suite in tests/**/*.test.ts (unit, integration, edge-cases) against the mocked ScreenCaptureKit layer; works cross-platform.
npm run test:unit — Fast coverage for utilities, audio metrics, selection, and capture control.
npm run test:integration — Multi-component flows (window/display capture, activity tracking, capability guards) using the shared mock.
npm run test:edge-cases — Boundary/error handling coverage.

Type Checking:

npm run typecheck — Type-check the SDK source code.
npm run typecheck:tests — Type-check the test files.

For true hardware validation, run the example scripts on macOS with Screen Recording permission enabled.

Stream-Based API

📁 See readme_examples/streams/06-stream-basics.ts and readme_examples/streams/07-stream-processing.ts for runnable examples

Use Node.js Readable streams for composable audio processing:

const audioStream = capture.createAudioStream('Spotify', { minVolume: 0.01 });
audioStream.pipe(yourWritableStream);

// Object mode for metadata access
const metaStream = capture.createAudioStream('Spotify', { objectMode: true });
metaStream.on('data', (sample) => console.log(`RMS: ${sample.rms}`));

When to Use Streams vs Events

Use Case	Recommended API
Piping through transforms	Stream
Backpressure handling	Stream
Multiple listeners	Event
Maximum simplicity	Event

Both APIs use the same underlying capture mechanism and have identical performance.

Stream API Best Practices

Always handle errors - Attach an error handler to prevent crashes
Use pipeline() - Better error handling than chaining .pipe()
Clean up resources - Call stream.stop() when done
Choose the right mode - Normal mode for raw data, object mode for metadata
Stream must flow - Attach a data listener to start capture

import { pipeline } from 'stream';

// Recommended pattern
pipeline(audioStream, transform, writable, (err) => {
  if (err) console.error('Pipeline failed:', err);
});

// Always handle SIGINT
process.on('SIGINT', () => audioStream.stop());

Troubleshooting Stream Issues

Issue	Cause	Solution
"Application not found"	App not running	Use `selectApp()` with fallbacks
No data events	App not playing audio / `minVolume` too high	Verify app is playing; lower or remove threshold
"stream.push() after EOF"	Stopping abruptly	Use `pipeline()` for proper cleanup
"Already capturing"	Multiple streams from one instance	Create separate `AudioCapture` instances
Memory growing	Not consuming data	Attach `data` listener; use circular buffer

Stream Performance Tips

Normal mode is faster than object mode (no metadata calculation)
Batch processing is more efficient than per-sample processing
Default highWaterMark is suitable for most cases

📁 See readme_examples/streams/07-stream-processing.ts for a complete production-ready stream example

API Reference

Class: `AudioCapture`

High-level event-based API (recommended).

Methods Overview

#	Category	Method	Description
	Discovery
1		`getApplications(opts?)`	List all capturable apps
2		`getAudioApps(opts?)`	List apps likely to produce audio
3		`findApplication(id)`	Find app by name or bundle ID
4		`findByName(name)`	Alias for `findApplication()`
5		`getApplicationByPid(pid)`	Find app by process ID
6		`getWindows(opts?)`	List all capturable windows
7		`getDisplays()`	List all displays
	Selection
8		`selectApp(ids?, opts?)`	Smart app selection with fallbacks
	Capture
9		`startCapture(app, opts?)`	Start capturing from an app
10		`captureWindow(id, opts?)`	Capture from a specific window
11		`captureDisplay(id, opts?)`	Capture from a display
12		`captureMultipleApps(ids, opts?)`	Capture multiple apps (mixed)
13		`captureMultipleWindows(ids, opts?)`	Capture multiple windows (mixed)
14		`captureMultipleDisplays(ids, opts?)`	Capture multiple displays (mixed)
15		`stopCapture()`	Stop current capture
16		`isCapturing()`	Check if currently capturing
17		`getStatus()`	Get detailed capture status
18		`getCurrentCapture()`	Get current capture target info
	Streams
19		`createAudioStream(app, opts?)`	Create Node.js Readable stream
20		`createSTTStream(app?, opts?)`	Stream pre-configured for STT
	Activity
21		`enableActivityTracking(opts?)`	Track which apps produce audio
22		`disableActivityTracking()`	Stop tracking and clear cache
23		`getActivityInfo()`	Get tracking stats
	Lifecycle
24		`dispose()`	Release resources and stop capture
25		`isDisposed()`	Check if instance is disposed

Static Methods

#	Method	Description
S1	`AudioCapture.verifyPermissions()`	Check screen recording permission
S2	`AudioCapture.bufferToFloat32Array(buf)`	Convert Buffer to Float32Array
S3	`AudioCapture.rmsToDb(rms)`	Convert RMS (0-1) to decibels
S4	`AudioCapture.peakToDb(peak)`	Convert peak (0-1) to decibels
S5	`AudioCapture.calculateDb(buf, method?)`	Calculate dB from audio buffer
S6	`AudioCapture.writeWav(buf, opts)`	Create WAV file from PCM data
S7	`AudioCapture.cleanupAll()`	Dispose all active instances
S8	`AudioCapture.getActiveInstanceCount()`	Get number of active instances

Events

Event	Payload	Description
`'start'`	`CaptureInfo`	Capture started
`'audio'`	`AudioSample`	Audio data received
`'stop'`	`CaptureInfo`	Capture stopped
`'error'`	`AudioCaptureError`	Error occurred

Method Reference

Discovery Methods

[1] `getApplications(options?): ApplicationInfo[]`

List all capturable applications.

Option	Type	Default	Description
`includeEmpty`	boolean	`false`	Include apps with empty names (helpers, background services)

const apps = capture.getApplications();
const allApps = capture.getApplications({ includeEmpty: true });

[2] `getAudioApps(options?): ApplicationInfo[]`

List apps likely to produce audio. Filters system apps, utilities, and background processes.

Option	Type	Default	Description
`includeSystemApps`	boolean	`false`	Include system apps (Finder, etc.)
`includeEmpty`	boolean	`false`	Include apps with empty names
`sortByActivity`	boolean	`false`	Sort by recent audio activity (requires [21])
`appList`	Array	`null`	Reuse prefetched app list

const audioApps = capture.getAudioApps();
// Returns: ['Spotify', 'Safari', 'Music', 'Zoom']
// Excludes: Finder, Terminal, System Preferences, etc.

// Sort by activity (most active first)
capture.enableActivityTracking();
const sorted = capture.getAudioApps({ sortByActivity: true });

[3] `findApplication(identifier): ApplicationInfo | null`

Find app by name or bundle ID (case-insensitive, partial match).

Parameter	Type	Description
`identifier`	string	App name or bundle ID

const spotify = capture.findApplication('Spotify');
const safari = capture.findApplication('com.apple.Safari');
const partial = capture.findApplication('spot'); // Matches "Spotify"

[4] `findByName(name): ApplicationInfo | null`

Alias for findApplication(). Provided for semantic clarity.

[5] `getApplicationByPid(processId): ApplicationInfo | null`

Find app by process ID.

Parameter	Type	Description
`processId`	number	Process ID

const app = capture.getApplicationByPid(12345);

[6] `getWindows(options?): WindowInfo[]`

List all capturable windows.

Option	Type	Default	Description
`onScreenOnly`	boolean	`false`	Only include visible windows
`requireTitle`	boolean	`false`	Only include windows with titles
`processId`	number	-	Filter by owning process ID

Returns WindowInfo:

windowId: Unique window identifier
title: Window title
owningProcessId: PID of owning app
owningApplicationName: App name
owningBundleIdentifier: Bundle ID
frame: { x, y, width, height }
layer: Window layer level
onScreen: Whether visible
active: Whether active

const windows = capture.getWindows({ onScreenOnly: true, requireTitle: true });
windows.forEach(w => console.log(`${w.windowId}: ${w.title} (${w.owningApplicationName})`));

[7] `getDisplays(): DisplayInfo[]`

List all displays.

Returns DisplayInfo:

displayId: Unique display identifier
width: Display width in pixels
height: Display height in pixels
frame: { x, y, width, height }
isMainDisplay: Whether this is the primary display

const displays = capture.getDisplays();
const main = displays.find(d => d.isMainDisplay);

Selection Method

[8] `selectApp(identifiers?, options?): ApplicationInfo | null`

Smart app selection with multiple fallback strategies.

Parameter	Type	Description
`identifiers`	string \| number \| Array \| null	App name, PID, bundle ID, or array to try in order

Option	Type	Default	Description
`audioOnly`	boolean	`true`	Only search audio apps
`fallbackToFirst`	boolean	`false`	Return first app if no match
`throwOnNotFound`	boolean	`false`	Throw error instead of returning null
`sortByActivity`	boolean	`false`	Sort by recent activity (requires [21])
`appList`	Array	`null`	Reuse prefetched app list

// Try multiple apps in order
const app = capture.selectApp(['Spotify', 'Music', 'Safari']);

// Get first audio app
const first = capture.selectApp();

// Fallback to first if none match
const fallback = capture.selectApp(['Spotify'], { fallbackToFirst: true });

// Throw on failure
try {
  const app = capture.selectApp(['Spotify'], { throwOnNotFound: true });
} catch (err) {
  console.log('Not found:', err.details.availableApps);
}

Capture Methods

All capture methods accept CaptureOptions:

Option	Type	Default	Description
`format`	`'float32'` \| `'int16'`	`'float32'`	Audio format
`channels`	1 \| 2	2	Mono or stereo
`sampleRate`	number	48000	Requested sample rate (system-dependent)
`bufferSize`	number	system	Buffer size in frames (affects latency)
`minVolume`	number	0	Min RMS threshold (0-1), filters silence
`excludeCursor`	boolean	`true`	Reserved for future video features

Buffer Size Guidelines:

1024: ~21ms latency, higher CPU
2048: ~43ms latency, balanced (recommended)
4096: ~85ms latency, lower CPU

[9] `startCapture(appIdentifier, options?): boolean`

Start capturing from an application.

Parameter	Type	Description
`appIdentifier`	string \| number \| ApplicationInfo	App name, bundle ID, PID, or app object

capture.startCapture('Spotify');                  // By name
capture.startCapture('com.spotify.client');       // By bundle ID
capture.startCapture(12345);                      // By PID
capture.startCapture(app);                        // By object

// With options
capture.startCapture('Spotify', {
  format: 'int16',
  channels: 1,
  minVolume: 0.01
});

[10] `captureWindow(windowId, options?): boolean`

Capture audio from a specific window.

Parameter	Type	Description
`windowId`	number	Window ID from `getWindows()`

const windows = capture.getWindows({ requireTitle: true });
const target = windows.find(w => w.title.includes('Safari'));
capture.captureWindow(target.windowId, { format: 'int16' });

[11] `captureDisplay(displayId, options?): boolean`

Capture audio from a display.

Parameter	Type	Description
`displayId`	number	Display ID from `getDisplays()`

const displays = capture.getDisplays();
const main = displays.find(d => d.isMainDisplay);
capture.captureDisplay(main.displayId);

[12] `captureMultipleApps(appIdentifiers, options?): boolean`

Capture from multiple apps simultaneously. Audio is mixed into a single stream.

Parameter	Type	Description
`appIdentifiers`	Array	App names, PIDs, bundle IDs, or ApplicationInfo objects

Additional Option	Type	Default	Description
`allowPartial`	boolean	`false`	Continue if some apps not found

// Capture game + Discord audio
capture.captureMultipleApps(['Minecraft', 'Discord'], {
  allowPartial: true,  // Continue even if one app not found
  format: 'int16'
});

[13] `captureMultipleWindows(windowIdentifiers, options?): boolean`

Capture from multiple windows. Audio is mixed.

Parameter	Type	Description
`windowIdentifiers`	Array	Window IDs or WindowInfo objects

Additional Option	Type	Default	Description
`allowPartial`	boolean	`false`	Continue if some windows not found

const windows = capture.getWindows({ requireTitle: true });
const browserWindows = windows.filter(w => /Safari|Chrome/.test(w.owningApplicationName));
capture.captureMultipleWindows(browserWindows.map(w => w.windowId));

[14] `captureMultipleDisplays(displayIdentifiers, options?): boolean`

Capture from multiple displays. Audio is mixed.

Parameter	Type	Description
`displayIdentifiers`	Array	Display IDs or DisplayInfo objects

Additional Option	Type	Default	Description
`allowPartial`	boolean	`false`	Continue if some displays not found

const displays = capture.getDisplays();
capture.captureMultipleDisplays(displays.map(d => d.displayId));

[15] `stopCapture(): void`

Stop the current capture session. Emits 'stop' event.

[16] `isCapturing(): boolean`

Check if currently capturing.

if (capture.isCapturing()) {
  capture.stopCapture();
}

[17] `getStatus(): CaptureStatus | null`

Get detailed capture status. Returns null if not capturing.

Returns CaptureStatus:

capturing: Always true when not null
processId: Process ID (may be null for display capture)
app: ApplicationInfo or null
window: WindowInfo or null
display: DisplayInfo or null
targetType: 'application' | 'window' | 'display' | 'multi-app'
config: { minVolume, format }

const status = capture.getStatus();
if (status) {
  console.log(`Type: ${status.targetType}, App: ${status.app?.applicationName}`);
}

[18] `getCurrentCapture(): CaptureInfo | null`

Get current capture target info. Same as getStatus() but without config.

Stream Methods

[19] `createAudioStream(appIdentifier, options?): AudioStream`

Create a Node.js Readable stream for audio capture.

Parameter	Type	Description
`appIdentifier`	string \| number	App name, bundle ID, or PID

Additional Option	Type	Default	Description
`objectMode`	boolean	`false`	Emit AudioSample objects instead of Buffers

// Raw buffer mode (for piping)
const stream = capture.createAudioStream('Spotify');
stream.pipe(myWritable);

// Object mode (for metadata access)
const stream = capture.createAudioStream('Spotify', { objectMode: true });
stream.on('data', (sample) => console.log(`RMS: ${sample.rms}`));

// Stop stream
stream.stop();

[20] `createSTTStream(appIdentifier?, options?): STTConverter`

Create stream pre-configured for Speech-to-Text engines.

Parameter	Type	Description
`appIdentifier`	string \| number \| Array \| null	App identifier(s), null for auto-select

Option	Type	Default	Description
`format`	`'int16'` \| `'float32'`	`'int16'`	Output format
`channels`	1 \| 2	1	Output channels (mono recommended)
`objectMode`	boolean	`false`	Emit objects with metadata
`autoSelect`	boolean	`true`	Auto-select first audio app if not found
`minVolume`	number	-	Silence filter threshold

// Auto-selects first audio app, converts to Int16 mono
const sttStream = capture.createSTTStream();
sttStream.pipe(yourSTTEngine);

// With fallback apps
const sttStream = capture.createSTTStream(['Zoom', 'Safari', 'Chrome']);

// Access selected app
console.log(`Selected: ${sttStream.app.applicationName}`);

// Stop
sttStream.stop();

Activity Tracking Methods

[21] `enableActivityTracking(options?): void`

Enable background tracking of audio activity. Useful for sorting apps by recent audio.

Option	Type	Default	Description
`decayMs`	number	30000	Remove apps from cache after this many ms of inactivity

capture.enableActivityTracking({ decayMs: 60000 }); // 60s decay

[22] `disableActivityTracking(): void`

Disable tracking and clear the cache.

capture.disableActivityTracking();

[23] `getActivityInfo(): ActivityInfo`

Get activity tracking status and statistics.

Returns ActivityInfo:

enabled: Whether tracking is enabled
trackedApps: Number of apps in cache
recentApps: Array of ProcessActivityInfo:
- processId: Process ID
- lastSeen: Timestamp of last audio
- ageMs: Time since last audio
- avgRMS: Average RMS level
- sampleCount: Number of samples received

const info = capture.getActivityInfo();
console.log(`Active apps: ${info.trackedApps}`);
info.recentApps.forEach(app => {
  console.log(`PID ${app.processId}: ${app.sampleCount} samples`);
});

Lifecycle Methods

[24] `dispose(): void`

Release all resources and stop any active capture. Safe to call multiple times (idempotent).

const capture = new AudioCapture();
capture.startCapture('Spotify');

// When done, release resources
capture.dispose();

// Instance can no longer be used
console.log(capture.isDisposed()); // true

[25] `isDisposed(): boolean`

Check if this instance has been disposed.

if (!capture.isDisposed()) {
  capture.startCapture('Spotify');
}

Note: Calling methods like startCapture(), captureWindow(), or captureDisplay() on a disposed instance will throw an error.

Static Method Reference

[S1] `AudioCapture.verifyPermissions(): PermissionStatus`

Check screen recording permission before capture.

Returns PermissionStatus:

granted: Whether permission is granted
message: Human-readable status
apps: Prefetched app list (reuse with selectApp({ appList }))
availableApps: Number of apps found
remediation: Fix instructions (if not granted)

const status = AudioCapture.verifyPermissions();
if (!status.granted) {
  console.error(status.message);
  console.log(status.remediation);
  process.exit(1);
}

// Reuse apps list
const app = capture.selectApp(['Spotify'], { appList: status.apps });

[S2] `AudioCapture.bufferToFloat32Array(buffer): Float32Array`

Convert Buffer to Float32Array for audio processing.

capture.on('audio', (sample) => {
  const floats = AudioCapture.bufferToFloat32Array(sample.data);
  // Process individual samples
  for (let i = 0; i < floats.length; i++) {
    const value = floats[i]; // Range: -1.0 to 1.0
  }
});

[S3] `AudioCapture.rmsToDb(rms): number`

Convert RMS value (0-1) to decibels.

const db = AudioCapture.rmsToDb(0.5); // -6.02 dB
const db = AudioCapture.rmsToDb(sample.rms);

[S4] `AudioCapture.peakToDb(peak): number`

Convert peak value (0-1) to decibels.

const db = AudioCapture.peakToDb(sample.peak);

[S5] `AudioCapture.calculateDb(buffer, method?): number`

Calculate dB level directly from audio buffer.

Parameter	Type	Default	Description
`buffer`	Buffer	-	Audio data buffer
`method`	`'rms'` \| `'peak'`	`'rms'`	Calculation method

capture.on('audio', (sample) => {
  const rmsDb = AudioCapture.calculateDb(sample.data, 'rms');
  const peakDb = AudioCapture.calculateDb(sample.data, 'peak');
});

[S6] `AudioCapture.writeWav(buffer, options): Buffer`

Create a complete WAV file from PCM audio data.

Option	Type	Required	Description
`sampleRate`	number	✓	Sample rate in Hz
`channels`	number	✓	Number of channels
`format`	`'float32'` \| `'int16'`		Audio format (default: `'float32'`)

import fs from 'fs';

capture.on('audio', (sample) => {
  const wav = AudioCapture.writeWav(sample.data, {
    sampleRate: sample.sampleRate,
    channels: sample.channels,
    format: sample.format
  });
  fs.writeFileSync('output.wav', wav);
});

[S7] `AudioCapture.cleanupAll(): number`

Dispose all active AudioCapture instances. Returns the number of instances cleaned up.

// Create multiple instances
const capture1 = new AudioCapture();
const capture2 = new AudioCapture();

console.log(AudioCapture.getActiveInstanceCount()); // 2

// Clean up all at once
const cleaned = AudioCapture.cleanupAll();
console.log(`Cleaned up ${cleaned} instances`); // 2

[S8] `AudioCapture.getActiveInstanceCount(): number`

Get the number of active (non-disposed) AudioCapture instances.

const capture = new AudioCapture();
console.log(AudioCapture.getActiveInstanceCount()); // 1

capture.dispose();
console.log(AudioCapture.getActiveInstanceCount()); // 0

Error Handling

Class: `AudioCaptureError`

Custom error class thrown by the SDK.

message: Human-readable error message
code: Machine-readable error code (see below)
details: Additional context (e.g., processId, availableApps)

Error Codes

Import ErrorCode for reliable error checking:

import { AudioCapture, AudioCaptureError, ErrorCode } from 'screencapturekit-audio-capture';

const capture = new AudioCapture();
capture.on('error', (err: AudioCaptureError) => {
  if (err.code === ErrorCode.APP_NOT_FOUND) {
    // Handle missing app
  }
});

Code	Description
`ERR_PERMISSION_DENIED`	Screen Recording permission not granted
`ERR_APP_NOT_FOUND`	Application not found by name or bundle ID
`ERR_PROCESS_NOT_FOUND`	Process ID not found or not running
`ERR_ALREADY_CAPTURING`	Attempted to start capture while already capturing
`ERR_CAPTURE_FAILED`	Native capture failed to start (e.g., app has no windows)
`ERR_INVALID_ARGUMENT`	Invalid arguments provided to method

Using Error Codes:

import { AudioCapture, AudioCaptureError, ErrorCode } from 'screencapturekit-audio-capture';

const capture = new AudioCapture();

capture.on('error', (err: AudioCaptureError) => {
  switch (err.code) {
    case ErrorCode.PERMISSION_DENIED:
      console.log('Grant Screen Recording permission');
      break;
    case ErrorCode.APP_NOT_FOUND:
      console.log('App not found:', err.details.requestedApp);
      console.log('Available:', err.details.availableApps);
      break;
    case ErrorCode.ALREADY_CAPTURING:
      console.log('Stop current capture first');
      capture.stopCapture();
      break;
    default:
      console.error('Error:', err.message);
  }
});

Stream Classes

AudioStream - Readable stream extending Node.js Readable:

stop() - Stop stream and capture
getCurrentCapture() - Get current capture info

STTConverter - Transform stream extending Node.js Transform:

stop() - Stop stream and capture
app - The selected ApplicationInfo
captureOptions - Options used for capture

Low-Level API: `ScreenCaptureKit`

For advanced users who need direct access to the native binding:

import { ScreenCaptureKit } from 'screencapturekit-audio-capture';

const captureKit = new ScreenCaptureKit();

// Get apps (returns basic ApplicationInfo array)
const apps = captureKit.getAvailableApps();

// Start capture (requires manual callback handling)
captureKit.startCapture(processId, config, (sample) => {
  // sample: { data, sampleRate, channelCount, timestamp }
  // No enhancement - raw native data
});

captureKit.stopCapture();
const isCapturing = captureKit.isCapturing();

When to use:

Absolute minimal overhead needed
Building your own wrapper
Avoiding event emitter overhead

Most users should use AudioCapture instead.

Multi-Process Capture Service

macOS ScreenCaptureKit only allows one process to capture audio at a time. If you need multiple processes to receive the same audio data, use the server/client architecture.

📁 See readme_examples/advanced/20-capture-service.ts for a complete example

When to Use

Scenario	Solution
Single app capturing audio	Use `AudioCapture` directly
Multiple processes need same audio	Use `AudioCaptureServer` + `AudioCaptureClient`
Electron main + renderer processes	Use server/client
Microservices architecture	Use server/client

Architecture Overview

┌─────────────────────────────────────────────────────────┐
│                  AudioCaptureServer                     │
│  - Runs in one process                                  │
│  - Handles actual ScreenCaptureKit capture              │
│  - Broadcasts audio to all connected clients            │
└─────────────────────────────────────────────────────────┘
              │ WebSocket (ws://localhost:9123)
              ▼
┌─────────────┐  ┌─────────────┐  ┌─────────────┐
│   Client 1  │  │   Client 2  │  │   Client N  │
│  (Process A)│  │  (Process B)│  │  (Process N)│
└─────────────┘  └─────────────┘  └─────────────┘

Server Usage

import { AudioCaptureServer } from 'screencapturekit-audio-capture';

const server = new AudioCaptureServer({
  port: 9123,        // Default: 9123
  host: 'localhost'  // Default: 'localhost'
});

// Start the server
await server.start();

// Server events
server.on('clientConnected', (clientId) => console.log(`Client ${clientId} connected`));
server.on('clientDisconnected', (clientId) => console.log(`Client ${clientId} disconnected`));
server.on('captureStarted', (session) => console.log(`Capture started: ${session.id}`));
server.on('captureStopped', () => console.log('Capture stopped'));
server.on('captureError', (error) => console.error('Capture error:', error));

// Stop the server
await server.stop();

Server Methods

Method	Returns	Description
`start()`	`Promise<void>`	Start the WebSocket server
`stop()`	`Promise<void>`	Stop server and disconnect all clients
`getSession()`	`CaptureSession \| null`	Get current capture session info
`getClientCount()`	`number`	Get number of connected clients

Server Events

Event	Payload	Description
`'clientConnected'`	`clientId: string`	Client connected
`'clientDisconnected'`	`clientId: string`	Client disconnected
`'captureStarted'`	`CaptureSession`	Capture session started
`'captureStopped'`	-	Capture session stopped
`'captureError'`	`Error`	Capture error occurred

Client Usage

import { AudioCaptureClient } from 'screencapturekit-audio-capture';

const client = new AudioCaptureClient({
  url: 'ws://localhost:9123',  // Default
  autoReconnect: true,         // Default: true
  reconnectDelay: 1000,        // Default: 1000ms
  maxReconnectAttempts: 10     // Default: 10
});

// Connect to server
await client.connect();

// Receive audio (similar API to AudioCapture)
client.on('audio', (sample) => {
  console.log(`Received ${sample.data.length} samples at ${sample.sampleRate}Hz`);
});

// List available apps (via server)
const apps = await client.getApplications();

// Start capture (request sent to server)
await client.startCapture('Spotify');
// Or by PID: await client.startCapture(12345);

// Other capture methods
await client.captureWindow(windowId);
await client.captureDisplay(displayId);
await client.captureMultipleApps(['Spotify', 'Discord']);

// Get server status
const status = await client.getStatus();
console.log(`Capturing: ${status.capturing}, Clients: ${status.totalClients}`);

// Stop capture
await client.stopCapture();

// Disconnect
client.disconnect();

Client Methods

Method	Returns	Description
`connect()`	`Promise<void>`	Connect to the server
`disconnect()`	`void`	Disconnect from server
`getApplications()`	`Promise<ApplicationInfo[]>`	List apps via server
`getWindows()`	`Promise<WindowInfo[]>`	List windows via server
`getDisplays()`	`Promise<DisplayInfo[]>`	List displays via server
`startCapture(target, opts?)`	`Promise<boolean>`	Start app capture
`captureWindow(id, opts?)`	`Promise<boolean>`	Start window capture
`captureDisplay(id, opts?)`	`Promise<boolean>`	Start display capture
`captureMultipleApps(targets, opts?)`	`Promise<boolean>`	Start multi-app capture
`stopCapture()`	`Promise<void>`	Stop current capture
`getStatus()`	`Promise<ServerStatus>`	Get server status
`getClientId()`	`string \| null`	Get this client's ID
`getSessionId()`	`string \| null`	Get current session ID

Client Events

Event	Payload	Description
`'connected'`	-	Connected to server
`'disconnected'`	-	Disconnected from server
`'reconnecting'`	`attempt: number`	Attempting to reconnect
`'reconnectFailed'`	-	Max reconnect attempts reached
`'audio'`	`RemoteAudioSample`	Audio data received
`'captureStopped'`	-	Server stopped capture
`'captureError'`	`{ message: string }`	Server capture error
`'error'`	`Error`	Client-side error

RemoteAudioSample

Audio samples received by clients have this structure:

Property	Type	Description
`data`	`Float32Array`	Audio sample data
`sampleRate`	`number`	Sample rate in Hz
`channels`	`number`	Number of channels
`timestamp`	`number`	Timestamp in seconds

Events Reference

Event: `'start'`

Emitted when capture starts.

capture.on('start', ({ processId, app }) => {
  console.log(`Capturing from ${app?.applicationName}`);
});

Event: `'audio'`

Emitted for each audio sample. See Audio Sample Structure for all properties.

capture.on('audio', (sample: AudioSample) => {
  console.log(`${sample.durationMs}ms, RMS: ${sample.rms}`);
});

Event: `'stop'`

Emitted when capture stops.

capture.on('stop', ({ processId }) => {
  console.log('Capture stopped');
});

Event: `'error'`

Emitted on errors.

capture.on('error', (err: AudioCaptureError) => {
  console.error(`[${err.code}]:`, err.message);
});

TypeScript

Full type definitions included. See Module Exports for import syntax.

Available Types

Type	Description
`AudioSample`	Audio sample with data and metadata
`ApplicationInfo`	App info (processId, bundleIdentifier, applicationName)
`WindowInfo`	Window info (windowId, title, frame, etc.)
`DisplayInfo`	Display info (displayId, width, height, etc.)
`CaptureInfo`	Current capture target info
`CaptureStatus`	Full capture status including config
`PermissionStatus`	Permission verification result
`ActivityInfo`	Activity tracking stats
`CaptureOptions`	Options for startCapture()
`AudioStreamOptions`	Options for createAudioStream()
`STTStreamOptions`	Options for createSTTStream()
`MultiAppCaptureOptions`	Options for captureMultipleApps()
`MultiWindowCaptureOptions`	Options for captureMultipleWindows()
`MultiDisplayCaptureOptions`	Options for captureMultipleDisplays()
`ServerOptions`	Options for AudioCaptureServer
`ClientOptions`	Options for AudioCaptureClient
`RemoteAudioSample`	Audio sample received via client
`CleanupResult`	Result of cleanupAll() operation
`ErrorCode`	Enum of error codes

Working with Audio Data

Buffer Format

Audio samples are Node.js Buffer objects containing Float32 PCM by default:

capture.on('audio', (sample) => {
  // Use helper (recommended)
  const float32 = AudioCapture.bufferToFloat32Array(sample.data);
  
  // Or manual
  const float32Manual = new Float32Array(
    sample.data.buffer,
    sample.data.byteOffset,
    sample.data.byteLength / 4
  );
});

Int16 Format

capture.startCapture('Spotify', { format: 'int16' });

capture.on('audio', (sample) => {
  const int16 = new Int16Array(
    sample.data.buffer,
    sample.data.byteOffset,
    sample.data.byteLength / 2
  );
});

Filtering Silence

capture.startCapture('Spotify', { minVolume: 0.01 });
// Only emits audio events when volume > 0.01 RMS

Resource Lifecycle

📁 See readme_examples/advanced/21-graceful-cleanup.ts for a complete example

Properly managing resources ensures your application shuts down cleanly without orphaned captures or memory leaks.

Instance Cleanup

const capture = new AudioCapture();
capture.startCapture('Spotify');

// When done with this specific instance
capture.dispose();  // Stops capture and releases resources

Global Cleanup

import { cleanupAll, getActiveInstanceCount, installGracefulShutdown } from 'screencapturekit-audio-capture';

// Check active instances
console.log(`Active: ${getActiveInstanceCount()}`);

// Clean up all instances at once
const result = await cleanupAll();  // Returns CleanupResult
console.log(`Cleaned up ${result.total} instances`);

// Install automatic cleanup on process exit (SIGINT, SIGTERM, etc.)
installGracefulShutdown();

Best Practices

Pattern	When to Use
`capture.dispose()`	Cleaning up a specific instance
`AudioCapture.cleanupAll()`	Cleaning up all `AudioCapture` instances
`cleanupAll()`	Cleaning up all instances (AudioCapture + AudioCaptureServer)
`installGracefulShutdown()`	Auto-cleanup on Ctrl+C, kill signals, or uncaught exceptions

Process Exit Handling

Exit handlers are automatically installed when you create an AudioCapture or AudioCaptureServer instance. For explicit control:

import { installGracefulShutdown } from 'screencapturekit-audio-capture';

// Install once at application startup
installGracefulShutdown();

// Now SIGINT/SIGTERM will automatically:
// 1. Stop all active captures
// 2. Dispose all instances
// 3. Exit cleanly

Common Issues

No applications available

Solution: Grant Screen Recording permission in System Preferences → Privacy & Security → Screen Recording, then restart your terminal.

Application not found

Solutions:

Check if the app is running
Use capture.getApplications() to list available apps
Use bundle ID instead of name: capture.startCapture('com.spotify.client')

No audio samples received

Solutions:

Ensure the app is playing audio
Check if audio is muted
Remove minVolume threshold for testing
Verify the app has visible windows

Build errors

Note: Most users won't see build errors since prebuilt binaries are included. These steps apply only if compilation is needed.

Solutions:

Install Xcode CLI Tools: xcode-select --install
Verify macOS 13.0+: sw_vers
Clean rebuild: npm run clean && npm run build

Examples

📁 All examples are in readme_examples/

Basics

Example	File	Description
Quick Start	`basics/01-quick-start.ts`	Basic capture setup
Robust Capture	`basics/05-robust-capture.ts`	Production error handling
Find Apps	`basics/11-find-apps.ts`	App discovery

Voice & STT

Example	File	Description
STT Integration	`voice/02-stt-integration.ts`	Speech-to-text patterns
Voice Agent	`voice/03-voice-agent.ts`	Real-time voice processing
Recording	`voice/04-audio-recording.ts`	Record and save as WAV

Streams

Example	File	Description
Stream Basics	`streams/06-stream-basics.ts`	Stream API fundamentals
Stream Processing	`streams/07-stream-processing.ts`	Transform streams

Processing

Example	File	Description
Visualizer	`processing/08-visualizer.ts`	ASCII volume display
Volume Monitor	`processing/09-volume-monitor.ts`	Level alerts
Int16 Capture	`processing/10-int16-capture.ts`	Int16 format
Manual Processing	`processing/12-manual-processing.ts`	Buffer manipulation

Capture Targets

Example	File	Description
Multi-App Capture	`capture-targets/13-multi-app-capture.ts`	Multiple apps
Per-App Streams	`capture-targets/14-per-app-streams.ts`	Separate streams
Window Capture	`capture-targets/15-window-capture.ts`	Single window
Display Capture	`capture-targets/16-display-capture.ts`	Full display
Multi-Window	`capture-targets/17-multi-window-capture.ts`	Multiple windows
Multi-Display	`capture-targets/18-multi-display-capture.ts`	Multiple displays

Advanced

Example	File	Description
Advanced Methods	`advanced/19-advanced-methods.ts`	Activity tracking
Capture Service	`advanced/20-capture-service.ts`	Multi-process sharing
Graceful Cleanup	`advanced/21-graceful-cleanup.ts`	Resource lifecycle management

Run examples:

npx tsx readme_examples/basics/01-quick-start.ts
npm run test:readme  # Run all examples

Targeting specific apps/windows/displays:

Most examples support environment variables to target specific sources instead of using defaults:

Env Variable	Type	Used By	Example
`TARGET_APP`	App name	01-12, 19-21	`TARGET_APP="Spotify" npx tsx readme_examples/basics/01-quick-start.ts`
`TARGET_APPS`	Comma-separated	13, 14	`TARGET_APPS="Safari,Music" npx tsx readme_examples/capture-targets/13-multi-app-capture.ts`
`TARGET_WINDOW`	Window ID	15, 17	`TARGET_WINDOW=12345 npx tsx readme_examples/capture-targets/15-window-capture.ts`
`TARGET_DISPLAY`	Display ID	16, 18	`TARGET_DISPLAY=1 npx tsx readme_examples/capture-targets/16-display-capture.ts`
`VERIFY`	`1` or `true`	13	`VERIFY=1 npx tsx readme_examples/capture-targets/13-multi-app-capture.ts`

Tip: Run npx tsx readme_examples/basics/11-find-apps.ts to list available apps and their names. Window/display IDs are printed when running the respective capture examples.

Important: Environment variables must be placed before the command, not after. TARGET_APP="Spotify" npx tsx ... works, but npx tsx ... TARGET_APP="Spotify" does not.

Platform Support

macOS Version	Support	Notes
macOS 15+ (Sequoia)	⚠️ Known issues	Single-process audio capture limitation (use server/client)
macOS 14+ (Sonoma)	✅ Full	Recommended
macOS 13+ (Ventura)	✅ Full	Minimum required
macOS 12.x and below	❌ No	ScreenCaptureKit not available
Windows/Linux	❌ No	macOS-only framework

Note: On macOS 15+, only one process can capture audio at a time via ScreenCaptureKit. If you need multiple processes to receive audio, use the Multi-Process Capture Service.

Performance

Typical (Apple Silicon M1):

CPU: <1% for stereo Float32
Memory: ~10-20MB
Latency: ~160ms (configurable)

Optimization tips:

Use minVolume to filter silence
Use format: 'int16' for 50% memory reduction
Use channels: 1 for another 50% reduction

Contributing

git clone https://github.com/mrlionware/screencapturekit-audio-capture.git
cd screencapturekit-audio-capture
npm install
npm run build
npm test

License

MIT License - see LICENSE

Made with ❤️ for the Node.js and macOS developer community

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
.github/workflows		.github/workflows
dist		dist
readme_examples		readme_examples
src		src
tests		tests
.gitignore		.gitignore
.npmignore		.npmignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
binding.gyp		binding.gyp
package.json		package.json
tsconfig.json		tsconfig.json

License

MrLionware/screencapturekit-audio-capture

Folders and files

Latest commit

History

Repository files navigation

ScreenCaptureKit Audio Capture

📖 Table of Contents

Features

Requirements

Installation

Fallback Compilation

Package Contents

Project Structure

Quick Start

Quick Integration Guide

Key Configuration Patterns

Audio Sample Structure

Module Exports

Testing

Stream-Based API

When to Use Streams vs Events

Stream API Best Practices

Troubleshooting Stream Issues

Stream Performance Tips

API Reference

Class: AudioCapture

Methods Overview

Static Methods

Events

Method Reference

Discovery Methods

[1] getApplications(options?): ApplicationInfo[]

[2] getAudioApps(options?): ApplicationInfo[]

[3] findApplication(identifier): ApplicationInfo | null

[4] findByName(name): ApplicationInfo | null

[5] getApplicationByPid(processId): ApplicationInfo | null

[6] getWindows(options?): WindowInfo[]

[7] getDisplays(): DisplayInfo[]

Selection Method

[8] selectApp(identifiers?, options?): ApplicationInfo | null

Capture Methods

[9] startCapture(appIdentifier, options?): boolean

[10] captureWindow(windowId, options?): boolean

[11] captureDisplay(displayId, options?): boolean

[12] captureMultipleApps(appIdentifiers, options?): boolean

[13] captureMultipleWindows(windowIdentifiers, options?): boolean

[14] captureMultipleDisplays(displayIdentifiers, options?): boolean

[15] stopCapture(): void

[16] isCapturing(): boolean

[17] getStatus(): CaptureStatus | null

[18] getCurrentCapture(): CaptureInfo | null

Stream Methods

[19] createAudioStream(appIdentifier, options?): AudioStream

[20] createSTTStream(appIdentifier?, options?): STTConverter

Activity Tracking Methods

[21] enableActivityTracking(options?): void

[22] disableActivityTracking(): void

[23] getActivityInfo(): ActivityInfo

Lifecycle Methods

[24] dispose(): void

[25] isDisposed(): boolean

Static Method Reference

[S1] AudioCapture.verifyPermissions(): PermissionStatus

[S2] AudioCapture.bufferToFloat32Array(buffer): Float32Array

[S3] AudioCapture.rmsToDb(rms): number

[S4] AudioCapture.peakToDb(peak): number

[S5] AudioCapture.calculateDb(buffer, method?): number

[S6] AudioCapture.writeWav(buffer, options): Buffer

[S7] AudioCapture.cleanupAll(): number

[S8] AudioCapture.getActiveInstanceCount(): number

Error Handling

Class: AudioCaptureError

Error Codes

Stream Classes

Low-Level API: ScreenCaptureKit

Multi-Process Capture Service

When to Use

Architecture Overview

Server Usage

Class: `AudioCapture`

[1] `getApplications(options?): ApplicationInfo[]`

[2] `getAudioApps(options?): ApplicationInfo[]`

[3] `findApplication(identifier): ApplicationInfo | null`

[4] `findByName(name): ApplicationInfo | null`

[5] `getApplicationByPid(processId): ApplicationInfo | null`

[6] `getWindows(options?): WindowInfo[]`

[7] `getDisplays(): DisplayInfo[]`

[8] `selectApp(identifiers?, options?): ApplicationInfo | null`

[9] `startCapture(appIdentifier, options?): boolean`

[10] `captureWindow(windowId, options?): boolean`

[11] `captureDisplay(displayId, options?): boolean`

[12] `captureMultipleApps(appIdentifiers, options?): boolean`

[13] `captureMultipleWindows(windowIdentifiers, options?): boolean`

[14] `captureMultipleDisplays(displayIdentifiers, options?): boolean`

[15] `stopCapture(): void`

[16] `isCapturing(): boolean`

[17] `getStatus(): CaptureStatus | null`

[18] `getCurrentCapture(): CaptureInfo | null`

[19] `createAudioStream(appIdentifier, options?): AudioStream`

[20] `createSTTStream(appIdentifier?, options?): STTConverter`

[21] `enableActivityTracking(options?): void`

[22] `disableActivityTracking(): void`

[23] `getActivityInfo(): ActivityInfo`

[24] `dispose(): void`

[25] `isDisposed(): boolean`

[S1] `AudioCapture.verifyPermissions(): PermissionStatus`

[S2] `AudioCapture.bufferToFloat32Array(buffer): Float32Array`

[S3] `AudioCapture.rmsToDb(rms): number`

[S4] `AudioCapture.peakToDb(peak): number`

[S5] `AudioCapture.calculateDb(buffer, method?): number`

[S6] `AudioCapture.writeWav(buffer, options): Buffer`

[S7] `AudioCapture.cleanupAll(): number`

[S8] `AudioCapture.getActiveInstanceCount(): number`

Class: `AudioCaptureError`

Low-Level API: `ScreenCaptureKit`

Event: `'start'`

Event: `'audio'`

Event: `'stop'`

Event: `'error'`

Packages