Multichannel Audio Installation with Sample-based Sound Synthesis/Processing and AI-based Classification
- Project Overview
- System Architecture
- Hardware & Software Requirements
- Installation & Setup
- Network & OSC Configuration
- Content Computers (Max)
- Display Player (Max)
- Spatial Render Computers (Reaper)
- Audio Classification Pipeline
- Preset Management
- System Operation
- Project Structure
- Troubleshooting
- Performance Metrics
- License & Credits
This audio software is part of a larger AI-driven media installation system. The overall system analyzes and selects images and sounds, which are then processed and rendered by this audio software.
External Asset Repositories
APO_Maincurated audio pool (Google Drive download, see Installation) – holds all sample material for the main synthesis engines.APO_Playerstems for the display player (Google Drive download, see Installation) – required for the vitrines/content player patches.
The Audio Software consists of three main components:
1. Content Generation (Max - Sample-based Synthesis/Processing)
- Folder:
APO_Main/andAPO_Player/ - Receives audio material selected by the external AI system via OSC control
- Processes and synthesizes the material through multiple synthesis engines
- Generates 8-channel sound clouds with real-time parameter modulation
- Outputs multichannel audio streams (96ch outdoor, 60ch indoor, 20ch display)
2. Spatial Rendering (Reaper - IEM Ambisonics)
- Folder:
APO_Render/ - Receives multichannel audio from content generation via DANTE network
- Encodes audio into Ambisonics format (1st to 7th Order)
- Decodes to physical speaker layouts across seven venues
- Provides dynamic spatial positioning controlled by OSC messages
3. Offline Classification Pipeline (Python - YAMNet + GPT-4)
- Folder:
audio_classification/ - Non-realtime process run once when new audio files are added
- Analyzes acoustic features and semantic content of audio material
- Generates classification metadata and preset files
- Prepares audio material for selection by the external AI system
| Component | Technology |
|---|---|
| Audio Processing | Cycling '74 Max 9 |
| Spatial Rendering | Reaper + IEM Ambisonic Plugins |
| Audio Classification | Python, TensorFlow (YAMNet), OpenAI GPT-4 mini |
| Audio Network | DANTE |
| Control Protocol | OSC (UDP Ports 3000, 4000, 4321, 6000, 9001-9004) |
Outdoor (3 Rooms):
- Dominikanerhof: 74 speakers, 7th Order Ambisonics (main venue)
- Durchgang DB11: 12 speakers, 1st Order Ambisonics
- Durchgang PG10: 10 speakers, 1st Order Ambisonics
Indoor (4 Rooms):
- Lobby: 16 speakers, 2nd Order Ambisonics
- Restaurant: 28 speakers, 3rd Order Ambisonics
- Nachtbar: 4 speakers, 1st Order Ambisonics
- WC: 2x4 speakers, 1st Order Ambisonics
Total System: 152 speakers across 7 venues
The software consists of a distributed audio system spanning multiple rooms with five computers working in tandem, connected via a DANTE audio network and controlled through OSC protocol.
The system follows a content-to-space architecture where audio content is first generated and processed, then spatially rendered across multiple rooms. The signal flow progresses through three content generation machines (Max synthesis engines) via DANTE network to two spatial rendering machines (Reaper with Ambisonics), finally reaching all speakers across seven venues.
3 Content Computers (Max):
- WS-AUD-CON-PG8 (Outdoor Content) - Kunstwerk 1
- WS-AUD-CON-DMN (Indoor Content) - Kunstwerk 2
- WS-AUD-VIT-PG8 (Display Player) - Kunstwerk 1&2
2 Spatial Computers (Reaper):
- WS-AUD-SPAT-PG8 (Outdoor Spatialization) - Kunstwerk 1
- WS-AUD-SPAT-DMN (Indoor Spatialization) - Kunstwerk 2
| Computer | IP | Function | Software | Patch/Project |
|---|---|---|---|---|
| WS-AUD-CON-PG8 | 10.1.11.71 | Outdoor Content | Max 9 | APO_main.maxpat |
| WS-AUD-CON-DMN | 10.2.11.71 | Indoor Content | Max 9 | APO_main_in.maxpat |
| WS-AUD-VIT-PG8 | 10.1.11.73 | Display Player | Max 9 | APO_player.maxpat |
| WS-AUD-SPAT-PG8 | 10.1.11.72 | Outdoor Spatial | Reaper | APO_outdoor_output.rpp |
| WS-AUD-SPAT-DMN | 10.2.11.72 | Indoor Spatial | Reaper | APO_indoor_output.rpp |
The system follows a three-stage architecture: Control → Generate → Render. The external Brain system sends OSC messages to select and control audio material. Content computers (Max) synthesize and process this material into multichannel audio streams, which are transmitted via DANTE to spatial computers (Reaper). The spatial computers encode the audio into Ambisonics, decode it for specific room geometries, and output to all speakers across seven venues.
External AI System (OSC Control)
↓
Content Generation (Max) ← Audio Material (NAS: data-pg8)
↓ DANTE Network (96ch/60ch/20ch)
Spatial Rendering (Reaper + IEM Ambisonics)
↓ DANTE Network (136ch/98ch)
Speaker Arrays (7 venues, 152 channels total)
Detailed signal flow diagram: see documentation/signal_flow.png
-
External Brain Software (not part of this software): Controls both visuals and audio of the installation. Sends OSC control messages to this audio software for preset recall, playback triggers, and parameter modulation. See Section 5: Network & OSC Configuration for complete message specifications.
-
NAS Server (data-pg8): Central storage for audio material and preset files. Audio material and partitur files must be manually copied to/from the NAS server - there is no automatic synchronization.
-
DANTE Network Infrastructure: Gigabit Ethernet switch with multicast/IGMP support for audio-over-IP transmission between all computers.
All Computers:
- Cycling '74 Max 9.0+ (Download)
- For content computers and display player
- Reaper 6.80+ or 7.x (Download)
- For spatial computers
- IEM Plugin Suite 1.14.1+ (Download)
- Required for Reaper spatial rendering
- DANTE Controller (Download)
- Required for audio network configuration
- Max Dependencies (included in
dependencies/folder or download from Google Drive)- Must be copied to local Max library
- Note: Packages are too large for GitHub, download from Google Drive if not included locally
- Libraries used:
- CNMAT Externals - UC Berkeley CNMAT objects for audio/OSC processing
- ease - Easing functions for smooth parameter transitions
- FluidCorpusManipulation - Machine learning toolkit for audio analysis
- GeneratingSoundAndOrganizingTime - Synthesis and sequencing tools
- ICST Ambisonics - Spatial audio processing (Zurich University of the Arts)
- jasch objects - Utility objects for Max patching
- karma - Advanced patching utilities
- MuBu For Max - Multi-buffer processing for sound analysis/synthesis
- odot - OSC message handling and data structures
- spat5-x64 - IRCAM Spat spatial audio processing suite
Optional:
- Python 3.8+ with TensorFlow - For audio classification pipeline
- Jupyter Notebook - For running classification scripts
Content Computers:
| Computer | RAM | CPU | Purpose |
|---|---|---|---|
| WS-AUD-CON-PG8 | 32GB | i9/Ryzen 9/M2 Pro | Outdoor synthesis (96ch) |
| WS-AUD-CON-DMN | 32GB | i9/Ryzen 9/M2 Pro | Indoor synthesis (60ch) |
| WS-AUD-VIT-PG8 | 16GB | i7/M2 | Display player (20ch) |
Spatial Computers:
| Computer | RAM | CPU | Purpose |
|---|---|---|---|
| WS-AUD-SPAT-PG8 | 64GB | 24+ cores | Outdoor Ambisonics (7th Order) |
| WS-AUD-SPAT-DMN | 32GB | 16+ cores | Indoor Ambisonics (3rd Order max) |
Audio Interfaces:
- Marian Clara E DANTE audio interface (512ch I/O)
- Driver Version: 4.71 (required for all systems)
- ASIO driver for Max and Reaper integration
Network Infrastructure:
- Gigabit Ethernet switch (DANTE compatible, QoS-capable, multicast/IGMP support)
- Cat6 or Cat6a Ethernet cables
- Flow control disabled on all switch ports
Outdoor Network (10.1.11.x):
WS-AUD-CON-PG8: 10.1.11.71 (Content Outdoor)
WS-AUD-SPAT-PG8: 10.1.11.72 (Spatial Outdoor)
WS-AUD-VIT-PG8: 10.1.11.73 (Display Player)
Subnet Mask: 255.255.255.0
Indoor Network (10.2.11.x):
WS-AUD-CON-DMN: 10.2.11.71 (Content Indoor)
WS-AUD-SPAT-DMN: 10.2.11.72 (Spatial Indoor)
Subnet Mask: 255.255.255.0
Control Panel → Network Connections → Ethernet Properties
IPv4 Settings:
IP Address: [see above]
Subnet Mask: 255.255.255.0
Setup Steps:
- Install DANTE Controller on a management computer
- Ensure dual network connection:
- DANTE Audio Network: Connected via DANTE soundcard
- DANTE Control Network: Connected via Ethernet (for DANTE Controller to configure and monitor devices)
- Launch DANTE Controller
- Wait for device discovery (30-60 seconds)
- Configure network settings:
- Latency: 1ms (optimal) or 2ms (safe)
- Sample Rate: 48 kHz (fixed)
- Encoding: PCM 24-bit
- Configure routing matrix (see detailed routing tables below)
DANTE Routing:
Outdoor System (10.1.11.x):
WS-AUD-CON-PG8 → WS-AUD-SPAT-PG8:
Bus 1-4: 32ch (Ch. 201-233) → Reaper Ch. 001-032
Reverb: 64ch (Ch. 301-365) → Reaper Ch. 033-096
WS-AUD-VIT-PG8 → WS-AUD-SPAT-PG8:
Vit + Col: 20ch (Ch. 401-421) → Reaper Ch. 097-116
WS-AUD-SPAT-PG8 → Speakers:
Output: 136ch (Ch. 001-163) → Yamaha Matrix
Indoor System (10.2.11.x):
WS-AUD-CON-DMN → WS-AUD-SPAT-DMN:
Bus 1-4: 32ch (Ch. 201-233) → Reaper Ch. 001-032
Reverb: 28ch (Ch. 301-329) → Reaper Ch. 033-060
WS-AUD-VIT-PG8 → WS-AUD-SPAT-DMN:
Vit + Col: 20ch (Ch. 401-421) → Reaper Ch. 061-080
WS-AUD-SPAT-DMN → Speakers:
Output: 98ch (Ch. 001-099) → Yamaha Matrix
Before installing or opening the Max patches make sure the curated audio libraries are present locally:
- APO_Main audio pool: Download the complete material from
Google Driveand copy it into theAPO_Main/audio/directory (or the NAS mirror used in production). - APO_Player content: Download the display-player stems from
Google Driveand place them insideAPO_Player/audio/.
Hinweis (DE): Ohne diese beiden Downloads stehen keine Samples für apo_main bzw. apo_player zur Verfügung – bitte die jeweiligen Google-Drive-Ordner vollständig synchronisieren, bevor die Installation fortgesetzt wird.
Download and install from cycling74.com or use /dependencies/Max909_250918_d7cea08.msi
The following Max libraries are required:
| Library | Purpose |
|---|---|
| CNMAT Externals | UC Berkeley CNMAT objects for audio/OSC processing |
| ease | Easing functions for smooth parameter transitions |
| FluidCorpusManipulation | Machine learning toolkit for audio analysis |
| GeneratingSoundAndOrganizingTime | Synthesis and sequencing tools |
| ICST Ambisonics | Spatial audio processing (Zurich University of the Arts) |
| jasch objects | Utility objects for Max patching |
| karma | Advanced patching utilities |
| MuBu For Max | Multi-buffer processing for sound analysis/synthesis |
| odot | OSC message handling and data structures |
| spat5-x64 | IRCAM Spat spatial audio processing suite |
Download Libraries:
If not included in local dependencies/ folder, download from:
Google Drive - Max Packages
(Note: Packages are too large for GitHub hosting)
Installation:
Windows:
Copy all folders from: dependencies/ (or downloaded folder)
to: C:\Users\[YourUsername]\Documents\Max 9\Library\
macOS:
Copy all folders from: dependencies/ (or downloaded folder)
to: ~/Documents/Max 9/Library/
- Open Max 9
- Go to: Options → File Preferences (Windows) or Max → Settings → File Preferences (macOS)
- Click Add (or the + button)
- Browse to and select the
APO_Mainfolder - ✓ MUST check "Include Subfolders" - Essential!
- Click OK to save
Why needed: Max needs to search the folder structure for abstractions, externals, presets, audio samples, and wavetables.
- Close Max completely
- Reopen Max 9
- Ensures all dependencies are loaded
For WS-AUD-CON-PG8 & WS-AUD-CON-DMN:
- Sample Rate: 48000 Hz
- I/O Vector Size: 2048 samples
- Signal Vector Size: 2048 samples
- Audio Interface: ad_asio Clara E (Marian Clara E, Driver v4.71)
- Clock Source: DANTE
- Overdrive: On - Essential!
For WS-AUD-VIT-PG8:
- Sample Rate: 48000 Hz
- I/O Vector Size: 2048 samples
- Signal Vector Size: 2048 samples
- Audio Interface: ad_asio Clara E (Marian Clara E, Driver v4.71)
- Clock Source: DANTE
- Overdrive: On - Essential!
Download from reaper.fm or use /dependencies/reaper748_x64-install.exe
Download from plugins.iem.at and run plugin scan or use /dependencies/IEMPluginSuiteInstaller_v1.15.0_x64.exe
WS-AUD-SPAT-PG8 (Outdoor):
- Sample Rate: 48000 Hz
- Block Size: 512 samples
- Audio Device: ASIO Marian Clara E (Driver v4.71, 512ch I/O via DANTE)
- Project:
APO_Render/APO_outdoor_output.rpp
WS-AUD-SPAT-DMN (Indoor):
- Sample Rate: 48000 Hz
- Block Size: 512 samples
- Audio Device: ASIO Marian Clara E (Driver v4.71, 512ch I/O via DANTE)
- Project:
APO_Render/APO_indoor_output.rpp
Outdoor Decoders:
- Dominikanerhof:
APO_Render/decoder/hof_dom_1_74/speaker_setup_1_74.json - DB11:
APO_Render/decoder/gang_db11/db11_allrad.json - PG10:
APO_Render/decoder/gang_pg10/pg_10_allrad.json
Indoor Decoders:
- Lobby:
APO_Render/decoder/indoor_lobby/lob_speaker_setup_allrad.json - Restaurant:
APO_Render/decoder/indoor_restaurant/rest_speaker_setup_allrad.json - WC:
APO_Render/decoder/wc_allrad.json
Preferences → Control/OSC/web → Add
- Outdoor: Listen on ports 9001-9004, 3000
- Indoor: Listen on ports 9001-9004
IEM MultiEncoder Configuration:
Each IEM MultiEncoder must be configured to receive OSC messages for spatial positioning:
- Set to 8 channels (for 8-channel sound cloud processing)
- Bus 1: Listen on port 9001
- Bus 2: Listen on port 9002
- Bus 3: Listen on port 9003
- Bus 4: Listen on port 9004
The content computers send spatial positioning data (/iem/position/azimuth, /iem/position/elevation) to these ports to dynamically move the 8-channel sound clouds through the Ambisonics field.
Ritual Shutdown (Outdoor only):
When the external brain software sends /DisplayCase 1 to the APO_Main, the outdoor content computer (WS-AUD-CON-PG8) automatically sends OSC messages via port 3000 to the outdoor spatial computer (WS-AUD-SPAT-PG8). Reaper's internal OSC logic receives these messages and mutes tracks for Durchgang DB11 and PG10.
| Port | Direction | Source | Destination | Function |
|---|---|---|---|---|
| 4000 | IN | External Control | Content Computers | Preset recall, synthesis control |
| 6000 | IN | External Control | Display Player | Display/guide control |
| 6000 | IN | External Control | SPAT-PG8 | Ritual shutdown |
| 3000 | Internal | CON-PG8 | SPAT-PG8 | Reaper track control (mute) |
| 4321 | OUT | Content Computers | External System | Frequency analysis data |
| 9001 | Internal | CON-PG8 | SPAT-PG8 | Panning (outdoor ch1) |
| 9002 | Internal | CON-PG8 | SPAT-PG8 | Panning (outdoor ch2) |
| 9003 | Internal | CON-DMN | SPAT-DMN | Panning (indoor ch1) |
| 9004 | Internal | CON-DMN | SPAT-DMN | Panning (indoor ch2) |
Target: WS-AUD-CON-PG8 (10.1.11.71) and WS-AUD-CON-DMN (10.2.11.71)
| OSC Address | Type | Range | Description |
|---|---|---|---|
/index |
int | 0 - n | Preset index from partitur file |
/blendlin |
float | 0.0 - 1.0 | Blend vector for dual-voice crossfade |
/amp |
float | 0.0 - 1.0 | Meta-parameter: Amplitude control |
/spec |
float | 0.0 - 1.0 | Meta-parameter: Spectral control |
/room |
float | 0.0 - 1.0 | Meta-parameter: Room/reverb control |
/tex |
float | 0.0 - 1.0 | Meta-parameter: Texture control (for expansion) |
/spat |
float[3] | 0.0 - 1.0 | Spatial meta-parameters (3 floats) |
/systemMasterFader |
float | 0.0 - 1.0 | Global system volume |
Example Messages:
oscsend 10.1.11.71 4000 /index i 122
oscsend 10.1.11.71 4000 /blendlin f 0.75
oscsend 10.1.11.71 4000 /amp f 0.5
oscsend 10.1.11.71 4000 /spat f 0.3 f 0.8 f 0.5
Message Monitoring: All incoming OSC messages in the Max patch can be viewed by opening the window "VIDEO:UDP:PARTITUR". This window provides real-time monitoring of all OSC traffic received on port 4000. Max patch (APO_main.maxpat or APO_main_in.maxpat)
Message Format: All OSC messages must follow the exact format specified above. The port numbers and IP addresses can be configured in the "init" window of the main Max patch (APO_main.maxpat or APO_main_in.maxpat). This window provides a central location to set up all network configurations for both incoming and outgoing OSC messages.
Target: WS-AUD-VIT-PG8 (10.1.11.73)
Vitrine Control:
| OSC Address | Type | Range | Description |
|---|---|---|---|
/v1 to /v14 |
int | 1 | Trigger vitrine (plays random file) |
Säulen (Column) Control:
| OSC Address | Type | Range | Description |
|---|---|---|---|
/col7 to /col10 |
int | 1 - n | Play specific file by index |
Ritual Shutdown:
| OSC Address | Type | Range | Description |
|---|---|---|---|
/DisplayCase |
int | 0/1 | Triggers ritual: mutes all displays, content computer sends OSC to Reaper (port 3000) to automatically mute DB11/PG10 tracks via Reaper's internal OSC logic |
Example Messages:
oscsend 10.1.11.73 6000 /v7 i 1 # Trigger vitrine 7 (random)
oscsend 10.1.11.73 6000 /col9 i 3 # Play file 3 on column 9
oscsend 10.1.11.73 6000 /DisplayCase i 1 # Activate ritual
File Organization:
- Vitrines: Folders
v1tov14(plays random file) - Säulen: Folders
col1tocol4(plays specific file by index) - Soundlaser: Folder
sl1sl2(plays random file)
Mapping of station numbers → channels: see documentation/SHA_AltePost_Audiokanäle_*.xlsx
Source: WS-AUD-CON-PG8 (10.1.11.71)
Destination: External visual system
Content: Real-time frequency analysis data
Message Format:
| OSC Address | Type | Range | Description |
|---|---|---|---|
/audio_env |
float[4] | 0.0 - 1.0 | List of 4 floats: [Full Envelope, Low Band, Mid Band, High Band] |
Example:
/audio_env 0.75 0.82 0.65 0.53
- Float 1 (0.75): Full envelope (overall audio level)
- Float 2 (0.82): Low frequency band
- Float 3 (0.65): Mid frequency band
- Float 4 (0.53): High frequency band
Source: Content computers
Destination: Spatial computers
Messages:
| OSC Address | Type | Range | Description |
|---|---|---|---|
/iem/position/azimuth |
float | -180 to 180 | Horizontal angle (degrees) |
/iem/position/elevation |
float | -90 to 90 | Vertical angle (degrees) |
Example:
oscsend 10.1.11.72 9001 /iem/position/azimuth f 45.0
oscsend 10.1.11.72 9001 /iem/position/elevation f 15.0
:
When the external brain software sends /DisplayCase 1 to the APO_Main, the outdoor content computer (WS-AUD-CON-PG8) automatically sends OSC messages via port 3000 to the outdoor spatial computer (WS-AUD-SPAT-PG8). Reaper's internal OSC logic receives these messages and mutes tracks for Durchgang DB11 and PG10.
Source: Content computers (Outdoor)
Destination: Spatial computers
Messages:
| OSC Address | Type | Range | Description |
|---|---|---|---|
/track/10/mute |
bin | 0/1 | DB11 Output Mute |
/track/15/mute |
bin | 0/1 | PG10 Output Mute |
Example:
oscsend 10.1.11.72 3000 /track/10/mute 0
The content computers generate and process audio using Cycling '74 Max 9 with sample-based synthesis and multi-channel processing.
| Computer | IP | Patch | OSC In | DANTE Out | Target |
|---|---|---|---|---|---|
| WS-AUD-CON-PG8 | 10.1.11.71 | APO_main.maxpat |
4000 | 96ch | SPAT-PG8 |
| WS-AUD-CON-DMN | 10.2.11.71 | APO_main_in.maxpat |
4000 | 60ch | SPAT-DMN |
The content computers use a dual-voice architecture for seamless preset transitions without audio interruption.
Concept:
- Each synthesis module exists in two parallel voices (Voice 1 & Voice 2)
- While one voice plays, the other loads the next preset in the background
- A blend vector (
/blendlin) crossfades between voices (0.0 = Voice 1, 1.0 = Voice 2) - The system continuously oscillates between voices
Workflow:
- External system sends
/index(e.g., 122) → loads into inactive voice - External system oscillates
/blendlin0→1 or 1→0 → crossfades - Meta-parameters modulate synthesis in real-time
- Smooth, uninterrupted transitions
The system is built around 8-channel processing throughout all modules. Each module works with 8 audio sources/channels projected into space as a sound cloud.
Two Levels of Spatial Control:
- Macro-level: Large-scale movement of entire 8-source cloud
- Micro-level: Internal synthesis modulation within 8 channels
Sound Sources:
- 2x Loop Players (Ablp & Bblp): 8 playheads each, loop/speed/pitch control
- 1x Granular Synthesis: Multi-voice granular engine
Time-Domain Processing:
- 1x Spectral Stretch: PaulStretch algorithm
Effects & Processing:
- 1x Phaser Effect: Multi-stage with modulation
- 1x Complex Resonator: Filter bank for spectral coloring
- 1x Multi-Effect Unit: Ring modulation, filtering, etc.
Mixing & Routing:
- 1x Crossfade Mixer: Blends sources via
/blendlin - Multi-channel Mixer: Routes to 32 channels (Bus 1-4)
- 1x Reverb: 64ch (outdoor) or 28ch (indoor)
Spatial Control:
- 1x Panning Module: OSC control (Ports 9001-9004)
- Vector Input: Generates spatial trajectories
- Outdoor (CON-PG8): 32ch (Bus) + 64ch (Reverb) = 96 channels
- Indoor (CON-DMN): 32ch (Bus) + 28ch (Reverb) = 60 channels
Detailed module documentation: Max help files in APO_Main/patchers-help/
The display player handles individual audio playback for museum display cases (Vitrinen) and audio guide columns (Mediensäulen).
| Computer | IP | Patch | OSC In | DANTE Out | Target |
|---|---|---|---|---|---|
| WS-AUD-VIT-PG8 | 10.1.11.73 | APO_player.maxpat |
6000 | 20ch | Both SPAT |
- 2x Soundlaser: Outdoor display cases
- 14 Vitrines: Display cases with triggered audio
- 4 Säulen: Audio guide columns with selectable content
| Type | OSC Control | Behavior |
|---|---|---|
| Vitrines | /v1 [1] to /v14 [1] |
Plays random file to completion |
| Säulen | /col7 [index] to /col10 [index] |
Plays specific file by index |
| Ritual | /DisplayCase [0/1] |
Mutes all displays & DB11/PG10 |
- Vitrines: Ambient audio triggered when visitors approach
- Säulen: Audio guide stations with selectable content by number
- Laser: Speech Audio random loop with pauses
The spatial computers handle Ambisonics encoding, decoding, and speaker distribution across seven venues.
| Computer | IP | Project | OSC In | DANTE In/Out | Decoders |
|---|---|---|---|---|---|
| WS-AUD-SPAT-PG8 | 10.1.11.72 | APO_outdoor_output.rpp |
9001-9004, 6000 | 116ch / 136ch | 3 decoders, 96 speakers |
| WS-AUD-SPAT-DMN | 10.2.11.72 | APO_indoor_output.rpp |
9001-9004 | 80ch / 98ch | 4 decoders, 56 speakers |
Outdoor (SPAT-PG8):
- Dominikanerhof: 7th Order (64ch) → 74 speakers
- Durchgang DB11: 1st Order (4ch) → 12 speakers
- Durchgang PG10: 1st Order (4ch) → 10 speakers
Indoor (SPAT-DMN):
- Lobby: 2nd Order (9ch) → 16 speakers
- Restaurant: 3rd Order (16ch) → 28 speakers
- Nachtbar: 1st Order (4ch) → 4 speakers
- 2x WC: 1st Order (4ch) → 2x4 speakers
Ambisonics Orders: 1st=4ch, 2nd=9ch, 3rd=16ch, 5th=36ch, 7th=64ch | Formula: (N+1)²
- Ambisonics Format: AmbiX (ACN/SN3D)
- Plugins: IEM Plugin Suite (University of Music and Performing Arts Graz)
- Decoder Configurations:
APO_Render/decoder/
Separate from the real-time system, the audio classification pipeline runs offline to analyze and categorize audio content.
- Input: Audio files from new folder in
apo_material/ - Analysis: YAMNet (TensorFlow) extracts acoustic features
- Semantic Classification: GPT-4 interprets and categorizes
- Output Generation:
- Classification XMLs
- Blank JSON preset files (Ablp, Bblp, grain, stretch)
- Integration:
- Move presets to
data-Ablp/,data-Bblp/, etc. - Refine presets in Max
- Push material to NAS server
- Move presets to
- YAMNet: Pre-trained audio model (TensorFlow)
- GPT-4: Semantic interpretation
- Python: Jupyter Notebook (
audio_classification/audio_class.ipynb)
Install Dependencies:
pip install tensorflow tensorflow-hub librosa openai pandasRun Classification:
jupyter notebook audio_classification/audio_class.ipynbComplete documentation: see audio_classification/readme.md
Each audio file has corresponding JSON presets for each module:
- data-Ablp/: Loop Player A presets
- data-Bblp/: Loop Player B presets
- data-grain/: Granular synthesis presets
- data-stretch/: Time-stretch presets
The partitur files define which module presets are assigned to each audio file. They store the combinations of synthesis module settings that will be used when the external brain system selects an audio file for playback.
File Locations:
partitur.txt: Outdoor preset combinations (WS-AUD-CON-PG8)partitur_in.txt: Indoor preset combinations (WS-AUD-CON-DMN)
Located in: APO_Main/data/
- Edit presets in Max patch
- Save combinations to partitur files
- Mirror to NAS data-pg8 to make presets avaliable for external control
- External system sends OSC
/indexto recall presets
Step 1: Prepare Audio Files
Create a new folder in apo_material/[category_name]/ and add your audio files.
- NO umlauts (ä, ö, ü, etc.)
- NO accents (é, à, ñ, etc.)
- NO spaces (use underscores instead)
Example:
❌ Wrong: "Paco De Lucía.wav"
✅ Correct: "Paco_De_Lucia.wav"
❌ Wrong: "Müller & Söhne.wav"
✅ Correct: "Mueller_und_Soehne.wav"
Step 2: Run Classification Script
- Open
audio_classification/audio_class.ipynbin Jupyter Notebook - Configure the folder path to point to your new material folder
- Run the script (detailed instructions in
audio_classification/readme.mdor directly in the notebook)
Step 3: Generated Output
The script automatically creates:
- Classification XMLs: Analysis results stored alongside the audio files in your material folder
- Blank JSON Presets: Four folders containing preset files for each module:
data-Ablp/- Presets for Loop Player Adata-Bblp/- Presets for Loop Player Bdata-grain/- Presets for Granular Synthesisdata-stretch/- Presets for Time-Stretch
Step 4: Integration
- The preset folders are already created in the correct location by the script
- Copy the entire material folder to NAS server data-pg8 for backup
- Open Max patch and refine the blank presets to achieve desired sonic results
- Test and adjust synthesis parameters
- Save preset combinations to partitur files (
partitur.txt/partitur_in.txt) - Mirror partitur files to NAS data-pg8 to make them available for external brain control
The installation is designed to run continuously. All Max patches are automatically launched via V4 Watchdog (located in the system autostart folder), which monitors and restarts patches if needed.
V4 Watchdog (Autostart):
- Located in Windows autostart folder
- Automatically launches all Max patches on system boot
- Monitors patches and restarts them if they crash
- Ensures continuous operation without manual intervention
Content Computers (Automatic):
- WS-AUD-CON-PG8:
APO_main.maxpat→ DSP auto-activates - WS-AUD-CON-DMN:
APO_main_in.maxpat→ DSP auto-activates - WS-AUD-VIT-PG8:
APO_player.maxpat→ DSP auto-activates
Spatial Computers (Manual, but run continuously):
- WS-AUD-SPAT-PG8: Reaper →
APO_outdoor_output.rpp(116ch in / 136ch out) - WS-AUD-SPAT-DMN: Reaper →
APO_indoor_output.rpp(80ch in / 98ch out)
Network & DANTE (2-3 minutes after boot):
- All DANTE devices visible in DANTE Controller
- All devices show "Locked" clock status
- Sample rate locked at 48kHz across all devices
- Network switch operational (all ports active)
Max Patches:
- All patches running (check V4 Watchdog status)
- DSP activated on all content computers
- CPU load within safe ranges (<75%)
- No error messages in Max console
Reaper Projects:
- Both projects loaded and running
- DANTE audio flowing (check meters)
- IEM MultiEncoders receiving OSC (ports 9001-9004)
- CPU load within safe ranges (<85%)
Audio Flow:
- Content → DANTE → Spatial → Speakers
- Test tones audible in all venues
- No dropouts or clicks
OSC Communication:
- External brain system connected
- OSC messages visible in Max "VIDEO:UDP:PARTITUR" window
- Spatial positioning active (check IEM MultiEncoder movement)
Via External OSC:
- Port 4000 → Content computers (preset recall, meta-parameters)
- Port 6000 → Display player (vitrine/säule triggers, ritual)
Via Max Interface:
- Manual parameter changes
- File selection in player
- Synthesis module on/off
Spatial Positioning (Automatic):
- Content computers → Spatial computers (Ports 9001-9004)
- Dynamic spatial trajectories from blend process
Latency Budget:
| Component | Latency | Setting |
|---|---|---|
| Max Processing | ~43ms | 2048 samples @ 48kHz |
| DANTE Network | 1-2ms | DANTE Controller |
| Reaper Encoding/Decoding | ~22ms | 512 samples @ 48kHz |
| Total (E2E) | ~65-70ms | Acceptable for installation |
Sample Rate: 48 kHz (fixed system-wide)
Bit Depth: 24-bit PCM
Buffer Priority: Stability over ultra-low latency
APO_audio/
│
├── 📄 readme.md ← Main documentation (this file)
├── 📄 network_config.md ← Detailed OSC/DANTE specs
├── 📄 setup_guide.md ← Installation guide
│
├── 📁 APO_Main/ ← Content Computers: Max Synthesis
│ ├── APO_main.maxpat ← Outdoor patch
│ ├── APO_main_in.maxpat ← Indoor patch
│ ├── apo_material/ ← Audio samples (16 categories)
│ ├── apo_waves/ ← Wavetables
│ ├── code/ ← Gen~ patches & JavaScript
│ ├── data/ ← System presets & partitur files
│ │ ├── partitur.txt ← Outdoor presets
│ │ └── partitur_in.txt ← Indoor presets
│ ├── data-Ablp/ ← JSON presets (Loop Player A)
│ ├── data-Bblp/ ← JSON presets (Loop Player B)
│ ├── data-grain/ ← JSON presets (Granular)
│ ├── data-stretch/ ← JSON presets (Stretch)
│ ├── externals/ ← Max externals
│ ├── patchers/ ← 108 abstraction patches
│ └── patchers-help/ ← Help patches
│
├── 📁 APO_Player/ ← Display Player: Audio Guide
│ ├── APO_player.maxpat ← Player patch
│ ├── player_material/ ← Playback material
│ │ ├── col1/ to col4/ ← Säulen content
│ │ ├── v1/ to v14/ ← Vitrines content
│ │ └── sl1/ to sl2/ ← Soundlaser
│ └── externals/ ← Player abstractions
│
├── 📁 APO_Render/ ← Spatial Computers: Reaper
│ ├── APO_outdoor_output.rpp ← Outdoor project
│ ├── APO_indoor_output.rpp ← Indoor project
│ └── decoder/ ← IEM decoder configs
│ ├── hof_dom_1_74/ ← Dominikanerhof (7th Order)
│ ├── gang_db11/ ← DB11 (1st Order)
│ ├── gang_pg10/ ← PG10 (1st Order)
│ ├── indoor_lobby/ ← Lobby (2nd Order)
│ ├── indoor_restaurant/ ← Restaurant (3rd Order)
│ └── wc_allrad.json ← WC (1st Order)
│
├── 📁 audio_classification/ ← Python audio analysis
│ ├── audio_class.ipynb ← Jupyter Notebook
│ ├── readme.md ← Classification docs
│ ├── classes.csv ← Categories
│ └── yamnet_class_map.csv ← YAMNet mapping
│
├── 📁 scripts/ ← Utility scripts
│ ├── audio_convert.py ← Audio conversion
│ └── find_space.py ← Utility
│
├── 📁 documentation/ ← Additional docs
│ ├── content_structure.png ← System diagram
│ ├── SHA_AltePost_Audiokanäle_Kunstwerk1_250901.xlsx
│ └── SHA_AltePost_Audiokanäle_Kunstwerk2_250901.xlsx
│
└── 📁 dependencies/ ← Max libraries & installers
├── CNMAT Externals/ ← UC Berkeley CNMAT (Copy to Max 9/Library/)
├── ease/ ← Easing functions (Copy to Max 9/Library/)
├── FluidCorpusManipulation/ ← ML audio toolkit (Copy to Max 9/Library/)
├── GeneratingSoundAndOrganizingTime/ ← Synthesis tools (Copy to Max 9/Library/)
├── ICST Ambisonics/ ← Spatial audio ZHdK (Copy to Max 9/Library/)
├── jasch objects/ ← Utility objects (Copy to Max 9/Library/)
├── karma/ ← Patching utilities (Copy to Max 9/Library/)
├── MuBu For Max/ ← Multi-buffer (Copy to Max 9/Library/)
├── odot/ ← OSC handling (Copy to Max 9/Library/)
├── spat5-x64/ ← IRCAM Spat (Copy to Max 9/Library/)
├── Max909_250918_d7cea08.msi ← Max 9 installer
├── reaper748_x64-install.exe ← Reaper installer
└── IEMPluginSuiteInstaller_v1.15.0_x64.exe ← IEM plugins
**Note:** Max packages also available on [Google Drive](https://drive.google.com/drive/folders/1q4KhnG0dok_zjeWZoE4U2uqC6Lic_MH1?usp=drive_link) (too large for GitHub)
No audio in Max:
- Audio activated? (Audio On)
- Correct audio interface selected?
- DSP activated?
- SystemMasterFader at 0?
Audio dropouts in Max:
- CPU load <75%?
- Sample rate 48 kHz?
- I/O & Vector Size 2048?
No audio in Reaper:
- DANTE routing correct?
- All devices "Locked" in DANTE Controller?
- Audio interface selected?
Audio dropouts in Reaper:
- CPU load <85%?
- Try increasing DANTE latency to 2ms
Computers cannot ping each other:
- Ethernet connected?
- IP addresses correct?
- Subnet mask 255.255.255.0?
DANTE devices not visible:
- Switch supports multicast/IGMP?
- Devices in same subnet?
- Flow control disabled on switch?
OSC messages not received:
- Port numbers correct? (4000, 6000, 9001-9004)
- IP addresses correct?
- Firewall allows UDP ports?
- OSC monitor in Max shows messages?
Test Command:
oscsend 10.1.11.71 4000 /index i 122Missing externals / Objects not found:
- All 10 libraries copied to Max 9/Library/?
- CNMAT Externals, ease, FluidCorpusManipulation, GeneratingSoundAndOrganizingTime
- ICST Ambisonics, jasch objects, karma, MuBu For Max, odot, spat5-x64
- Download from Google Drive if not in local
dependencies/folder
- File Preferences configured? (Add
APO_Main+ "Include Subfolders") - Max restarted after configuration?
Common missing objects and their libraries:
cnmat.*objects → CNMAT Externals missingfluid.*objects → FluidCorpusManipulation missingmubu.*objects → MuBu For Max missingo.*objects → odot missingspat5.*objects → spat5-x64 missing
Patch won't open / Crashes:
- All dependencies copied?
- File Preferences with "Include Subfolders" checked?
- Max version 9.0+?
- Check Max Console for specific missing object errors
IEM plugins not available:
- IEM Plugin Suite installed?
- Reaper plugin scan performed?
- 64-bit plugins for 64-bit Reaper?
Decoder doesn't load:
- JSON file correctly formatted?
- File path correct?
- Ambisonics order matches config?
| System | Computer | Idle | Typical | Peak | Max Safe |
|---|---|---|---|---|---|
| Content Outdoor | WS-AUD-CON-PG8 | <20% | 40-50% | 60% | 75% |
| Content Indoor | WS-AUD-CON-DMN | <20% | 40-50% | 60% | 75% |
| Display Player | WS-AUD-VIT-PG8 | <10% | 15-25% | 40% | 60% |
| Spatial Outdoor | WS-AUD-SPAT-PG8 | <25% | 60-70% | 80% | 85% |
| Spatial Indoor | WS-AUD-SPAT-DMN | <20% | 50-60% | 70% | 85% |
Notes:
- SPAT-PG8 has highest load (7th Order Ambisonics)
- SPAT-DMN lower load (max 3rd Order)
- VIT-PG8 minimal load (20ch playback only)
| Parameter | Value | Notes |
|---|---|---|
| Sample Rate | 48 kHz | System-wide, broadcast standard |
| Bit Depth | 24 bit | DANTE network |
| Audio Interface | Marian Clara E | 512ch I/O, Driver v4.71 |
| Max Channels | 512ch | DANTE interface capacity |
| Total Speakers | 152ch | All 7 venues |
| Max Buffer | 2048 samples | Content computers (~43ms) |
| Min Buffer | 512 samples | Spatial computers (~11ms) |
| Total Latency | ~65-70ms | End-to-end acceptable |
Project: APO "Museum of Change" - Audio Software
Version: 1.0
Date: October 2025
Maintainers: Jonas Hammerer, Wolfgang Musil
- Cycling '74 Max 9: Audio processing & synthesis
- Reaper DAW: Ambisonics rendering
- IEM Plugin Suite: Ambisonics encoding/decoding (University of Music and Performing Arts Graz)
- TensorFlow / YAMNet: Audio classification (Google Research)
- OpenAI GPT-4: Semantic audio analysis
- DANTE Audio Networking: Audio-over-IP (Audinate)
- Ambisonic Format: AmbiX (ACN/SN3D)
- Audio Network: DANTE (AES67 compatible)
- Sample Rate: 48 kHz (Broadcast Standard)
- Control Protocol: OSC (Open Sound Control)
Last Updated: October 12, 2025
For questions or issues, consult the Max/Reaper console for error messages or contact the maintainers.