Refactor(handler part 12): vae decode by 1larity · Pull Request #591 · ace-step/ACE-Step-1.5

1larity · 2026-02-15T16:25:19Z

Summary

Decompose VAE decode logic from acestep/handler.py into focused mixins, plus runtime hardening fixes discovered during validation:

Gradio audio upload validation return-shape fix (gr.skip / gr.update) to avoid FileData validation errors.
vLLM re-init hardening to prevent trying to initialize the default process group twice!.

Changes

A) Handler decomposition: VAE decode extraction

Added acestep/core/generation/handler/vae_decode.py
- tiled_decode
- _tiled_decode_cpu_fallback
- _decode_on_cpu
Added acestep/core/generation/handler/vae_decode_chunks.py
- _tiled_decode_inner
- _tiled_decode_gpu
- _tiled_decode_offload_cpu
Wired mixins into:
- acestep/core/generation/handler/__init__.py
- acestep/handler.py
Removed moved decode implementation from handler.py facade area (behaviour preserved).

B) Audio upload validation hotfix

Updated acestep/gradio_ui/events/generation_handlers.py
- validate_uploaded_audio_file(...) now returns:
  - gr.skip() for valid/no-op cases
  - gr.update(value=None) for invalid files
Updated tests in acestep/gradio_ui/events/generation_handlers_test.py

C) vLLM re-initialization fix

Updated acestep/llm_inference.py
- Added _cleanup_torch_distributed_state()
- Called during vLLM unload and before vLLM initialization
- Prevents duplicate default process-group init errors.
Added test: acestep/llm_inference_dist_cleanup_test.py

Behavioural parity / reliability notes

VAE decode fallback chain remains intact:
- MLX fast path -> PyTorch path
- MPS clamp safeguards
- OOM fallback chain (GPU -> offload CPU -> full CPU decode)
Public handler entrypoints/signatures remain unchanged.
Upload validation now avoids invalid component-state writes to Gradio Audio.
vLLM init/unload now defensively cleans stale distributed state.

Tests

Added/updated unit tests

acestep/core/generation/handler/vae_decode_test.py
- MPS clamp behaviour
- MPS runtime fallback
- MLX success path
- MLX failure fallback
- Non-MPS runtime error propagation
- Batch-sequential decode
- Direct decode path
- Overlap adjustment path
- OOM fallback chain coverage
acestep/gradio_ui/events/generation_handlers_test.py
- Upload validation return behaviour and role-based warning
acestep/llm_inference_dist_cleanup_test.py
- destroy process group when initialized
- no-op when not initialized

Validation run

py_compile passed for changed files.
Some unittest runs are environment-limited by optional deps (torchaudio, gradio) import chain in this environment.

Manual UI testing

Manual UI checks passed on available platform paths:

Generation path sanity
Upload validation behaviour (invalid source/reference file warnings + clearing)
No regression in non-Apple decode paths

Not executed:

Apple-specific (MPS/MLX) manual validation (environment unavailable).

Summary by CodeRabbit

Release Notes

New Features
- Enhanced VAE audio decoding with automatic device-specific fallback strategies to improve generation stability across different hardware platforms.
- Improved audio file validation with more informative error messages and robust input handling.
Tests
- Added comprehensive test coverage for VAE decoding fallback paths, distributed state cleanup, and audio validation workflows.

coderabbitai · 2026-02-15T16:25:36Z

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

This PR refactors tiled VAE decoding functionality by extracting it from AceStepHandler into two new mixin classes (VaeDecodeMixin and VaeDecodeChunksMixin) with orchestrated fallback paths spanning MLX, PyTorch, and CPU execution. Concurrently, it adds distributed state cleanup for LLMHandler, enhances Gradio audio validation with i18n support, and includes comprehensive test coverage for the new components.

Changes

Cohort / File(s)	Summary
VAE Decode Mixin Implementation `acestep/core/generation/handler/vae_decode.py`, `acestep/core/generation/handler/vae_decode_chunks.py`, `acestep/core/generation/handler/__init__.py`	Introduces VaeDecodeMixin (orchestrator with MLX/PyTorch fallbacks and MPS-safe constants) and VaeDecodeChunksMixin (implements chunked decode with GPU/offload/CPU pathways, adaptive overlap, and OOM handling). Exports both from handler package.
VAE Decode Test Suite `acestep/core/generation/handler/vae_decode_mixin_test.py`, `acestep/core/generation/handler/vae_decode_chunks_test.py`, `acestep/core/generation/handler/vae_decode_test_helpers.py`	Adds comprehensive tests validating MPS clamping, MLX fast-path, fallback chains (GPU→Offload→CPU), overlap adjustment, OOM recovery, and batch sequential decoding. Includes test helpers (FakeVae, DecodeHost, ChunksHost) for injecting mock behavior.
Handler Refactoring `acestep/handler.py`	Updates AceStepHandler to inherit from VaeDecodeMixin and VaeDecodeChunksMixin; removes 313 lines of previously embedded tiled decode logic (MPS constants, CPU fallback, chunked decode methods).
LLM Distributed Cleanup `acestep/llm_inference.py`, `acestep/llm_inference_dist_cleanup_test.py`	Adds \`_cleanup_torch_distributed_state()\\` method to LLMHandler and invokes it during vLLM initialization and unload to destroy stale PyTorch distributed process groups. Includes tests validating cleanup behavior.
Gradio Audio Validation `acestep/gradio_ui/events/generation_handlers.py`, `acestep/gradio_ui/events/generation_handlers_test.py`, `acestep/gradio_ui/i18n/en.json`	Expands \`_extract_audio_path()\\` to handle None/string/list inputs; updates \`validate_uploaded_audio_file()\\` to return gr.skip()/gr.update() for Gradio compatibility and uses i18n key "audio_format_invalid" for localized warnings.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Merge mainline commits as of 2026/02/08 05:16 UTC with MPS optimizations and do optimization checks #322: Modifies tiled VAE decode surface and MPS-specific fallback logic that overlaps with the core refactoring of this PR.
(feat)(mlx): Native MLX VAE acceleration for Apple Silicon #459: Adds MLX VAE integration and \_mlx_vae_decode\\ logic which is directly integrated into the new VaeDecodeMixin orchestration path.
refact: Reorganized Advanced Settings UI & Added latent shift and rescale #452: Updates AceStepHandler to add latent post-processing before VAE decode, affecting the same handler class being refactored here.

Suggested reviewers

ChuxiJ

🐰 A VAE's nested paths now shine so bright,
From MLX to GPU, with MPS in sight,
When chunks overflow and memories strain,
CPU fallbacks catch them like gentle rain,
Clean distributed states keep vLLM's domain sane! ✨

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Refactor(handler part 12): vae decode' directly and clearly describes the main change: extracting and refactoring VAE decode logic from the handler into focused mixins, which is the primary objective of this pull request.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@acestep/core/generation/handler/vae_decode_test.py`:
- Around line 1-244: Split this large test module into two smaller modules to
stay under 200 LOC: move the VaeDecodeMixinTests class (and the minimal stubs it
uses: _DecodeOutput, _FakeVae, _DecodeHost) into a new test file (e.g.,
vae_decode_mixin_test.py) and move VaeDecodeChunksMixinTests (and its stubs:
_ChunksHost) into a separate file (e.g., vae_decode_chunks_test.py); if any
helpers are shared between both (e.g., _DecodeOutput or _FakeVae), extract them
into a small shared test helper module to avoid duplication and import that from
both new test files; ensure each new module imports VaeDecodeMixin and
VaeDecodeChunksMixin respectively and that test discovery still finds the tests.

In `@acestep/gradio_ui/events/generation_handlers.py`:
- Around line 932-943: Update the docstring for _extract_audio_path to clearly
state its purpose, accepted input types (None, str, list/tuple containing a
string), the normalization it performs (stripping whitespace, returning None for
empty strings or None input), and the return value (Optional[str] — file path or
None); mention that it does not raise exceptions and note any special handling
of list/tuple by returning the first element if it is a string.
- Around line 964-968: The hardcoded user-facing strings in the audio-format
warning must be localized: replace the inline role_label logic and the
gr.Warning message to use the i18n translator (t) from the module used in this
package (e.g., import t from i18n). Build the role label via t("reference") /
t("source") (or appropriate keys) based on audio_role and pass a translated
message key to gr.Warning (e.g., t("audio.format_invalid", {"role":
role_label})) so both the label and the full toast text come from i18n keys;
update the call site using role_label, audio_role, and gr.Warning accordingly.

In `@acestep/llm_inference_dist_cleanup_test.py`:
- Around line 6-11: The test currently silences all exceptions during import by
catching Exception, which can hide real errors in acestep.llm_inference; change
the except clause in the import block to except ImportError (so only
import/module-not-found issues are caught) and keep assigning LLMHandler = None
and _IMPORT_ERROR = exc as before (preserve the pragma: no cover comment if
needed) — locate the try/except that imports LLMHandler and update the exception
type from Exception to ImportError.

🧹 Nitpick comments (3)

acestep/llm_inference.py (1)

117-125: Module exceeds 200 LOC — consider splitting.
This file is far beyond the 200 LOC cap; consider extracting backend-specific helpers into separate modules for maintainability.

As per coding guidelines, "Target module size: optimal <= 150 LOC, hard cap 200 LOC."
acestep/core/generation/handler/vae_decode_chunks.py (1)
13-15: Add type hints to the chunked decode helper signatures.

The new methods are untyped; annotate inputs/outputs for clarity and consistency.
Suggested signature pattern
-    def _tiled_decode_inner(self, latents, chunk_size, overlap, offload_wav_to_cpu):
+    def _tiled_decode_inner(
+        self,
+        latents: torch.Tensor,
+        chunk_size: int,
+        overlap: int,
+        offload_wav_to_cpu: bool,
+    ) -> torch.Tensor:
Apply the same pattern to `_tiled_decode_gpu` and `_tiled_decode_offload_cpu`. As per coding guidelines: Add type hints for new/modified functions when practical in Python.
Also applies to: 83-85, 114-116
acestep/core/generation/handler/vae_decode.py (1)
16-22: Add full type hints for VAE decode mixin methods.

latents and return types are unannotated; add torch.Tensor (and return) across the mixin methods.
Suggested signature pattern
-    def tiled_decode(
-        self,
-        latents,
-        chunk_size: Optional[int] = None,
-        overlap: int = 64,
-        offload_wav_to_cpu: Optional[bool] = None,
-    ):
+    def tiled_decode(
+        self,
+        latents: torch.Tensor,
+        chunk_size: Optional[int] = None,
+        overlap: int = 64,
+        offload_wav_to_cpu: Optional[bool] = None,
+    ) -> torch.Tensor:
Apply the same pattern to `_tiled_decode_cpu_fallback` and `_decode_on_cpu`. As per coding guidelines: Add type hints for new/modified functions when practical in Python.
Also applies to: 87-88, 103-104

acestep/core/generation/handler/vae_decode_test.py

acestep/gradio_ui/events/generation_handlers.py

acestep/llm_inference_dist_cleanup_test.py

ChuxiJ · 2026-02-15T23:29:40Z

some conflicts need to solve

…t paths

1larity changed the title ~~Refactor(handler part 11): vae decode~~ Refactor(handler part 12): vae decode Feb 15, 2026

coderabbitai bot reviewed Feb 15, 2026

View reviewed changes

ChuxiJ approved these changes Feb 15, 2026

View reviewed changes

1larity added 3 commits February 16, 2026 00:17

refactor(handler): extract VAE decode mixins and harden LLM/audio ini…

49c3549

…t paths

test(handler): split VAE decode tests and refine audio path docstring

875dcba

fix(i18n): localize upload warning and tighten import guard

f7ab5ca

1larity force-pushed the feat/decompose-handler-part10-pr2-vae-decode branch from 317138e to f7ab5ca Compare February 16, 2026 00:18

ChuxiJ merged commit 2b1ad8c into ace-step:main Feb 16, 2026
2 of 3 checks passed

coderabbitai bot mentioned this pull request Feb 16, 2026

RuntimeError: Offset increment outside graph capture encountered unexpectedly. #613

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor(handler part 12): vae decode#591

Refactor(handler part 12): vae decode#591
ChuxiJ merged 3 commits intoace-step:mainfrom
1larity:feat/decompose-handler-part10-pr2-vae-decode

1larity commented Feb 15, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 15, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ChuxiJ commented Feb 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

1larity commented Feb 15, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

A) Handler decomposition: VAE decode extraction

B) Audio upload validation hotfix

C) vLLM re-initialization fix

Behavioural parity / reliability notes

Tests

Added/updated unit tests

Validation run

Manual UI testing

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Feb 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ChuxiJ commented Feb 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1larity commented Feb 15, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 15, 2026 •

edited

Loading