Skip to content

Refactor(handler part 12): vae decode#591

Merged
ChuxiJ merged 3 commits intoace-step:mainfrom
1larity:feat/decompose-handler-part10-pr2-vae-decode
Feb 16, 2026
Merged

Refactor(handler part 12): vae decode#591
ChuxiJ merged 3 commits intoace-step:mainfrom
1larity:feat/decompose-handler-part10-pr2-vae-decode

Conversation

@1larity
Copy link
Contributor

@1larity 1larity commented Feb 15, 2026

Summary

Decompose VAE decode logic from acestep/handler.py into focused mixins, plus runtime hardening fixes discovered during validation:

  1. Gradio audio upload validation return-shape fix (gr.skip / gr.update) to avoid FileData validation errors.
  2. vLLM re-init hardening to prevent trying to initialize the default process group twice!.

Changes

A) Handler decomposition: VAE decode extraction

  • Added acestep/core/generation/handler/vae_decode.py
    • tiled_decode
    • _tiled_decode_cpu_fallback
    • _decode_on_cpu
  • Added acestep/core/generation/handler/vae_decode_chunks.py
    • _tiled_decode_inner
    • _tiled_decode_gpu
    • _tiled_decode_offload_cpu
  • Wired mixins into:
    • acestep/core/generation/handler/__init__.py
    • acestep/handler.py
  • Removed moved decode implementation from handler.py facade area (behaviour preserved).

B) Audio upload validation hotfix

  • Updated acestep/gradio_ui/events/generation_handlers.py
    • validate_uploaded_audio_file(...) now returns:
      • gr.skip() for valid/no-op cases
      • gr.update(value=None) for invalid files
  • Updated tests in acestep/gradio_ui/events/generation_handlers_test.py

C) vLLM re-initialization fix

  • Updated acestep/llm_inference.py
    • Added _cleanup_torch_distributed_state()
    • Called during vLLM unload and before vLLM initialization
    • Prevents duplicate default process-group init errors.
  • Added test: acestep/llm_inference_dist_cleanup_test.py

Behavioural parity / reliability notes

  • VAE decode fallback chain remains intact:
    • MLX fast path -> PyTorch path
    • MPS clamp safeguards
    • OOM fallback chain (GPU -> offload CPU -> full CPU decode)
  • Public handler entrypoints/signatures remain unchanged.
  • Upload validation now avoids invalid component-state writes to Gradio Audio.
  • vLLM init/unload now defensively cleans stale distributed state.

Tests

Added/updated unit tests

  • acestep/core/generation/handler/vae_decode_test.py
    • MPS clamp behaviour
    • MPS runtime fallback
    • MLX success path
    • MLX failure fallback
    • Non-MPS runtime error propagation
    • Batch-sequential decode
    • Direct decode path
    • Overlap adjustment path
    • OOM fallback chain coverage
  • acestep/gradio_ui/events/generation_handlers_test.py
    • Upload validation return behaviour and role-based warning
  • acestep/llm_inference_dist_cleanup_test.py
    • destroy process group when initialized
    • no-op when not initialized

Validation run

  • py_compile passed for changed files.
  • Some unittest runs are environment-limited by optional deps (torchaudio, gradio) import chain in this environment.

Manual UI testing

Manual UI checks passed on available platform paths:

  • Generation path sanity
  • Upload validation behaviour (invalid source/reference file warnings + clearing)
  • No regression in non-Apple decode paths

Not executed:

  • Apple-specific (MPS/MLX) manual validation (environment unavailable).

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced VAE audio decoding with automatic device-specific fallback strategies to improve generation stability across different hardware platforms.
    • Improved audio file validation with more informative error messages and robust input handling.
  • Tests

    • Added comprehensive test coverage for VAE decoding fallback paths, distributed state cleanup, and audio validation workflows.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 15, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

This PR refactors tiled VAE decoding functionality by extracting it from AceStepHandler into two new mixin classes (VaeDecodeMixin and VaeDecodeChunksMixin) with orchestrated fallback paths spanning MLX, PyTorch, and CPU execution. Concurrently, it adds distributed state cleanup for LLMHandler, enhances Gradio audio validation with i18n support, and includes comprehensive test coverage for the new components.

Changes

Cohort / File(s) Summary
VAE Decode Mixin Implementation
acestep/core/generation/handler/vae_decode.py, acestep/core/generation/handler/vae_decode_chunks.py, acestep/core/generation/handler/__init__.py
Introduces VaeDecodeMixin (orchestrator with MLX/PyTorch fallbacks and MPS-safe constants) and VaeDecodeChunksMixin (implements chunked decode with GPU/offload/CPU pathways, adaptive overlap, and OOM handling). Exports both from handler package.
VAE Decode Test Suite
acestep/core/generation/handler/vae_decode_mixin_test.py, acestep/core/generation/handler/vae_decode_chunks_test.py, acestep/core/generation/handler/vae_decode_test_helpers.py
Adds comprehensive tests validating MPS clamping, MLX fast-path, fallback chains (GPU→Offload→CPU), overlap adjustment, OOM recovery, and batch sequential decoding. Includes test helpers (FakeVae, DecodeHost, ChunksHost) for injecting mock behavior.
Handler Refactoring
acestep/handler.py
Updates AceStepHandler to inherit from VaeDecodeMixin and VaeDecodeChunksMixin; removes 313 lines of previously embedded tiled decode logic (MPS constants, CPU fallback, chunked decode methods).
LLM Distributed Cleanup
acestep/llm_inference.py, acestep/llm_inference_dist_cleanup_test.py
Adds \_cleanup_torch_distributed_state()\\ method to LLMHandler and invokes it during vLLM initialization and unload to destroy stale PyTorch distributed process groups. Includes tests validating cleanup behavior.
Gradio Audio Validation
acestep/gradio_ui/events/generation_handlers.py, acestep/gradio_ui/events/generation_handlers_test.py, acestep/gradio_ui/i18n/en.json
Expands \_extract_audio_path()\\ to handle None/string/list inputs; updates \validate_uploaded_audio_file()\\ to return gr.skip()/gr.update() for Gradio compatibility and uses i18n key "audio_format_invalid" for localized warnings.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • ChuxiJ

🐰 A VAE's nested paths now shine so bright,
From MLX to GPU, with MPS in sight,
When chunks overflow and memories strain,
CPU fallbacks catch them like gentle rain,
Clean distributed states keep vLLM's domain sane!

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Refactor(handler part 12): vae decode' directly and clearly describes the main change: extracting and refactoring VAE decode logic from the handler into focused mixins, which is the primary objective of this pull request.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@1larity 1larity changed the title Refactor(handler part 11): vae decode Refactor(handler part 12): vae decode Feb 15, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@acestep/core/generation/handler/vae_decode_test.py`:
- Around line 1-244: Split this large test module into two smaller modules to
stay under 200 LOC: move the VaeDecodeMixinTests class (and the minimal stubs it
uses: _DecodeOutput, _FakeVae, _DecodeHost) into a new test file (e.g.,
vae_decode_mixin_test.py) and move VaeDecodeChunksMixinTests (and its stubs:
_ChunksHost) into a separate file (e.g., vae_decode_chunks_test.py); if any
helpers are shared between both (e.g., _DecodeOutput or _FakeVae), extract them
into a small shared test helper module to avoid duplication and import that from
both new test files; ensure each new module imports VaeDecodeMixin and
VaeDecodeChunksMixin respectively and that test discovery still finds the tests.

In `@acestep/gradio_ui/events/generation_handlers.py`:
- Around line 932-943: Update the docstring for _extract_audio_path to clearly
state its purpose, accepted input types (None, str, list/tuple containing a
string), the normalization it performs (stripping whitespace, returning None for
empty strings or None input), and the return value (Optional[str] — file path or
None); mention that it does not raise exceptions and note any special handling
of list/tuple by returning the first element if it is a string.
- Around line 964-968: The hardcoded user-facing strings in the audio-format
warning must be localized: replace the inline role_label logic and the
gr.Warning message to use the i18n translator (t) from the module used in this
package (e.g., import t from i18n). Build the role label via t("reference") /
t("source") (or appropriate keys) based on audio_role and pass a translated
message key to gr.Warning (e.g., t("audio.format_invalid", {"role":
role_label})) so both the label and the full toast text come from i18n keys;
update the call site using role_label, audio_role, and gr.Warning accordingly.

In `@acestep/llm_inference_dist_cleanup_test.py`:
- Around line 6-11: The test currently silences all exceptions during import by
catching Exception, which can hide real errors in acestep.llm_inference; change
the except clause in the import block to except ImportError (so only
import/module-not-found issues are caught) and keep assigning LLMHandler = None
and _IMPORT_ERROR = exc as before (preserve the pragma: no cover comment if
needed) — locate the try/except that imports LLMHandler and update the exception
type from Exception to ImportError.
🧹 Nitpick comments (3)
acestep/llm_inference.py (1)

117-125: Module exceeds 200 LOC — consider splitting.
This file is far beyond the 200 LOC cap; consider extracting backend-specific helpers into separate modules for maintainability.

As per coding guidelines, "Target module size: optimal <= 150 LOC, hard cap 200 LOC."

acestep/core/generation/handler/vae_decode_chunks.py (1)

13-15: Add type hints to the chunked decode helper signatures.

The new methods are untyped; annotate inputs/outputs for clarity and consistency.

Suggested signature pattern
-    def _tiled_decode_inner(self, latents, chunk_size, overlap, offload_wav_to_cpu):
+    def _tiled_decode_inner(
+        self,
+        latents: torch.Tensor,
+        chunk_size: int,
+        overlap: int,
+        offload_wav_to_cpu: bool,
+    ) -> torch.Tensor:
Apply the same pattern to `_tiled_decode_gpu` and `_tiled_decode_offload_cpu`. As per coding guidelines: Add type hints for new/modified functions when practical in Python.

Also applies to: 83-85, 114-116

acestep/core/generation/handler/vae_decode.py (1)

16-22: Add full type hints for VAE decode mixin methods.

latents and return types are unannotated; add torch.Tensor (and return) across the mixin methods.

Suggested signature pattern
-    def tiled_decode(
-        self,
-        latents,
-        chunk_size: Optional[int] = None,
-        overlap: int = 64,
-        offload_wav_to_cpu: Optional[bool] = None,
-    ):
+    def tiled_decode(
+        self,
+        latents: torch.Tensor,
+        chunk_size: Optional[int] = None,
+        overlap: int = 64,
+        offload_wav_to_cpu: Optional[bool] = None,
+    ) -> torch.Tensor:
Apply the same pattern to `_tiled_decode_cpu_fallback` and `_decode_on_cpu`. As per coding guidelines: Add type hints for new/modified functions when practical in Python.

Also applies to: 87-88, 103-104

@ChuxiJ
Copy link
Contributor

ChuxiJ commented Feb 15, 2026

some conflicts need to solve

@1larity 1larity force-pushed the feat/decompose-handler-part10-pr2-vae-decode branch from 317138e to f7ab5ca Compare February 16, 2026 00:18
@ChuxiJ ChuxiJ merged commit 2b1ad8c into ace-step:main Feb 16, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants