Bug/Refactor- Implement Gradient Clipping and EMA for Training Stability

Observation: Training loss occasionally diverges when using high learning rates or large batches during the flow-matching optimization process.

Proposed Fix:

Add torch.nn.utils.clip_grad_norm_ to the training loop (suggested max norm: 1.0).

Implement Exponential Moving Average (EMA) for model weights to improve the robustness of the generated speech samples.

Verification: Monitor the vector field loss to ensure smoother convergence over 50k+ steps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug/Refactor- Implement Gradient Clipping and EMA for Training Stability #157

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Bug/Refactor- Implement Gradient Clipping and EMA for Training Stability #157

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions