Skip to content

Dumb question on fine-tuning #3

@BradKML

Description

@BradKML

Since people know SFT for AR-LLM will cause issues of forgetting, and that PEFT like LoRA (as well as RL-based retraining) can be treated as a mitigation strategy. Would DLLMs suffer from similar problems when fine-tuning is applied? https://arxiv.org/html/2405.09673v2

On the flip side, are there any guidelines for tuning DLLMs with dynamic environments and chain-of-thought? https://arxiv.org/html/2506.14245v1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions