-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Since people know SFT for AR-LLM will cause issues of forgetting, and that PEFT like LoRA (as well as RL-based retraining) can be treated as a mitigation strategy. Would DLLMs suffer from similar problems when fine-tuning is applied? https://arxiv.org/html/2405.09673v2
On the flip side, are there any guidelines for tuning DLLMs with dynamic environments and chain-of-thought? https://arxiv.org/html/2506.14245v1
Metadata
Metadata
Assignees
Labels
No labels