Dumb question on fine-tuning

Since people know SFT for AR-LLM will cause issues of forgetting, and that PEFT like LoRA (as well as RL-based retraining) can be treated as a mitigation strategy. Would DLLMs suffer from similar problems when fine-tuning is applied? https://arxiv.org/html/2405.09673v2

On the flip side, are there any guidelines for tuning DLLMs with dynamic environments and chain-of-thought? https://arxiv.org/html/2506.14245v1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dumb question on fine-tuning #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Dumb question on fine-tuning #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions