Skip to content

More detail on language-mixing phenomena #72

@AlanShao-zy

Description

@AlanShao-zy

Thank you for your excellent work and the insightful paper. We are currently attempting to reproduce your results using the kk-datasets configuration, and have observed that no language-mixing phenomena occur even after model convergence.

What is the average response length (in tokens) observed during language-mixing events?

Some Exp Settings:

  • model:qwen2.5-7b-base
  • max_response_length: 4096
  • datasets: kk 5-8ppl

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions