Skip to content

Continual  pretraining data #1

@xufana7

Description

@xufana7

Your research is highly valuable to the community, and I believe that having access to the continual pre-training data you used would greatly accelerate further research in this field. As you know, creating high-quality, large-scale continual pre-training datasets is a significant challenge.

I was hoping you might consider open-sourcing the dataset you used for your work. Sharing this data would be a tremendous contribution and would allow others to build upon your foundation, reproduce your results, and explore new research avenues more effectively.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions