DACTYL: Diverse Adversarial Corpus of Texts Yielded from Large language models

Overview

This repository contains the code and resources for my master's thesis at Cambridge: DACTYL: Diverse Adversarial Corpus of Texts Yielded from Large language models.

Datasets

Models

Generating DACTYL

We used the dactyl-generation package to generate texts. This can be installed via pip:

pip install dactyl-generation

Code

finetuning: Continues pre-training a Llama 3.2 1B Instruct model for a specific domain.
baselines: Contains code to get predictions from existing AI-text detectors on the DACTYL dataset.
cpt_generations: Performs a randomized generation parameter sweep to determine which parameters evade detection better for the Llama 3.2 1B Instruct models.
training: Trains an AI-text classifier.
evaluation: Evaluates DACTYL-trained classifiers.

Citation

@misc{thorat2025dactyldiverseadversarialcorpus,
      title={DACTYL: Diverse Adversarial Corpus of Texts Yielded from Large Language Models}, 
      author={Shantanu Thorat and Andrew Caines},
      year={2025},
      eprint={2508.00619},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.00619}, 
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DACTYL: Diverse Adversarial Corpus of Texts Yielded from Large language models

Overview

Datasets

Models

Generating DACTYL

Code

Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
baselines		baselines
cpt_generations		cpt_generations
evaluation		evaluation
finetuning		finetuning
training		training
LICENSE		LICENSE
README.md		README.md

License

ShantanuT01/DACTYL

Folders and files

Latest commit

History

Repository files navigation

DACTYL: Diverse Adversarial Corpus of Texts Yielded from Large language models

Overview

Datasets

Models

Generating DACTYL

Code

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages