Skip to content
/ TROVE Public

[ACL 2025] TROVE: A Challenge for Fine-Grained Text Provenance via Source Sentence Tracing and Relationship Classification

Notifications You must be signed in to change notification settings

ZNLP/TROVE

Repository files navigation

TROVE: A Challenge for Fine-Grained Text Provenance via Source Sentence Tracing and Relationship Classification

This is the official repository of TROVE (ACL 2025).

framework

Directory Structure

├── Part1_origin_data/      # Raw dataset
├── Part2_human_annotated/  # Human-annotated dataset
├── Part3_annotated_result/ # Merged human annotations and preliminary retrieval results.
├── Part4_evaluation_data/  # Evaluation data for models (including prompts), with texts segmented by sentences.
├── step1_process_annotation.py # Data processing script step1 - generates "Part3_annotated_result"
├── step2_generate_evaluation.py # Data processing script step2 - generates "Part4_evaluation_data"
├── inference_close_model.py # Evaluation for closed-source models
├── inference_open_model.py # Evaluation for open-source models
├── metrics.py # evaluation 

Requirements

Please install the following:

  • python 3.10
  • torch 2.3.1
  • transformers 4.40.2

Quick Start

Dataset

Due to file size limits, the data folders are hosted on Google Drive.

👉 Download Link

Setup: Please unzip the downloaded file and place the extracted subfolders ​directly into the root of this repository​.

The data in "Part4_evaluation_data" is our complete dataset and can be directly used for model evaluation.

Inference

  • For open-source models:
    sh ./open_infer.sh
  • For closed-source models:
    sh ./close_infer.sh

Evaluation

python metrics.py

Citation

If you use TROVE in your research, please cite our paper:

@inproceedings{zhu-etal-2025-trove,
    title = "{TROVE}: A Challenge for Fine-Grained Text Provenance via Source Sentence Tracing and Relationship Classification",
    author = "Zhu, Junnan  and
      Xiao, Min  and
      Wang, Yining  and
      Zhai, Feifei  and
      Zhou, Yu  and
      Zong, Chengqing",
    booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.acl-long.577/",
    pages = "11755--11771",
    ISBN = "979-8-89176-251-0"
}

About

[ACL 2025] TROVE: A Challenge for Fine-Grained Text Provenance via Source Sentence Tracing and Relationship Classification

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published