Multitoken Hallucination Probes for Apertus

This is the codebase for the project on multitoken hallucination prediction with probes for the Apertus LLM. The project is done as a part of Large Scale AI Engineering course at ETHZ by Klejdi Sevdari, Michal Korniak and Tymoteusz Kwiecinski and supervised by Anna Hedström and Imanol Schlag.

In scope of the project we:

reproduced the project, along with the annotation pipeline for generating the dataset with annotated hallucination spans
implemented and evaluated multitoken probes concatenating the tokens
implemented and evaluated attention probes

In the ./clariden/ directory there are some files and scripts that help with working on the cluster, including sbatch scripts and environment files.

./generation_pipeline contains a script which can be used to create a dataset with model outputs using a dataset with prompts (e.g. longfact or longfact++). Such dataset with generations is a later input for an annotation pipeline (which is in ./annotation_pipeline directory), that fact-checks the generations using an advanced model with web-search functionality. In the original paper, they used Sonnet 4.5, but we used GPT4o, because we had no access to Anthropic API.

Readme of the original repository can be found here.

Setup

We moved from uv to simple pip setup because of problems with torch cuda dependencies. Maybe it will be good to switch back to uv, but I found out this setup to be working just fine.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
annotation_pipeline		annotation_pipeline
clariden		clariden
configs		configs
demo		demo
generation_pipeline		generation_pipeline
jupyter_experiments		jupyter_experiments
probe		probe
scripts		scripts
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README.old.md		README.old.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
train_data.yaml		train_data.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multitoken Hallucination Probes for Apertus

Setup

About

Uh oh!

Releases

Packages

Languages

License

sevdari/hallucination_probes

Folders and files

Latest commit

History

Repository files navigation

Multitoken Hallucination Probes for Apertus

Setup

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages