ACL: Adversarial Contrastive Learning for LLM Quantization Attacks.

Setting

Add the following variables to ~/.bashrc

vi ~/.bashrc
export HF_TOKEN=<YOUR TOKEN>
export HF_ALLOW_CODE_EVAL=1
source ~/.bashrc

Environment setting

conda create -n ACL python=3.11 -y
conda activate ACL
pip install -r requirements.txt

Quick Start

Download Model

 cd ACL
 hf download "meta-llama/Llama-3.2-1B-Instruct" --local-dir  base_models/llama3.2-1b-instruct

Fine-tune

Two-stage fine-tuning (injection and removal).

./run_injection_and_removal.sh llama3.2-1b-instruct

If you have 8 GPUs, you can perform distributed fine-tuning by running

./run_injection_and_removal_8gpu.sh llama3.2-1b-instruct

Evaluate Attack Success Rate (ASR):

Evaluate ASR under three attack scenarios (ad_inject, over_refusal, and jailbreak) across three zero-shot LLM quantization settings: INT8, FP4, and NF4.

./run_evaluate_asr.sh llama3.2-1b-instruct ad_inject fp4

Evaluate Benchmark:

Evaluate MMLU and TruthfulQA under three attack scenarios (ad_inject, over_refusal, and jailbreak) across three zero-shot LLM quantization settings: INT8, FP4, and NF4.

./run_evaluate_benchmark.sh llama3.2-1b-instruct ad_inject fp4

Acknowledgements

Our code is based on llm-quantization-attack and llm-pruning-attack.

We thank the teams for their open-source implementation.

Citation

If you find AttnCache useful or relevant to your project and research, please kindly cite our paper:

@article{song2026acl,
        title={Adversarial Contrastive Learning for LLM Quantization Attacks},
        author={Song, Dinghong and Xu, Zhiwei and Wan, Hai and Zhao, Xibin and Su, Pengfei and Li, Dong},
        journal={arXiv},
        year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
ACL		ACL
figures		figures
q_attack		q_attack
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
install.sh		install.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ACL: Adversarial Contrastive Learning for LLM Quantization Attacks.

Setting

Environment setting

Quick Start

Download Model

Fine-tune

Evaluate Attack Success Rate (ASR):

Evaluate Benchmark:

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Languages

License

dinghongsong/ACL

Folders and files

Latest commit

History

Repository files navigation

ACL: Adversarial Contrastive Learning for LLM Quantization Attacks.

Setting

Environment setting

Quick Start

Download Model

Fine-tune

Evaluate Attack Success Rate (ASR):

Evaluate Benchmark:

Acknowledgements

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages