Skip to content

ConSCompF: Consistency-focused Similarity Comparison Framework

License

Notifications You must be signed in to change notification settings

alex-karev/conscompf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ConSCompF: Consistency-focused Similarity Comparison Framework

Python implementation of ConSCompF - LLM similarity comparison framework that accounts for instruction consistency proposed in the original paper.

LLM comprarison using ConSCompF

Features

  • Generates LLM similarity matrices and compresses them using PCA.
  • Can be used in few-shot scenarios.
  • Supports multiple input formats including lists, HF datasets, and pandas DataFrames.
  • Supports different return types including lists, PyTorch tensors, and pandas DataFrames.
  • Supports embedding caching.

Installation

The package is available on PyPI:

pip install conscompf

Usage

from conscompf import ConSCompF

conscompf = ConSCompF(quiet=True)

data: list[dict[str, list[str]]] = [
    {
        "model1": [
            "Text 1...",
            "Text 2...",
        ], 
        "model2": [
            "Text 1...",
            "Text 2...",
        ], 
    }, {
        "model1": [...],
        "model2": [...]
    }, ...
] # Or use HF dataset with a similar structure

out = conscompf(data, return_type="df") # Available return types: pt, df, list

print(out["sim_matrix"])
print(out["pca"])
print(out["consistency"])

The same minimalistic example, but with real data can be found in examples/simple.py.

More examples are available in examples directory.

For a full list of available functions and arguments use the documentation:

pydoc conscompf.ConSCompF

Build

You can build and install this package manually:

git clone https://github.com/alex-karev/conscompf
cd conscompf
python -m build .
pip install .

Citation

This project is currently contributed by Alexey Karev and Dong Xu from School of Computer Engineering and Science of Shanghai University.

If you find our work valuable, please cite:

 @article{
    Karev_Xu_2025, 
    title={ConSCompF: Consistency-focused Similarity Comparison Framework for Generative Large Language Models}, 
    volume={82}, 
    ISSN={1076-9757}, 
    DOI={10.1613/jair.1.17028},
    journal={Journal of Artificial Intelligence Research}, 
    author={Karev, Alexey and Xu, Dong}, 
    year={2025}, 
    month=mar, 
    pages={1325–1347} 
}

The original dataset used during the experiments described in the original paper is available here.

Contribution

Feel free to fork this repo and make pull requests.

Lisense

Free to use under Apacha 2.0. See LICENSE for more information.

About

ConSCompF: Consistency-focused Similarity Comparison Framework

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages