ExPAM: Explainable Personality Assessment Method using Heterogeneous Linguistic Features and Off-the-Shelf LLMs

Elena Ryumina, Dmitry Ryumin, Maxim Markitantov, Alexey Karpov

Abstract

Many organizations are increasingly adopting personalization techniques to enhance a user satisfaction. However, current systems generally lack the ability to automatically infer and interpret individual Personality traits (PTs), which are key drivers of user behavior. Large Language Models (LLMs) are widely used, but they are still not well-suited to reliable and explainable Personality Assessment (PA). To address this gap, we propose ExPAM, a novel Explainable Personality Assessment Method that leverages hybrid feature fusion and in-context learning with off-the-shelf Large Language Models (LLMs) to predict Big Five PTs from textual data. It allows explicitly grounding predictions in interpretable linguistic patterns without requiring Large Language Models (LLMs) fine-tuning. The hybrid fusion is designed to simultaneously enhance predictive performance and model interpretability in Personality Assessment (PA). Specifically, transformer-based embeddings encode local contextual information, while features extracted via the Linguistic Inquiry and Word Count (LIWC) dictionary provide complementary global and local linguistic indicators of PTs. These interpretable feature patterns are incorporated into prompts that guide the LLM to generate both PTs predictions and human-understandable explanations. Evaluated on the ChaLearn First Impressions v2 corpus, ExPAM outperforms models relying on either feature type alone, achieving a mean accuracy (mACC) of 0.891 and a Concordance Correlation Coefficient (CCC) of 0.333. Moreover, prompting the LLM with hybrid global-local patterns yields a relative CCC improvement of 9.6%. Qualitative interpretability analysis reveals trait-specific linguistic patterns, offering valuable insights for psychological research, computational linguistics, and paralinguistic studies. The proposed method thus advances both accuracy and transparency in PA, with promising applications in psychological profiling, personnel selection, and personalized recommendation systems.

Framework Pipeline

Figure 1: Pipeline of ExPAM.

Materials

The project uses the ChaLearn First Impressions V2 corpus, which includes:

Video recordings of more than 3000 individuals.
Ground-truth personality trait scores (continuous values from 0 to 1) for Big Five traits: Openness (O), Conscientiousness (C), Extraversion (E), Agreeableness (N), non-Neuroticism (N).

The corpus is available after registration. After registration, raw data is downloaded. The prepared data is available at src/prepered_dataframes.

Code Information

The codebase is structured as follows:

project_root/
├── figures/ # visualizations
├── src/
│ ├── prepered_dataframes
│ │ ├── dev_full_with_ASR.csv
│ │ ├── test_full_with_ASR.csv
│ │ ├── train_full_with_ASR.csv
│ ├── datasets.py # Custom PyTorch datasets and collate functions
│ ├── losses.py # Loss functions (LogCoshGL, etc.)
│ ├── measures.py # Evaluation metrics (CCC, MAE)
│ ├── models.py # Model architectures (BiLSTMAtt, MambaAtt, fusion_model)
│ ├── text_preprocessing.py # Embedding extraction and LIWC feature generation
│ ├── training_utils.py # Training loops, early stopping, checkpointing
│ └── utils.py # Helper functions
├── get_attention_weights.py # Generate attention weights for test set
├── get_explanation_for_LLM.py # Generate hybrid-based explanations for LLMs
├── get_explanation_with_LLM.py # Generate trait explanations + LLM refinement
├── refine_with_llm.py # Code for refined predictions and explanations using an LLM
├── train_single_models.py # Train base models (XLM, LIWC)
├── train_fusion_model.py # Train ensemble/fusion model
├── transcribe_with_asr.py # Generate ASR transcripts from audio
└── README.md # This file

Usage Instructions

1. Setup Environment

# Clone repository
git clone https://github.com/yourname/ExPAM.git
cd ExPAM

# Install dependencies
pip install -r requirements.txt

2. Generate ASR Transcripts (Optional)

python transcribe_with_asr.py \
  --data_path "path/to/audio/" \
  --df_path "path/to/csvs/" \
  --whisper_model "openai/whisper-large-v3-turbo" \
  --device "cuda:0"

This will add text_ASR column to your CSVs. The prepared data is available at src/prepered_dataframes.

3. Train Single Models

Train single models on XLM-RoBERTa, JINA, BERT and LIWC features:

python train_single_models.py \
  --models BiLSTMAtt ReBiLSTMAtt MambaAtt ReMambaAtt \
  --encoders xlm jina-v3 bert bge liwc \
  --lrs 1e-5 1e-4\
  --dropouts 0.1 0.0\
  --hds 64 128\
  --epochs 60 \
  --seed 42 \
  --patience 10 \
  --bs 32 \
  --save_dir "saved_single_models"

4. Train Fusion Model

Combine predictions from best single models:

python train_fusion_model.py \
  --nn_model_path "saved_single_models/BEST MODEL BASED ON DEEP FEATURES" \
  --hc_model_path "saved_single_models/BEST MODEL BASED ON HAND-CRAFTED FEATURES" \
  --save_dir "saved_fusion_models" \
  --deep_model_architecture "BEST DEEP MODEL ARCHITECTURE" \
  --hc_model_architecture "BEST HAND-CRAFTED MODEL ARCHITECTURE" \
  --deep_encoder "BEST DEEP ENCODER" \
  --lr 1e-2\
  --epochs 500 \
  --seed 42 \
  --patience 100 \
  --bs 128 \
  --save_dir "saved_fusion_models"

5. Generate Attention Weights for All Train / Test Examples

This step is necessary for interpreting the hybrid model results and generating explanations:

python get_attention_weights.py \
  --nn_model_path "saved_single_models/BEST MODEL BASED ON DEEP FEATURES" \
  --hc_model_path "saved_single_models/BEST MODEL BASED ON HAND-CRAFTED FEATURES" \
  --save_dir "saved_fusion_models" \
  --deep_model_architecture "BEST DEEP MODEL ARCHITECTURE" \
  --hc_model_architecture "BEST HAND-CRAFTED MODEL ARCHITECTURE" \
  --deep_encoder "BEST DEEP ENCODER" \
  --dataset_path "src/prepered_dataframes/train_full_with_ASR.csv" \
  --subset "train"

This step is required before running get_explanation_for_LLM.py and get_explanation_with_LLM.py.

6. Generate Explanations Without LLM for All Test Examples

To generate explanations for all test examples for building prompts for LLM, you should use:

python get_explanation_for_LLM.py \
--test_csv "src/prepered_dataframes/test_full_with_ASR.csv" \
--train_weights "train_attention_weights.pickle" \
--test_weights "test_attention_weights.pickle" \
--liwc_path "LIWC2007.txt" \
--save_pickle "test_explanations.pickle"

7. Refine Predictions and Explanation with LLM for All Test Examples

Several Large Language Models (LLMs) were evaluated in four different experimental setups (zero-shot, one-shot, few-shot, and explanation-based):

However, in terms of performance measures (mACC, CCC), Falcon-H1-7B-Instruct outperformed the others. See Figure 2:

Figure 2: Performance measures of LLM. ZS, OS, FS and EX refer to zero-, one-, few-shot and explanation-based setups. T means a thinking mode.

To obtain refined predictions and explanations using an LLM, you should use:

python refine_with_llm.py \
--prompt_type explanation \
--prompt_pickle "test_explanations.pickle" \
--input_csv "src/prepered_dataframes/test_full_with_ASR.csv" \
--output_csv out_expl.csv \
--log_file log_expl.txt \
--llm_model_id tiiuae/Falcon-H1-7B-Instruct

8. Generate Explanations With / Without LLM for One Example

For a specific video and trait:

python get_explanation_with_LLM.py \
  --train_weights "train_attention_weights.pickle" \
  --test_weights "test_attention_weights.pickle" \
  --video_name "BSfClgoqf00.001" \
  --test_csv "test_full_with_ASR.csv" \
  --run_llm \
  --llm_model_path "tiiuae/Falcon-H1-7B-Instruct" \
  --output_dir "results/BSfClgoqf00.001" \

Methodology

Data Preprocessing

Text normalization: lowercase, contraction expansion, punctuation removal.
Tokenization and embedding extraction via XLM-RoBERTa / JINA / BERT.
LIWC feature extraction per token using dictionary matching.

Model Architecture

Single Models: BiLSTM + Attention (BiLSTMAtt) / Residual + BiLSTM + Attention (ReBiLSTMAtt) / Mamba + Attention (BiMambaAtt) / Residual + Mamba + Attention (ReMambaAtt) for each modality.
Fusion Model: Concatenates predictions from single models based on deep and hand-crafted features → single dense layer → sigmoid output.

Interpretability

Global attention weights aggregated across training set.
Local token-level attention normalized and visualized.
Explanation generated based on top positive/negative tokens and categories.

LLM integration

Prompt instructs LLM to reinterpret scores based on our explanation-based prompt, ignoring initial predictions unless supported.
Output: refined scores + natural language explanation (200 words).

Citations

If you use this work, please cite the following paper (currently under review):

@article{ryumina2025ber,
  title   = {ExPAM: Explainable Personality Assessment Method using Heterogeneous Linguistic Features and Off-the-Shelf LLMs},
  author  = {Ryumina, Elena and Ryumin, Dmitry and Markitantov, Maxim and Karpov, Alexey},
  journal = {PeerJ Computer Science},
  year    = {2026},
  note    = {Under review}
}

License

This project is released under the MIT License — see LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ExPAM: Explainable Personality Assessment Method using Heterogeneous Linguistic Features and Off-the-Shelf LLMs

Abstract

Framework Pipeline

Materials

Code Information

Usage Instructions

1. Setup Environment

2. Generate ASR Transcripts (Optional)

3. Train Single Models

4. Train Fusion Model

5. Generate Attention Weights for All Train / Test Examples

6. Generate Explanations Without LLM for All Test Examples

7. Refine Predictions and Explanation with LLM for All Test Examples

8. Generate Explanations With / Without LLM for One Example

Methodology

Citations

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
figures		figures
src		src
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENCE		LICENCE
LIWC2007.txt		LIWC2007.txt
README.md		README.md
get_attention_weights.py		get_attention_weights.py
get_explanation_for_LLM.py		get_explanation_for_LLM.py
get_explanation_with_LLM.py		get_explanation_with_LLM.py
refine_with_llm.py		refine_with_llm.py
requirements.txt		requirements.txt
train_fusion_models.py		train_fusion_models.py
train_single_models.py		train_single_models.py
transcribe_with_asr.py		transcribe_with_asr.py

License

SMIL-SPCRAS/ExPAM

Folders and files

Latest commit

History

Repository files navigation

ExPAM: Explainable Personality Assessment Method using Heterogeneous Linguistic Features and Off-the-Shelf LLMs

Abstract

Framework Pipeline

Materials

Code Information

Usage Instructions

1. Setup Environment

2. Generate ASR Transcripts (Optional)

3. Train Single Models

4. Train Fusion Model

5. Generate Attention Weights for All Train / Test Examples

6. Generate Explanations Without LLM for All Test Examples

7. Refine Predictions and Explanation with LLM for All Test Examples

8. Generate Explanations With / Without LLM for One Example

Methodology

Citations

License

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages