In this project, we integrate language model (LM) embeddings into pyactr (ACT-R's Python package). Our Experiments are two-fold:
We adapt psycho-embeddings, so that we can extract ISO and AOC embeddings from different LLMs:
- spp_match_finder.py finds contextual sentences in a corpus (from fineweb) given words from the Semantic Priming Project (SPP) dataset.
- embedder.py is updated to include Gemma, and LLM2Vec embeddings.
- psycho-embeddings.py is used to extract ISO embeddings given different datasets and LMs, followed by computing the Spearman correlation between the observed RTs and the cosine similarity between prime and target embeddings.
- aoc_psycho-embeddings is used to extract AOC embeddings given different datasets, LMs, and extracted contexts for each dataset, followed by computing the Spearman correlation between the observed RTs and the cosine similarity between prime and target embeddings.
We incorporate LMs in the pyactr package to model the priming effect in a lexical decision task on the SPP dataset (data/spp_merged.csv):
- spp_all_experiments.py is used to run our ACT-R models (B1, B2, B3, M1, M2, M3).
- model.py and utilities.py are modified within pyactr to include embeddings, fan effect, emma, etc.
- spp_merged.csv contains SPP-Short RT values for prime-target pairs, merged with frequency bins, cosine similarity values, fan effect values, etc.