designer-coderajay

Ajay Pravin Mahale designer-coderajay

Pinned Loading

activation-patching-framework activation-patching-framework Public

Causal intervention framework for mechanistic interpretability research. Implements activation patching methodology for identifying causally important components in transformer language models.

Python 1
logit-lens-explorer logit-lens-explorer Public

Mechanistic interpretability tool visualizing GPT-2's layer-by-layer predictions using the logit lens technique

Python 2
induction-head-detector induction-head-detector Public

Mechanistic interpretability tool to detect induction heads in GPT-2 using TransformerLens

Python 1
minigpt-shakespeare minigpt-shakespeare Public

From-scratch GPT implementation trained on Shakespeare (10.75M params, 25 tests)

Python 1
bpe-tokenizer-scratch bpe-tokenizer-scratch Public

Byte-Pair Encoding tokenizer built from scratch in Python. The same algorithm used by GPT-2.

Python 1
toy-model-superposition toy-model-superposition Public

A minimal Python implementation replicating Anthropic’s “Toy Models of Superposition,” illustrating feature superposition under sparse activations.

Python 1