This repository packages the model setup, exposure, and activation extraction stack used to study orientation and working-memory representations with deep neural networks (DNNs). The codebase lets you initialize a curated panel of CNN, transformer, and vision-language models, expose them to controlled visual stimuli, and persist layer-wise activations to HDF5 for downstream representational similarity analysis (RSA) or decoding experiments.
- Conda (recommended)
conda env create -f abstraction_perception.yaml conda activate abstraction_perception
- Pip-only (CPU)
GPU builds of PyTorch / torchvision should match your CUDA runtime if acceleration is needed.
python -m venv .venv source .venv/bin/activate pip install -r requirements.txt
ann_pipe/
├── abstraction_perception.yaml # Full conda environment
├── requirements.txt # Minimal pip dependencies
├── runners/
│ └── run_extract_activations.py
├── src/
│ ├── data/
│ │ ├── __init__.py
│ │ ├── loaders.py # get_image_set helper
│ │ ├── lazy_activations.py # memory-efficient HDF5 reading
│ │ └── preprocessing.py # transforms + activation hooks
│ ├── models/ # model configs + wrappers
│ └── utils/
│ └── abstr_perc_helperfuncs.py
├── examples/
│ └── run_single_model.py # minimal extraction example
└── tests/
└── validate_extraction.py # output structure validation
python runners/run_extract_activations.py \
--stimuli_root /path/to/stimuli/semantic \
--folders stimuli_rotatedsemantic \
--output_dir /path/to/output/activations \
--models ResNet50 CORNet-S ViT-B-16 \
--replicates 4 --auto_confirmEach folder listed under --folders produces an HDF5 file named {output_prefix}_{folder}.h5. The default prefix is model_activations.
src.models.setup: Declares available architectures, their extraction layers, and loader helpers. The wrappers expose logits + intermediate activations in a consistent dictionary.src.data.preprocessing: Shared preprocessing transforms (ImageNet, CLIP, SLIP), alpha masking, replicate jitter, and hook-based activation capture.src.data.loaders.get_image_set: Lists.pngstimuli within a folder.src.data.lazy_activations: Memory-efficient lazy-loading utilities for reading extracted activations from HDF5 files.runners/run_extract_activations.py: CLI orchestration for model initialization, image exposure, replicate averaging, and HDF5 persistence.
python runners/run_extract_activations.py \
--stimuli_root ./stimuli/concrete \
--folders stimuli_rotatedconcrete stimuli_confound/leaves \
--output_dir ./output/concrete/activations \
--output_prefix concrete_modelsOmit --models to extract from every model defined in src/models/setup.py.
- Use higher
--replicatesto stabilize representations under input jitter before computing working-memory RSA or bias metrics. - Group output HDF5 files by manipulation (baseline vs. confound) to run depth-binned or layer-wise RDM comparisons downstream.
| Argument | Description |
|---|---|
--stimuli_root |
Base directory containing image folders. Required. |
--folders |
One or more subfolders (relative or absolute). Default: . |
--output_dir |
Destination directory for HDF5 files. Required. |
--output_prefix |
Filename prefix, defaults to model_activations. |
--models |
Subset of model names; default uses all configured models. |
--replicates |
Number of jittered exposures per image (default 4). |
--auto_confirm |
Skip the interactive safety prompt. |
- File structure:
/{model_name}/{stimulus_id}/{layer_name} - Dataset shape:
(n_replicates, feature_dim) stimulus_idcorresponds to the base filename (sans extension).- To verify integrity:
import h5py with h5py.File("model_activations_rotatedsemantic.h5") as f: print(f["ResNet50"].keys()) # image ids acts = f["ResNet50"]["object_01"]["layer1.0.conv1"][:] assert acts.shape[0] == 4
The layout ensures compatibility with lazy-loading utilities or streaming RDM builders used in downstream RSA / working-memory decoding suites.
The pipeline includes memory-efficient lazy-loading utilities for reading HDF5 activation files without loading everything into RAM at once.
from src.data.lazy_activations import load_lazy_activations
# Load activations with lazy loading
acts = load_lazy_activations("model_activations_semantic.h5")
# Access like a regular nested dictionary - data loads only when accessed
resnet_layer1 = acts["ResNet50"]["object_01"]["layer1.0.conv1"] # (n_replicates, features)
vit_layer = acts["ViT-B-16"]["object_02"]["blocks.5.attn"]
# Works with loops - memory efficient for large datasets
for img_id in acts["ResNet50"].keys():
for layer_name in acts["ResNet50"][img_id].keys():
activation = acts["ResNet50"][img_id][layer_name] # Loaded on demand
# ... process activationfrom src.data.lazy_activations import (
get_available_models,
get_available_images,
get_available_layers,
get_activation_shape
)
# Inspect HDF5 structure without loading data
models = get_available_models("activations.h5")
# ['ResNet50', 'CORNet-S', 'ViT-B-16']
images = get_available_images("activations.h5", "ResNet50")
# ['object_01', 'object_02', ...]
layers = get_available_layers("activations.h5", "ResNet50")
# ['layer1.0.conv1', 'layer2.0.conv1', ...]
shape = get_activation_shape("activations.h5", "ResNet50")
# (4, 64) -> (n_replicates, feature_dim)# Lazy loading automatically caches small activations (< 10MB)
# Clear cache to free memory when needed
acts["ResNet50"]["object_01"].clear_cache() # Clear single image
acts["ResNet50"].clear_cache() # Clear entire model
acts.close() # Clear all cachesfrom src.data.lazy_activations import load_model_activations_lazy
# Load only specific model
resnet_acts = load_model_activations_lazy("activations.h5", "ResNet50")
layer_data = resnet_acts["ResNet50"]["object_01"]["layer4.2.conv3"]| Training Type | CNN-based | Transformer-based |
|---|---|---|
| Image Only | ResNet50, ResNeXt-101-WSL, ConvNeXt-Large, CORNet-S, VGG19 | ViT-B-16, ViT-B-32-timm, DeiT-Base |
| Image + Text | ResNet50-CLIP, ConvNeXt-Large-CLIP | ViT-B-16-CLIP, ViT-B-32-CLIP, ViT-B-16-LAION-CLIP |
| Contrastive | SLIP ViT-Small | — |
Add new models by extending src/models/setup.py with layer lists and loader factories.
- Device placement: The runner auto-selects MPS → CUDA → CPU. Override by editing
process_datasetif you need explicit devices per model. - Memory pressure: Use
--modelsto process a subset or run multiple passes, appending into an existing HDF5 (the script opens files in append mode when partial subsets are requested). - Throughput: Increase
--replicatesonly when necessary; each replicate re-processes the full model forward pass. - Missing weights: Some wrappers (e.g., SLIP, open_clip) need manual weight downloads; see inline error messages.
- Output HDF5 files feed directly into the RSA / bias analysis scripts (
runners_clean/run_analysis.pyin the original project). Copy the generated files intooutput/activations/<orientation>/to reuse existing RDM builders. - Maintain a consistent naming convention (
output_prefix) across experimental conditions to simplify behavioral alignment and depth-binned permutation tests.
examples/run_single_model.py: Minimal extraction workflow for a single modelexamples/read_activations_example.py: Demonstrates reading and inspecting extracted HDF5 filespython examples/read_activations_example.py model_activations_semantic.h5 --model ResNet50 --show_layers
- Run
tests/validate_extraction.pyto verify activation extraction output structure on a small stimulus set. - Use
examples/read_activations_example.pyto inspect generated HDF5 files and verify layer shapes. - For integration with neuroscience data, verify that the averaged replicates maintain the expected contrast (e.g., orientation tuning curves) before computing metrics.
"Vision CNNs": ["ResNet50", "ConvNeXt-Large", "CORNet-S", "VGG19"], "Vision ViTs": ["ViT-B-16", "DeiT-Base", "ViT-B-32-timm"], "Vision-Language CNNs": ["ResNeXt-101-WSL", "ResNet50-CLIP", "ConvNeXt-Large-CLIP"], "Vision-Language ViTs": ["ViT-B-16-CLIP", "ViT-B-32-CLIP", "ViT-B-16-LAION-CLIP"]
| Training Type | CNN-based | Transformer-based |
|---|---|---|
| Image Only | ResNet50 ResNeXt-101-WSL ConvNeXt-Large CORNet-S VGG19 |
ViT-B-16 ViT-B-32-timm DeiT-Base |
| Image + Text | ResNet50-CLIP ConvNeXt-Large-CLIP |
ViT-B-16-CLIP ViT-B-32-CLIP ViT-B-16-LAION-CLIP |
- ResNet50
- Vgg19
- CORNet-S
- ViT-B-16
- DeiT-Base