Repository of code and resources for processing, modelling, and analysis of MIND and other brain MRI modalities, with a focus on predictive machine learning (ElasticNet regression, PLS regression, XGBoost) and deep learning (GNNs) modelling. Main analyses are performed on UK Biobank dataset.

MIND_processing: Handling and preprocessing Morphometric INverse Divergence (MIND) structural connectivity matrices
- Collects UK Biobank data from CAMH cluster
- Runs MIND processing for each participant in dataset using DK or HCP atlas
- Uses freesurfer cortical thickness, volume, surface area, curvature, and sulcal depth variables
other_dataset_processing: Preprocessing scripts for other datasets (not UK Biobank), unrelated to main project
- OASIS, NEO, 3D datasets
All models are trained to predict performance on Fluid Intelligence, Paired Associate Learning, Digit Symbol Substitution Test, and Alphanumeric Trail Making Test using demographic data, MIND, functional connectivity, and cortical thickness. Models are optimized using nested 10-fold cross-validation and grid search. To assess the role of demographic variables, models are trained both on the full predictor set and on neuroimaging predictors with demographic effects regressed out. Model weights and SHAP scores are analyzed for interpretability
matlab_plsregression: MATLAB scripts for Partial Least Squares (PLS) regression, including permutation testing and bootstrappingmodels_plsregression: Python scripts and notebooks for PLS regression modelling and analysismodels_xgboost: Scripts and notebooks for XGBoost modelling and analysismodels_elasticnet: Scripts and notebooks for Elastic Net regression modelling and analysis
QuantNets: Scripts and config files for building and experimenting with various GNN architectures (Graph Convolution Network (GCN), Graph Attention Networks (GAT), Quantized Graph Convolutional Network (QGRN))- Preprocessing scripts for converting connectivity matrices into graphs (PyTorch Geometric), standardization, sparsification, and dataset splitting
- Architecture configurations of different GNN types and different methods of injecting demographic data into models
- Configurations of model hyperparameters and sizing (# of layers, dropout, weight decay, learning rate, learning rate schedulers, activation functions, pooling layers, normalization layers, embedding dimensions, hidden dimensions)
- Experimental setup to train multiple model configurations and track evaluation results
miscellaneous: Miscellaneous scripts and resourcesregion_names: UK Biobank variable names and brain region namesvisualization: Visualization of data and modelling resultssbatch_scripts: SLURM batch scripts for running computational jobs on clusters
- Polish and implement Quantized Graph Convolutional architectures
- Preprocess connectivity matrices with Fast Fourier Transform + inverse covariance matrix to sparsify
- Pipeline to evaluate GNNs on different cognitive measures