added the code for the 3D expert finetuning and evaluation#51
added the code for the 3D expert finetuning and evaluation#51Portgas37 wants to merge 1 commit intoEPFLiGHT:experts_newfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new “3D expert” training + evaluation bundle under src/multimeditron/experts/3D_expert/ to fine-tune a 3D CLIP-style model on .npy volumes and evaluate learned embeddings via a downstream MLP classifier.
Changes:
- Introduces a HuggingFace
Trainer-based fine-tuning script forGoodBaiBai88/M3D-CLIPwith JSONL dataset loading +.npypairing/expansion. - Adds 3D embedding extraction utilities and a fracture benchmark that trains/tests an MLP on extracted embeddings.
- Adds a dedicated requirements file and a sample training YAML config for the 3D expert.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 23 comments.
Show a summary per file
| File | Description |
|---|---|
src/multimeditron/experts/3D_expert/train_3D.py |
New 3D fine-tuning script (dataset mixing/expansion, preprocessing, Trainer wiring). |
src/multimeditron/experts/3D_expert/requirements.txt |
Pinned dependency set intended for the 3D expert environment. |
src/multimeditron/experts/3D_expert/mlp_eval.py |
New MLP-based downstream evaluation (k-fold CV + final test evaluation). |
src/multimeditron/experts/3D_expert/load_from_clip.py |
Helper for loading a 3D CLIP model and encoding .npy volumes. |
src/multimeditron/experts/3D_expert/configs/train.yaml |
Example YAML config for running train_3D.py. |
src/multimeditron/experts/3D_expert/Benchmark.py |
New benchmark ABC base class for the 3D expert eval scripts. |
src/multimeditron/experts/3D_expert/3D_fracture_eval.py |
Fracture benchmark that caches embeddings and runs MLP_eval. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| x = x.to("cuda") | ||
| label = label.to("cuda") |
There was a problem hiding this comment.
Several tensors are moved to the hardcoded device string "cuda" (e.g. in evaluate_fold()), ignoring self.device. This will crash on CPU-only environments and is inconsistent with the device selection in __init__. Use self.device (or the model’s parameter device) consistently when moving inputs/labels and creating loss tensors.
| x = x.to("cuda") | |
| label = label.to("cuda") | |
| x = x.to(self.device) | |
| label = label.to(self.device) |
| @@ -0,0 +1,210 @@ | |||
| from Benchmark import Benchmark | |||
There was a problem hiding this comment.
These imports use bare module names (from Benchmark import Benchmark) which are ambiguous in this repo (there is also experts/evaluation_pipeline/disease_classification_pipeline/Benchmark.py). Depending on the working directory / PYTHONPATH, this can import the wrong module or fail. Prefer explicit relative imports within this folder (e.g. from .Benchmark import Benchmark) or a fully-qualified package import.
| from Benchmark import Benchmark | |
| from .Benchmark import Benchmark |
| self.label.append(lab) | ||
| torch.save(self.data, file_name_data) | ||
| torch.save(self.label, file_name_lab) | ||
|
|
There was a problem hiding this comment.
torch.save(...) writes to <cwd>/embeddings/... but the code never creates the embeddings/ directory. If it doesn’t exist, saving will fail with FileNotFoundError. Create the directory (e.g. os.makedirs(..., exist_ok=True)) before these saves.
| def load_model(model_name_or_path: str, device: torch.device = None, cache_dir: str = "/mloscratch/users/achahed/cache"): | ||
| """ | ||
| Load a 3D CLIP model from the given path. |
There was a problem hiding this comment.
load_model() has a user-specific default cache_dir (an absolute /mloscratch/... path). This makes the helper non-portable outside that environment. Prefer defaulting cache_dir to None and letting HuggingFace use its standard cache location.
| - The image shape needs to be processed as 1*32*256*256 | ||
| - The image needs to be normalized to 0-1 (Min-Max Normalization) | ||
| - The image format needs to be .npy |
There was a problem hiding this comment.
The docstring states the image must be normalized to 0–1 (Min-Max) and shaped to 1*32*256*256, but the function currently only loads the .npy and adjusts dimensions (no normalization/resampling). Either implement the preprocessing described here or update the docstring to match actual behavior.
| - The image shape needs to be processed as 1*32*256*256 | |
| - The image needs to be normalized to 0-1 (Min-Max Normalization) | |
| - The image format needs to be .npy | |
| - The input image is expected to be stored as a NumPy array in a .npy file. | |
| - The function will add missing batch/channel dimensions so that the final tensor | |
| has shape (batch, channels, depth, height, width) before being passed to the model. | |
| - The model is designed for images with spatial size [32, 256, 256] and in_channels=1; | |
| this function does not perform any resizing or resampling, so the caller must ensure | |
| the input volume has appropriate spatial dimensions. | |
| - No Min-Max normalization is applied inside this function; if the model expects values | |
| in a specific range (e.g., 0–1), the caller is responsible for performing that | |
| preprocessing before saving/loading the .npy file. |
| self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu") | ||
| self.clip_path = clip_path | ||
| self.clip_name = "" | ||
| def evaluate(self): |
There was a problem hiding this comment.
This method requires 1 positional argument, whereas overridden Benchmark.evaluate requires 2.
This method requires 1 positional argument, whereas overridden Benchmark.evaluate requires 2.
| def evaluate(self): | |
| def evaluate(self, *args, **kwargs): |
|
|
||
| return accuracy / self.k | ||
|
|
||
| def evaluate(self) -> float: |
There was a problem hiding this comment.
This method requires 1 positional argument, whereas overridden Benchmark.evaluate requires 2.
This method requires 1 positional argument, whereas overridden Benchmark.evaluate requires 2.
| def evaluate(self) -> float: | |
| def evaluate(self, data_loader=None) -> float: |
| import torch | ||
| import json | ||
| from load_from_clip import load_model, encode_img | ||
| import torch.nn as nn |
There was a problem hiding this comment.
Import of 'nn' is not used.
| import torch.nn as nn |
| from transformers import VisionTextDualEncoderConfig, VisionTextDualEncoderModel | ||
|
|
There was a problem hiding this comment.
Import of 'VisionTextDualEncoderConfig' is not used.
Import of 'VisionTextDualEncoderModel' is not used.
| from transformers import VisionTextDualEncoderConfig, VisionTextDualEncoderModel |
| from transformers import ( | ||
| AutoImageProcessor, | ||
| AutoModel, | ||
| AutoTokenizer, | ||
| HfArgumentParser, | ||
| VisionTextDualEncoderModel, | ||
| Trainer, | ||
| TrainingArguments, | ||
| set_seed, | ||
| ) |
There was a problem hiding this comment.
Import of 'VisionTextDualEncoderModel' is not used.
Import of 'AutoImageProcessor' is not used.
No description provided.