Active Prompt Learning with Vision-Language Model Priors

This repository is the official implementation of "Active Prompt Learning with Vision-Language Model Priors" accepted by TMLR 2025.

Abstract

Vision-language models (VLMs) have demonstrated remarkable zero-shot performance across various classification tasks. Nonetheless, their reliance on hand-crafted text prompts for each task hinders efficient adaptation to new tasks. While prompt learning offers a promising solution, most studies focus on maximizing the utilization of given few-shot labeled datasets, often overlooking the potential of careful data selection strategies, which enable higher accuracy with fewer labeled data. This motivates us to study a budget-efficient active prompt learning framework. Specifically, we introduce a class-guided clustering that leverages the pre-trained image and text encoders of VLMs, thereby enabling our cluster-balanced acquisition function from the initial round of active learning. Furthermore, considering the substantial class-wise variance in confidence exhibited by VLMs, we propose a budget-saving selective querying based on adaptive class-wise thresholds. Extensive experiments in active learning scenarios across nine datasets demonstrate that our method outperforms existing baselines.

How to Install

We sincerely appreciate the PCB repository for offering an excellent base code. For details, the code is based on the CoOp repository and the powerful toolbox Dassl.pytorch. To simply usage, PCB integrates the dassl directory into our directory. You follow below commands:

conda create -n pcb python=3.10 
conda activate pcb 
pip install -r requirements.txt

Next, you should build on the datasets - follow DATASETS.md to install the datasets.

How to Run

To run the code, you need to look into scripts/alvlm/main.sh. In this file, you must set parameter DATA as the directory path that datasets are stored. After then, you can run the code by following command.

CUDA_VISIBLE_DEVICES=XX sh scripts/alvlm/main.sh [DATASET NAME] [MODEL NAME] [AL METHOD] [SEED NUMBER] [MODE]

DATASET NAME $\in$ [oxford_flowers, dtd, oxford_pets, caltech101, stanford_cars, eurosat, fgvc_aircraft]
MODEL NAME $\in$ [RN50, RN101, vit_b32, vit_b16]
AL METHOD $\in$ [random, entropy, coreset, badge, pcb, cb, cbsq]
SEED: integer
MODE: This is for description augmentation $\in$ [none, AS, AE]

Cite

Please cite our paper if you use the model or this code in your own work:

@inproceedings{kim2025active,
  title={Active Prompt Learning with Vision-Language Model Priors},
  author={Hoyoung Kim and Seokhee Jin and Changhwan Sung and Jaechang Kim and Jungseul Ok},
  booktitle=TMLR,
  year={2025},
  url={https://arxiv.org/abs/2411.16722}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
CLIP_Explainability		CLIP_Explainability
clip		clip
configs		configs
cupl		cupl
dassl		dassl
datasets		datasets
descriptors		descriptors
scripts		scripts
trainers		trainers
.gitignore		.gitignore
DATASETS.md		DATASETS.md
LICENSE		LICENSE
README.md		README.md
gradfam.ipynb		gradfam.ipynb
main-figure.PNG		main-figure.PNG
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Active Prompt Learning with Vision-Language Model Priors

Abstract

How to Install

How to Run

Cite

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ml-postech/active-prompt-learning

Folders and files

Latest commit

History

Repository files navigation

Active Prompt Learning with Vision-Language Model Priors

Abstract

How to Install

How to Run

Cite

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages