Skip to content

Multi-head self-attention network for visual semantic embedding. (wacv2020)

Notifications You must be signed in to change notification settings

GeondoPark/MHSAN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding

This repository is the official PyTorch implementation of MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding (WACV2020) by Geondo Park, Chihye Han, Wonjun Yoon, Daeshik Kim.

Prepare Dataset

Download the dataset files: coco, Flickr

We use splits produced by Andrej Karpathy. Download the json file from here. Then, put it same directory, respectively. (ref, get_data_path in utils.py)

Train Coco

For train the new model with coco dataset, run train_coco.py Overall hyperparameters for training are set by default.

python train_coco.py -hop 10 -name <model_name> -p-coeff 0.1

Train flickr

For train the new model with flickr dataset, run train_flickr.py Overall hyperparameters for training are set by default.

python train_flickr.py -hop 10 -name <model_name> -p-coeff 0.1

Implementation

  • Our code is implemented based on : vsepp

Citation

@inproceedings{park2020mhsan,
  title={MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding},
  author={Park, Geondo and Han, Chihye and Yoon, Wonjun and Kim, Daeshik},
  booktitle={The IEEE Winter Conference on Applications of Computer Vision},
  pages={1518--1526},
  year={2020}
}

About

Multi-head self-attention network for visual semantic embedding. (wacv2020)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages