Multi-Modal Sentiment Classification

Sharif University of Technology, EE Dept.
Deep Learning Graduate Course, Dr. E. Fatemizadeh

Description

This project implements a deep learning pipeline for Multi-Modal Sentiment Analysis, specifically designed to process and classify sentiment in image-text conversations. Built using PyTorch, the system leverages state-of-the-art models to analyze both visual and textual data. It utilizes EfficientNet for image feature extraction and BERT for textual embeddings, fusing these modalities to predict sentiments (Positive, Negative, Neutral).

The project is structured into distinct phases, evolving from basic data handling to complex multimodal fusion strategies, making it a comprehensive resource for understanding how to integrate Natural Language Processing (NLP) and Computer Vision (CV).

Features

Robust Data Handling: Custom PyTorch DataLoaders designed for the MSCTD (Multi-Modal Sentiment Classification and Time Dynamics) dataset.
Visual Sentiment Analysis:
- Face detection and extraction using MTCNN and RetinaFace.
- Feature extraction using pre-trained EfficientNet-B2.
Textual Sentiment Analysis:
- Preprocessing pipelines involving tokenization, lemmatization, and stop-word removal.
- Implementation of classical methods (TF-IDF, SVM) and deep learning approaches (Word2Vec, BERT).
Multimodal Fusion: Strategies to concatenate and fuse embedded vectors from both image and text models for superior classification performance.
Model Evaluation: Comprehensive metrics including Accuracy, Precision, Recall, and F1-Score with confusion matrix visualizations.

Installation

Clone the repository:

git clone [https://github.com/nikelroid/multi-modal-sentiment-classification.git](https://github.com/nikelroid/multi-modal-sentiment-classification.git)
cd multi-modal-sentiment-classification

Install dependencies: It is recommended to use a virtual environment.

pip install torch torchvision torchaudio
pip install transformers scikit-learn pandas numpy matplotlib seaborn
pip install mtcnn pyenchant nltk

Download NLTK data (if required):

import nltk
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')

Usage

The project is divided into four phases. You can run the Jupyter Notebooks for each phase sequentially:

Phase 0: Data Preparation

Goal: Initialize data loaders and prepare the MSCTD dataset.
Run: Phase-0/project_phase0.ipynb

Phase 1: Visual Analysis

Goal: Extract facial features and analyze images for sentiment.
Run: Phase-1/Phase1-Part1.ipynb, Phase-1/Phase1-Part2.ipynb, etc.

Phase 2: Textual Analysis

Goal: Train NLP models to classify sentiment based on text dialogue.
Run: Phase-2/Phase2.ipynb

Phase 3: Multimodal Fusion

Goal: Combine pre-trained visual and textual models to train a final classifier.
Run: Phase-3/Phase3_Part1.ipynb

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository.
Create a new feature branch (git checkout -b feature/AmazingFeature).
Commit your changes (git commit -m 'Add some AmazingFeature').
Push to the branch (git push origin feature/AmazingFeature).
Open a Pull Request.

License

This project is distributed under the MIT License. See LICENSE for more information.

Contact/Support

For questions or support, please open an issue in this repository or contact the original participants:

Nima Kelidari
Ali Abbasi
Amir Ahmad Shafiee

Name		Name	Last commit message	Last commit date
Latest commit History 147 Commits
Phase-0		Phase-0
Phase-1		Phase-1
Phase-2		Phase-2
Phase-3		Phase-3
README-PHASE0.md		README-PHASE0.md
README-PHASE1.md		README-PHASE1.md
README-PHASE2.md		README-PHASE2.md
README-PHASE3.md		README-PHASE3.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi-Modal Sentiment Classification

Sharif University of Technology, EE Dept.
Deep Learning Graduate Course, Dr. E. Fatemizadeh

Description

Features

Installation

Usage

Phase 0: Data Preparation

Phase 1: Visual Analysis

Phase 2: Textual Analysis

Phase 3: Multimodal Fusion

Contributing

License

Contact/Support

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Nikelroid/multimodal-sentiment-classification

Folders and files

Latest commit

History

Repository files navigation

Multi-Modal Sentiment Classification

Sharif University of Technology, EE Dept. Deep Learning Graduate Course, Dr. E. Fatemizadeh

Description

Features

Installation

Usage

Phase 0: Data Preparation

Phase 1: Visual Analysis

Phase 2: Textual Analysis

Phase 3: Multimodal Fusion

Contributing

License

Contact/Support

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Sharif University of Technology, EE Dept.
Deep Learning Graduate Course, Dr. E. Fatemizadeh

Packages