Skip to content

develop a Siamese Neural Network capable of performing binary classification based on the Euclidean distance between embedding vectors generated by the network. This classification helps in predicting whether an image is synthetic or real.

Notifications You must be signed in to change notification settings

MarioCicalese/SiameseNN-for-Fake-images-detection

Repository files navigation

🔍 DeepFake Image Detection with Siamese Neural Networks

📌 Project Overview

The goal of this project was to develop a Siamese Neural Network capable of performing binary classification based on the Euclidean distance between embedding vectors generated by the network. This classification helps in predicting whether an image is synthetic or real. The model was trained using RGB images of size 200x200, which were converted into the frequency domain using the Fourier Transform. These images were sourced from the ArtiFact dataset, which contains a total of 2,496,738 images, including 964,989 real and 1,531,749 synthetic images. The frequency domain was chosen because prior research has indicated that generative models leave artificial imprints, often referred to as 'fingerprints', on the images they generate. These fingerprints can be leveraged to detect synthetic images. However, these fingerprints are not visible in the RGB spatial domain but can be observed in the frequency domain.

🗃️ ArtiFact Dataset

The images used to train the model were sourced from the Artifact Dataset, which comprises a total of 2,496,738 RGB images of size 200×200, including 964,989 real and 1,531,749 synthetic samples. The images span various domains such as human faces, animals, landscapes, vehicles, and artworks. The synthetic images were generated using 25 different models, specifically 13 Generative Adversarial Networks (GANs), 7 Diffusion models, and 5 other generation techniques.

All preprocessing operations applied to prepare the data for the model are documented in detail within the project notebooks and technical report. However, the most significant transformation involved converting images from the RGB domain to the frequency domain using the Fourier Transform.

🏗️ Model Architecture

To achieve the objectives, a Siamese Neural Network (SNN) architecture was employed, consisting of three Convolutional Neural Networks (CNNs). Specifically, the chosen CNN model is EfficientNetV2-B0, the most lightweight variant in the EfficientNetV2 family.

The final classification layer of the CNN is removed to extract the embedding vectors (also referred to as encoding vectors) from each input. This step is crucial, as the loss function—in this case, PyTorch's TripletMarginLoss — operates by computing the distances between these vector representations.

⚙️ Training Details

🧪 Triplet Loss and Semi-Hard Triplet Mining

To optimize the model's learning process, we rely on TripletMarginLoss, which encourages the network to map similar images closer together in the embedding space while pushing dissimilar ones apart. Each training sample consists of a triplet: an anchor, a positive (same class), and a negative (different class). We categorize triplets into:

  • Easy triplets: already well-separated, contributing little to learning.
  • Hard triplets: overly challenging and may destabilize training.
  • Semi-hard triplets: informative samples where the negative is further from the anchor than the positive, but still within a defined margin. We apply offline triplet mining by computing embeddings with a pretrained network and selecting only semi-hard triplets using the Euclidean distance and a margin of 0.2. This significantly improves performance by focusing training on the most informative examples while discarding misleading ones.

The model was trained using the following hyperparameters:

  • Batch size: 16
  • Epochs: 30
  • Initial Learning Rate: 0.001
  • Optimizers: Adam with ReduceLROnPlateau scheduler

Multiple training runs were conducted to evaluate different triplet selection strategies and model configurations. A total of five experiments were performed, and the model selected for deployment is the one that achieved the best overall performance metrics, as documented in the project report.

🔍 Inference Phase

The testing phase begins by using the trained model to generate an embedding database from the anchor images in the training set. As illustrated in Figure, the inference process relies on this database to predict the class of each test image.

Each image in the test set is first passed through the model to extract its embedding vector. This vector is then compared against all entries in the embedding database using the Euclidean distance. The predicted class for the test image is assigned based on the closest anchor embedding found in the database.

📊 Results

The best-performing model was obtained from Test #5. Below are the results from all the experiments:

We then analyzed whether the overall model performance was consistent across individual generators. As an example, we evaluated the results on images generated by the Taming Transformer model, obtaining the following performance:

These results confirm that the model's performance **remains stable even when evaluated on a single generator.**

🛠️ Installation Instructions

Follow these steps to set up the project locally.

  1. Cloning the repository:
    git clone <repository_url>
    cd SiameseNN-for-Fake-images-detection
    
  2. Setting Up the Environment:
    • Using Conda If you are using Conda, you can create a new environment and install all the required dependencies as follows:
      conda create --name myenv python=3.10
      conda activate myenv
      
    • Using Python venv If you prefer using venv, you can set up your environment like this:
      python -m venv myenv
      
      # On Windows:
      myenv\Scripts\activate
      
      # On Unix or MacOS:
      source myenv/bin/activate
      
  3. Installing the required Python packages:
    pip install -r requirements.txt
    

🚀 Running the Application

Once the environment is set up and activated, you can run the main script from the terminal:

cd ModelTesting
python main.py

This script will execute the testing phase of the Siamese Neural Network using the trained model in the project's files.

About

develop a Siamese Neural Network capable of performing binary classification based on the Euclidean distance between embedding vectors generated by the network. This classification helps in predicting whether an image is synthetic or real.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •