Skip to content

Banking-optimized signature verification with Siamese CNN - Contrastive learning for forgery detection with high recall for fraud prevention

License

Notifications You must be signed in to change notification settings

umitkacar/handwritten-biometric-authentication

Repository files navigation

🔐 Offline Signature Verification

State-of-the-Art Deep Learning for Handwritten Signature Authentication

Python PyTorch License Stars Issues


🎯 Advanced Siamese Convolutional Neural Network for Signature Authentication

Powered by PyTorch • Optimized for Banking & Security Systems • Research-Grade Quality


📋 Table of Contents


✨ Features

🎯 Feature 📝 Description
🧠 Deep Learning Siamese Convolutional Neural Network with Contrastive Loss
High Accuracy Optimized for banking-grade precision-recall balance
🔄 Transfer Learning Pre-trained models ready for fine-tuning
📊 ROC Analysis Comprehensive evaluation with ROC curves
🎨 Production-Ready Modern package structure, tested, and documented
🧪 Fully Tested 44 comprehensive tests with 100% pass rate
📦 Installable Package pip install -e . for easy development
🛠️ Modern Tooling Black, Ruff, mypy, pytest configured
📝 Type Hints Complete type coverage for safety
🚀 GPU Accelerated Full CUDA support for faster training

🚀 What's New in v2.0

🎉 Major Release - Production-Ready Refactor

Version 2.0 is a complete rewrite with modern Python packaging, comprehensive testing, and production-grade code quality.

Key Improvements:

  • Modern Package Structure: src/ layout with proper imports
  • Comprehensive Testing: 44 tests covering all components
  • Type Safety: Full type hints with mypy validation
  • Development Tools: Black, Ruff, pytest, pre-commit hooks
  • Production Scripts: CLI-ready with argparse
  • Complete Documentation: CHANGELOG, CONTRIBUTING, LESSONS-LEARNED

What Changed:

# OLD (v1.x) - Direct file imports
from Model import SiameseConvNet

# NEW (v2.0) - Package imports
from signature_verification import SiameseConvNet

See CHANGELOG.md for full migration guide.


🚀 Research Trends (2024-2025)

graph LR
    A[🔷 Siamese CNN] --> B[🔶 Vision Transformers]
    B --> C[🔷 Hybrid CNN-ViT]
    C --> D[🌟 SOTA 2024-2025]

    style A fill:#e1f5ff
    style B fill:#fff3e0
    style C fill:#f3e5f5
    style D fill:#e8f5e9
Loading

🌟 Current Implementation

  • Siamese Convolutional Network - Proven architecture with contrastive learning
  • PyTorch Framework - Modern, flexible, and production-ready
  • Banking-Optimized - High recall for fraud detection

🔮 Roadmap to State-of-the-Art (2025)

Technology Status Impact
🤖 Vision Transformers (ViT) 📋 Planned Global feature extraction
🎯 Swin Transformers 📋 Planned Hierarchical attention mechanisms
Hybrid CNN-ViT 📋 Planned Best of both worlds
🔄 Few-Shot Learning 📋 Planned Learn from limited samples
🎨 Spatial Transformer Networks 📋 Planned Automatic alignment

🏗️ Architecture

🧠 Siamese Network Architecture

┌─────────────────────────────────────────────────────────────┐
│                    SIAMESE NETWORK                          │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  Input Signature 1         Input Signature 2               │
│         │                          │                        │
│         ▼                          ▼                        │
│    ┌─────────┐              ┌─────────┐                    │
│    │ Conv1   │              │ Conv1   │                    │
│    │ 11x11   │              │ 11x11   │   Shared           │
│    └────┬────┘              └────┬────┘   Weights          │
│         │                          │                        │
│    ┌────▼────┐              ┌─────▼───┐                    │
│    │ Pool +  │              │ Pool +  │                    │
│    │  LRN    │              │  LRN    │                    │
│    └────┬────┘              └────┬────┘                    │
│         │                          │                        │
│    ┌────▼────┐              ┌─────▼───┐                    │
│    │ Conv2   │              │ Conv2   │                    │
│    │  5x5    │              │  5x5    │                    │
│    └────┬────┘              └────┬────┘                    │
│         │                          │                        │
│    ┌────▼────┐              ┌─────▼───┐                    │
│    │ Conv3-4 │              │ Conv3-4 │                    │
│    │ Dropout │              │ Dropout │                    │
│    └────┬────┘              └────┬────┘                    │
│         │                          │                        │
│    ┌────▼────┐              ┌─────▼───┐                    │
│    │   FC    │              │   FC    │                    │
│    │ 128-dim │              │ 128-dim │                    │
│    └────┬────┘              └────┬────┘                    │
│         │                          │                        │
│         └──────────┬───────────────┘                        │
│                    ▼                                        │
│            ┌──────────────┐                                 │
│            │ Euclidean    │                                 │
│            │  Distance    │                                 │
│            └──────┬───────┘                                 │
│                   ▼                                         │
│            ┌──────────────┐                                 │
│            │ Contrastive  │                                 │
│            │     Loss     │                                 │
│            └──────────────┘                                 │
│                                                             │
└─────────────────────────────────────────────────────────────┘

🎯 Key Components

🔹 Convolutional Layers: Extract local features
🔹 Local Response Normalization: Enhance contrast
🔹 MaxPooling: Spatial dimensionality reduction
🔹 Dropout (0.3-0.5): Prevent overfitting
🔹 Fully Connected: 128-dimensional embeddings
🔹 Contrastive Loss: Metric learning optimization

⚡ Quick Start

📦 Installation

# Clone the repository
git clone https://github.com/umitkacar/Offline_Signature_Verification.git
cd Offline_Signature_Verification

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install package in editable mode (recommended for development)
pip install -e .

# Or install from requirements.txt
pip install -r requirements.txt

# Install development dependencies (optional)
pip install -e ".[dev]"

🚀 Package Installation (v2.0+)

The package is now properly structured and installable:

# Install as editable package
pip install -e .

# Now import anywhere
python
>>> from signature_verification import SiameseConvNet, TrainDataset
>>> model = SiameseConvNet()
>>> print(model)

🎯 Training

# Step 1: Prepare data (first time only)
python scripts/prepare_data.py

# Step 2: Train the model
python scripts/train.py --epochs 5 --batch-size 8 --lr 0.001

# Advanced training options
python scripts/train.py \
    --epochs 10 \
    --batch-size 16 \
    --lr 0.0005 \
    --device cuda \
    --model-dir ./MyModels

🧪 Testing & Evaluation

# Evaluate trained model
python scripts/evaluate.py --model ./Models/checkpoint_epoch_4

# Custom evaluation
python scripts/evaluate.py \
    --model ./Models/checkpoint_epoch_9 \
    --data ./Data/test_index.pkl \
    --output ./my_results.png \
    --batch-size 16

📊 Model Usage (Python API)

from signature_verification import (
    SiameseConvNet,
    SignatureTestDataset,
    distance_metric
)
import torch

# Load model
model = SiameseConvNet()
model.load_state_dict(torch.load('./Models/checkpoint_epoch_4'))
model.eval()

# Load test data
dataset = SignatureTestDataset(data_path='./Data/test_index.pkl')

# Compare signatures
with torch.no_grad():
    img1, img2, label = dataset[0]
    img1 = img1.unsqueeze(0)  # Add batch dimension
    img2 = img2.unsqueeze(0)

    features1, features2 = model(img1, img2)
    distance = distance_metric(features1, features2)
    is_genuine = distance < threshold  # threshold = 1.5 (example)

    print(f"Distance: {distance.item():.4f}")
    print(f"Prediction: {'Genuine' if is_genuine else 'Forged'}")

⚡ Quick Verification Test

# Run quick functionality test (no data needed)
python scripts/quick_test.py

📊 Performance Metrics

🎯 Banking Sector Optimization

Metric Value Priority
🎯 Recall (Sensitivity) High 🔴 Critical for fraud detection
📊 Precision Balanced 🟡 Prevent customer inconvenience
📈 F1-Score Optimized 🟢 Overall performance
Inference Speed < 50ms 🔵 Real-time capability

📈 Precision-Recall Trade-off

Banking Priority: Detecting forged signatures is critical! High recall prevents fraud while maintaining reasonable precision to avoid excessive customer re-verification.

┌─────────────────────────────────────────┐
│  Precision vs Recall Trade-off         │
├─────────────────────────────────────────┤
│                                         │
│  High Recall   → Catch more frauds     │
│  (Priority)    → Some false positives  │
│                                         │
│  Balanced      → Optimal UX            │
│  Precision     → Minimize re-signing   │
│                                         │
└─────────────────────────────────────────┘

🔬 Technical Deep Dive

🧮 Loss Function

The Contrastive Loss function optimizes the network to:

  • Minimize distance for genuine signature pairs
  • Maximize distance for forged signature pairs
L = (1 - Y) × D² + Y × max(margin - D, 0where:
  Y = 1 (different persons) or 0 (same person)
  D = Euclidean distance between embeddings
  margin = 2.0

🏗️ Network Specifications

Architecture:
  Input: 220x155 grayscale images
  Conv1: 48 filters, 11×11 kernel
  Conv2: 128 filters, 5×5 kernel
  Conv3: 256 filters, 3×3 kernel
  Conv4: 96 filters, 3×3 kernel
  FC1: 1024 neurons
  FC2: 128-dimensional embeddings

Regularization:
  - Dropout: 0.3 (after conv2 & conv4)
  - Dropout: 0.5 (after fc1)
  - Local Response Normalization

Optimization:
  - Optimizer: Adam
  - Learning Rate: 0.001
  - Batch Size: 8

📚 State-of-the-Art References (2024-2025)

🌟 Latest Research (2024-2025)

📅 Year 🔬 Research 🏆 Highlights 🔗 Link
2025 HTCSigNet Hybrid Transformer-CNN, SOTA accuracy 📄
2025 PAST Pairwise Attention Swin Transformer 📄
2025 Spatial Transformers Automatic signature alignment 📄
2025 CNN-ViT Hybrid 98.9% accuracy in biometrics 📄
2024 Vision Transformers ViT market growth 33.2% CAGR 📄

🔥 Trending Technologies

mindmap
  root((2024-2025
    Trends))
    Vision Transformers
      Swin Transformer
      ViT Architecture
      Hybrid CNN-ViT
    Few-Shot Learning
      Meta-Learning
      N-way K-shot
      Prototypical Networks
    Self-Attention
      Global Context
      Multi-Head Attention
      Cross-Attention
    Advanced Architectures
      Spatial Transformers
      CycleGAN Denoising
      YOLOv5 Detection
Loading

🎓 Foundational Papers

  1. SigNet (2017) - Original Siamese CNN architecture

  2. Vision Transformers (2020) - Transformer architecture for CV

    • Revolutionizing computer vision with attention mechanisms
    • 280M to 2.7B USD market growth (2024-2032)
  3. Contrastive Learning - Metric learning fundamentals


🛠️ Tech Stack

Core Technologies

Python PyTorch NumPy scikit-learn Pillow

Deep Learning Features

🔹 PyTorch 2.0+ - Dynamic computation graphs 🔹 CUDA Support - GPU acceleration 🔹 Mixed Precision - Faster training 🔹 DataLoader - Efficient batch processing 🔹 Model Checkpointing - Save/resume training


🎯 Use Cases

🏦 Industry 💡 Application 🎯 Benefit
Banking Check verification Prevent fraud
Legal Document authentication Ensure validity
Healthcare Prescription verification Patient safety
Government ID verification Security enhancement
Finance Contract validation Legal compliance

🗂️ Project Structure (v2.0+)

Offline_Signature_Verification/
│
├── 📁 src/signature_verification/   # Main package (installable)
│   ├── 📄 __init__.py              # Package exports
│   ├── 📄 model.py                 # Siamese CNN & Contrastive Loss
│   ├── 📄 dataset.py               # PyTorch Dataset classes
│   └── 📄 utils.py                 # Utility functions
│
├── 📁 scripts/                     # Production-ready scripts
│   ├── 📄 prepare_data.py          # Data preprocessing
│   ├── 📄 train.py                 # Training with CLI args
│   ├── 📄 evaluate.py              # Evaluation & ROC curves
│   ├── 📄 quick_test.py            # Functionality verification
│   └── 📄 README.md                # Scripts documentation
│
├── 📁 tests/                       # Comprehensive test suite
│   ├── 📄 test_model.py            # Model tests (25 tests)
│   ├── 📄 test_dataset.py          # Dataset tests (8 tests)
│   ├── 📄 test_utils.py            # Utils tests (11 tests)
│   └── 📄 conftest.py              # Pytest configuration
│
├── 📁 Data/                        # Training/test indices (gitignored)
├── 📁 Data_raw/                    # Raw signature images (gitignored)
├── 📁 Models/                      # Saved checkpoints (gitignored)
│
├── 📄 pyproject.toml               # Modern build configuration
├── 📄 requirements.txt             # Dependencies
├── 📄 .pre-commit-config.yaml      # Pre-commit hooks
├── 📄 CHANGELOG.md                 # Version history
├── 📄 CONTRIBUTING.md              # Developer guide
├── 📄 LESSONS-LEARNED.md           # Refactoring insights
├── 📄 README.md                    # This file
├── 📄 LICENSE                      # MIT License
└── 📄 .gitignore                  # Git ignore rules

📦 Package Structure Highlights

  • src/ layout: Modern Python packaging standard
  • Installable package: pip install -e . for development
  • Type hints: Full type coverage with mypy
  • Comprehensive tests: 44 tests with 100% pass rate
  • Production scripts: CLI-ready with argparse
  • Development tools: Black, Ruff, mypy, pytest configured

🌟 Popular Signature Verification Repositories (2024-2025)

🏆 Top GitHub Projects

  1. 🔥 EndToEnd Signature System (2024)

    • YOLOv5 Detection + CycleGAN Cleaning + Verification
    • GitHub
    • Tech: YOLOv5, CycleGAN, PyTorch/TensorFlow
  2. ⭐ sigver - Feature Extraction Package

    • Writer-dependent classifiers
    • GitHub
    • Tech: PyTorch, Pre-trained models
  3. 🎯 Signature Recognition

    • Digital image processing + Neural networks
    • GitHub
    • Tech: OpenCV, TensorFlow, 201+ stars
  4. 🤗 Hugging Face Signature Detection

    • Production-ready model serving
    • Hugging Face
    • Tech: Triton Server, ONNX, TensorRT

📖 Dataset

📊 Training Configuration

Dataset Statistics:
  Total Persons: 79
  Signatures per Person: 12
  Training Samples: 20,000 (10K positive + 10K negative)
  Test Split: 5%
  Image Size: 220 × 155 pixels
  Format: Grayscale PNG

📥 Data Structure

Data_raw/genuines/
└── NFI-{person:03d}{sign:02d}{person:03d}.png

Example: NFI-001-01-001.png
         Person 1, Signature 1, Person 1

🔍 Evaluation Metrics

📊 ROC Curve Analysis

ROC Curves are used because:

  • ✅ Summarize TPR vs FPR trade-offs
  • ✅ Effective for balanced datasets
  • ✅ Probability threshold visualization

📈 Precision-Recall Curves

PR Curves are preferred when:

  • ✅ Imbalanced datasets (more genuine than forged)
  • ✅ Focus on positive class performance
  • ✅ Banking applications (high recall priority)

🤝 Contributing

We welcome contributions! Here's how you can help:

🌟 Ways to Contribute

  • 🐛 Report bugs and issues
  • 💡 Suggest new features or improvements
  • 📝 Improve documentation
  • 🔬 Share research papers and implementations
  • 🚀 Submit pull requests

📋 Contribution Guidelines

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

🗺️ Roadmap

🎯 2025 Milestones

  • 🤖 Integrate Vision Transformer (ViT) architecture
  • 🎯 Implement Swin Transformer for hierarchical features
  • ⚡ Add Hybrid CNN-ViT model
  • 🔄 Few-shot learning capabilities
  • 🎨 Spatial Transformer Networks for alignment
  • 📱 Mobile deployment (ONNX/TensorRT)
  • 🌐 Web demo with FastAPI
  • 📊 Comprehensive benchmarking suite
  • 🔒 Security hardening and adversarial robustness
  • 📚 Extended documentation and tutorials

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

MIT License - Free for commercial and private use

🙏 Acknowledgments

Special thanks to:

  • 🎓 Research Community - For advancing signature verification
  • 🔬 Paper Authors - SigNet, Vision Transformers, HTCSigNet teams
  • 💻 Open Source Contributors - PyTorch, scikit-learn communities
  • 📊 Dataset Providers - Kaggle signature datasets
  • 🌟 GitHub Community - For feedback and contributions

📞 Contact & Support

💬 Get in Touch

GitHub Issues GitHub Discussions


⭐ Star History

Star History Chart


🎯 Project Stats

GitHub last commit GitHub code size GitHub top language


🚀 Made with ❤️ for the AI & Security Community

If you find this project useful, please consider giving it a ⭐!


© 2024-2025 Offline Signature Verification Project

Empowering secure authentication with deep learning


🧪 Testing

Comprehensive Test Suite

We maintain a robust test suite with 44 tests covering all components:

# Run all tests
pytest

# Run with coverage report
pytest --cov=src --cov-report=html

# Run in parallel (faster)
pytest -n auto

# Run specific test file
pytest tests/test_model.py -v

Test Coverage

Component Tests Coverage
Model 25 tests Architecture, forward/backward, loss functions
Dataset 8 tests Loading, preprocessing, data handling
Utils 11 tests Image processing, tensor conversion
Total 44 tests 100% passing

Quick Functionality Test

No data required - perfect for CI/CD:

python scripts/quick_test.py

Output:

============================================================
QUICK FUNCTIONALITY TEST
============================================================

✅ All imports successful!
✅ Model initialized
✅ Forward pass successful!
✅ Loss calculation successful!
✅ Distance metric successful!
✅ Gradient computation successful!

ALL TESTS PASSED! ✅
============================================================

🛠️ Development

Setup Development Environment

# Clone and install
git clone https://github.com/umitkacar/Offline_Signature_Verification.git
cd Offline_Signature_Verification

# Install with dev dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

Code Quality Tools

# Format code
black src tests scripts

# Lint code
ruff check src tests scripts --fix

# Type checking
mypy src

# Run all quality checks
pre-commit run --all-files

Pre-commit Hooks

Automatically run on every commit:

  • Trailing whitespace removal
  • End-of-file fixer
  • YAML/JSON/TOML validation
  • Black formatting
  • Ruff linting
  • mypy type checking
  • pytest tests

📚 Documentation

Available Documentation

API Documentation

All code is fully documented with Google-style docstrings:

def forward(self, x: Tensor, y: Tensor) -> Tuple[Tensor, Tensor]:
    """Forward pass through both branches of the Siamese network.

    Args:
        x: First signature tensor of shape (batch_size, 1, 220, 155)
        y: Second signature tensor of shape (batch_size, 1, 220, 155)

    Returns:
        Tuple of feature embeddings (f_x, f_y), each of shape (batch_size, 128)
    """

🔄 Migration from v1.x to v2.0

Quick Migration Guide

  1. Install the package:

    pip install -e .
  2. Update imports:

    # Old
    from Model import SiameseConvNet
    from Dataset import TrainDataset
    
    # New
    from signature_verification import SiameseConvNet, TrainDataset
  3. Update script calls:

    # Old
    python train_model.py
    
    # New
    python scripts/train.py
  4. Run tests:

    pytest

See CHANGELOG.md for complete migration instructions.