Skip to content

[FederatedLearning][Post-v1] Production-Grade Vertical Federated Learning (VFL) with Split Learning #542

@ooples

Description

@ooples

Summary

Implement a complete vertical federated learning stack. VFL is fundamentally different from the existing horizontal FL framework: instead of each client having the same features for different samples, each party holds different features for the same entities. This requires split neural networks, secure gradient exchange, entity alignment (PSI from #538), label privacy protection, and handling missing features.

Depends on: #538 (PSI for entity alignment)
New facade method: ConfigureVerticalFederatedLearning() (separate from horizontal ConfigureFederatedLearning)

Motivation

The entire existing FL framework is horizontal-only -- all 16 aggregation strategies assume clients have the same model architecture and average parameters. VFL requires a fundamentally different coordination model:

  • No model averaging: Instead, parties coordinate forward passes through a split neural network
  • Feature partitioning: Party A has features 1-10, Party B has features 11-20, Party C holds labels
  • Secure gradient exchange: Gradients flow between parties, must be protected
  • Entity alignment: Before training, parties must find shared entities via PSI
  • Missing features: In production, not all parties have data for all entities
  • Label privacy: The label holder information must be protected from feature holders

Example: Bank (income, credit score) + Hospital (diagnoses, prescriptions) + Retailer (purchase history) jointly predict loan default -- each party keeps its data private, only model activations flow between parties.

Latest Research (2024-2026)

Implementation Plan

Interfaces (4 files)

File Purpose
src/FederatedLearning/Vertical/IVerticalFederatedTrainer.cs VFL training orchestration: coordinate split forward/backward passes
src/FederatedLearning/Vertical/IVerticalParty.cs Represents one party: holds local features, computes local embeddings
src/FederatedLearning/Vertical/ISplitModel.cs Split neural network: local bottom models + shared top model
src/FederatedLearning/Vertical/ILabelProtector.cs Protect label holder gradients from feature parties

Classes (10 files)

File Purpose
src/FederatedLearning/Vertical/VerticalFederatedTrainer.cs Main VFL coordinator: entity alignment -> split training -> secure gradient exchange
src/FederatedLearning/Vertical/VerticalPartyClient.cs Feature-holding party: computes local embeddings from local features
src/FederatedLearning/Vertical/VerticalPartyLabelHolder.cs Label-holding party: computes loss, generates top-model gradients, applies label DP
src/FederatedLearning/Vertical/SplitNeuralNetwork.cs Split NN: each party runs bottom model, outputs concat/sum/attend, top model on coordinator
src/FederatedLearning/Vertical/VerticalDataPartitioner.cs Partition features across parties (by column groups, by domain, random)
src/FederatedLearning/Vertical/SecureGradientExchange.cs Encrypted gradient passing between parties using HE or secret sharing
src/FederatedLearning/Vertical/MissingFeatureHandler.cs Handle missing feature blocks: zero imputation, mean imputation, learned imputation, skip
src/FederatedLearning/Vertical/LabelDifferentialPrivacy.cs DP noise on label holder gradients to prevent feature parties from inferring labels
src/FederatedLearning/Vertical/VerticalFederatedUnlearner.cs GDPR-compliant removal of entity data from VFL models (certified unlearning)
src/FederatedLearning/Vertical/VerticalFederatedBenchmark.cs VertiBench-style benchmarking for VFL implementations

Options and Enums (8 files)

File Purpose
src/Models/Options/VerticalFederatedLearningOptions.cs Top-level VFL config: party definitions, PSI options, split model config
src/Models/Options/SplitModelOptions.cs Cut layer selection (manual, auto-optimal, balanced-compute), embedding dimension
src/Models/Options/MissingFeatureOptions.cs Imputation strategy, alignment threshold, minimum overlap ratio
src/Models/Options/VflUnlearningOptions.cs Unlearning method, certification level, verification
src/Models/Options/VflAggregationMode.cs Enum: Concatenation, Sum, Attention, Gating
src/Models/Options/MissingFeatureStrategy.cs Enum: Zero, Mean, Learned, Skip
src/Models/Options/SplitPointStrategy.cs Enum: Manual, AutoOptimal, BalancedCompute
src/Models/Options/VflUnlearningMethod.cs Enum: Retraining, GradientAscent, PrimalDual, Certified

Facade Integration

New facade method on AiModelBuilder (VFL is a different paradigm from horizontal FL):

// New method on AiModelBuilder (new partial class file: AiModelBuilder.VerticalFL.cs)
public IAiModelBuilder<T, TInput, TOutput> ConfigureVerticalFederatedLearning(
    VerticalFederatedLearningOptions options,
    IVerticalParty<T>[] parties = null,
    ILabelProtector<T> labelProtector = null)

Usage:

builder.ConfigureVerticalFederatedLearning(new VerticalFederatedLearningOptions
{
    EntityAlignment = new PsiOptions { Protocol = PsiProtocol.ObliviousTransfer },
    SplitModel = new SplitModelOptions
    {
        AggregationMode = VflAggregationMode.Concatenation,
        SplitPoint = SplitPointStrategy.AutoOptimal,
        EmbeddingDimension = 64
    },
    MissingFeatures = new MissingFeatureOptions
    {
        Strategy = MissingFeatureStrategy.Mean,
        MinimumOverlapRatio = 0.5
    },
    Unlearning = new VflUnlearningOptions
    {
        Method = VflUnlearningMethod.Certified,
        Enabled = false
    }
});

Also needs:

  • New backing fields in AiModelBuilder (in a new partial class file)
  • New method signatures in IAiModelBuilder interface
  • New VFL training path in Build() method (separate from horizontal FL path)

Acceptance Criteria

  • IVerticalFederatedTrainer coordinating split forward/backward passes
  • IVerticalParty representing feature-holding and label-holding parties
  • ISplitModel split neural network with local bottom + shared top
  • VerticalFederatedTrainer full orchestration (PSI -> train -> evaluate)
  • SecureGradientExchange with HE or SS protection
  • MissingFeatureHandler with zero, mean, learned, and skip strategies
  • LabelDifferentialPrivacy protecting label holder
  • VerticalFederatedUnlearner for GDPR compliance
  • VerticalDataPartitioner for feature distribution
  • ConfigureVerticalFederatedLearning() method on AiModelBuilder
  • IAiModelBuilder interface updated with new method signature
  • PSI integration from [FederatedLearning][Post-v1] Private Set Intersection (PSI) for Vertical FL Entity Alignment #538 for entity alignment
  • Builds on both net10.0 and net471
  • Unit tests with synthetic vertically-partitioned data
  • XML docs with beginners sections

Estimated: ~22 files

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestfederated-learningFederated learning framework, privacy, aggregation, and deployment

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions