Skip to content

Conversation

@nindanaoto
Copy link
Member

This PR implements Double Decomposition (DD) as described in https://eprint.iacr.org/2023/771, enabling more efficient TFHE operations through bivariate polynomial representation.

Summary

  • Implement Double Decomposition infrastructure with auxiliary decomposition parameters (l, Bg)
  • Add 128-bit Torus (__uint128_t) support for lvl3param with FFT backend
  • Refactor TRGSW encryption to properly follow the DD algorithm blueprint
  • Add comprehensive test coverage for both 64-bit and 128-bit DD paths

Key Changes

New Parameters:

  • l / Bgbit: Auxiliary decomposition depth and base for DD
  • lₐ / Bgₐbit: Auxiliary decomposition for nonce path
  • h / hₐ: Offset vectors for auxiliary decomposition

New Functions:

  • DoubleDecomposition / NonceDoubleDecomposition: Bivariate decomposition
  • TRLWEBaseBbarDecompose / TRLWEBaseBbarDecomposeNonce: Decompose TRLWE to base Bg
  • RecombineTRLWEFromDD / RecombineTRLWEFromDDNonce: Recombine the TRLWEs back to a single TRLWE
  • 128-bit FFT support via 64-bit backend in TwistIFFT/TwistFFT

Algorithm:

  1. Encrypt TRGSW as ordinary TRGSW (k×lₐ + l rows)
  2. Apply Double Decomposition to each TRLWE row, expanding to k×lₐ×lₐ + l×l rows
  3. During ExternalProduct, use bivariate decomposition and recombination

Files Changed

File Description
include/trgsw.hpp Core DD implementation (+750 lines)
include/mulfft.hpp 128-bit FFT support
include/params/128bit.hpp New 128-bit parameter set (lvl3param)
include/params/*.hpp Add DD parameters to existing param sets
test/externalproductdoubledecomposition.cpp DD unit tests
test/gatebootstrappingtlwe2tlwedoubledecomposition.cpp 128-bit gate bootstrapping test

Test Results

externalproductdoubledecomposition ✓ (64-bit, standard + DD paths)
gatebootstrappingtlwe2tlwedoubledecomposition ✓ (128-bit, ~95ms/bootstrap)
externalproduct ✓ (regression test)
gatebootstrapping ✓ (regression test)

Backward Compatibility

All existing parameter sets default to l=1 and lₐ=1, which disable DD and use the standard decomposition path. Existing code continues to work unchanged.

nindanaoto and others added 9 commits December 31, 2025 08:20
Implement foundational support for the Double Decomposition technique
from "Revisiting Key Decomposition Techniques for FHE" (ePrint 2023/771).

Changes:
- Add auxiliary decomposition parameters (l̅, l̅ₐ, B̅gbit, B̅gₐbit) to all
  parameter structs with trivial default values (l̅=1, B̅g=2^digits)
- Update TRGSW type definitions to use k*lₐ*l̅ₐ + l*l̅ row structure
- Add h̅gen() and nonceh̅gen() for auxiliary h value generation
- Modify trgswhadd, halftrgswhadd, trgswhoneadd for double decomposition
- Update ApplyFFT2trgsw, ApplyNTT2trgsw, ApplyRAINTT2trgsw loop bounds

With trivial values, behavior is unchanged (h̅[0]=1, sizes identical).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The h̅gen and nonceh̅gen functions had an incorrect shift formula that
used (i+1)*B̅gbit instead of i*B̅gbit. This caused ExternalProductDD to
fail with ~50% error rate because the gadget values h[i]*h̅[j] were off
by a factor of 2^B̅gbit.

The correct formula is:
- h̅[0] = 1 (j=0 means no auxiliary shift)
- h̅[j] = 2^(width - j*B̅gbit) for j > 0

This matches the decomposition shift formula:
  width - (i+1)*Bgbit - j*B̅gbit

When l̅=1 (trivial auxiliary decomposition), h̅[0]=1 correctly reduces
double decomposition to standard decomposition.

Also adds externalproductdoubledecomposition test to verify correctness.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Changed lvl3param from __uint128_t to uint64_t (128-bit types not
  fully supported in TFHEpp)
- Adjusted parameters: l=2, l̅=2 (was l=4, l̅=4) to fit 64-bit constraint
- Fixed DDTestParam nbit from 10 to 11 to match lvl2param for FFT
  compatibility

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update lvl3param to use __uint128_t with l=l̅=4, Bgbit=B̅gbit=16
- Extend FFT to support nbit=12 (n=4096) via fftplvl3
- Add TwistFFT/TwistIFFT handling for 128-bit types using 64-bit FFT
- Add UniformTorusRandom<P>() helper for 128-bit random generation
- Add ModularGaussian support for __uint128_t
- Fix decomposition functions to scale values for 128-bit FFT compatibility
- Add lvl03param bootstrapping parameter (lvl0 → lvl3)
- Add GateBootstrappingTLWE2TLWEDD test for non-trivial DD (l̅=4)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…port

- Merge ExternalProduct and ExternalProductDD using if constexpr (P::l̅ > 1)
- Remove redundant DD-postfixed functions (CMUXFFTDD, BlindRotateDD, etc.)
- Fix 128-bit shift overflow in keyswitch.hpp by using proper type casts
- Update lvl3param DD parameters to satisfy constraint l*Bgbit + (l̅-1)*B̅gbit ≤ 128

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use lvl2param::nbit in DDTestParam for FFT compatibility across param sets
- Skip gate bootstrapping DD test for CONCRETE builds (lvl3param nbit too large)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The 128-bit Torus uses a 64-bit FFT backend where TwistIFFT extracts the
top 64 bits and TwistFFT places results in the top 64 bits. All decomposition
functions needed to account for this offset.

Key fixes:
- Add << 64 shift in Decomposition/NonceDecomposition for 128-bit types
- Add << 64 shift in TRLWEBaseBbarDecompose/Nonce for 128-bit types
- Fix RecombineTRLWEFromDD/Nonce to compensate for TwistFFT's << 64 shift
  by adjusting recombination shifts (actual_shift = target_shift - 64)
- Update test to include both l̅=1 (standard) and l̅=2 (DD) code paths

All tests pass:
- 64-bit external product DD (both l̅=1 and l̅=2)
- 128-bit gate bootstrapping DD (lvl3param: l=2, l̅=4, B̅gbit=32)
- Standard external product and gate bootstrapping tests

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The Double Decomposition algorithm requires:
1. First create ordinary TRGSW (k*lₐ + l rows) with encrypted zeros + gadget
2. Then apply DD to each TRLWE row, expanding to k*lₐ*l̅ₐ + l*l̅ rows

Previous implementation incorrectly encrypted zeros into all DD-expanded rows
then tried to transform in-place, which used wrong encrypted zeros.

Changes:
- trgswSymEncryptImpl: Create ordinary TRGSW first, then apply DD
- halftrgswSymEncryptImpl: Same pattern for HalfTRGSW
- trgswSymEncryptOne: New function for encrypting constant 1 with DD support
- trgswhadd/halftrgswhadd/trgswhoneadd: Simplified to standard-only with
  static_assert to catch misuse with DD parameters

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
With Double Decomposition, TwistIFFT always receives decomposition
digits (small integers), never raw Torus values. This allows us to
simplify the implementation by removing all 64-bit shift workarounds:

- TwistIFFT: Use low 64 bits directly instead of >> 64
- TwistFFT: Store in low 64 bits instead of << 64
- All decomposition functions: Remove << 64 shift
- RecombineTRLWEFromDD/Nonce: Remove fft_offset compensation

The code is now much cleaner and the recombination logic is
straightforward.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@nindanaoto nindanaoto merged commit 2eb7996 into master Jan 8, 2026
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants