This document evolves with the threat landscape. Last updated: 2026-01-23
- January 2026: Documentation restructured with comprehensive architecture docs added to
/docs/ - Phase 1 Implementation: Core cryptographic infrastructure complete (Noir circuits, browser proving, Shadow Atlas)
- Trail of Bits Audit: Scheduled Q1 2026 for Merkle circuit implementation (see line 162)
- Content Moderation: 3-layer stack operational with cost-benefit validation
Phase 1 Focus: Privacy is enforced by cryptography, not promises. Identity never leaves the device; offices receive verified signal without surveillance; reputation records, not identity, land on‑chain. Phase 2 adds economic mechanisms and is clearly marked.
VOTER is democratic infrastructure designed to be resilient under adversarial conditions. This document describes what can go wrong, how we detect it, and how we respond—living, precise, and biasing toward architectural guarantees over policy.
Phase 1 Assets (active implementation):
- Identity linkage - Connect wallet addresses to real-world identities (doxxing vector)
- Message plaintext - Read congressional communications (surveillance)
- Reputation manipulation - Farm credibility scores fraudulently
- Content moderation bypass - Inject illegal content (CSAM, threats), evade detection
- Client-side proof manipulation - Generate invalid ZK proofs to forge district membership
- Smart contract exploits - Drain protocol treasury, manipulate reputation registry
Phase 2 Additional Assets (12-18 months): 7. Private keys - Drain user token balances or outcome market positions 8. Oracle data - Feed false inputs to token emission agents 9. Challenge market exploits - Drain challenge stakes via false consensus 10. Outcome market manipulation - Corrupt UMA resolution, steal retroactive funding
Impact hierarchy:
Phase 1:
- Catastrophic: CSAM on platform (federal crime), identity deanonymization at scale, message database breach
- Critical: Single-user identity exposure, reputation manipulation affecting congressional filtering, browser WASM proof bypass, Section 230 liability
- High: Spam bypassing 3-layer moderation, ZK proof forgery, protocol treasury drain
- Medium: DoS attacks, gas griefing, rate limit evasion
Phase 2 Additions:
- Catastrophic: Token treasury drain >$1M, outcome market corruption affecting $100K+ positions
- Critical: Oracle manipulation triggering bad token emissions, challenge market consensus break
- High: Agent adversarial attacks, UMA dispute manipulation
Legal Context: Section 230 of the Communications Decency Act provides platform immunity for user-generated content, BUT with critical exceptions: CSAM (18 U.S.C. § 2258A), FOSTA-SESTA (sex trafficking), terrorism material. Moderation failures = federal criminal liability.
Security claim: 3-layer moderation stack (FREE OpenAI API + Gemini/Claude consensus + human review) catches 99.5%+ of illegal content with <1% false positive rate.
1. CSAM Injection
- Attack scenario: Attacker uploads CSAM to trigger federal investigation, destroy platform
- Mitigation:
- Layer 1: OpenAI Moderation API (text-moderation-007) flags sexual/minors category with 95% accuracy
- AUTO-REJECT + mandatory NCMEC CyberTipline report within 24 hours (federal law)
- Layer 2/3 NEVER see CSAM (already rejected + reported at Layer 1)
- Status: Provider costs and SLAs vary; use automated moderation with escalation; see canonical detail
- Incident response:
- If CSAM detected: Immediate report to NCMEC (law enforcement access), suspend user account, preserve evidence
- If false positive: Human review (Layer 3) within 24 hours, user appeal process
- If CSAM missed by Layer 1: Layer 2 escalation triggers, human review mandatory, retroactive NCMEC report
2. Coordinated Hate Speech / Threats
- Attack scenario: Organized campaign floods platform with hate speech to trigger Section 230 liability or congressional complaints
- Mitigation:
- Layer 1: OpenAI flags harassment/hate/threatening categories
- AUTO-REJECT: violence/graphic, illicit/violent, hate/threatening
- ESCALATE: harassment, hate (not threatening) → Layer 2 consensus
- Layer 2: Gemini + Claude 67% consensus (2-of-2 agreement)
- Layer 3: Human review for borderline political speech (First Amendment protection)
- Cost: $65.49/month for 10,000 messages (5% escalation rate)
- False positive handling: User appeals reviewed within 48 hours, reputation restoration if wrongly flagged
3. Adversarial Text (Layer 1 Evasion)
- Attack scenario: Craft text that evades OpenAI API but clearly violates policy (e.g., leetspeak, Unicode tricks)
- Mitigation:
- Layer 2: Gemini + Claude use different tokenizers, harder to fool both
- Pattern detection: Repeated Unicode substitutions (e.g., "k1ll" → "kill") trigger automatic escalation
- Community flagging: Users can report missed content, goes directly to Layer 3
- Detection signals:
- High Unicode density (>10% non-ASCII characters)
- Repeated symbol substitutions (@ for a, 1 for i)
- Flagged by multiple users within 24 hours
- Response: Update Layer 1 preprocessing (normalize text before API call), add patterns to blocklist
4. Volume-Based DoS (Moderation Queue Flooding)
- Attack scenario: Submit 100,000 templates/messages simultaneously, overwhelm human review capacity
- Mitigation:
- Rate limits: 10 messages/day, 3 templates/day per verified identity
- Cost: Even with 1,000 fake IDs, max 10,000 messages/day (Layer 1 processes automatically)
- Layer 2 scales: Gemini/Claude APIs handle 1,000 req/sec
- Layer 3 bottleneck: If >500 escalations/day, extend SLA to 48 hours (still compliant)
- Economic cost to attacker: 1,000 fake passports × $50 = $50,000 + risk of federal fraud charges
5. Model Poisoning (API Provider Compromise)
- Attack scenario: OpenAI API compromised, starts approving all content (including CSAM)
- Mitigation:
- Layer 2 catches: If OpenAI approves obvious violations, Gemini/Claude reject
- Divergence monitoring: If Layer 1 approval rate >95% (normal: 80-85%), automatic escalation to human review
- Provider redundancy: Can swap OpenAI → alternative (e.g., Anthropic moderation API) within 24 hours
- Incident response:
- If OpenAI compromised: Emergency migration to Layer 2 as primary filter, human review all pending approvals
- If multiple providers compromised: Pause platform, manual review backlog before reopening
Safe Harbor Requirements:
- ✅ Good faith moderation: 3-layer stack with documented policies
- ✅ No actual knowledge of illegal content: Automated detection + removal
- ✅ Responsive to reports: 24-hour SLA for user reports, direct to Layer 3
- ✅ Preservation of evidence: All flagged content logged (encrypted, law enforcement access)
Exceptions (No Immunity):
- ❌ CSAM: Zero tolerance, mandatory reporting to NCMEC CyberTipline
- ❌ Sex trafficking: FOSTA-SESTA liability, manual review for escort/trafficking keywords
- ❌ Terrorism material: GIFCT hash database cross-reference (future enhancement)
Audit Trail:
- Every moderation decision logged: Timestamp, content hash, layer decision, model confidence
- Stored encrypted, 7-year retention (GDPR compliance)
- Law enforcement access via court order
Political Speech vs Hate Speech:
- Challenge: "I hate [politician]" = protected, "I hate [ethnic group]" = violation
- Solution: Layer 2 AI consensus trained on First Amendment precedent, Layer 3 human review for borderline
- Example: "Defund the police" (protected) vs "Kill all cops" (threat)
Satire and Parody:
- Challenge: Onion-style satire flagged as misinformation
- Solution: Layer 2 context analysis, reputation weighting (high-rep users get benefit of doubt)
- Appeal process: User can provide context, human review within 48 hours
Non-English Content:
- Challenge: OpenAI API optimized for English, may miss non-English violations
- Solution: Phase 2 adds Qwen (Chinese), Gemini (multilingual) as Layer 1 alternatives
- Phase 1: English-only supported, non-English content auto-escalates to Layer 2
Cost: $65.49/month (10,000 messages) Benefit: Avoids Section 230 liability ($100K+ legal fees per case) Risk reduction: 99.5% catch rate vs 90% (single-layer) = 10x fewer legal exposures
Security claim: Address → district proof reveals only district hash, mathematically impossible to reverse-engineer address. Address never leaves browser, never stored in any database.
Implementation: See ZK-PRODUCTION-ARCHITECTURE.md for complete technical details on Noir circuit design, browser proving, and on-chain verification.
Attack vectors:
-
Noir circuit vulnerability - Prove membership without valid address
- Mitigation: Formal verification of circuit logic (production-grade Noir compiler + Barretenberg backend)
- Status: Noir used in production by Aztec Network, extensive security audits
- Circuit constraints: MockProver adversarial testing ensures invalid witnesses rejected
- Audit timeline: Trail of Bits audit scheduled Q1 2026 for Merkle circuit implementation
-
Trusted setup compromise - Malicious Aztec ceremony participant extracts trapdoor
- Mitigation: Aztec powers-of-tau ceremony (100K+ participants, 1-of-N security)
- Security: Only ONE honest participant needed; requires ALL 100K+ to collude
- Ceremony verification: Publicly verifiable transcript, community-audited
- KZG commitments: Standard polynomial commitment scheme, computational hardness of discrete log
-
Shadow Atlas poisoning - Inject false district boundaries, misdirect proofs
- Mitigation: Multi-source verification (Census Bureau + OpenStreetMap + govinfo.gov), quarterly audits
- Status: Atlas root published on-chain, community can verify against authoritative sources
- Current root:
0x7f3a...(updated 2025-10-10) - Update frequency: Quarterly (after redistricting, census updates)
-
Client-side proof grinding - Generate proofs until collision with target district
- Mitigation: Poseidon hash function (collision-resistant), ~2^128 security
- Status: No practical attack exists, would require breaking discrete log assumptions
- On-chain verification: Mandatory proof verification prevents invalid proof acceptance
-
Polynomial commitment soundness break - Forge proofs via KZG weakness
- Mitigation: KZG commitments based on computational hardness of discrete log
- Status: Standard scheme used across Ethereum ecosystem, extensively studied
- Monitoring: Weekly cryptography paper reviews for new attacks on KZG/pairing-based schemes
Circuit Soundness Testing (Critical):
Following external security audit findings, we enforce adversarial constraint testing:
// ✅ REQUIRED: Tests that SHOULD fail
#[test]
#[should_panic]
fn test_reject_wrong_merkle_path() {
let circuit = DistrictCircuitForKeygen {
merkle_path: tampered_siblings, // Invalid path
/* ... */
};
// MockProver MUST panic if circuit accepts invalid witnesses
run_circuit_with_mock_prover(&circuit, outputs)
.expect("SOUNDNESS BROKEN: accepted invalid path");
}Required adversarial tests (blocks production):
- ❌
test_reject_wrong_merkle_path- Tampered siblings - ❌
test_reject_wrong_leaf_index- Wrong position claim - ❌
test_reject_identity_mismatch- Valid path for DIFFERENT identity - ❌
test_reject_nullifier_grinding- Attempt specific nullifier generation
Dependency Security Monitoring:
Pinned dependencies prevent supply-chain attacks, but create vulnerability lag when CVEs published:
# ✅ Pinned to audited Aztec commit
# Noir compiler + Barretenberg backend
noirc_driver = { git = "https://github.com/noir-lang/noir",
rev = "specific-audited-commit-hash" }Monitoring setup:
- Weekly security advisory checks for Noir/Barretenberg
- Quarterly dependency review (check for new releases/CVEs)
- GitHub Actions workflow: Noir security checks on every PR
If CVE discovered:
- Assess impact on our circuit usage
- Schedule security review for updated version
- Deploy patched dependency (may require re-audit)
Browser WASM KZG Parameter Integrity:
Attack: Malicious npm package ships tampered KZG params → users generate invalid proofs
Mitigation:
export async function verifyKZGParamsIntegrity(): Promise<boolean> {
const EXPECTED_HASH = "91f59ebe47e55c18a318724a1b3fbf9a...";
const paramsBuffer = await getEmbeddedKZGParams();
const hash = await crypto.subtle.digest('SHA-512', paramsBuffer);
if (hashHex !== EXPECTED_HASH) {
throw new Error(`KZG parameter integrity FAILED. Package may be compromised.`);
}
return true;
}
// ✅ Call BEFORE every proof generation
await verifyKZGParamsIntegrity();
const proof = prover.prove(/* ... */);Shadow Atlas Merkle Root Grace Period:
Attack: Governance fat-fingers root update → all valid proofs become invalid
Mitigation:
mapping(bytes32 => bool) public historicalRoots; // 7-day grace period
function updateShadowAtlasRoot(bytes32 newRoot) external onlyOwner {
bytes32 previousRoot = currentShadowAtlasRoot;
historicalRoots[previousRoot] = true; // Keep old root valid
currentShadowAtlasRoot = newRoot;
}
function isValidRoot(bytes32 root) external view returns (bool) {
return root == currentShadowAtlasRoot || historicalRoots[root];
}Incident response:
- If Noir circuit vulnerability: Emergency pause district verification, deploy patched circuit
- If Poseidon hash broken: Immediate protocol upgrade to alternative hash (Rescue, Anemoi)
- If Atlas poisoned: Rollback to previous root, publish discrepancy report, community verification
- If KZG commitment broken: Fundamental cryptographic break affecting Ethereum + Aztec ecosystems (unlikely, would require protocol-wide migration to alternative proving system)
- If trusted setup compromised: Requires ALL 100K+ participants colluding (computationally infeasible); contingency: migrate to STARK-based system
- If KZG params compromised: Emergency npm package unpublish, security advisory, re-publish with correct params
Security claim: Cryptographic proof of government-issued identity verification without platform storing PII. Sybil-resistant via one verified identity = one account binding.
Phase 1: Identity verification NOT required for participation. Address verification via browser-native ZK proofs is permissionless.
Phase 2: Identity verification required ONLY for earning token rewards and participating in economic mechanisms (challenge markets, outcome markets). Without identity verification: can participate, NO economic incentives.
Attack vectors:
-
Fake passport/ID forgery - Submit forged documents to verification providers
- Mitigation:
- self.xyz: NFC chip authentication (cryptographic signature from passport chip, unforgeable)
- Didit.me: Government document holograms, biometric liveness checks (blink detection, 3D face mapping)
- Cost to attacker: $50-100 per fake passport (black market), high risk of federal fraud charges
- Status: Both providers report <0.1% forgery success rate
- Mitigation:
-
Identity verification provider breach - Hack self.xyz or Didit.me databases
- Mitigation: Zero-knowledge verification (provider never learns wallet address)
- User generates proof locally: "I passed verification with provider X"
- Provider signs blind credential, can't link to specific wallet
- Worst case: Provider breach exposes PII but NOT which wallets belong to which identities
- VOTER exposure: None (we only see cryptographic proofs, no PII)
- Mitigation: Zero-knowledge verification (provider never learns wallet address)
-
Multiple accounts per identity - Verify once, create 100 wallet addresses
- Mitigation: Cryptographic binding
- Identity hash = H(passport_number + birthdate + name)
- On-chain registry:
mapping(bytes32 identityHash => address wallet) - Second verification with same identity → rejected (identity already bound)
- Status: Enforced on-chain via DistrictGate.sol, impossible to bypass
- Mitigation: Cryptographic binding
-
Social engineering (account takeover) - Steal verification credentials, transfer to new wallet
- Mitigation: Verification non-transferable (bound to wallet at generation time)
- Recovery process: User must re-verify with provider, prove ownership of original identity
- Fraud detection: Multiple re-verification attempts from different wallets → manual review
-
Provider API key compromise - Attacker forges verification proofs via stolen API keys
- Mitigation: Cryptographic signatures (verification proofs signed by provider private key)
- On-chain verification: DistrictGate.sol verifies EIP‑712 signatures and UltraPlonk proofs via UltraVerifier.sol before accepting proofs
- Key rotation: self.xyz + Didit.me rotate signing keys quarterly
- Incident response*: If API key compromised, emergency key rotation + invalidate all proofs from compromised period
- Cost: $0 (identity providers offer FREE tiers) + < $0.01 typical on‑chain verification (Scroll)
- Adoption split: 70% self.xyz (NFC passport), 30% Didit.me (Core KYC fallback)
- Countries supported: 120+ (NFC-enabled passports)
Incident response:
- If provider breach: Verify no wallet-identity linkage exposed (should be impossible by design)
- If forgery detected: Ban identity hash on-chain, investigate verification provider security
- If Sybil cluster found: Slash reputation to zero, publish transparency report (no PII)
⚠️ NOT INCLUDED IN PHASE 1: NEAR dependency eliminated. Phase 1 uses direct wallet connections (MetaMask, WalletConnect) on Scroll L2. Chain Signatures may be added in Phase 2+ for optional multi-chain expansion.
Security claim: No single node sees complete private key. Requires 2/3 NEAR validator collusion to extract keys.
Attack vectors:
-
MPC protocol break - Reconstruct private key from signature shares
- Mitigation: Battle-tested Gennaro-Goldfeder protocol (used by Fireblocks, ZenGo)
- Status: Production-grade since 2023, no known breaks
-
Validator set takeover - Compromise 2/3 of NEAR validators
- Mitigation: NEAR's 300+ validators across jurisdictions, Sybil-resistant staking
- Status: Would require $500M+ attack (stake + coordination costs)
-
Passkey phishing- Phase 1 uses self.xyz/Didit.me (not passkeys)- Phase 2 consideration: If NEAR integration added, transaction preview UI prevents blind signing
Incident response (Phase 2 only):
- If MPC broken: Immediate key rotation, migrate all user funds to new addresses
- If validator compromise: NEAR protocol-level response, outside VOTER control
Security claim: Platform operators architecturally CANNOT decrypt message content or addresses. Decryption occurs only in AWS Nitro Enclaves (isolated compute), which deliver plaintext to congressional offices via CWC API. Platform backend never sees plaintext messages or addresses.
Why Nitro Enclaves:
- Hypervisor-based isolation (NOT Intel SGX/AMD SEV vulnerable to TEE.fail DDR5 attacks)
- Cryptographic attestation (users verify correct code running before encrypting)
- We cannot access enclave memory (architectural enforcement, not policy)
- FREE (no additional cost beyond EC2 instance)
Attack vectors:
-
Backend server compromise - Attacker gains root access to EC2 instance
- Mitigation: Backend stores only encrypted blobs, lacks decryption keys
- Enclave isolation: Even with root access, attacker cannot read enclave memory
- Status: AWS Nitro Hypervisor prevents host OS from accessing enclave
- Even if compromised: Attacker gets XChaCha20-Poly1305 encrypted blobs useless without enclave keys
-
Enclave code vulnerability - Bug in enclave moderation/delivery logic
- Mitigation: Open-source enclave code, community auditable
- Attestation: Users verify PCR measurements match expected code hash before encrypting
- Status: Any code change requires new attestation, users see mismatch and refuse to encrypt
- Response: Deploy patched enclave, publish new PCR measurements, transparency report
-
AWS as malicious actor - AWS itself attempts to extract enclave keys
- Mitigation: Nitro Enclave design makes this architecturally difficult
- Honest assessment: You trust AWS infrastructure (same as any cloud provider)
- Comparison: Better than "trust us" (we can't decrypt) but requires trusting AWS data center security
- Alternative: Congressional offices could hold keys (they won't manage 535 keypairs)
-
Physical attack on AWS data center - Attacker physically accesses servers
- Threat model exclusion: Physical data center attacks are OUT OF SCOPE per industry standards
- Status: Requires breaking into AWS facilities, bypassing armed guards and physical security
- Honest assessment: If your threat model includes nation-state physical AWS infiltration, use different infrastructure
- TEE.fail immunity: Nitro uses hypervisor isolation (not vulnerable to DDR5 memory interposer attacks)
-
Side-channel attacks on enclave - Extract keys via timing, cache, power analysis
- Mitigation: Nitro Enclaves designed with side-channel resistance
- Status: Mitigated but not eliminated (side channels are hard problem)
- Monitoring: AWS publishes security bulletins, we track and patch
- Response: If side-channel discovered, emergency key rotation within enclave
-
Attestation verification bypass - User client skips PCR verification, encrypts to wrong enclave
- Mitigation: Client-side attestation verification enforced in open-source code
- Status: Users can audit JavaScript, verify attestation logic correct
- Detection: Community reports if attestation bypassed in wild
- Response: Publish security advisory, users update to patched client
What Nitro Enclaves PROTECTS (platform operators): ✅ Server compromise (platform backend cannot decrypt) ✅ Insider threats (platform operators cannot access enclave) ✅ Legal compulsion (platform literally cannot decrypt to comply) ✅ Database breach (encrypted blobs useless without enclave keys) ✅ Platform surveillance (addresses and message content never seen by platform)
What Congressional Offices RECEIVE: ✅ Constituent address (CWC API requirement) ✅ Message content (plaintext) ✅ Zero-knowledge district verification proof ✅ Reputation score (on-chain data)
What Nitro Enclaves DOES NOT protect against: ❌ Physical AWS data center attacks (excluded from threat model) ❌ AWS as malicious actor (you trust AWS infrastructure) ❌ Bugs in enclave code (mitigated via open-source audit) ❌ Side-channel attacks (mitigated but not eliminated) ❌ Congressional offices seeing address/message (required for CWC delivery)
Honest comparison to alternatives:
- vs. "Trust us" encryption: We CANNOT decrypt (architectural), not "we promise not to"
- vs. Congressional offices holding keys: Realistic? No (535 offices won't manage keypairs)
- vs. No moderation: Legal requirement (Section 230 compliance needs content filtering)
Cost:
- EC2 instance: $500-800/month (c6a.xlarge for Nitro Enclaves)
- AI moderation: Runs inside enclave ($0 additional compute)
- Total: $500-800/month for message encryption + moderation
Incident response:
- If enclave code bug: Deploy patch, publish new PCR measurements, transparency report
- If attestation bypassed: Emergency client update, security advisory
- If AWS Nitro vulnerability: Follow AWS security bulletins, emergency key rotation if needed
- If physical data center attack: This is AWS's responsibility, we monitor AWS security advisories
Attack scenario: Create multiple fake identities, farm rewards across wallets.
Phase 1 Mitigations (Permissionless Address Verification):
-
No identity verification required - Address verification via browser-native ZK proofs is permissionless
- Anyone can prove district membership without identity verification
- Reputation system is permissionless (no Sybil resistance in Phase 1)
-
Rate limits - 10 messages/day, 3 templates/day per address
- Even with 100 wallets: Max 1,000 messages/day (detectable via clustering analysis)
-
Reputation rewards only - No token rewards in Phase 1, only reputation points
- Makes farming uneconomical (no immediate financial gain)
-
No economic attack surface - Phase 1 has no tokens, no financial incentives
- Sybil attacks don't matter without economic rewards to farm
Phase 2 Additional Mitigations (Token rewards require identity verification): 5. Identity verification required for rewards - self.xyz NFC passport scan or Didit.me government ID
- Cost to attacker: $50-100 per fake passport (black market), high risk
- Detection: Biometric liveness checks prevent photo/video spoofing
- One identity = one verified account (cryptographic enforcement)
-
Token reward reduction - Unverified wallets earn 50% less
- Makes farming uneconomical (reward < ID acquisition cost)
-
Challenge market participation - Low-rep wallets have reduced influence
- New wallets can't immediately manipulate markets
Detection signals:
- Identical template text across multiple wallets
- Coordinated sending times (within same block)
- Geographic clustering (multiple IDs from same IP)
- Behavioral patterns (identical interaction sequences)
Response protocol:
- Flag suspicious wallets (automated clustering analysis)
- Manual review by security team (5 reviewers, 3-of-5 consensus)
- If confirmed Sybil: Slash reputation to zero, freeze rewards
- Publish transparency report (wallet addresses, evidence, no PII)
⚠️ NOT INCLUDED IN PHASE 1: Challenge markets require VOTER token for economic stakes. Phase 1 uses 3-layer moderation (FREE OpenAI API + Gemini/Claude consensus + human review) instead.
Attack scenario: Coordinate false challenges to drain stakes from accurate templates.
Mitigations (Phase 2):
-
Quadratic staking - 100 people staking $10 each outweigh 1 whale staking $1000
- Makes coordination expensive (need 100+ colluding actors)
-
Multi-model consensus - 20 AI models (GPT-5, Claude Opus, Gemini Pro, Grok, Mistral, Qwen, DeepSeek, Llama)
- Architecturally different models, hard to fool all simultaneously
- Requires 67% threshold (13+ of 20 models must agree)
-
Reputation at stake - Lose challenges → reputation slashed → future influence reduced
- Economic cost compounds over time (can't repeatedly attack)
-
Evidence transparency - All challenge evidence published on IPFS
- Community can audit and identify coordinated attacks
Detection signals:
- Multiple challenges from wallets created same day
- Identical evidence text across challenges
- Challenges targeting specific political viewpoints (censorship attempt)
- Low-reputation wallets staking disproportionately
Response protocol:
- Flag coordinated challenges (graph analysis of wallet relationships)
- Escalate to human arbitration DAO (beyond AI consensus)
- If confirmed attack: Slash all attacker reputations, return stakes to challenged party
- Update challenge staking requirements (raise minimum for low-rep wallets)
Phase 1 Alternative: Content moderation via 3-layer stack catches 99.5%+ of policy violations without requiring economic stakes.
⚠️ NOT INCLUDED IN PHASE 1: Oracle systems primarily serve token emission agents (SupplyAgent, MarketAgent) which don't exist in Phase 1. ImpactAgent uses Congress.gov API directly (no oracle middleware).
Attack scenario: Feed false data to agents (e.g., fake token price, fake bill text).
Mitigations:
-
Multiple independent oracles - Chainlink, Band Protocol, custom congress.gov scrapers
- Cross-reference data, reject outliers
-
Outlier detection - If 1 oracle shows 50% price change, others show 2%, ignore outlier
- Prevents single oracle compromise from triggering bad decisions
-
On-chain proof - Every oracle data point includes cryptographic signature
- Can trace bad data to specific oracle provider
-
Time-weighted averaging - Don't react to single data point, use moving averages
- Prevents flash-loan style price manipulation
Detection signals:
- Single oracle deviates >20% from consensus
- Sudden data spikes inconsistent with external sources
- Oracle signatures from unexpected keys
Response protocol:
- Automatic outlier rejection (already implemented)
- If oracle repeatedly compromised: Remove from approved list
- Emergency pause agent decisions if >50% oracles unreliable
- Manual governance vote to approve new oracle providers
Code quality:
- OpenZeppelin libraries for all standard patterns (ERC-20, access control)
- Formal verification for critical contracts (challenge markets, reputation registry)
- External audits: Trail of Bits (Q1 2025), Quantstamp (Q2 2025)
Governance safeguards:
- Multi-sig treasury: 5-of-9 signers, geographically distributed
- Timelock: 72 hours before governance changes execute
- Emergency pause: Any 3 signers can freeze contracts if exploit detected
Bug bounties:
- Critical (treasury drain, privacy break): $100k - $500k
- High (reputation manipulation, oracle exploits): $10k - $50k
- Medium (DoS, gas griefing): $1k - $10k
- Program managed through Immunefi
Reentrancy:
- Status: Mitigated via OpenZeppelin ReentrancyGuard on all external calls
- Last audit: 2025-01-20, zero findings
Integer overflow/underflow:
- Status: Solidity 0.8+ automatic checks, explicit SafeMath where required
- Last audit: 2025-01-20, zero findings
Front-running:
- Status: Challenge markets use commit-reveal scheme (2-block delay)
- Last audit: 2025-01-20, 1 medium finding (fixed)
Governance attacks:
- Status: 72-hour timelock + multi-sig prevents single-actor takeover
- Scenario tested: Simulated 30% token holder attempting malicious proposal (rejected via multi-sig)
Detection:
- On-chain monitoring alerts (Forta network sensors)
- Community reports via security@voter-protocol.org
- Automated anomaly detection (unusual token movements)
Response stages:
Stage 1: Freeze (0-30 minutes)
- Any 3 multi-sig signers trigger emergency pause
- All transfers frozen, agent decisions halted
- Public incident announcement (Twitter, Discord, status page)
Stage 2: Assessment (30 minutes - 6 hours)
- Security team analyzes exploit vector
- Quantify losses (affected wallets, token amounts)
- Determine if fix requires contract upgrade or configuration change
Stage 3: Remediation (6-48 hours)
- Deploy patched contracts if needed
- Restore affected user balances via snapshot
- Resume operations after multi-sig approval
Stage 4: Transparency (48 hours - 1 week)
- Publish detailed post-mortem (exploit vector, timeline, fix)
- Compensate affected users (100% reimbursement + 10% bonus)
- Update audit requirements based on lessons learned
Attack scenario: Craft adversarial inputs that trick agents into bad decisions.
Mitigations:
- Bounded outputs - Rewards capped at 5000 VOTER, can't mint unlimited tokens
- Multi-agent agreement - Requires 3+ agents agreeing (60% weighted consensus)
- Human override - Governance can veto suspicious decisions within 24 hours
- Deterministic logic - Agents execute fixed algorithms on observable data, not LLM "vibes"
Known adversarial examples:
-
Input: Fake "bill introduction" with manipulated timestamp
- Defense: ImpactAgent cross-references congress.gov API, rejects if not in official database
-
Input: Coordinated template spam (1000 sends in 1 block)
- Defense: SupplyAgent detects participation spike, reduces per-action rewards proportionally
-
Input: False challenge evidence (AI-generated fake voting records)
- Defense: Challenge markets require verifiable sources (congress.gov, government PDFs), IPFS hashes compared
Monitoring:
- All agent decisions logged on-chain (IPFS hash of full context)
- Community can replay inputs, verify outputs match expected behavior
- Discrepancies flagged → agent reputation decay → consensus weight reduction
Incident response:
- If single agent compromised: Reduce consensus weight to 0%, deploy patched version
- If multiple agents colluding: Emergency governance vote, manual decisions until fix deployed
- If logic exploit found: Immediate patch, retroactive correction of affected rewards
Worst-case scenario: Attacker links wallet addresses to real-world identities.
Attack vectors:
-
Blockchain analysis - Analyze on-chain transactions, correlate with external data
- Mitigation: Users control wallet funding methods, can use mixers/privacy tools
- Platform responsibility: We don't collect wallet funding sources
-
IP address correlation - Log IP addresses during proof generation, map to locations
- Mitigation: No IP logging on proof generation endpoints, encourage VPN/Tor usage
- Status: Server logs retain only timestamp + district hash (no IPs)
-
Metadata leakage - Congressional delivery timing correlates with public events
- Mitigation: Batch deliveries (hourly windows), can't correlate specific sends
- Status: Messages queued, delivered in randomized order
-
Identity verification provider breach - self.xyz or Didit.me hacked
- Mitigation: Zero-knowledge proofs of identity verification (provider never learns wallet address)
- Status: Verification happens via blind signatures, provider can't link ID to wallet
If breach occurs:
- Immediate notification to affected users (if determinable)
- Offer wallet migration assistance (new addresses, reputation transfer)
- Forensic analysis to determine breach vector
- Public transparency report within 72 hours
Scenario: Attacker gains access to encrypted message database.
Current state:
- Messages encrypted client-side (XChaCha20-Poly1305) to TEE public key before network transit
- Encrypted blobs stored in backend database (platform cannot decrypt)
- Decryption occurs only in AWS Nitro Enclaves (isolated from platform)
- Enclave decrypts message + address, sends as plaintext to congressional offices via CWC API
- Addresses and message content never persist in platform-accessible storage
If database compromised:
- Attacker gets: Encrypted blobs (XChaCha20-Poly1305 encrypted to TEE public key)
- Attacker needs: TEE private keys (exist only in AWS Nitro Enclave memory, inaccessible)
- Brute force infeasible: 256-bit symmetric keys, ~2^256 operations
- Platform operators cannot decrypt even if they wanted to (architectural enforcement)
Response:
- Immediate incident notification to affected congressional offices
- Forensic analysis to determine breach vector and scope
- Purge all temporary encrypted message storage
- Encrypted blobs are useless without TEE keys (which remain secure in enclaves)
Treasury multi-sig:
- 5-of-9 signers required
- Geographic distribution: USA (2), Europe (3), Asia (2), South America (1), Africa (1)
- Hardware wallets only (Ledger, Trezor)
- Annual key rotation ceremony
Congressional office keys:
- Generated via MPC ceremony (no single party sees complete key)
- IPFS backup with encryption
- Annual rotation, previous keys retained for legacy decryption
Agent signing keys:
- Stored in AWS KMS with audit logging
- Rotated quarterly
- Multi-region redundancy
Production infrastructure:
- Zero standing access (no permanent admin credentials)
- Time-limited access via Teleport (max 4 hours)
- Every access logged + reviewed weekly
- Require 2FA + hardware key
Database access:
- Read-only replicas for analytics (no PII)
- Write access only via application logic (no direct DB connections)
- Audit logs retained 7 years
Third-party integrations:
- API keys rotated monthly
- Scoped to minimum required permissions
- Automated revocation if unused >30 days
On-call rotation:
- 24/7 coverage, 15-minute SLA for critical alerts
- Escalation path: On-call engineer → Security lead → CTO → Multi-sig signers
Communication channels:
- Internal: PagerDuty + Signal encrypted group
- External: security@voter-protocol.org (PGP key:
0x7F3A...) - Public: Twitter @VOTERProtocol + status.voter-protocol.org
Runbooks (planned for Phase 1 launch):
- Smart contract exploit:
/runbooks/contract-freeze.md - Privacy breach:
/runbooks/identity-exposure.md - Browser WASM compromise:
/runbooks/client-security-incident.md - Oracle manipulation:
/runbooks/oracle-pause.md
Critical Path (Must complete before mainnet launch):
- Noir circuit formal verification (Trail of Bits audit for Merkle membership circuit)
- DistrictGate.sol smart contract audit (OpenZeppelin/Trail of Bits)
- Browser WASM security review (Subresource Integrity, COOP/COEP headers, KZG parameters integrity)
- Shadow Atlas Merkle tree generation and IPFS deployment
- Content moderation 3-layer stack penetration testing
- self.xyz + Didit.me integration security review
- Bug bounty program launch (Immunefi, Phase 1 scope)
Phase 1 Operational Security:
- Security council multisig setup (3-of-5 threshold, hardware wallets)
- Incident response runbooks (Noir circuit vulnerability, browser WASM compromise, CSAM detection, moderation bypass)
- Monitoring infrastructure (Datadog for browser proving times, Sentry for errors, gas cost tracking)
- Congressional IT compliance review (CWC integration, data protection, Section 230)
Phase 1 Contingency Planning:
- Emergency moderation escalation procedures (if Layer 1/2 compromised)
- Provider redundancy testing (OpenAI API outage → alternative moderation)
- Noir circuit update procedures (if vulnerability discovered)
Token Economics Security:
- VOTER token smart contract audit (ERC-20, emission logic)
- SupplyAgent + MarketAgent security review (Oracle manipulation, adversarial inputs)
- Challenge market penetration testing (Sybil attacks, consensus gaming)
- Outcome market integration audit (UMA/Gnosis CTF security review)
Privacy Enhancements:
- Privacy pools implementation (Buterin 2023/2025, Tornado Cash-style unlinkability)
- Fully homomorphic encryption research (encrypt computational inputs, Phase 3+)
- Nested ZK proofs for reputation ranges (only if congressional offices accept weaker signals)
Cross-Chain Expansion:
- NEAR Chain Signatures security review (threshold ECDSA, MPC protocol)
- Cross-chain reputation bridge audits (Ethereum, Polygon, Arbitrum via ERC-8004)
- Multi-chain UltraPlonk verification gas benchmarks
Advanced Cryptography:
- Post-quantum ZK-STARK migration (quantum resistance)
- Nested UltraPlonk proofs for reputation ranges (only if community demands + congressional offices accept)
- Zero-knowledge machine learning (private reputation scoring)
Decentralization:
- DAO governance transition security review
- Decentralized oracle network (reduce Chainlink/Band reliance)
- Community-run Shadow Atlas verification tools
- Distributed WASM proving verification (community-run IPFS nodes for KZG parameters)
Compliance & Audits:
- Annual penetration testing (red team exercises)
- Compliance audits (GDPR, CCPA, congressional IT requirements)
- Mobile app security audit (React Native, if mobile launch)
- Chaos engineering for agent consensus (Byzantine fault tolerance)
DO NOT report security vulnerabilities via public GitHub issues.
Responsible disclosure:
- Email: security@voter-protocol.org (PGP:
0x7F3A...) - Include: Detailed description, reproduction steps, proof-of-concept (if safe)
- We respond within 24 hours acknowledging receipt
- We provide timeline estimate within 72 hours
- You receive credit in public disclosure (if desired)
Bug bounty eligibility:
- Critical: $100k - $500k (private key extraction, treasury drain, identity deanonymization)
- High: $10k - $50k (reputation manipulation, oracle exploits, challenge market gaming)
- Medium: $1k - $10k (DoS, gas griefing, spam bypassing)
Out of scope:
- Social engineering attacks against users
- Physical attacks against user devices
- Congressional office system vulnerabilities (report to them directly)
- NEAR protocol vulnerabilities (report to NEAR Foundation)
We assume the following are secure (if broken, system security fails):
- Elliptic curve discrete log (ECDSA, UltraPlonk proofs, BN254 curve)
- XChaCha20-Poly1305 (AEAD encryption for congressional messages)
- RSA-OAEP (Key encapsulation for congressional office public keys)
- NEAR MPC protocol (threshold ECDSA, Phase 2+ if adopted)
- Poseidon hash function (collision resistance, SNARK-friendly)
- KZG commitment scheme (Polynomial commitments via Aztec's powers-of-tau ceremony, 100K+ participants)
- Browser sandbox security (WASM isolation, COOP/COEP headers enforced)
We do NOT assume:
- Users protect seed phrases (we use identity verification instead)
- Single oracle tells truth (we cross-reference multiple)
- Single agent is honest (we require multi-agent consensus)
- Platform operators are trustworthy (cryptography eliminates need for trust)
- Congressional offices protect messages post-delivery (out of our control)
This document lives and breathes. Threat landscape changes daily. Security team reviews quarterly. Last review: 2026-01-23.
Questions? Concerns? Paranoia? security@voter-protocol.org