LLM Antidote reset artifacts are NOT security tools. They are context management utilities that help users control conversational flow. This document clarifies what these artifacts can and cannot do from a security perspective.
✅ Context Management Tools
- Help start fresh topics without creating new chats
- Reduce semantic context bleed between discussions
- Enable testing and verification of LLM behavior
- Improve conversational flow and clarity
✅ User Agency Enhancers
- Give users more control over their LLM interactions
- Allow selective context clearing
- Enable privacy-conscious development workflows
- Support educational and research use cases
✅ Behavioral Interventions
- Leverage LLM instruction-following capabilities
- Exploit model training patterns
- Signal meta-level state changes
- Request explicit context resets
❌ NOT Security Mechanisms
- Do NOT delete conversation logs from provider servers
- Do NOT provide cryptographic guarantees
- Do NOT prevent provider data retention
- Do NOT satisfy compliance requirements (HIPAA, GDPR, etc.)
❌ NOT Privacy Protection
- Do NOT erase previously shared sensitive information
- Do NOT prevent provider training on your data (if they do that)
- Do NOT hide content from safety monitoring
- Do NOT provide anonymity or anti-forensic guarantees
❌ NOT Safety Bypasses
- Do NOT circumvent content policies
- Do NOT enable harmful behaviors
- Do NOT "jailbreak" or override safety training
- Do NOT hide malicious intent
-
Unintended Context Leakage (Medium Protection)
- Scenario: Discussing project A, then B, LLM references A in B discussion
- Protection: Reset reduces semantic associations between A and B
- Limitation: Not 100% effective, some associations may persist
-
Context Contamination (Medium Protection)
- Scenario: Confused or incorrect context affecting new discussions
- Protection: Clears problematic associations for fresh start
- Limitation: User must recognize contamination occurred
-
Conversational Interference (Good Protection)
- Scenario: Prior discussion topics influencing unrelated questions
- Protection: Specialized resets (coding, conversation) work well
- Limitation: Universal resets less reliable
-
Data Retention by Providers (No Protection)
- Threat: Provider stores all conversation data
- Reality: Reset is behavioral, not architectural
- Mitigation: Use providers with strong privacy policies, or local models
-
Training Data Incorporation (No Protection)
- Threat: Your conversations used to train future models
- Reality: Reset doesn't prevent data collection
- Mitigation: Check provider policies, use opt-out if available
-
Adversarial Context Reconstruction (No Protection)
- Threat: Determined attacker reconstructs "forgotten" context
- Reality: Context may be recoverable through careful probing
- Mitigation: Don't rely on reset for security-critical scenarios
-
Logging and Monitoring (No Protection)
- Threat: Conversation logged by provider, employer, etc.
- Reality: Reset doesn't affect logging systems
- Mitigation: Assume all messages are permanently logged
✅ Appropriate use cases:
- Switching between non-sensitive topics
- Clearing confused conversational context
- Testing LLM behavior and capabilities
- Educational demonstrations
- Privacy-conscious development (with caveats)
❌ Inappropriate use cases:
- Clearing sensitive information already shared
- Meeting compliance requirements
- Hiding conversations from monitoring
- "Deleting" proprietary code or trade secrets
- Security-critical context separation
-
Don't Share Sensitive Data in First Place
# ❌ BAD: Sharing real API key api_key = "sk_live_abc123XYZ..." # ✅ GOOD: Using placeholder api_key = "YOUR_API_KEY_HERE"
-
Sanitize Before Sharing
- Redact names, keys, credentials
- Use fake data for examples
- Remove company/client identifiers
- Generalize specific details
-
Use Appropriate Tools for Context
- Public information: Any LLM fine
- Proprietary code: Enterprise LLM with data retention guarantees
- Highly sensitive: Local models only
- PII/PHI: Don't use LLMs, or use HIPAA-compliant services
-
Understand Provider Policies
- Read terms of service
- Check data retention policies
- Understand training data usage
- Know your rights (GDPR, CCPA, etc.)
-
Layer Your Privacy Approach
- Sanitization (first line of defense)
- Provider selection (second line)
- Reset artifacts (third line - convenience layer)
- New chats when needed (fourth line)
- Local models for sensitive work (ultimate fallback)
Important: Reset artifacts do NOT bypass safety mechanisms.
❌ WRONG UNDERSTANDING:
"If I discuss harmful topic X, then reset, the LLM will forget and help with Y"
✅ CORRECT UNDERSTANDING:
"Reset clears context, but safety training persists. LLM will still decline harmful requests."
After reset, LLMs retain:
- Content policy enforcement
- Harmful content detection
- Ethical boundaries
- Safety training
- Refusal behaviors
Intended uses:
- Clean slate for new topics
- Context management for complex conversations
- Testing and research
- Educational purposes
Potential misuses we oppose:
- Attempting safety bypasses (won't work anyway)
- Hiding harmful intent (detectable and logged)
- Circumventing monitoring (ineffective)
- Deceptive manipulation
Question: Do reset artifacts help with GDPR compliance?
Answer: No. GDPR requires actual data deletion, not behavioral context clearing.
Proper approach:
- Use GDPR-compliant LLM providers
- Request data deletion through official channels
- Don't process EU citizen data with non-compliant services
Question: Can I use reset artifacts when discussing medical information?
Answer: No. HIPAA requires technical safeguards reset artifacts don't provide.
Proper approach:
- Use HIPAA-compliant LLM services
- Ensure Business Associate Agreements (BAA)
- Don't share PHI with non-compliant services, even with reset
Similar concerns apply to:
- CCPA (California privacy)
- SOC 2 compliance
- PCI DSS (payment card data)
- SOX (financial regulations)
General rule: Reset artifacts are NOT a substitute for compliance.
-
Stop immediately
- Don't continue the conversation
- Don't try to "cover up" with reset
-
Assess the damage
- What was shared?
- How sensitive is it?
- Who has access?
-
Take appropriate action
- API keys/credentials: Rotate immediately
- PII/PHI: Notify appropriate parties, follow breach protocols
- Trade secrets: Consult legal, consider implications
- Passwords: Change them
-
Learn and prevent
- Update procedures
- Train team members
- Use better sanitization
- Consider local models for sensitive work
Important: If you've already shared sensitive data, reset doesn't undo it.
Timeline:
1. [10:00] Share API key in LLM chat
2. [10:01] Realize mistake, apply reset
3. [10:02] LLM confirms reset
Reality:
- The API key was logged at 10:00
- Reset at 10:01 doesn't delete logs
- Provider still has the key in their systems
- You MUST rotate the key
- Treat all LLM conversations as potentially public
- Sanitize data before sharing
- Use reset artifacts for convenience, not security
- Start new chats for sensitive topics
- Consider local models for private work
- Establish clear LLM usage policies
- Train employees on data sanitization
- Use enterprise LLM services with guarantees
- Monitor and audit LLM usage
- Don't rely on reset artifacts for compliance
- Never share production credentials
- Use test/dummy data in examples
- Assume all code snippets are logged
- Sanitize before sharing architecture details
- Use private repositories for sensitive code
This project is committed to ethical AI interaction:
✅ Ethical uses:
- User control and agency
- Educational exploration
- Research into LLM behavior
- Improving conversational flow
- Privacy-conscious development (with proper layering)
❌ Unethical uses:
- Attempting to bypass safety (won't work)
- Hiding harmful intent (will be caught)
- Circumventing monitoring (ineffective and wrong)
- Deceptive manipulation
- Violating terms of service
We believe in being transparent about capabilities and limitations:
What we claim:
- Reset artifacts can reduce context bleed (70-90% effective depending on model)
- Specialized resets work better than universal (evidence-based)
- Effectiveness varies by model architecture (documented)
What we don't claim:
- Perfect context clearing
- Security guarantees
- Privacy protection
- Compliance assistance
- Safety bypasses (which we oppose)
Use reset artifacts as context management tools, not security mechanisms.
For security and privacy:
- Prevention > remediation (don't share sensitive data)
- Proper tools > workarounds (use compliant services)
- Defense in depth > single layer (sanitize + policy + tools)
- Transparency > obscurity (understand what you're working with)
Questions about security/privacy? Open an issue on GitHub, but remember:
- This is a research/educational project
- We're not security consultants
- Always consult appropriate experts for compliance/security needs
Last updated: 2025-11-22 Author: Jonathan Jewell License: CC0 1.0 Universal