Second Brain Multi-Agent AI System

A comprehensive second-brain system that combines intelligent multi-agent orchestration, privacy-first document archiving with PII sanitization, and seamless web research capabilities using PydanticAI, ChromaDB, and Brave Search integration.

Features

🤖 Multi-Agent System: Orchestrator coordinates between Archivist (local memory) and Researcher (web search)
🔒 PII Sanitization: Automatic detection and redaction of sensitive information using Presidio
📚 Document Ingestion: Parse and index markdown files with header-based chunking
🔍 Intelligent Routing: Query routing based on content - personal notes vs. web search
💾 Vector Database: ChromaDB for efficient similarity search
🌐 Web Search Integration: Brave Search via MCP server for external queries
🛡️ Privacy-First: All PII is removed before storage
📊 Observability: Built-in telemetry with Logfire/OpenTelemetry

Prerequisites

Python 3.13+
uv package manager
Node.js (for Brave Search MCP server)
Anthropic API key for Claude
Brave Search API key (for web search functionality)

Installation

# Install dependencies and download spaCy model
make install

Or manually:

uv sync
uv run spacy download en_core_web_lg

Usage

Quick Start

Run the orchestrator (recommended - routes to both agents):

make run-orchestrator

Or run individual agents:

make run-archivist   # Local memory search only
make run-researcher  # Web search only

Available Commands

make help              # Show all available commands
make run-orchestrator  # Run the orchestrator agent (routes queries)
make run-archivist     # Run the archivist agent (memory only)
make run-researcher    # Run the researcher agent (web search only)
make evals             # Run evaluation tests for all agents
make install           # Install dependencies
make clean             # Clean up generated files
make clean-chroma      # Clean ChromaDB persistent data
make format            # Format code with black and isort
make lint              # Lint code with ruff
make fix-lint          # Auto-fix linting issues
make check             # Run all checks (lint + type check)

Manual Usage

from memory import MemoryTool

# Initialize memory tool
memory = MemoryTool()

# Create sample documents
memory.create_sample_docs()

# Ingest documents with PII sanitization
memory.ingest_folder()

# Search for information
results = memory.collection.query(
    query_texts=["deployment guide"],
    n_results=1
)

Architecture

C4 Component Diagram

graph TB
    subgraph "Second Brain System"
        subgraph "Agent Layer"
            ORC[Orchestrator Agent<br/>Decision Router]
            ARC[Archivist Agent<br/>Memory Manager]
            RES[Researcher Agent<br/>Web Search]
        end
        
        subgraph "Core Components"
            MEM[Memory Tool<br/>Vector DB Interface]
            GRD[PII Guardrail<br/>Presidio]
            TEL[Telemetry<br/>Logfire/OTEL]
        end
        
        subgraph "Data Layer"
            CHR[(ChromaDB<br/>Vector Store)]
            DOC[(.secondbrain/<br/>Markdown Docs)]
        end
        
        subgraph "External Services"
            LLM[Claude API<br/>Anthropic]
            BRV[Brave Search<br/>MCP Server]
        end
    end
    
    USER[User Query] --> ORC
    
    ORC -->|Route to Memory| ARC
    ORC -->|Route to Web| RES
    
    ARC -->|search/save| MEM
    ARC -->|LLM calls| LLM
    
    RES -->|web search| BRV
    RES -->|LLM calls| LLM
    
    MEM -->|sanitize| GRD
    MEM -->|query/store| CHR
    MEM -->|ingest| DOC
    
    GRD -->|analyze PII| MEM
    
    ORC -.->|trace| TEL
    ARC -.->|trace| TEL
    RES -.->|trace| TEL
    
    style USER fill:#e1f5ff
    style ORC fill:#fff4e6
    style ARC fill:#fff4e6
    style RES fill:#fff4e6
    style MEM fill:#e8f5e9
    style GRD fill:#e8f5e9
    style TEL fill:#e8f5e9
    style CHR fill:#f3e5f5
    style DOC fill:#f3e5f5
    style LLM fill:#fce4ec
    style BRV fill:#fce4ec

Key Components:

Orchestrator Agent: Routes queries to either Archivist (local memory) or Researcher (web search)
Archivist Agent: Manages personal knowledge base with PII-sanitized storage and retrieval
Researcher Agent: Searches the web using Brave Search MCP server
Memory Tool: Handles markdown parsing, chunking, and vector database operations
PII Guardrail: Uses Presidio to detect and redact sensitive information before storage
Telemetry: Distributed tracing with Logfire/OpenTelemetry for observability

Project Structure

.
├── agents/
│   ├── orchestrator.py     # Orchestrator agent (router)
│   ├── archivist.py        # Archivist agent (memory)
│   ├── researcher.py       # Researcher agent (web search)
│   ├── *_evals.py          # Agent evaluation tests
│   └── __init__.py
├── memory.py               # MemoryTool class with PII sanitization
├── guardrails.py           # PIIGuardrail using Presidio
├── otel.py                 # Telemetry configuration
├── Makefile                # Build and run commands
├── pyproject.toml          # Project dependencies
└── README.md

PII Protection

The system automatically detects and redacts:

📧 Email addresses → <EMAIL>
📞 Phone numbers → <PHONE_NUM>
👤 Person names → <PERSON>
🔐 Other sensitive data → <REDACTED>

Example:

Input:  "Call John Doe at 555-0123 or email john.doe@example.com"
Output: "Call <PERSON> at <PHONE_NUM> or email <EMAIL>"

Environment Variables

Create a .env file in the project root:

# Required
ANTHROPIC_API_KEY=your-anthropic-api-key-here
BRAVE_API_KEY=your-brave-api-key-here

# Optional (defaults are set in agents/__init__.py)
DEFAULT_LLM_MODEL=anthropic:claude-haiku-4-5
TOKENIZERS_PARALLELISM=true

# For Opik tracing (optional)
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:5173/api/v1/private/otel

Or set them manually:

export ANTHROPIC_API_KEY="your-anthropic-api-key-here"
export BRAVE_API_KEY="your-brave-api-key-here"

Development

Running Evaluations

Run agent evaluations (tests all three agents):

make evals

This will run:

agents/archivist_evals.py - Archivist agent tests
agents/researcher_evals.py - Researcher agent tests
agents/orchestrator_evals.py - Orchestrator agent tests

Code Quality

make format    # Format code with black and isort
make lint      # Check code with ruff
make fix-lint  # Auto-fix linting issues
make check     # Run all checks (lint + mypy)

Cleaning Up

# Clean generated files only
make clean

# Clean ChromaDB data
make clean-chroma

# Clean everything
make clean && make clean-chroma

Technology Stack

Pydantic AI: Agent framework
Pydantic Evals: Evaluation framework
Brave Search MCP: Brave search mcp server
ChromaDB: Vector database for RAG and memory
Presidio: PII detection and anonymization for Guardrails
spaCy: NLP for entity recognition
Claude (Anthropic): Language model
uv: Fast Python package manager
Opik: Experiment tracking and evaluation platform (optional)

Opik Integration (for Local Deployment)

This project supports Opik for experiment tracking and evaluation. You can run Opik locally using Docker Compose.

Prerequisites for Opik

Docker and Docker Compose installed
Clone the Opik repository (outside this project):

# Clone the Opik repository
git clone https://github.com/comet-ml/opik.git

# Navigate to the opik folder

cd opik

# Start the Opik platform

./opik.sh

# Stop the Opik platform
./opik.sh --stop

Opik will be available at http://localhost:5173

Using Opik with Evaluations

Once Opik is running, your evaluation traces will automatically be logged to the local Opik instance at http://localhost:5173.

License

GPLv3

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
agents		agents
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
guardrails.py		guardrails.py
memory.py		memory.py
otel.py		otel.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Second Brain Multi-Agent AI System

Features

Prerequisites

Installation

Usage

Quick Start

Available Commands

Manual Usage

Architecture

C4 Component Diagram

Project Structure

PII Protection

Environment Variables

Development

Running Evaluations

Code Quality

Cleaning Up

Technology Stack

Opik Integration (for Local Deployment)

Prerequisites for Opik

Using Opik with Evaluations

License

Contributing

About

Uh oh!

Releases

Packages

Languages

License

netologist/secondbrain-ai

Folders and files

Latest commit

History

Repository files navigation

Second Brain Multi-Agent AI System

Features

Prerequisites

Installation

Usage

Quick Start

Available Commands

Manual Usage

Architecture

C4 Component Diagram

Project Structure

PII Protection

Environment Variables

Development

Running Evaluations

Code Quality

Cleaning Up

Technology Stack

Opik Integration (for Local Deployment)

Prerequisites for Opik

Using Opik with Evaluations

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages