Skip to content

netologist/secondbrain-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Second Brain Multi-Agent AI System

A comprehensive second-brain system that combines intelligent multi-agent orchestration, privacy-first document archiving with PII sanitization, and seamless web research capabilities using PydanticAI, ChromaDB, and Brave Search integration.

Features

  • πŸ€– Multi-Agent System: Orchestrator coordinates between Archivist (local memory) and Researcher (web search)
  • πŸ”’ PII Sanitization: Automatic detection and redaction of sensitive information using Presidio
  • πŸ“š Document Ingestion: Parse and index markdown files with header-based chunking
  • πŸ” Intelligent Routing: Query routing based on content - personal notes vs. web search
  • πŸ’Ύ Vector Database: ChromaDB for efficient similarity search
  • 🌐 Web Search Integration: Brave Search via MCP server for external queries
  • πŸ›‘οΈ Privacy-First: All PII is removed before storage
  • πŸ“Š Observability: Built-in telemetry with Logfire/OpenTelemetry

Prerequisites

  • Python 3.13+
  • uv package manager
  • Node.js (for Brave Search MCP server)
  • Anthropic API key for Claude
  • Brave Search API key (for web search functionality)

Installation

# Install dependencies and download spaCy model
make install

Or manually:

uv sync
uv run spacy download en_core_web_lg

Usage

Quick Start

Run the orchestrator (recommended - routes to both agents):

make run-orchestrator

Or run individual agents:

make run-archivist   # Local memory search only
make run-researcher  # Web search only

Available Commands

make help              # Show all available commands
make run-orchestrator  # Run the orchestrator agent (routes queries)
make run-archivist     # Run the archivist agent (memory only)
make run-researcher    # Run the researcher agent (web search only)
make evals             # Run evaluation tests for all agents
make install           # Install dependencies
make clean             # Clean up generated files
make clean-chroma      # Clean ChromaDB persistent data
make format            # Format code with black and isort
make lint              # Lint code with ruff
make fix-lint          # Auto-fix linting issues
make check             # Run all checks (lint + type check)

Manual Usage

from memory import MemoryTool

# Initialize memory tool
memory = MemoryTool()

# Create sample documents
memory.create_sample_docs()

# Ingest documents with PII sanitization
memory.ingest_folder()

# Search for information
results = memory.collection.query(
    query_texts=["deployment guide"],
    n_results=1
)

Architecture

C4 Component Diagram

graph TB
    subgraph "Second Brain System"
        subgraph "Agent Layer"
            ORC[Orchestrator Agent<br/>Decision Router]
            ARC[Archivist Agent<br/>Memory Manager]
            RES[Researcher Agent<br/>Web Search]
        end
        
        subgraph "Core Components"
            MEM[Memory Tool<br/>Vector DB Interface]
            GRD[PII Guardrail<br/>Presidio]
            TEL[Telemetry<br/>Logfire/OTEL]
        end
        
        subgraph "Data Layer"
            CHR[(ChromaDB<br/>Vector Store)]
            DOC[(.secondbrain/<br/>Markdown Docs)]
        end
        
        subgraph "External Services"
            LLM[Claude API<br/>Anthropic]
            BRV[Brave Search<br/>MCP Server]
        end
    end
    
    USER[User Query] --> ORC
    
    ORC -->|Route to Memory| ARC
    ORC -->|Route to Web| RES
    
    ARC -->|search/save| MEM
    ARC -->|LLM calls| LLM
    
    RES -->|web search| BRV
    RES -->|LLM calls| LLM
    
    MEM -->|sanitize| GRD
    MEM -->|query/store| CHR
    MEM -->|ingest| DOC
    
    GRD -->|analyze PII| MEM
    
    ORC -.->|trace| TEL
    ARC -.->|trace| TEL
    RES -.->|trace| TEL
    
    style USER fill:#e1f5ff
    style ORC fill:#fff4e6
    style ARC fill:#fff4e6
    style RES fill:#fff4e6
    style MEM fill:#e8f5e9
    style GRD fill:#e8f5e9
    style TEL fill:#e8f5e9
    style CHR fill:#f3e5f5
    style DOC fill:#f3e5f5
    style LLM fill:#fce4ec
    style BRV fill:#fce4ec
Loading

Key Components:

  • Orchestrator Agent: Routes queries to either Archivist (local memory) or Researcher (web search)
  • Archivist Agent: Manages personal knowledge base with PII-sanitized storage and retrieval
  • Researcher Agent: Searches the web using Brave Search MCP server
  • Memory Tool: Handles markdown parsing, chunking, and vector database operations
  • PII Guardrail: Uses Presidio to detect and redact sensitive information before storage
  • Telemetry: Distributed tracing with Logfire/OpenTelemetry for observability

Project Structure

.
β”œβ”€β”€ agents/
β”‚   β”œβ”€β”€ orchestrator.py     # Orchestrator agent (router)
β”‚   β”œβ”€β”€ archivist.py        # Archivist agent (memory)
β”‚   β”œβ”€β”€ researcher.py       # Researcher agent (web search)
β”‚   β”œβ”€β”€ *_evals.py          # Agent evaluation tests
β”‚   └── __init__.py
β”œβ”€β”€ memory.py               # MemoryTool class with PII sanitization
β”œβ”€β”€ guardrails.py           # PIIGuardrail using Presidio
β”œβ”€β”€ otel.py                 # Telemetry configuration
β”œβ”€β”€ Makefile                # Build and run commands
β”œβ”€β”€ pyproject.toml          # Project dependencies
└── README.md

PII Protection

The system automatically detects and redacts:

  • πŸ“§ Email addresses β†’ <EMAIL>
  • πŸ“ž Phone numbers β†’ <PHONE_NUM>
  • πŸ‘€ Person names β†’ <PERSON>
  • πŸ” Other sensitive data β†’ <REDACTED>

Example:

Input:  "Call John Doe at 555-0123 or email john.doe@example.com"
Output: "Call <PERSON> at <PHONE_NUM> or email <EMAIL>"

Environment Variables

Create a .env file in the project root:

# Required
ANTHROPIC_API_KEY=your-anthropic-api-key-here
BRAVE_API_KEY=your-brave-api-key-here

# Optional (defaults are set in agents/__init__.py)
DEFAULT_LLM_MODEL=anthropic:claude-haiku-4-5
TOKENIZERS_PARALLELISM=true

# For Opik tracing (optional)
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:5173/api/v1/private/otel

Or set them manually:

export ANTHROPIC_API_KEY="your-anthropic-api-key-here"
export BRAVE_API_KEY="your-brave-api-key-here"

Development

Running Evaluations

Run agent evaluations (tests all three agents):

make evals

This will run:

  • agents/archivist_evals.py - Archivist agent tests
  • agents/researcher_evals.py - Researcher agent tests
  • agents/orchestrator_evals.py - Orchestrator agent tests

Code Quality

make format    # Format code with black and isort
make lint      # Check code with ruff
make fix-lint  # Auto-fix linting issues
make check     # Run all checks (lint + mypy)

Cleaning Up

# Clean generated files only
make clean

# Clean ChromaDB data
make clean-chroma

# Clean everything
make clean && make clean-chroma

Technology Stack

  • Pydantic AI: Agent framework
  • Pydantic Evals: Evaluation framework
  • Brave Search MCP: Brave search mcp server
  • ChromaDB: Vector database for RAG and memory
  • Presidio: PII detection and anonymization for Guardrails
  • spaCy: NLP for entity recognition
  • Claude (Anthropic): Language model
  • uv: Fast Python package manager
  • Opik: Experiment tracking and evaluation platform (optional)

Opik Integration (for Local Deployment)

This project supports Opik for experiment tracking and evaluation. You can run Opik locally using Docker Compose.

Prerequisites for Opik

  • Docker and Docker Compose installed
  • Clone the Opik repository (outside this project):
# Clone the Opik repository
git clone https://github.com/comet-ml/opik.git

# Navigate to the opik folder

cd opik

# Start the Opik platform

./opik.sh

# Stop the Opik platform
./opik.sh --stop

Opik will be available at http://localhost:5173

Using Opik with Evaluations

Once Opik is running, your evaluation traces will automatically be logged to the local Opik instance at http://localhost:5173.

License

GPLv3

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

About

Second Brain Multi-Agent AI System

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published