Skip to content

Comments

feat: add FTMS (File/Text Management System)#8

Open
modpunk wants to merge 9 commits intoopenagen:mainfrom
modpunk:main
Open

feat: add FTMS (File/Text Management System)#8
modpunk wants to merge 9 commits intoopenagen:mainfrom
modpunk:main

Conversation

@modpunk
Copy link

@modpunk modpunk commented Feb 22, 2026

Summary

  • FTMS module (src/ftms/): file upload, storage, SQLite FTS5 indexing, text extraction, and media description
  • Gateway integration: 5 new endpoints (/upload, /files, /files/search, /files/{id}, /files/{id}/download) with configurable body limit for uploads
  • Web UI: file attach button with upload progress, file bubbles with icons/previews, and proxy passthrough for binary responses
  • Config: FtmsConfig with enabled, max_upload_size_mb, storage_dir, auto_describe options
  • Bug fixes: fixed 5 pre-existing build errors (unguarded PostgresMemory, missing futures-util alloc feature, duplicate chat fn in reliable.rs, futures→futures_util imports)

Architecture

Files are stored at ~/.zeroclaw/files/YYYY/MM/DD/{uuid}.{ext} with metadata + full-text content indexed in SQLite FTS5. Text is auto-extracted from text/JSON/XML/PDF files; images get base64 data-URI descriptions. Search uses FTS5 ranking with optional MIME type and date filters.

New files

  • src/ftms/mod.rs — FtmsService orchestrator
  • src/ftms/schema.rs — FileRecord, FileMetadata, FileSearchResult, FileListResponse
  • src/ftms/storage.rs — Date-organized file storage
  • src/ftms/index.rs — SQLite FTS5 index with content-sync triggers
  • src/ftms/extract.rs — Text extraction (UTF-8, PDF)
  • src/ftms/describe.rs — Media description (image base64, audio/video metadata)

Test plan

  • Release build succeeds on RPi4 (12m 18s, 11MB binary)
  • Upload endpoint returns FileRecord with extracted text
  • FTS5 search returns ranked results
  • List endpoint returns paginated results
  • Download endpoint returns exact file bytes (verified with diff)
  • Web UI attach button uploads and renders file bubbles

🤖 Generated with Claude Code

modpunk and others added 9 commits February 21, 2026 23:05
Defines architecture for file upload, storage, text extraction,
AI-powered media description, and FTS5 full-text search indexing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Detailed step-by-step plan covering config, schema, storage, index,
extraction, description, gateway routes, web UI, and proxy changes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link

Thanks for contributing to ZeroClaw.

For faster review, please ensure:

  • PR template sections are fully completed
  • cargo fmt --all -- --check, cargo clippy --all-targets -- -D warnings, and cargo test are included
  • If automation/agents were used heavily, add brief workflow notes
  • Scope is focused (prefer one concern per PR)

See CONTRIBUTING.md and docs/pr-workflow.md for full collaboration rules.

@github-actions github-actions bot added agent Auto scope: src/agent/** changed. memory Auto scope: src/memory/** changed. provider Auto scope: src/providers/** changed. docs Auto scope: docs/markdown/template files changed. dependencies Auto scope: dependency manifest/lock/policy changed. core Auto scope: root src/*.rs files changed. config Auto scope: src/config/** changed. gateway Auto scope: src/gateway/** changed. onboard Auto scope: src/onboard/** changed. labels Feb 22, 2026
@github-actions
Copy link

PR intake checks found warnings (non-blocking)

Fast safe checks found advisory issues. CI lint/test/build gates still enforce merge quality.

  • Missing required PR template sections: ## Validation Evidence (required), ## Security Impact (required), ## Privacy and Data Hygiene (required), ## Rollback Plan (required)
  • Incomplete required PR template fields: summary problem, summary why it matters, summary what changed, validation commands, security risk/mitigation, privacy status, rollback plan

Action items:

  1. Complete required PR template sections/fields.
  2. Remove tabs, trailing whitespace, and merge conflict markers from added lines.
  3. Re-run local checks before pushing:
    • ./scripts/ci/rust_quality_gate.sh
    • ./scripts/ci/rust_strict_delta_gate.sh
    • ./scripts/ci/docs_quality_gate.sh

Run logs: https://github.com/openagen/zeroclaw/actions/runs/22280116181

Detected blocking line issues (sample):

  • none

Detected advisory line issues (sample):

  • none

@github-actions github-actions bot added size: XL Auto size: >1000 non-doc changed lines. risk: high Auto risk: security/runtime/gateway/tools/workflows. memory: cli Auto module: memory/cli changed. provider: reliable Auto module: provider/reliable changed. config: core Auto module: config core files changed. and removed memory Auto scope: src/memory/** changed. provider Auto scope: src/providers/** changed. config Auto scope: src/config/** changed. gateway Auto scope: src/gateway/** changed. onboard Auto scope: src/onboard/** changed. labels Feb 22, 2026
@github-actions github-actions bot added gateway: core Auto module: gateway core files changed. onboard: wizard Auto module: onboard/wizard changed. labels Feb 22, 2026
@caiqinghua
Copy link
Collaborator

Thanks for contributing to ZeroClaw.

  1. Add authentication middleware to all /files/* and /upload endpoints
  2. Add file validation:
    - Size limit (recommend 10-50MB)
    - Content type allowlist (e.g., text, PDF, images - no executables)
    - Magic byte validation
  3. Add storage quota (per-token or global)
  4. Add rate limiting to upload endpoint (prevent DoS)
  5. Document security model in docs/security/file-upload-threat-model.md

@github-actions
Copy link

Hi @modpunk, friendly automation nudge from PR hygiene.

This PR has had no new commits for 53h and still needs an update before merge:

  • No CI Required Gate run was found for the current head commit.

Recommended next steps

  1. Rebase your branch on main.
  2. Push the updated branch and re-run checks (or use Re-run failed jobs).
  3. Post fresh validation output in this PR thread.

Maintainers: apply no-stale to opt out for accepted-but-blocked work.
Head SHA: c61523cf83b1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent Auto scope: src/agent/** changed. config: core Auto module: config core files changed. core Auto scope: root src/*.rs files changed. dependencies Auto scope: dependency manifest/lock/policy changed. docs Auto scope: docs/markdown/template files changed. gateway: core Auto module: gateway core files changed. memory: cli Auto module: memory/cli changed. onboard: wizard Auto module: onboard/wizard changed. provider: reliable Auto module: provider/reliable changed. risk: high Auto risk: security/runtime/gateway/tools/workflows. size: XL Auto size: >1000 non-doc changed lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants