Skip to content

Conversation

@techiejd
Copy link
Owner

@techiejd techiejd commented Dec 9, 2025

Add bulk embedding functionality with a new job, run-tracking collection, endpoint, and admin "Embed all" button to enable efficient backfilling and flexible ingest modes.

  • New payloadcms-vectorize:bulk-embed-all task handles selecting missing-version docs, delegating to provider callbacks (or the in-process fallback), polling, and reconciling embeddings.
  • Ingest mode now supports bulk to skip realtime vectorization, clear stale embeddings on updates, and rely on the bulk pipeline.
  • Knowledge pool collections render an admin "Embed all" control that posts to /api/vector-bulk-embed to queue a run.
  • Registered the new run collection vector-bulk-embeddings-runs, added the bulk endpoint, updated dev config/helpers for bulk callbacks, and documented ingestMode + bulk callbacks.

Open in Cursor Open in Web

Co-authored-by: jdavid10001 <jdavid10001@gmail.com>
@cursor
Copy link

cursor bot commented Dec 9, 2025

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

@techiejd
Copy link
Owner Author

techiejd commented Dec 9, 2025

@cursoragent please start with changes. Looking forward to seeing your work.

@cursor
Copy link

cursor bot commented Dec 9, 2025

Thanks for the go-ahead! I’ve implemented the bulk-embed feature set:

  • Added bulk ingest mode and job (payloadcms-vectorize:bulk-embed-all) with provider callback support and in-process fallback.
  • New run-tracking collection (vector-bulk-embeddings-runs) plus /api/vector-bulk-embed endpoint and admin “Embed all” button on knowledge pool lists.
  • Ingest routing now respects ingestMode (realtime vs bulk), clearing stale embeddings on updates in bulk mode.
  • Docs updated for ingestMode and bulk callbacks; added bulk ingest tests and updated config specs.

Tests run: pnpm vitest dev/specs/config.spec.ts (DB-dependent bulk test added but not run here).

If you’d like, I can run the full DB-backed bulk spec suite or prepare a commit/PR summary.


Open in Cursor Open in Web

Learn more about Cursor Agents

@techiejd techiejd self-assigned this Dec 16, 2025
@techiejd techiejd added the enhancement New feature or request label Dec 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants