feat: Semantic product search with LLM evaluation tests#64
Merged
Conversation
f7e76ec to
95be4ff
Compare
- Implement semantic product search using Weaviate text_vector - Add LLM-based search query extraction from conversation context - Create LLM-as-judge evaluation framework for e2e testing - Add translations for product search responses (en/pl) Product Search: - searchProductIdsInWeaviate for semantic search via nearText - productsNode extracts query from full conversation, not just last message - Returns formatted product list with prices, categories, stock status Evaluation Framework: - evaluator.ts: LLM-as-judge using Bielik model (score 1-5) - conversationRunner.ts: Execute multi-turn conversations - productSearch.e2e.test.ts: 11 scenarios (single-turn, multi-turn, edge cases) - Separate vitest config for e2e tests (npm run test:eval) Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>
95be4ff to
1c11c21
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
text_vector)Changes
Product Search
searchProductIdsInWeaviatefor semantic search vianearTextontext_vectorproductsNodeextracts search query from entire conversation (not just last message)getAgentTranslationshelper for server-side translationsEvaluation Framework
evaluator.ts: LLM-as-judge using Bielik model (scores 1-5 with criteria)conversationRunner.ts: Execute multi-turn conversations through chatGraphproductSearch.e2e.test.ts: 11 test scenariosnpm run test:eval)Test plan
npm test- unit tests pass (94 tests)npm run test:eval- e2e evaluation tests execute (~16-18/21 pass, some variance due to LLM non-determinism)🤖 Generated with Claude Code