-
Notifications
You must be signed in to change notification settings - Fork 2
ralph: #29 — Implement LLM-judge evaluator for answer comparison #56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jharris1679
wants to merge
39
commits into
main
Choose a base branch
from
ralph/issue-29
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
39 commits
Select commit
Hold shift + click to select a range
396e04c
ralph: work on #29 (iter 1)
jharris1679 5ff7b67
ralph: work on #29 (iter 2)
jharris1679 90afa8e
ralph: work on #29 (iter 3)
jharris1679 8db2cf9
ralph: work on #29 (iter 4)
jharris1679 dca0a62
ralph: work on #29 (iter 5)
jharris1679 5e68155
ralph: work on #29 (iter 6)
jharris1679 591e66a
ralph: work on #29 (iter 7)
jharris1679 1907aee
ralph: work on #29 (iter 8)
jharris1679 3c15eea
fix: resolve syntax errors in llm-judge.ts and runner.ts (#29)
jharris1679 bfd1865
ralph: work on #29 (iter 10)
jharris1679 fe5fdbe
ralph: work on #29 (iter 11)
jharris1679 51cc8a0
ralph: work on #29 (iter 14)
jharris1679 6201c1c
fix: resolve type errors in llm-judge and runner (#29)
jharris1679 accb89b
fix: resolve TypeScript build errors in llm-judge and runner (#29)
jharris1679 202890f
ralph: work on #29 (iter 17)
jharris1679 ba89155
ralph: work on #29 (iter 24)
jharris1679 731e6ef
ralph: work on #29 (iter 25)
jharris1679 ba82825
ralph: work on #29 (iter 26)
jharris1679 8aa5584
ralph: work on #29 (iter 27)
jharris1679 d3124fe
ralph: work on #29 (iter 28)
jharris1679 a356603
ralph: work on #29 (iter 29)
jharris1679 22968cb
ralph: work on #29 (iter 30)
jharris1679 f8a4c81
ralph: work on #29 (iter 31)
jharris1679 e8dbf84
fix: resolve TypeScript type errors in opencode agent (#29)
jharris1679 fead796
ralph: work on #29 (iter 33)
jharris1679 48d3025
ralph: work on #29 (iter 35)
jharris1679 9ce33a0
ralph: work on #29 (iter 36)
jharris1679 58596fd
ralph: work on #29 (iter 37)
jharris1679 b5825e1
ralph: work on #29 (iter 38)
jharris1679 57d57fb
ralph: work on #29 (iter 41)
jharris1679 49343b6
ralph: work on #29 (iter 42)
jharris1679 0837c90
ralph: work on #29 (iter 43)
jharris1679 5e5c120
ralph: work on #29 (iter 46)
jharris1679 3573088
ralph: work on #29 (iter 48)
jharris1679 0e61dac
ralph: work on #29 (iter 49)
jharris1679 17db07f
ralph: work on #29 (iter 50)
jharris1679 a6df782
ralph: work on #29 (iter 51)
jharris1679 e92fea0
ralph: work on #29 (iter 52)
jharris1679 a039b62
ralph: work on #29 (iter 53)
jharris1679 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential duplicate answer content from
message.updatedevents.When streaming deltas are received (lines 229-233),
answeraccumulates text. Thenmessage.updated(lines 341-346) unconditionally appends text from the full message parts. This can double the captured answer. The fallback at lines 373-392 already handles the "no streaming data" case. Guard this section:📝 Committable suggestion
🤖 Prompt for AI Agents