-
Notifications
You must be signed in to change notification settings - Fork 3
feat: add test evidence checker for PR submissions (#61) #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implements automated test evidence checking for PRs in desktop and ComfyUI repos. - Creates gh-test-evidence task to scan open PRs - Uses GPT-4o-mini to analyze PR bodies for test evidence - Posts warning comments when test explanations or visual proof are missing - Auto-updates or deletes comments based on PR changes - Follows the same comment pattern as ComfyUI_frontend workflow Resolves #61 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements automated test evidence checking for pull requests in the Comfy-Org/desktop and comfyanonymous/ComfyUI repositories. The solution uses GPT-4o-mini to analyze PR descriptions for test explanations, screenshots, and videos, then posts/updates/deletes warning comments based on what evidence is present.
Key changes:
- New automated task that scans open PRs and validates test evidence using AI
- Smart comment management system that updates existing comments instead of creating duplicates
- Database-backed state tracking to avoid redundant analysis
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
| app/tasks/run-gh-tasks.ts | Registers the new test evidence task and reformats existing task entries for consistency |
| app/tasks/gh-test-evidence/gh-test-evidence.ts | Core implementation of the test evidence checker with OpenAI integration, comment management, and database persistence |
| app/tasks/gh-test-evidence/gh-test-evidence.spec.ts | Test suite structure with mocked dependencies for validating the task behavior |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
…dd CI cleanup - Extract main logic into runCorePingTask() function for better testability - Add isCI check to properly close DB and exit in CI environments - Add todo comment about deprecating custom webhook types in favor of @octokit/webhooks-types - Add llm-api, @keyv/mongo, and @octokit/webhooks-types dependencies - Remove trailing whitespace 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Resolve conflicts in coreping.ts by keeping new refactored structure - Resolve conflicts in run-gh-tasks.ts by including all task imports - Resolve conflicts in package.json by keeping both new dependencies - Accept incoming bun.lock changes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Switch from gpt-4o-mini to gpt-5-mini for analyzing PR test evidence. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Fixed typo 'Explaination' -> 'Explanation' throughout the codebase: - Updated schema field name in TestEvidenceSchema - Updated all references in code and tests - Updated OpenAI prompt and JSON schema - Updated warning message generation Addresses review comments from Copilot. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 8 out of 9 changed files in this pull request and generated 1 comment.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
snomiao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrects zod version that was accidentally changed during merge from ^4.0.5 to ^4.0.0. This should resolve the Vercel build failure. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…ubRepoUrl The function parseUrlRepoOwner does not exist in @/src/parseOwnerRepo. The correct function name is parseGithubRepoUrl. This fixes the TypeScript compilation error that was causing the Vercel build to fail. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…or code formatting
|
No dependency changes detected. Learn more about Socket for GitHub. 👍 No dependency changes detected in pull request |
Replace bun:mock with MSW (Mock Service Worker) for more realistic HTTP mocking: - Mock GitHub API endpoints (pulls, comments) and OpenAI API - Add proper MSW server lifecycle (beforeAll, afterEach, afterAll) - Mock database module to avoid MongoDB connection in tests - All tests passing (4/4) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…nation' Address Copilot review feedback to use singular form 'explanation' for consistency. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Use the centralized MSW setup from @/src/test/msw-setup instead of duplicating server configuration. This addresses the review comment to use the unified MSW setup. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
|
Fixed spelling from 'explanations' to 'explanation' in b963f43 |
|
Refactored to use unified MSW setup from @/src/test/msw-setup in 0800956 |
| type S = GithubApiComponents["schemas"]; | ||
| // todo(sno): deprecate this and use @octokit/webhooks-types | ||
| export type WEBHOOK_EVENTS = { | ||
| branch_protection_configuration: S[`webhook-branch-protection-configuration${string}` & keyof S]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets' remove this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and update usages
Updated OpenAI model from invalid 'gpt-5-mini' to correct 'gpt-4o-mini' for test evidence analysis. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
… assertions - Add comprehensive GitHub client mocking (ghc.pulls.list, gh.issues, etc.) - Add database mocking with proper findOneAndUpdate implementation - Add actual test assertions using expect() for all test cases - Verify comment creation, deletion, and update behavior - Verify draft PR skipping logic - Verify warning message format and content - All tests now properly validate behavior instead of just structure Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…ine type - Delete run/github-webhook-event-type/index.ts - Replace import with inline WEBHOOK_EVENT type definition in run/index.ts - Add TODO comment to consider migrating to @octokit/webhooks-types Addresses review comment to remove unused file and update usages. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Updated by Bun to include configVersion in lockfile format. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add Biome configuration with settings for: - Formatting (2-space indent, 120 line width, LF line endings) - Linting (recommended rules with React and Next.js support) - JavaScript formatting (single quotes, trailing commas, etc.) - HTML formatting - Auto-organize imports and sort attributes Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Resolved conflicts: - Accepted main's @octokit/webhooks-types migration in run/index.ts - Accepted main's CLAUDE.md documentation updates - Accepted main's coreping.ts refactoring - Merged gh-test-evidence task import with new task imports - Accepted main's bun.lock and package.json updates Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Update imports to match the refactored GitHub client location: - @/src/gh → @/lib/github - @/src/ghc → @/lib/github/githubCached Fixes CI error: Cannot find module '@/src/gh' Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Summary
Implements automated test evidence checking for PRs in desktop and ComfyUI repos, solving issue #61 in a smarter way.
Changes
app/tasks/gh-test-evidence/gh-test-evidence.tstaskComfy-Org/desktopandcomfyanonymous/ComfyUISmart Improvements
Testing
gh-test-evidence.spec.tswith test structureWorkflow Integration
Added to
app/tasks/run-gh-tasks.tsto run on schedule with other GitHub tasks.Closes #61
🤖 Generated with Claude Code