Add post-deployment health checks (#17270)#17509
Conversation
Implements automated health monitoring workflow and script to test Lambda@Edge and critical endpoints after deployments. - Creates Node.js health check script for endpoint testing - Adds GitHub Actions workflow triggered after deployments - Tests 9 critical endpoints and 2 Lambda@Edge redirects - Sends Slack notifications on failure - Supports manual triggers for on-demand testing - Documents health checks in BUILD-AND-DEPLOY.md Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This comment was marked as outdated.
This comment was marked as outdated.
|
Your site preview for commit a017102 is ready! 🎉 http://www-testing-pulumi-docs-origin-pr-17509-a0171029.s3-website.us-west-2.amazonaws.com. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Change 'Testing locally' and 'Adding new checks' from bold text to proper H3 headings per STYLE-GUIDE.md. Co-authored-by: Cam Soper <CamSoper@users.noreply.github.com>
|
Your site preview for commit 13a14ae is ready! 🎉 http://www-testing-pulumi-docs-origin-pr-17509-13a14ae8.s3-website.us-west-2.amazonaws.com. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Replaces 147-line Node.js script with inline curl-based bash in the GitHub Actions workflow. Removes repository checkout and Node.js setup steps, eliminating external dependencies. Net reduction: 87 lines of code. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
We just did a little refactoring, @claude. Hit me with another review. |
|
Claude finished @CamSoper's task in 1m 12s —— View job Code review complete
Workflow Implementation ReviewThe refactored inline bash approach looks great! Here's my detailed feedback: Strengths✅ Clean refactoring: The inline bash implementation is much simpler and more maintainable than using external scripts or dependencies ✅ Well-structured functions: The ✅ Good separation of concerns: The health check job doesn't modify anything, and the notify job only runs on failure ✅ Comprehensive coverage: Tests critical endpoints (homepage, docs, registry, SDK docs) and Lambda@Edge redirects ✅ Proper error handling: Uses ✅ Clear output: Emoji indicators and descriptive messages make it easy to see what's being tested Issues Found1. Shebang location ( The shebang Fix: Either remove the shebang (since GitHub Actions already uses bash) or add 2. Regex pattern escaping ( Line 132 has a regex pattern that requires the backslash to be double-escaped in the bash context: check_redirect "/docs/reference/pkg/nodejs/pulumi/aws/" "/docs/reference/pkg/aws/\\?language=nodejs" "Node.js SDK redirect"The Current behavior: Matches literal 3. Redirect validation strictness ( The redirect check uses Recommendation: Consider whether exact matching is needed for any redirects, or if the current regex matching is sufficient for your use case. Documentation ReviewThe BUILD-AND-DEPLOY.md additions look good overall. Here are my findings: Strengths✅ Clear structure: Well-organized sections covering what it checks, when it runs, and how to extend it ✅ Practical examples: Local testing examples help developers debug issues ✅ Good detail level: Explains the separation of concerns (deployment stays successful even if health check fails) Style Issues1. Heading capitalization (BUILD-AND-DEPLOY.md:2165) "Local testing" should be "Local Testing" (Title Case for H3 headings). Wait, I need to recheck the style guide... Actually, looking at STYLE-GUIDE.md:39-40: "H2 and deeper: Sentence case". So "Local testing" is actually correct as is! My mistake. ✅ No style issues found SummaryThe refactoring successfully simplifies the health check implementation with inline bash. The code is clean, well-documented, and easy to maintain. Action items:
Overall, this is a solid implementation. The inline bash approach is much cleaner than using external dependencies, and the documentation clearly explains how the system works. Nice refactoring! 🎉 |
|
Your site preview for commit 4aae252 is ready! 🎉 http://www-testing-pulumi-docs-origin-pr-17509-4aae2525.s3-website.us-west-2.amazonaws.com. |
Implements automated health monitoring to detect Lambda@Edge and endpoint failures after deployments.
✅ Zero-risk change: This PR only adds a new workflow and documentation. No existing workflows or deployment processes are modified.
Changes
.github/workflows/post-deployment-health-check.yml): Inline bash script using curl to test 9 critical endpoints and 2 Lambda@Edge redirects. Triggers after deployments and sends Slack notifications on failure.BUILD-AND-DEPLOY.md): Documents health check system and usageImplementation
Uses inline bash with curl (no external dependencies or repository checkout required). This keeps the workflow simple, fast, and self-contained.
How it works
The workflow runs after existing deployment workflows complete successfully. It's completely independent and won't block or interfere with deployments.
Testing
✅ All health checks pass against production
✅ Validates both production (
pulumi.com) and testing (pulumi-test.io) environments