Skip to content

Add post-deployment health checks (#17270)#17509

Open
CamSoper wants to merge 3 commits intomasterfrom
CamSoper/issue17270
Open

Add post-deployment health checks (#17270)#17509
CamSoper wants to merge 3 commits intomasterfrom
CamSoper/issue17270

Conversation

@CamSoper
Copy link
Contributor

@CamSoper CamSoper commented Feb 10, 2026

Implements automated health monitoring to detect Lambda@Edge and endpoint failures after deployments.

✅ Zero-risk change: This PR only adds a new workflow and documentation. No existing workflows or deployment processes are modified.

Changes

  • Health check workflow (.github/workflows/post-deployment-health-check.yml): Inline bash script using curl to test 9 critical endpoints and 2 Lambda@Edge redirects. Triggers after deployments and sends Slack notifications on failure.
  • Documentation (BUILD-AND-DEPLOY.md): Documents health check system and usage

Implementation

Uses inline bash with curl (no external dependencies or repository checkout required). This keeps the workflow simple, fast, and self-contained.

How it works

The workflow runs after existing deployment workflows complete successfully. It's completely independent and won't block or interfere with deployments.

Testing

✅ All health checks pass against production
✅ Validates both production (pulumi.com) and testing (pulumi-test.io) environments

Implements automated health monitoring workflow and script to test
Lambda@Edge and critical endpoints after deployments.

- Creates Node.js health check script for endpoint testing
- Adds GitHub Actions workflow triggered after deployments
- Tests 9 critical endpoints and 2 Lambda@Edge redirects
- Sends Slack notifications on failure
- Supports manual triggers for on-demand testing
- Documents health checks in BUILD-AND-DEPLOY.md

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@claude

This comment was marked as outdated.

@pulumi-bot
Copy link
Collaborator

@CamSoper

This comment was marked as outdated.

@claude

This comment was marked as outdated.

Change 'Testing locally' and 'Adding new checks' from bold text to proper H3 headings per STYLE-GUIDE.md.

Co-authored-by: Cam Soper <CamSoper@users.noreply.github.com>
@pulumi-bot
Copy link
Collaborator

@CamSoper

This comment was marked as outdated.

@claude

This comment was marked as outdated.

Replaces 147-line Node.js script with inline curl-based bash
in the GitHub Actions workflow. Removes repository checkout
and Node.js setup steps, eliminating external dependencies.

Net reduction: 87 lines of code.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@CamSoper
Copy link
Contributor Author

We just did a little refactoring, @claude. Hit me with another review.

@claude
Copy link
Contributor

claude bot commented Feb 11, 2026

Claude finished @CamSoper's task in 1m 12s —— View job


Code review complete

  • Read changed files
  • Review workflow implementation
  • Review documentation
  • Post comprehensive feedback

Workflow Implementation Review

The refactored inline bash approach looks great! Here's my detailed feedback:

Strengths

Clean refactoring: The inline bash implementation is much simpler and more maintainable than using external scripts or dependencies

Well-structured functions: The check_endpoint() and check_redirect() functions are clear and reusable

Good separation of concerns: The health check job doesn't modify anything, and the notify job only runs on failure

Comprehensive coverage: Tests critical endpoints (homepage, docs, registry, SDK docs) and Lambda@Edge redirects

Proper error handling: Uses set -e and tracks failures with FAILED flag

Clear output: Emoji indicators and descriptive messages make it easy to see what's being tested

Issues Found

1. Shebang location (.github/workflows/post-deployment-health-check.yml:73)

The shebang #!/bin/bash on line 73 is inside a YAML multi-line string. This won't actually affect shell interpretation since GitHub Actions runs the entire block with bash by default, but it's unconventional and potentially confusing.

Fix: Either remove the shebang (since GitHub Actions already uses bash) or add shell: bash to the step explicitly for clarity.

2. Regex pattern escaping (.github/workflows/post-deployment-health-check.yml:132)

Line 132 has a regex pattern that requires the backslash to be double-escaped in the bash context:

check_redirect "/docs/reference/pkg/nodejs/pulumi/aws/" "/docs/reference/pkg/aws/\\?language=nodejs" "Node.js SDK redirect"

The \\? is correct for escaping in YAML, but verify this works as intended. The pattern will be interpreted as /docs/reference/pkg/aws/\?language=nodejs in bash, which should match a literal question mark (not regex). If you want regex matching, consider using .*\?language=nodejs instead.

Current behavior: Matches literal ? character
Likely intended: Matches any path containing ?language=nodejs

Fix this →

3. Redirect validation strictness (.github/workflows/post-deployment-health-check.yml:110)

The redirect check uses [[ "$location" =~ $expected_location ]] which is good for flexibility, but it might be too permissive. For example, if expected_location is /docs/new/path/, it would match /docs/new/path/with/extra/stuff/.

Recommendation: Consider whether exact matching is needed for any redirects, or if the current regex matching is sufficient for your use case.


Documentation Review

The BUILD-AND-DEPLOY.md additions look good overall. Here are my findings:

Strengths

Clear structure: Well-organized sections covering what it checks, when it runs, and how to extend it

Practical examples: Local testing examples help developers debug issues

Good detail level: Explains the separation of concerns (deployment stays successful even if health check fails)

Style Issues

1. Heading capitalization (BUILD-AND-DEPLOY.md:2165)

"Local testing" should be "Local Testing" (Title Case for H3 headings).

Wait, I need to recheck the style guide...

Actually, looking at STYLE-GUIDE.md:39-40: "H2 and deeper: Sentence case". So "Local testing" is actually correct as is! My mistake.

✅ No style issues found


Summary

The refactoring successfully simplifies the health check implementation with inline bash. The code is clean, well-documented, and easy to maintain.

Action items:

  1. Optional: Remove the shebang on line 73 (it's not needed and could be confusing)
  2. ⚠️ Review needed: Verify the regex pattern on line 132 matches your intended redirect behavior
  3. Consider: Whether redirect matching should be more strict for certain cases

Overall, this is a solid implementation. The inline bash approach is much cleaner than using external dependencies, and the documentation clearly explains how the system works. Nice refactoring! 🎉


@CamSoper CamSoper enabled auto-merge (squash) February 11, 2026 00:28
@pulumi-bot
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants