-
Notifications
You must be signed in to change notification settings - Fork 35
Add integration tests for metrics endpoints #687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… policy-engine (#3) * Initial plan * Add metrics integration tests and expose metrics ports Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Fix response body reading and temporarily use normal gateway-controller image Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Revert to coverage image and finalize metrics integration tests Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Refactor metrics steps to extract helper functions and reduce duplication Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Fix test logic: clear cached response body after API creation and use consistent body reading Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Fix mutex locking consistency in getResponseBody helper Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Optimize regex compilation, extract API definition constant, and improve error logging Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Add integration tests for metrics endpoints in gateway-controller and policy-engine Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Revert unintended go.mod and go.sum changes - keep only test files Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com>
|
|
WalkthroughThis pull request adds integration tests for gateway metrics endpoints. Changes include Docker Compose port mappings for metrics, a BDD feature file defining test scenarios, Go-based test step implementations for metrics validation, and suite configuration to execute the new metrics tests. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@gateway/it/steps_metrics.go`:
- Around line 200-223: In the function theAPICountMetricShouldHaveIncreased,
avoid holding s.mutex while calling s.GetContextInt (which calls GetContextValue
and acquires s.mutex.RLock), as that causes a deadlock; instead call
s.GetContextInt("initial_api_count") before acquiring s.mutex (remove or move
the s.mutex.Lock()/defer Unlock() so the initialCount is read without holding
the write lock), then re-acquire the lock only for the remaining protected
operations (or drop the explicit lock entirely if not needed) so GetContextInt
does not attempt nested locking with s.mutex.Lock.
- Around line 168-182: The function iExtractCurrentAPICountFromMetrics currently
holds s.mutex while calling s.SetContextValue which also locks the same mutex
and causes a deadlock; fix it by moving the SetContextValue call out of the
critical section: lock only long enough to call getResponseBody and
parseAPICountFromMetrics (or capture any data that requires protection), store
the parsed count in a local variable, unlock the mutex, then call
s.SetContextValue("initial_api_count", count). Alternatively, if you prefer
keeping the work under one lock, add an internal helper like
setContextValueLocked that assumes the mutex is already held and call that
instead of SetContextValue; ensure you use the existing s.mutex,
iExtractCurrentAPICountFromMetrics, getResponseBody, parseAPICountFromMetrics,
and SetContextValue symbols to locate code.
🧹 Nitpick comments (4)
gateway/it/features/metrics.feature (1)
42-47: Consider adding status code verification after the second metrics request.In "API count metric updates after API creation", the second GET request to the metrics endpoint (line 46) doesn't verify the response status code before checking the metric increase. While the metric comparison would likely fail if the request failed, an explicit status check would provide clearer error messages.
Suggested improvement
When I create a new API via the gateway controller And I send a GET request to the gateway controller metrics endpoint + Then the response status code should be 200 - Then the API count metric should have increased + And the API count metric should have increasedgateway/it/steps_metrics.go (3)
140-143: Consider using structured logging instead of fmt.Printf.Using
fmt.Printffor warnings may not integrate well with the test framework's logging. Consider usinglog.Printffor consistency with the rest of the codebase or the test framework's logging facilities.Suggested change
val, err := strconv.Atoi(match[1]) if err != nil { - // Log parsing error but continue processing other matches - fmt.Printf("Warning: failed to parse metric value '%s': %v\n", match[1], err) + // Log parsing error but continue processing other matches + log.Printf("Warning: failed to parse metric value '%s': %v", match[1], err) continue }Add
"log"to the imports.
184-198: Test API creation doesn't verify success response.
iCreateTestAPIViaGatewayControllersends a POST request but doesn't verify the response status code. If the API creation fails, the test will proceed and fail later with a misleading error about the metric count not increasing.Suggested improvement
func (s *TestState) iCreateTestAPIViaGatewayController(httpSteps *steps.HTTPSteps) error { body := &godog.DocString{Content: testAPIDefinition} httpSteps.SetHeader("Content-Type", "application/yaml") if err := httpSteps.SendPOSTToService("gateway-controller", "/apis", body); err != nil { return err } + + // Verify API creation succeeded + if httpSteps.LastResponse == nil || httpSteps.LastResponse.StatusCode >= 400 { + return fmt.Errorf("failed to create test API: status=%d", httpSteps.LastResponse.StatusCode) + } // Clear the cached response body since we'll be reading metrics again s.mutex.Lock()Note: Adjust based on how
httpStepsexposes the last response.
74-84: Hardcoded localhost URLs.The metrics endpoint URLs are hardcoded to
localhost:9091andlocalhost:9003. While this works for the current Docker Compose setup, consider making these configurable viaTestState.Configfor consistency with other service URLs (e.g.,GatewayControllerURL).
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
gateway/it/docker-compose.test.yamlgateway/it/features/metrics.featuregateway/it/steps_metrics.gogateway/it/suite_test.go
🧰 Additional context used
🧬 Code graph analysis (1)
gateway/it/suite_test.go (1)
gateway/it/steps_metrics.go (1)
RegisterMetricsSteps(52-72)
🔇 Additional comments (4)
gateway/it/docker-compose.test.yaml (1)
33-33: LGTM!The new port mappings for metrics endpoints are correctly configured and align with the metrics endpoint URLs used in
steps_metrics.go(:9091/metricsfor gateway-controller and:9003/metricsfor policy-engine).Also applies to: 63-63
gateway/it/features/metrics.feature (1)
19-47: Well-structured BDD feature for metrics testing.The scenarios cover essential aspects: endpoint accessibility, Prometheus format validation, and metric behavior verification. The progressive complexity from basic access checks to state-change verification is a good testing approach.
gateway/it/steps_metrics.go (1)
32-35: Good practice: Pre-compiled regex.Pre-compiling the regex at package initialization avoids repeated compilation overhead during test execution.
gateway/it/suite_test.go (1)
75-75: LGTM!The metrics feature and step definitions are correctly integrated into the test suite, following the same pattern as other feature registrations.
Also applies to: 225-225
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| // iExtractCurrentAPICountFromMetrics extracts the API count metric value | ||
| func (s *TestState) iExtractCurrentAPICountFromMetrics() error { | ||
| s.mutex.Lock() | ||
| defer s.mutex.Unlock() | ||
|
|
||
| bodyStr, err := s.getResponseBody() | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| // Parse the API count from metrics | ||
| count := parseAPICountFromMetrics(bodyStr) | ||
| s.SetContextValue("initial_api_count", count) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for SetContextValue implementation in the repository
rg "func.*SetContextValue" -A 10 --type goRepository: wso2/api-platform
Length of output: 609
Confirmed deadlock risk: SetContextValue acquires the same mutex while it's already held.
iExtractCurrentAPICountFromMetrics locks s.mutex and then calls s.SetContextValue, which also attempts to lock s.mutex. Since the mutex is not recursive, this will deadlock.
The fix is to either:
- Release the mutex before calling
s.SetContextValue, or - Move the
SetContextValuecall outside the critical section, or - Create a separate internal method that assumes the mutex is already held
🤖 Prompt for AI Agents
In `@gateway/it/steps_metrics.go` around lines 168 - 182, The function
iExtractCurrentAPICountFromMetrics currently holds s.mutex while calling
s.SetContextValue which also locks the same mutex and causes a deadlock; fix it
by moving the SetContextValue call out of the critical section: lock only long
enough to call getResponseBody and parseAPICountFromMetrics (or capture any data
that requires protection), store the parsed count in a local variable, unlock
the mutex, then call s.SetContextValue("initial_api_count", count).
Alternatively, if you prefer keeping the work under one lock, add an internal
helper like setContextValueLocked that assumes the mutex is already held and
call that instead of SetContextValue; ensure you use the existing s.mutex,
iExtractCurrentAPICountFromMetrics, getResponseBody, parseAPICountFromMetrics,
and SetContextValue symbols to locate code.
| // theAPICountMetricShouldHaveIncreased verifies the API count increased | ||
| func (s *TestState) theAPICountMetricShouldHaveIncreased() error { | ||
| s.mutex.Lock() | ||
| defer s.mutex.Unlock() | ||
|
|
||
| initialCount, ok := s.GetContextInt("initial_api_count") | ||
| if !ok { | ||
| return fmt.Errorf("initial API count not found in context") | ||
| } | ||
|
|
||
| bodyStr, err := s.getResponseBody() | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| // Parse the current API count from metrics | ||
| currentCount := parseAPICountFromMetrics(bodyStr) | ||
|
|
||
| if currentCount <= initialCount { | ||
| return fmt.Errorf("API count did not increase: initial=%d, current=%d", initialCount, currentCount) | ||
| } | ||
|
|
||
| return nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for GetContextInt implementation
ast-grep --pattern $'func ($_ *TestState) GetContextInt($_) ($_, $_) {
$$$
}'Repository: wso2/api-platform
Length of output: 414
🏁 Script executed:
# Also search for iExtractCurrentAPICountFromMetrics to understand the similar pattern
rg -A 10 "iExtractCurrentAPICountFromMetrics" gateway/it/Repository: wso2/api-platform
Length of output: 1565
🏁 Script executed:
# Search for TestState struct definition to understand the mutex field
rg "type TestState struct" gateway/it/ -A 10Repository: wso2/api-platform
Length of output: 575
🏁 Script executed:
#!/bin/bash
# Search for GetContextValue implementation
rg "func.*GetContextValue" gateway/it/ -A 10Repository: wso2/api-platform
Length of output: 611
Fix deadlock: release mutex before calling GetContextInt.
This method acquires s.mutex with a write lock, then calls GetContextInt, which internally calls GetContextValue that tries to acquire s.mutex.RLock(). Attempting to acquire a read lock while holding a write lock will cause a deadlock in the same goroutine.
Either retrieve the value before locking the mutex, or refactor to avoid nested locking:
// Option 1: Retrieve value before locking
initialCount, ok := s.GetContextInt("initial_api_count")
if !ok {
return fmt.Errorf("initial API count not found in context")
}
s.mutex.Lock()
defer s.mutex.Unlock()
// ... rest of the logic🤖 Prompt for AI Agents
In `@gateway/it/steps_metrics.go` around lines 200 - 223, In the function
theAPICountMetricShouldHaveIncreased, avoid holding s.mutex while calling
s.GetContextInt (which calls GetContextValue and acquires s.mutex.RLock), as
that causes a deadlock; instead call s.GetContextInt("initial_api_count") before
acquiring s.mutex (remove or move the s.mutex.Lock()/defer Unlock() so the
initialCount is read without holding the write lock), then re-acquire the lock
only for the remaining protected operations (or drop the explicit lock entirely
if not needed) so GetContextInt does not attempt nested locking with
s.mutex.Lock.
$subject
Summary by CodeRabbit
Release Notes
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.