Add integration tests for metrics endpoints #687

Tharsanan1 · 2026-01-15T11:04:16Z

$subject

Summary by CodeRabbit

Release Notes

New Features
- Exposed metrics endpoints from gateway controller (port 9091) and policy engine (port 9003), providing Prometheus-format metrics for monitoring API counts and gateway performance.
Tests
- Added comprehensive integration tests validating metrics endpoint accessibility, Prometheus format compliance, and API count metric updates.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

… policy-engine (#3) * Initial plan * Add metrics integration tests and expose metrics ports Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Fix response body reading and temporarily use normal gateway-controller image Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Revert to coverage image and finalize metrics integration tests Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Refactor metrics steps to extract helper functions and reduce duplication Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Fix test logic: clear cached response body after API creation and use consistent body reading Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Fix mutex locking consistency in getResponseBody helper Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Optimize regex compilation, extract API definition constant, and improve error logging Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Add integration tests for metrics endpoints in gateway-controller and policy-engine Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> * Revert unintended go.mod and go.sum changes - keep only test files Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Tharsanan1 <25491225+Tharsanan1@users.noreply.github.com>

CLAassistant · 2026-01-15T11:04:23Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ Tharsanan1
❌ Copilot
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

coderabbitai · 2026-01-15T11:04:37Z

Walkthrough

This pull request adds integration tests for gateway metrics endpoints. Changes include Docker Compose port mappings for metrics, a BDD feature file defining test scenarios, Go-based test step implementations for metrics validation, and suite configuration to execute the new metrics tests.

Changes

Cohort / File(s)	Summary
Docker Configuration `gateway/it/docker-compose.test.yaml`	Exposed metrics ports: gateway-controller port 9091 and policy-engine port 9003
Test Feature Specification `gateway/it/features/metrics.feature`	New BDD feature defining four scenarios: controller/policy-engine metrics endpoint accessibility, Prometheus format validation, and API count metric tracking with updates after API creation
Test Step Implementation `gateway/it/steps_metrics.go`	New step definitions implementing metrics HTTP request handling, Prometheus format parsing, API count metric extraction, and cache management for response bodies
Test Suite Configuration `gateway/it/suite_test.go`	Registered metrics feature and corresponding step definitions in the test suite

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Hoppy hop, metrics now flow,
Through gateways we test, from high up to low,
Prometheus counts dancing in formats so bright,
Controller and engine, both shining so right! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is empty (only contains '$subject' placeholder), missing all required sections from the template such as Purpose, Goals, Approach, Documentation, and test details.	Fill in the required description template sections: Purpose (why these tests are needed), Goals, Approach, Documentation impact, and details about the integration tests being added.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main changeset: adding integration tests for metrics endpoints, which matches the file additions and changes.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@gateway/it/steps_metrics.go`:
- Around line 200-223: In the function theAPICountMetricShouldHaveIncreased,
avoid holding s.mutex while calling s.GetContextInt (which calls GetContextValue
and acquires s.mutex.RLock), as that causes a deadlock; instead call
s.GetContextInt("initial_api_count") before acquiring s.mutex (remove or move
the s.mutex.Lock()/defer Unlock() so the initialCount is read without holding
the write lock), then re-acquire the lock only for the remaining protected
operations (or drop the explicit lock entirely if not needed) so GetContextInt
does not attempt nested locking with s.mutex.Lock.
- Around line 168-182: The function iExtractCurrentAPICountFromMetrics currently
holds s.mutex while calling s.SetContextValue which also locks the same mutex
and causes a deadlock; fix it by moving the SetContextValue call out of the
critical section: lock only long enough to call getResponseBody and
parseAPICountFromMetrics (or capture any data that requires protection), store
the parsed count in a local variable, unlock the mutex, then call
s.SetContextValue("initial_api_count", count). Alternatively, if you prefer
keeping the work under one lock, add an internal helper like
setContextValueLocked that assumes the mutex is already held and call that
instead of SetContextValue; ensure you use the existing s.mutex,
iExtractCurrentAPICountFromMetrics, getResponseBody, parseAPICountFromMetrics,
and SetContextValue symbols to locate code.

🧹 Nitpick comments (4)

gateway/it/features/metrics.feature (1)
42-47: Consider adding status code verification after the second metrics request.

In "API count metric updates after API creation", the second GET request to the metrics endpoint (line 46) doesn't verify the response status code before checking the metric increase. While the metric comparison would likely fail if the request failed, an explicit status check would provide clearer error messages.
Suggested improvement
   When I create a new API via the gateway controller
   And I send a GET request to the gateway controller metrics endpoint
+  Then the response status code should be 200
-  Then the API count metric should have increased
+  And the API count metric should have increased
gateway/it/steps_metrics.go (3)
140-143: Consider using structured logging instead of fmt.Printf.

Using fmt.Printf for warnings may not integrate well with the test framework's logging. Consider using log.Printf for consistency with the rest of the codebase or the test framework's logging facilities.
Suggested change
 			val, err := strconv.Atoi(match[1])
 			if err != nil {
-				// Log parsing error but continue processing other matches
-				fmt.Printf("Warning: failed to parse metric value '%s': %v\n", match[1], err)
+				// Log parsing error but continue processing other matches
+				log.Printf("Warning: failed to parse metric value '%s': %v", match[1], err)
 				continue
 			}
Add "log" to the imports.
184-198: Test API creation doesn't verify success response.

iCreateTestAPIViaGatewayController sends a POST request but doesn't verify the response status code. If the API creation fails, the test will proceed and fail later with a misleading error about the metric count not increasing.
Suggested improvement
 func (s *TestState) iCreateTestAPIViaGatewayController(httpSteps *steps.HTTPSteps) error {
 	body := &godog.DocString{Content: testAPIDefinition}
 	httpSteps.SetHeader("Content-Type", "application/yaml")
 	if err := httpSteps.SendPOSTToService("gateway-controller", "/apis", body); err != nil {
 		return err
 	}
+
+	// Verify API creation succeeded
+	if httpSteps.LastResponse == nil || httpSteps.LastResponse.StatusCode >= 400 {
+		return fmt.Errorf("failed to create test API: status=%d", httpSteps.LastResponse.StatusCode)
+	}

 	// Clear the cached response body since we'll be reading metrics again
 	s.mutex.Lock()
Note: Adjust based on how httpSteps exposes the last response.
74-84: Hardcoded localhost URLs.

The metrics endpoint URLs are hardcoded to localhost:9091 and localhost:9003. While this works for the current Docker Compose setup, consider making these configurable via TestState.Config for consistency with other service URLs (e.g., GatewayControllerURL).

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d422545 and 1bbf5e1.

📒 Files selected for processing (4)

gateway/it/docker-compose.test.yaml
gateway/it/features/metrics.feature
gateway/it/steps_metrics.go
gateway/it/suite_test.go

🧰 Additional context used

🧬 Code graph analysis (1)

gateway/it/suite_test.go (1)

gateway/it/steps_metrics.go (1)

RegisterMetricsSteps (52-72)

🔇 Additional comments (4)

gateway/it/docker-compose.test.yaml (1)

33-33: LGTM!

The new port mappings for metrics endpoints are correctly configured and align with the metrics endpoint URLs used in steps_metrics.go (:9091/metrics for gateway-controller and :9003/metrics for policy-engine).

Also applies to: 63-63

gateway/it/features/metrics.feature (1)

19-47: Well-structured BDD feature for metrics testing.

The scenarios cover essential aspects: endpoint accessibility, Prometheus format validation, and metric behavior verification. The progressive complexity from basic access checks to state-change verification is a good testing approach.

gateway/it/steps_metrics.go (1)

32-35: Good practice: Pre-compiled regex.

Pre-compiling the regex at package initialization avoids repeated compilation overhead during test execution.

gateway/it/suite_test.go (1)

75-75: LGTM!

The metrics feature and step definitions are correctly integrated into the test suite, following the same pattern as other feature registrations.

Also applies to: 225-225

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-15T11:12:51Z

gateway/it/steps_metrics.go

+// iExtractCurrentAPICountFromMetrics extracts the API count metric value
+func (s *TestState) iExtractCurrentAPICountFromMetrics() error {
+	s.mutex.Lock()
+	defer s.mutex.Unlock()
+
+	bodyStr, err := s.getResponseBody()
+	if err != nil {
+		return err
+	}
+
+	// Parse the API count from metrics
+	count := parseAPICountFromMetrics(bodyStr)
+	s.SetContextValue("initial_api_count", count)
+	return nil
+}


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for SetContextValue implementation in the repository rg "func.*SetContextValue" -A 10 --type go

Repository: wso2/api-platform

Length of output: 609

Confirmed deadlock risk: SetContextValue acquires the same mutex while it's already held.

iExtractCurrentAPICountFromMetrics locks s.mutex and then calls s.SetContextValue, which also attempts to lock s.mutex. Since the mutex is not recursive, this will deadlock.

The fix is to either:

Release the mutex before calling s.SetContextValue, or

Move the SetContextValue call outside the critical section, or

Create a separate internal method that assumes the mutex is already held

🤖 Prompt for AI Agents

In `@gateway/it/steps_metrics.go` around lines 168 - 182, The function iExtractCurrentAPICountFromMetrics currently holds s.mutex while calling s.SetContextValue which also locks the same mutex and causes a deadlock; fix it by moving the SetContextValue call out of the critical section: lock only long enough to call getResponseBody and parseAPICountFromMetrics (or capture any data that requires protection), store the parsed count in a local variable, unlock the mutex, then call s.SetContextValue("initial_api_count", count). Alternatively, if you prefer keeping the work under one lock, add an internal helper like setContextValueLocked that assumes the mutex is already held and call that instead of SetContextValue; ensure you use the existing s.mutex, iExtractCurrentAPICountFromMetrics, getResponseBody, parseAPICountFromMetrics, and SetContextValue symbols to locate code.

coderabbitai · 2026-01-15T11:12:52Z

gateway/it/steps_metrics.go

+// theAPICountMetricShouldHaveIncreased verifies the API count increased
+func (s *TestState) theAPICountMetricShouldHaveIncreased() error {
+	s.mutex.Lock()
+	defer s.mutex.Unlock()
+
+	initialCount, ok := s.GetContextInt("initial_api_count")
+	if !ok {
+		return fmt.Errorf("initial API count not found in context")
+	}
+
+	bodyStr, err := s.getResponseBody()
+	if err != nil {
+		return err
+	}
+
+	// Parse the current API count from metrics
+	currentCount := parseAPICountFromMetrics(bodyStr)
+
+	if currentCount <= initialCount {
+		return fmt.Errorf("API count did not increase: initial=%d, current=%d", initialCount, currentCount)
+	}
+
+	return nil
+}


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for GetContextInt implementation ast-grep --pattern $'func ($_ *TestState) GetContextInt($_) ($_, $_) { $$$ }'

Repository: wso2/api-platform

Length of output: 414

🏁 Script executed:

# Also search for iExtractCurrentAPICountFromMetrics to understand the similar pattern rg -A 10 "iExtractCurrentAPICountFromMetrics" gateway/it/

Repository: wso2/api-platform

Length of output: 1565

🏁 Script executed:

# Search for TestState struct definition to understand the mutex field rg "type TestState struct" gateway/it/ -A 10

Repository: wso2/api-platform

Length of output: 575

🏁 Script executed:

#!/bin/bash # Search for GetContextValue implementation rg "func.*GetContextValue" gateway/it/ -A 10

Repository: wso2/api-platform

Length of output: 611

Fix deadlock: release mutex before calling GetContextInt.

This method acquires s.mutex with a write lock, then calls GetContextInt, which internally calls GetContextValue that tries to acquire s.mutex.RLock(). Attempting to acquire a read lock while holding a write lock will cause a deadlock in the same goroutine.

Either retrieve the value before locking the mutex, or refactor to avoid nested locking:

// Option 1: Retrieve value before locking initialCount, ok := s.GetContextInt("initial_api_count") if !ok { return fmt.Errorf("initial API count not found in context") } s.mutex.Lock() defer s.mutex.Unlock() // ... rest of the logic

🤖 Prompt for AI Agents

In `@gateway/it/steps_metrics.go` around lines 200 - 223, In the function theAPICountMetricShouldHaveIncreased, avoid holding s.mutex while calling s.GetContextInt (which calls GetContextValue and acquires s.mutex.RLock), as that causes a deadlock; instead call s.GetContextInt("initial_api_count") before acquiring s.mutex (remove or move the s.mutex.Lock()/defer Unlock() so the initialCount is read without holding the write lock), then re-acquire the lock only for the remaining protected operations (or drop the explicit lock entirely if not needed) so GetContextInt does not attempt nested locking with s.mutex.Lock.

Merge branch 'main' into test-metrics

1bbf5e1

coderabbitai bot reviewed Jan 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add integration tests for metrics endpoints #687

Add integration tests for metrics endpoints #687

Uh oh!

Tharsanan1 commented Jan 15, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

CLAassistant commented Jan 15, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Jan 15, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 15, 2026

Uh oh!

coderabbitai bot Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add integration tests for metrics endpoints #687

Are you sure you want to change the base?

Add integration tests for metrics endpoints #687

Uh oh!

Conversation

Tharsanan1 commented Jan 15, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

CLAassistant commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Tharsanan1 commented Jan 15, 2026 •

edited by coderabbitai bot

Loading

CLAassistant commented Jan 15, 2026 •

edited

Loading

coderabbitai bot commented Jan 15, 2026 •

edited

Loading