Skip to content

Add a startup parameter for the DNS proxy#1581

Open
FAUST-BENCHOU wants to merge 2 commits intokmesh-net:mainfrom
FAUST-BENCHOU:feat/dns
Open

Add a startup parameter for the DNS proxy#1581
FAUST-BENCHOU wants to merge 2 commits intokmesh-net:mainfrom
FAUST-BENCHOU:feat/dns

Conversation

@FAUST-BENCHOU
Copy link

What type of PR is this?
/kind feature

What this PR does / why we need it:

Which issue(s) this PR fixes:
Fixes #1574

Tests
TestDnsproxyAPI: Enables and disables the DNS proxy via the status server's HTTP interface (POST /dnsproxy?enable=true|false), verifying consistency with the underlying behavior of kmeshctl dnsproxy.

TestDnsproxyKmeshctl: Enables and disables the DNS proxy via the kmeshctl command line (kmeshctl dnsproxy <pod> enable/disable), verifying that the CLI can correctly control a single daemon.

TestDnsproxyStartupParameter: Configures the DNS proxy switch via startup parameters/environment variables (such as KMESH_ENABLE_DNS_PROXY), and after patching the DaemonSet, verifies that the kmesh pod remains ready, confirming the startup configuration is effective.

Signed-off-by: zhoujinyu <2319109590@qq.com>
Signed-off-by: zhoujinyu <2319109590@qq.com>
@kmesh-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign yaozengzeng for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kmesh-bot
Copy link
Collaborator

Welcome @FAUST-BENCHOU! It looks like this is your first PR to kmesh-net/kmesh 🎉

@gemini-code-assist
Copy link

Summary of Changes

Hello @FAUST-BENCHOU, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the control and flexibility of Kmesh's DNS proxy feature. It provides both a new command-line interface for dynamic runtime management and a dedicated startup parameter for initial configuration. These changes streamline the process of enabling or disabling the DNS proxy, making it more robust and easier to integrate into existing deployment workflows.

Highlights

  • New kmeshctl dnsproxy Command: Introduced a new kmeshctl dnsproxy command-line tool that allows users to enable or disable Kmesh's DNS proxy for specific daemon pods or across all Kmesh daemons in a cluster.
  • DNS Proxy Startup Parameter: Added a new startup parameter, --enable-dns-proxy, to the Kmesh daemon. This allows the DNS proxy to be configured at the daemon's launch, providing a declarative way to manage its state.
  • Dynamic Control via Status Server: The kmeshctl dnsproxy command interacts with the Kmesh daemon's status server via a new /dnsproxy HTTP endpoint. This enables runtime control over the DNS proxy without requiring a daemon restart.
  • Configuration Updates: Updated Helm charts (daemonset.yaml) and direct YAML deployments (kmesh.yaml) to utilize the new --enable-dns-proxy flag and removed the deprecated KMESH_ENABLE_DNS_PROXY environment variable for managing the DNS proxy state.
  • Integration Tests: Added comprehensive integration tests to verify the functionality of the DNS proxy, covering its control via the status server API, the kmeshctl command, and the new startup parameter.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • ctl/common/common.go
    • Imported the new dnsproxy package.
    • Added the dnsproxy command to the root kmeshctl command.
  • ctl/dnsproxy/dnsproxy.go
    • New file implementing the kmeshctl dnsproxy command.
    • Provides functionality to enable or disable the DNS proxy for individual Kmesh daemon pods or all daemons in a cluster.
    • Communicates with the Kmesh daemon's status server via HTTP POST requests.
  • daemon/manager/manager.go
    • Modified the NewServer constructor call to pass the full controller object, allowing the status server to access controller methods for DNS proxy management.
  • daemon/options/bpf.go
    • Added EnableDnsProxy boolean field to BpfConfig.
    • Registered a new persistent flag --enable-dns-proxy for daemon startup configuration.
  • deploy/charts/kmesh-helm/templates/daemonset.yaml
    • Updated the args for the kmesh daemon container to conditionally include --enable-dns-proxy based on Helm values.
    • Removed the KMESH_ENABLE_DNS_PROXY environment variable.
  • deploy/yaml/kmesh.yaml
    • Updated the args for the kmesh daemon container to include --enable-dns-proxy.
    • Removed the KMESH_ENABLE_DNS_PROXY environment variable.
  • docs/ctl/kmeshctl.md
    • Added a reference to the new kmeshctl_dnsproxy.md documentation file.
  • docs/ctl/kmeshctl_dnsproxy.md
    • New file providing detailed documentation and examples for the kmeshctl dnsproxy command.
  • pkg/controller/controller.go
    • Added a sync.Mutex (dnsProxyMu) to protect concurrent access to the dnsServer.
    • Modified the Start method to initialize the DNS proxy based on the new bpfConfig.EnableDnsProxy flag, with workload.EnableDNSProxy as a fallback for backward compatibility.
    • Refactored setupDNSProxy to conditionally start the DNS server based on the DNS proxy trigger state.
    • Introduced StartDnsProxy() and StopDnsProxy() methods for dynamic runtime control of the DNS proxy.
    • Updated updateDnsLookupTable to use the dnsProxyMu mutex.
  • pkg/controller/telemetry/metric.go
    • Added an EnableDnsProxy atomic boolean field to MetricController to track the DNS proxy's enabled state.
  • pkg/controller/workload/workload_controller.go
    • Modified the Run method to pass the current DNS proxy trigger state to Processor.PrepareDNSProxy.
    • Added SetDnsProxyTrigger and GetDnsProxyTrigger methods to manage the DNS proxy state within the WorkloadController via the MetricController.
  • pkg/controller/workload/workload_processor.go
    • Updated PrepareDNSProxy to accept an enabled boolean parameter, making its behavior explicit rather than relying on a global variable.
  • pkg/status/status_server.go
    • Added patternDnsproxy constant for the new HTTP endpoint.
    • Modified NewServer to accept and store the *controller.Controller instance, allowing it to call DNS proxy control methods.
    • Registered dnsproxyHandler to handle requests at the /dnsproxy endpoint.
    • Implemented dnsproxyHandler to parse 'enable' query parameters and call ctrl.StartDnsProxy() or ctrl.StopDnsProxy() accordingly.
  • test/e2e/dnsproxy_test.go
    • New file containing integration tests for the DNS proxy functionality.
    • Includes TestDnsproxyAPI to test the HTTP API for enabling/disabling the proxy.
    • Includes TestDnsproxyKmeshctl to test the kmeshctl dnsproxy command.
    • Includes TestDnsproxyStartupParameter to verify the --enable-dns-proxy flag at startup.
Activity
  • The pull request introduces a new feature to control the DNS proxy in Kmesh.
  • It includes new kmeshctl commands and API endpoints for managing the DNS proxy.
  • Comprehensive integration tests have been added to ensure the new functionality works as expected across different control mechanisms (API, CLI, startup parameter).
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


A proxy for DNS, Resolves names with swift finesse, No more lost requests.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a startup parameter --enable-dns-proxy and a kmeshctl dnsproxy command to control the DNS proxy feature. The changes are well-structured, covering the daemon, CLI tool, deployment configurations, and tests. My review focuses on improving the robustness and error handling of the new functionality. I've identified a critical issue in the kmeshctl command that could cause it to fail when targeting multiple pods, and some high-severity issues related to ignored errors in cleanup paths that could lead to a broken state. The suggested changes will make the new command more reliable and the error handling in the controller more explicit.

Comment on lines +86 to +99
if podName != "" {
SetDnsproxyPerKmeshDaemon(client, podName, enableStr)
return
}

// Apply to all kmesh daemons
podList, err := client.PodsForSelector(context.TODO(), utils.KmeshNamespace, utils.KmeshLabel)
if err != nil {
log.Errorf("failed to get kmesh podList: %v", err)
os.Exit(1)
}
for _, pod := range podList.Items {
SetDnsproxyPerKmeshDaemon(client, pod.GetName(), enableStr)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

After modifying SetDnsproxyPerKmeshDaemon to return an error, ControlDnsproxy should be updated to handle these errors. When a single pod is targeted, an error should cause the command to exit with a non-zero status. When all pods are targeted, an error for one pod should be logged, but the command should continue to process the remaining pods.

	if podName != "" {
		if err := SetDnsproxyPerKmeshDaemon(client, podName, enableStr); err != nil {
			log.Errorf("failed to set dnsproxy for pod %s: %v", podName, err)
			os.Exit(1)
		}
		return
	}

	// Apply to all kmesh daemons
	podList, err := client.PodsForSelector(context.TODO(), utils.KmeshNamespace, utils.KmeshLabel)
	if err != nil {
		log.Errorf("failed to get kmesh podList: %v", err)
		os.Exit(1)
	}
	for _, pod := range podList.Items {
		if err := SetDnsproxyPerKmeshDaemon(client, pod.GetName(), enableStr); err != nil {
			log.Errorf("failed to set dnsproxy for pod %s: %v", pod.GetName(), err)
		}
	}

Comment on lines +102 to +147
func SetDnsproxyPerKmeshDaemon(cli kube.CLIClient, podName, info string) {
var status string
if info == "enable" {
status = "true"
} else {
status = "false"
}

fw, err := utils.CreateKmeshPortForwarder(cli, podName)
if err != nil {
log.Errorf("failed to create port forwarder for Kmesh daemon pod %s: %v", podName, err)
os.Exit(1)
}
if err := fw.Start(); err != nil {
log.Errorf("failed to start port forwarder for Kmesh daemon pod %s: %v", podName, err)
os.Exit(1)
}
defer fw.Close()

url := fmt.Sprintf("http://%s%s?enable=%s", fw.Address(), patternDnsproxy, status)

req, err := http.NewRequest(http.MethodPost, url, nil)
if err != nil {
log.Errorf("Error creating request: %v", err)
return
}

req.Header.Set("Content-Type", "application/json")
httpClient := &http.Client{}
resp, err := httpClient.Do(req)
if err != nil {
log.Errorf("failed to make HTTP request: %v", err)
return
}
defer resp.Body.Close()

if resp.StatusCode != http.StatusOK {
log.Errorf("Error: received status code %d", resp.StatusCode)
bodyBytes, readErr := io.ReadAll(resp.Body)
if readErr != nil {
log.Errorf("Error reading response body: %v", readErr)
return
}
log.Errorf("response: %s", string(bodyBytes))
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The function SetDnsproxyPerKmeshDaemon calls os.Exit(1) on failure. When kmeshctl dnsproxy enable is run without a pod name, it iterates over all kmesh daemons. If an error occurs for one daemon (e.g., it's not reachable), the entire command will exit, preventing it from configuring the other daemons.

The function should be refactored to return an error instead of exiting. This allows the caller (ControlDnsproxy) to handle the error gracefully, such as logging it and continuing to the next pod in the loop.

func SetDnsproxyPerKmeshDaemon(cli kube.CLIClient, podName, info string) error {
	var status string
	if info == "enable" {
		status = "true"
	} else {
		status = "false"
	}

	fw, err := utils.CreateKmeshPortForwarder(cli, podName)
	if err != nil {
		return fmt.Errorf("failed to create port forwarder for Kmesh daemon pod %s: %v", podName, err)
	}
	if err := fw.Start(); err != nil {
		return fmt.Errorf("failed to start port forwarder for Kmesh daemon pod %s: %v", podName, err)
	}
	defer fw.Close()

	url := fmt.Sprintf("http://%s%s?enable=%s", fw.Address(), patternDnsproxy, status)

	req, err := http.NewRequest(http.MethodPost, url, nil)
	if err != nil {
		return fmt.Errorf("Error creating request: %v", err)
	}

	req.Header.Set("Content-Type", "application/json")
	httpClient := &http.Client{}
	resp, err := httpClient.Do(req)
	if err != nil {
		return fmt.Errorf("failed to make HTTP request: %v", err)
	}
	defer resp.Body.Close()

	if resp.StatusCode != http.StatusOK {
		bodyBytes, readErr := io.ReadAll(resp.Body)
		if readErr != nil {
			return fmt.Errorf("error reading response body for status %d: %v", resp.StatusCode, readErr)
		}
		return fmt.Errorf("received status code %d, response: %s", resp.StatusCode, string(bodyBytes))
	}
	return nil
}

Comment on lines +272 to +276
if err := c.setupDNSProxy(); err != nil {
c.client.WorkloadController.SetDnsProxyTrigger(false)
_ = c.client.WorkloadController.Processor.PrepareDNSProxy(false)
return err
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In this error handling path, the error from PrepareDNSProxy(false) is ignored. If cleaning up the BPF map fails, it might leave a stale entry, which could lead to unexpected behavior (e.g., DNS requests still being redirected). The error should be logged to aid in debugging.

	if err := c.setupDNSProxy(); err != nil {
		c.client.WorkloadController.SetDnsProxyTrigger(false)
		if cleanupErr := c.client.WorkloadController.Processor.PrepareDNSProxy(false); cleanupErr != nil {
			log.Warnf("Failed to cleanup DNS proxy on setup error: %v", cleanupErr)
		}
		return err
	}

Comment on lines +281 to 297
func (c *Controller) StopDnsProxy() error {
if c.client == nil || c.client.WorkloadController == nil {
return fmt.Errorf("dns proxy not supported in this mode")
}
c.dnsProxyMu.Lock()
defer c.dnsProxyMu.Unlock()
if c.dnsServer == nil {
c.client.WorkloadController.SetDnsProxyTrigger(false)
_ = c.client.WorkloadController.Processor.PrepareDNSProxy(false)
return nil
}
c.client.WorkloadController.SetDnsProxyTrigger(false)
_ = c.client.WorkloadController.Processor.PrepareDNSProxy(false)
c.dnsServer.Close()
c.dnsServer = nil
return nil
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The error from c.client.WorkloadController.Processor.PrepareDNSProxy(false) is ignored. If cleaning up the BPF map fails, it might leave a stale entry, which could lead to unexpected behavior (e.g., DNS requests still being redirected). The error should be logged. This happens in two places in this function.

func (c *Controller) StopDnsProxy() error {
	if c.client == nil || c.client.WorkloadController == nil {
		return fmt.Errorf("dns proxy not supported in this mode")
	}
	c.dnsProxyMu.Lock()
	defer c.dnsProxyMu.Unlock()
	if c.dnsServer == nil {
		c.client.WorkloadController.SetDnsProxyTrigger(false)
		if err := c.client.WorkloadController.Processor.PrepareDNSProxy(false); err != nil {
			log.Warnf("Failed to cleanup DNS proxy state: %v", err)
		}
		return nil
	}
	c.client.WorkloadController.SetDnsProxyTrigger(false)
	if err := c.client.WorkloadController.Processor.PrepareDNSProxy(false); err != nil {
		log.Warnf("Failed to cleanup DNS proxy state: %v", err)
	}
	c.dnsServer.Close()
	c.dnsServer = nil
	return nil
}

@codecov
Copy link

codecov bot commented Feb 4, 2026

Codecov Report

❌ Patch coverage is 0% with 95 lines in your changes missing coverage. Please review.
✅ Project coverage is 39.29%. Comparing base (3018401) to head (f89d640).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/controller/controller.go 0.00% 60 Missing ⚠️
pkg/status/status_server.go 0.00% 28 Missing ⚠️
pkg/controller/workload/workload_controller.go 0.00% 5 Missing ⚠️
pkg/controller/workload/workload_processor.go 0.00% 2 Missing ⚠️

❌ Your patch check has failed because the patch coverage (0.00%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Files with missing lines Coverage Δ
pkg/controller/telemetry/metric.go 61.29% <ø> (ø)
pkg/controller/workload/workload_processor.go 58.20% <0.00%> (+0.44%) ⬆️
pkg/controller/workload/workload_controller.go 29.62% <0.00%> (-1.14%) ⬇️
pkg/status/status_server.go 33.33% <0.00%> (-2.49%) ⬇️
pkg/controller/controller.go 0.00% <0.00%> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7c77714...f89d640. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great feature, but i am concerned with the on-flying change could break traffic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a startup parameter for the DNS proxy.

3 participants