Skip to content

refactor(server): unify URL configs when scheme is missing#2944

Merged
imbajin merged 11 commits intoapache:masterfrom
bitflicker64:fix-url-normalization
Feb 3, 2026
Merged

refactor(server): unify URL configs when scheme is missing#2944
imbajin merged 11 commits intoapache:masterfrom
bitflicker64:fix-url-normalization

Conversation

@bitflicker64
Copy link
Contributor

@bitflicker64 bitflicker64 commented Jan 23, 2026

Purpose of the PR

  • close [Improvement] unify endpoint URL format #2942
  • Some server URL configs currently require an explicit scheme (http:// / https://). If a user sets something like 127.0.0.1:8080, it can fail to parse or behave inconsistently compared to other modules that already default the scheme.

This PR fixes that by normalizing only the relevant server URL options: if the scheme is missing, we auto-prefix a sensible default, while leaving explicitly provided schemes untouched.

Also covered the review corner case: server.k8s_url should default to https:// (so we don’t accidentally downgrade it to http://).

Main Changes

HugeConfig.get(...) now applies URL normalization for a small allowlist of keys:

  • restserver.url → default http://
  • gremlinserver.url → default http://
  • server.urls_to_pd → default http://
  • server.k8s_url → default https://

Normalization only triggers when:

  • the config key is in the allowlist, and
  • the value is a String, and
  • the value has no scheme

If the value already starts with http:// or https://, it’s left as-is.
Non-string config values are untouched.

Verifying these changes

  • Trivial rework / code cleanup without any test coverage. (No Need)
  • Already covered by existing tests, such as (please modify tests here).
  • Need tests and can be verified as follows:
    Tests are included and you can verify the behavior like this:
  • Added unit tests in HugeConfigTest covering:

    • missing scheme → default prefix is added
    • existing scheme → preserved as-is
    • server.k8s_url missing scheme → becomes https://... (no accidental downgrade)

Does this PR potentially affect the following parts?

Documentation Status

  • Doc - TODO
  • Doc - Done
  • Doc - No Need

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. tests Add or improve test cases labels Jan 23, 2026
@imbajin imbajin requested a review from Copilot January 24, 2026 15:13
@imbajin imbajin changed the title Normalize server URL config values when scheme is missing refactor(server): unify URL configs when scheme is missing Jan 24, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses issue #2942 by normalizing server URL configuration values when the scheme (http:// or https://) is missing. The normalization allows users to configure URLs without explicitly specifying the scheme, with sensible defaults automatically applied.

Changes:

  • Modified HugeConfig.get() to apply URL normalization for specific config keys
  • Added URL normalization logic that defaults to http:// for most server URLs and https:// for Kubernetes URLs
  • Added comprehensive unit tests to verify normalization behavior

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
hugegraph-commons/hugegraph-common/src/main/java/org/apache/hugegraph/config/HugeConfig.java Implements URL normalization logic in the config retrieval method with allowlist-based scheme defaulting
hugegraph-commons/hugegraph-common/src/test/java/org/apache/hugegraph/unit/config/HugeConfigTest.java Adds test cases for URL normalization and reorganizes imports in alphabetical order

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 88 to 97
public <T, R> R get(TypedOption<T, R> option) {
Object value = this.getProperty(option.name());
if (value == null) {
return option.defaultValue();
value = option.defaultValue();
}

// Normalize URL options if needed (add scheme like http://)
value = normalizeUrlOptionIfNeeded(option.name(), value);

return (R) value;
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URL normalization happens at retrieval time in the get() method rather than at storage time.

This means:

  1. the original unnormalized value remains in the underlying configuration storage and will be returned by direct access methods like getProperty() if they exist
  2. if the configuration is saved using save(), the unnormalized values will be written to the file, potentially causing confusion.
    Consider documenting this behavior or normalizing at storage time instead to maintain consistency.

Copilot uses AI. Check for mistakes.
String lower = s.toLowerCase();
if (lower.startsWith("http://") || lower.startsWith("https://")) {
return s;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Important: Missing Edge Case Handling

The current implementation doesn't handle URLs with credentials or userinfo:

// These would incorrectly get prefixed:
"user:password@127.0.0.1:8080" → "http://user:password@127.0.0.1:8080"
"admin@localhost:8080""http://admin@localhost:8080"

While valid according to RFC 3986 (scheme://[userinfo@]host[:port]), detecting these requires checking for @ before any /:

Suggested change
}
// Keep original string if scheme already exists
String lower = s.toLowerCase();
if (lower.startsWith("http://") || lower.startsWith("https://")) {
return s;
}
// Don't add scheme if userinfo is present without scheme
// (e.g., "user:pass@host:port" - likely malformed or needs manual fixing)
int slashPos = s.indexOf('/');
int atPos = s.indexOf('@');
if (atPos != -1 && (slashPos == -1 || atPos < slashPos)) {
// Has userinfo component - preserve as-is to avoid masking config errors
return s;
}
return scheme + s;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prefixSchemeIfMissing() method currently just normalizes URLs and doesn’t try to validate things like userinfo (user@host). That seems reasonable since handling credentials in URLs is discouraged and @ can also appear in valid paths. Adding special checks here would blur the line between normalization and validation.It’s probably better to handle stricter validation at the client/usage layer if needed.I’m happy to add a separate check or open an additional PR if you think stricter handling belongs here. Please let me know your preference.

@bitflicker64
Copy link
Contributor Author

Thanks for the feedback. I’ll go through the comments and update the implementation to address the issues you mentioned ill also keep these points in mind to avoid making similar mistakes in future changes

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jan 26, 2026
@imbajin imbajin self-requested a review January 26, 2026 06:57
Copy link
Member

@imbajin imbajin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologize for hitting approve by mistake :)

@dosubot dosubot bot removed the lgtm This PR has been approved by a maintainer label Jan 26, 2026
@bitflicker64
Copy link
Contributor Author

Apologize for hitting approve by mistake :)

its fine i am still working on this

@bitflicker64
Copy link
Contributor Author

I noticed that my tests currently use a duplicated UrlOptions class instead of the actual ServerOptions. Because of this, if config keys change in ServerOptions, the tests will not fail, since they rely on hardcoded duplicates.

// Test duplicate
public static class UrlOptions extends OptionHolder {
    public static final ConfigOption<String> restUrl =
        new ConfigOption<>("restserver.url", ...).withUrlNormalization("http://");
}

// Production (ServerOptions.java)
public static final ConfigOption<String> GREMLIN_SERVER_URL =
    new ConfigOption<>("gremlinserver.url", ...).withUrlNormalization("http://");

Would you prefer that I:
Refactor tests to use ServerOptions directly or
Keep the duplicated UrlOptions (more isolated, but requires manual syncing)?
Let me know your preference and I’ll refactor accordingly.

@imbajin
Copy link
Member

imbajin commented Jan 29, 2026

I noticed that my tests currently use a duplicated UrlOptions class instead of the actual ServerOptions. Because of this, if config keys change in ServerOptions, the tests will not fail, since they rely on hardcoded duplicates.

// Test duplicate
public static class UrlOptions extends OptionHolder {
    public static final ConfigOption<String> restUrl =
        new ConfigOption<>("restserver.url", ...).withUrlNormalization("http://");
}

// Production (ServerOptions.java)
public static final ConfigOption<String> GREMLIN_SERVER_URL =
    new ConfigOption<>("gremlinserver.url", ...).withUrlNormalization("http://");

Would you prefer that I: Refactor tests to use ServerOptions directly or Keep the duplicated UrlOptions (more isolated, but requires manual syncing)? Let me know your preference and I’ll refactor accordingly.

@bitflicker64 Apologize for the delay, prefer PlanA (Refactor tests to keep DRY rule)

@bitflicker64
Copy link
Contributor Author

I noticed that my tests currently use a duplicated UrlOptions class instead of the actual ServerOptions. Because of this, if config keys change in ServerOptions, the tests will not fail, since they rely on hardcoded duplicates.

// Test duplicate
public static class UrlOptions extends OptionHolder {
    public static final ConfigOption<String> restUrl =
        new ConfigOption<>("restserver.url", ...).withUrlNormalization("http://");
}

// Production (ServerOptions.java)
public static final ConfigOption<String> GREMLIN_SERVER_URL =
    new ConfigOption<>("gremlinserver.url", ...).withUrlNormalization("http://");

Would you prefer that I: Refactor tests to use ServerOptions directly or Keep the duplicated UrlOptions (more isolated, but requires manual syncing)? Let me know your preference and I’ll refactor accordingly.

@bitflicker64 Apologize for the delay, prefer PlanA (Refactor tests to keep DRY rule)

Hi, I tried refactoring HugeConfigTest to use ServerOptions directly as suggested. However, this test is in the hugegraph-common module, while ServerOptions is in hugegraph-server. When I add the dependency, it creates a circular dependency and the build fails with this error:

The projects in the reactor contain a cyclic reference:
org.apache.hugegraph:hugegraph-api
 -> org.apache.hugegraph:hugegraph-core
 -> org.apache.hugegraph:hugegraph-common
 -> org.apache.hugegraph:hugegraph-api

Because of this, ServerOptions cannot be used in hugegraph-common tests without breaking the module structure.

@codecov
Copy link

codecov bot commented Jan 30, 2026

Codecov Report

❌ Patch coverage is 0% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 1.87%. Comparing base (99baf2b) to head (358d037).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
...ava/org/apache/hugegraph/config/ServerOptions.java 0.00% 4 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (99baf2b) and HEAD (358d037). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (99baf2b) HEAD (358d037)
3 2
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #2944       +/-   ##
============================================
- Coverage     35.63%   1.87%   -33.76%     
+ Complexity      333      86      -247     
============================================
  Files           801     788       -13     
  Lines         67523   65289     -2234     
  Branches       8778    8354      -424     
============================================
- Hits          24061    1224    -22837     
- Misses        40901   63971    +23070     
+ Partials       2561      94     -2467     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bitflicker64
Copy link
Contributor Author

@imbajin Thanks for the discussion and guidance. I’ve moved the ServerOptions related tests into ServerOptionsTest and confirmed that all tests and CI are passing now.

imbajin
imbajin previously approved these changes Jan 31, 2026
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jan 31, 2026
@imbajin imbajin requested a review from Copilot January 31, 2026 12:31
Copy link
Member

@imbajin imbajin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also update the config files' default value (like https://github.com/apache/incubator-hugegraph/blob/9babe493919c01f012a56e7b5fb4d8b9faf64cf5/hugegraph-server/hugegraph-dist/src/assembly/static/conf/rest-server.properties#L3C2-L3C2)

Remove the unnecessary prefix of URL scheme & Also need to update it in https://github.com/apache/incubator-hugegraph-doc (website page), thx

image

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

fromDefault = true;
}

if (!fromDefault) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Code Quality: Inconsistent behavior between default and explicit values

Line 94-96 shows normalization only applies to explicitly set values (fromDefault = false), but not to default values from option.defaultValue().

This creates inconsistent behavior:

  • Default value http://127.0.0.1:8080 → used as-is
  • User sets 127.0.0.1:8080 → normalized to http://127.0.0.1:8080

Question: Is this intentional? If defaults already have schemes, why do we need .withUrlNormalization() on the options?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is intentional.
The default values already have http://, so they are already correct. There is nothing to fix, so normalization is skipped. User values may not have the scheme (for example: 127.0.0.1:8080), so only those need normalization.
.withUrlNormalization() is just metadata. It tells the system: “this option is a URL, and use this scheme when fixing user input.”

So:
Default → already correct → no change
User value → may be incomplete → normalize

imbajin and others added 2 commits January 31, 2026 22:15
- Add URL normalization support for config options
- Automatically prefix missing schemes (http://, https://)
- Log warnings when auto-correcting user-provided values
- Add comprehensive test coverage for normalization logic
- Update config files to demonstrate the feature

Changes:
- ConfigOption: Add withUrlNormalization() builder method
- ServerOptions: Apply normalization to REST, Gremlin, K8s URLs
- HugeConfig: Implement lazy cache and normalization logic
- Add ServerOptionsTest with 5 test cases
- Simplify URLs in main and Docker config
@bitflicker64
Copy link
Contributor Author

All review feedback implemented . There's one failing test KoutApiTest.testGet (graph traversal) that I can't trace back to my URL changes would really appreciate your help debugging this .also after this ill update https://github.com/apache/incubator-hugegraph-doc

@bitflicker64
Copy link
Contributor Author

We should also update the config files' default value (like https://github.com/apache/incubator-hugegraph/blob/9babe493919c01f012a56e7b5fb4d8b9faf64cf5/hugegraph-server/hugegraph-dist/src/assembly/static/conf/rest-server.properties#L3C2-L3C2)

Remove the unnecessary prefix of URL scheme & Also need to update it in https://github.com/apache/incubator-hugegraph-doc (website page), thx

image

Done! I've updated the documentation repo to remove the http:// prefix from all configuration examples.
Created PR: apache/incubator-hugegraph-doc#449

@imbajin imbajin merged commit 6ffdd9c into apache:master Feb 3, 2026
15 of 16 checks passed
@bitflicker64
Copy link
Contributor Author

@imbajin hey I'm interested in contributing more to the project and was wondering if I could get an invitation to the Slack workspace?

@imbajin
Copy link
Member

imbajin commented Feb 7, 2026

@imbajin hey I'm interested in contributing more to the project and was wondering if I could get an invitation to the Slack workspace?

Thank you for your interest! It’s great to see you’re looking to get more involved with the project.

Regarding the Slack and Discord channels, they are actually public and you are welcome to join anytime without a specific invitation. However, please note that we haven't officially started active operations or management for those channels yet. If you're interested, we’d be happy to grant you the necessary permissions to help lead or moderate the community there!

Currently, our primary communication still happens through GitHub Issues/Discussions and Email.

We are always open to any suggestions or feedback you might have, whether you'd like to share them publicly or privately. Looking forward to your further contributions!

@bitflicker64
Copy link
Contributor Author

@imbajin hey I'm interested in contributing more to the project and was wondering if I could get an invitation to the Slack workspace?

Thank you for your interest! It’s great to see you’re looking to get more involved with the project.

Regarding the Slack and Discord channels, they are actually public and you are welcome to join anytime without a specific invitation. However, please note that we haven't officially started active operations or management for those channels yet. If you're interested, we’d be happy to grant you the necessary permissions to help lead or moderate the community there!

Currently, our primary communication still happens through GitHub Issues/Discussions and Email.

We are always open to any suggestions or feedback you might have, whether you'd like to share them publicly or privately. Looking forward to your further contributions!

Hi, I tried joining via the Slack link, but it says I don’t have an account. I also couldn’t find the Discord.
Could you please share active invite links?
My email: himanshuverma151006@gmail.com
Thanks

@imbajin
Copy link
Member

imbajin commented Feb 9, 2026

Hi, I tried joining via the Slack link, but it says I don’t have an account. I also couldn’t find the Discord. Could you please share active invite links? My email: himanshuverma151006@gmail.com Thanks

Seems I don't have the permission to invite u to the ASF's Slack:)

When I add your account, it said:

Your administrator has prohibited the use of Slack Connect in this workspace, so only ASF members and guests can join this channel.

Discord: https://discord.gg/uQqtb9U5vu (Indeed Not active)

BTW, our docker-compose images have some issues with the latest version, u could submit a new issue & test/fix it

@bitflicker64
Copy link
Contributor Author

Hi, I tried joining via the Slack link, but it says I don’t have an account. I also couldn’t find the Discord. Could you please share active invite links? My email: himanshuverma151006@gmail.com Thanks

Seems I don't have the permission to invite u to the ASF's Slack:)

When I add your account, it said:

Your administrator has prohibited the use of Slack Connect in this workspace, so only ASF members and guests can join this channel.

Discord: https://discord.gg/uQqtb9U5vu (Indeed Not active)

BTW, our docker-compose images have some issues with the latest version, u could submit a new issue & test/fix it

Hi, I tried joining via the Slack link, but it says I don’t have an account. I also couldn’t find the Discord. Could you please share active invite links? My email: himanshuverma151006@gmail.com Thanks

Seems I don't have the permission to invite u to the ASF's Slack:)

When I add your account, it said:

Your administrator has prohibited the use of Slack Connect in this workspace, so only ASF members and guests can join this channel.

Discord: https://discord.gg/uQqtb9U5vu (Indeed Not active)

BTW, our docker-compose images have some issues with the latest version, u could submit a new issue & test/fix it

Thanks for checking the slack invite and i've joined the discord . I’ll look into the docker issue and update accordingly.

@bitflicker64
Copy link
Contributor Author

Hi, I tried joining via the Slack link, but it says I don’t have an account. I also couldn’t find the Discord. Could you please share active invite links? My email: himanshuverma151006@gmail.com Thanks

Seems I don't have the permission to invite u to the ASF's Slack:)

When I add your account, it said:

Your administrator has prohibited the use of Slack Connect in this workspace, so only ASF members and guests can join this channel.

Discord: https://discord.gg/uQqtb9U5vu (Indeed Not active)

BTW, our docker-compose images have some issues with the latest version, u could submit a new issue & test/fix it

I am running this setup on macOS where host networking is not available and the PD container shows three listening ports. While setting up a local development environment using bridge networking, could you clarify whether all PD ports need to be exposed or only the ones required for normal HugeGraph operation? Additionally, is bridge networking the recommended approach on macOS instead of host mode, considering potential port conflicts and platform limitations?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files. tests Add or improve test cases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improvement] unify endpoint URL format

2 participants