Skip to content

Federation (ExternalCatalog ICEBERG_REST): rest.client.proxy.* properties not being applied #3465

@yj-lee0503

Description

@yj-lee0503

Is your feature request related to a problem? Please describe.

Yes. In controlled egress environments, federation outbound HTTPS calls do not route through the configured HTTP CONNECT proxy even when rest.client.proxy.hostname and rest.client.proxy.port are set on the ExternalCatalog.

The Problem:

  • Organizations with controlled egress policies (common in enterprise and regulated environments) cannot use Polaris federation features
  • Catalog properties containing rest.client.proxy.* configuration appear to have no effect
  • No proxy traffic is observed in proxy logs during federation API calls
  • Direct internet access is blocked by network policy, causing federation to fail

Evidence:
Extensive testing in Kubernetes (EKS) environment with Squid proxy confirms:

  1. ✅ External catalog created with rest.client.proxy.hostname and rest.client.proxy.port properties
  2. ✅ Federation API calls return HTTP 500 or timeout
  3. ⛔ Zero proxy traffic from Polaris pod IPs in Squid access logs
  4. ✅ Explicit curl with --proxy flag from Polaris pod works and appears in proxy logs

Testing Evidence:

# Federation call (no proxy traffic observed)
curl -X GET "${CATALOG_BASE}/<catalog name>/namespaces" \
  -H "Authorization: Bearer ${TOKEN}"
# Result: Times out or returns 500, no proxy logs

# Direct curl with explicit proxy (works, proxy traffic observed)
kubectl exec polaris-pod -- curl -I \
  --proxy http://proxy.svc.cluster.local:3128 \
  https://remote-catalog.example.com
# Result: Success, appears in Squid logs

Initial hypothesis:
Polaris may not be passing ExternalCatalog properties through to the Iceberg REST client's HTTPClient.Builder, or federation is using a different HTTP client path. Iceberg 1.10.0+ includes full proxy support via these properties:

  • rest.client.proxy.hostname
  • rest.client.proxy.port
  • rest.client.proxy.username (optional)
  • rest.client.proxy.password (optional)

If there’s an existing supported way to configure federation egress/proxy settings, I’d appreciate a pointer. Happy to test.

Describe the solution you'd like

Ensure ExternalCatalog properties are passed into the Iceberg REST client used for federation.

Specific request:

  1. Verify that catalog properties map in ExternalCatalog payload is wired into Iceberg's HTTPClient.Builder
  2. Document the correct configuration surface for outbound federation HTTP settings
  3. Add regression test that verifies proxy usage (via mock proxy or traffic inspection)

Expected Behavior:
When creating an external catalog with proxy configuration:

{
  "type": "EXTERNAL",
  "name": "my-external-catalog",
  "properties": {
    "rest.client.proxy.hostname": "proxy.example.com",
    "rest.client.proxy.port": "3128",
    "rest.client.proxy.username": "optional-user",
    "rest.client.proxy.password": "optional-pass",
    "rest.client.connection-timeout-ms": "30000",
    "rest.client.socket-timeout-ms": "120000"
  },
  "connectionConfigInfo": {
    "connectionType": "ICEBERG_REST",
    "uri": "https://remote-catalog.example.com/api/catalog/v1",
    "remoteCatalogName": "remote-catalog",
    "authenticationParameters": { ... }
  }
}

Federation calls should:

  • Route HTTPS requests through the configured proxy
  • Appear in proxy logs as CONNECT remote-catalog.example.com:443
  • Support proxy authentication if credentials provided

Benefits:

  • Enables federation in enterprise environments with controlled egress
  • Maintains compatibility with existing network security policies
  • Allows traffic inspection, logging, and audit compliance
  • Leverages Iceberg's existing proxy support (already in 1.10.0+)

Describe alternatives you've considered

All current workarounds are suboptimal:

  1. Deploy in NAT-enabled nodes - Works but reduces network security posture by allowing direct internet access
  2. Use AWS PrivateLink/VPC endpoints - Only available for specific cloud providers, service tiers, and federation targets
  3. iptables transparent proxy - Complex, brittle, requires NET_ADMIN capability, limited by DNS round-robin (only captures IPs resolved at pod init time)
  4. Standard JVM proxy properties / environment variables - Already tested, do not take effect for federation calls

None of these alternatives provide a clean, production-ready solution.

Enhancement consideration:
As a fallback, Polaris could also respect standard JVM system proxy settings (-Dhttps.proxyHost, HTTP_PROXY environment variables) for federation calls, but that should be positioned as an enhancement rather than the core issue.

Additional context

Environment:

  • Polaris Version: 1.2.0-incubating
  • Iceberg Version: 1.10.1 (includes proxy support via iceberg-bom)
  • Deployment: Kubernetes (EKS)
  • Network: Controlled egress via Squid proxy (no direct internet)

Configuration Tested:
External catalog creation with properties:

{
  "rest.client.proxy.hostname": "proxy.svc.cluster.local",
  "rest.client.proxy.port": "3128",
  "rest.client.connection-timeout-ms": "30000",
  "rest.client.socket-timeout-ms": "120000"
}

Also tested (no effect):

  • Environment variables: HTTPS_PROXY, HTTP_PROXY, https_proxy, http_proxy, NO_PROXY
  • JVM properties: -Dhttps.proxyHost, -Dhttps.proxyPort, -Dhttp.proxyHost, -Dhttp.proxyPort
  • Quarkus REST client config: QUARKUS_REST_CLIENT_PROXY_ADDRESS

Source Code Observation:
File: org/apache/polaris/service/catalog/iceberg/IcebergRESTExternalCatalogFactory.java

RESTCatalog federatedCatalog = new RESTCatalog(
    context,
    (config) ->
        HTTPClient.builder(config)
            .uri(config.get(org.apache.iceberg.CatalogProperties.URI))
            .build());

The config map is passed to HTTPClient.builder(), but it's unclear if the catalog properties map is included in this config or if only specific keys are passed through.

References:

Acceptance Criteria:

  • External catalog properties map is verified to be passed to Iceberg REST client
  • Proxy configuration documented in Polaris federation docs
  • Regression test added to verify proxy usage
  • Federation with proxy configuration works in isolated networks

Impact Assessment:

  • Impact: High - Blocks federation adoption in enterprise environments with controlled egress

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions