diff --git a/PRDs/20251217-allowlist-resolver/PRD.md b/PRDs/20251217-allowlist-resolver/PRD.md new file mode 100644 index 0000000000..46085b8999 --- /dev/null +++ b/PRDs/20251217-allowlist-resolver/PRD.md @@ -0,0 +1,1312 @@ +# Resource Access Policy PRD + +**Issue:** [#183 - Add new allowlist-only resolver for loading models, instances, and dynamic model generation](https://github.com/metaschema-framework/metaschema-java/issues/183) + +**Goal:** Provide a policy-based URI access control system using glob patterns with programmatic IP-based SSRF protection, graduated enforcement modes, mandatory URI normalization, and defense-in-depth file system protections. + +**Architecture:** Implement a `ResourceAccessPolicy` in the `core` module combining glob pattern matching with IP-based network security. Integrate at loader level (`IModuleLoader`, `IBoundLoader`) with configurable enforcement modes. Default: DISABLED (fully backwards compatible); opt-in to AUDIT or ENFORCE. + +**Tech Stack:** Java 11, existing Metaschema core interfaces, Metaschema-based configuration model, SLF4J for audit logging, IP address library for CIDR block matching. + +--- + +## Problem Statement + +As a developer of Metaschema-based tooling deploying services, I need a resolver subsystem that: +1. Restricts access to an allowlist of local filesystem directories +2. Restricts access to an allowlist of remote HTTP services +3. Prevents SSRF attacks to internal services (localhost, cloud metadata endpoints) +4. Prevents local file inclusion attacks (directory traversal, sensitive system files) +5. Can be deployed gradually (audit first, enforce later) without breaking existing workflows +6. Works identically in CLI and library-based deployments + +### Security Threats Addressed + +| Threat | Attack Vector | Mitigation | +|--------|--------------|------------| +| Local File Inclusion | `../../../etc/passwd` in imports | Mandatory path normalization + symlink resolution + pattern-based access control | +| URL Encoding Bypass | `file:///etc/p%61sswd` | Mandatory URI percent-decoding before matching | +| SSRF to Internal Services | `http://localhost:8080/admin` | IP-based SSRF checking via `NetworkSecurityChecker` | +| IP Encoding Bypass | `http://2130706433/`, `http://0x7f000001/` | Programmatic IP resolution, not string patterns | +| IPv6 SSRF | `http://[::ffff:127.0.0.1]/` | `InetAddress`-based classification of all IP forms | +| Cloud Metadata Access | `http://169.254.169.254/` | CIDR block checking for link-local ranges | +| HTTP Redirect Bypass | `302` redirect to `http://169.254.169.254/` | Re-check policy after every redirect | +| XXE Attacks | XML entity resolution to arbitrary URLs | Route entity resolution through policy | +| Scheme Injection | `file://`, `ftp://`, `gopher://` | Scheme-level allow/deny with glob patterns | +| JAR Scheme SSRF | `jar:http://evil.com/mal.jar!/path` | Recursive policy check on JAR inner URI | +| Symlink Traversal | Symlink from allowed → denied path | Symlink resolution before policy check (default) | +| Config Privilege Escalation | Project-local config weakens admin policy | Ratchet-based configuration layering | +| ReDoS via Patterns | Crafted glob patterns in config files | Non-backtracking regex, pattern complexity limits | + +--- + +## Design Decisions + +### 1. Glob Pattern Model + +Patterns use glob syntax with `!` negation for deny rules: + +- **Allow patterns** (no prefix) define what resources are accessible +- **Deny patterns** (`!` prefix) create exceptions that block previously allowed resources +- Patterns are evaluated **in order**, last match wins +- Patterns are organized **by URI scheme** (file, https, http, jar) + +**Important — `!` means DENY:** Unlike `.gitignore` where `!` means "re-include" (stop ignoring), in this system `!` means "deny access." This is the opposite semantic. Documentation must make this explicit. + +The pattern syntax itself (glob wildcards `*`, `**`, `?`) follows `.gitignore` conventions but the behavioral model is different. Documentation should describe this as "glob pattern matching with last-match-wins evaluation," not as ".gitignore-style." + +**Directory equivalence rule:** A pattern ending in `/**` also matches the directory itself (without trailing slash or children). For example: +- `/workspace/**` matches `/workspace`, `/workspace/`, and `/workspace/project/schema.xml` +- `pages.nist.gov/**` matches `pages.nist.gov`, `pages.nist.gov/`, and `pages.nist.gov/schemas/foo.xml` + +This prevents a common misconfiguration where allowing a directory subtree via `path/**` unexpectedly denies access to the directory path itself. The implementation compiles `path/**` as matching `path`, `path/`, and `path/`. + +### 2. Enforcement Modes & Zero-Config Behavior + +Three graduated enforcement levels for safe rollout: + +| Mode | Behavior | Use Case | +|------|----------|----------| +| `DISABLED` | No policy checking; all URIs allowed | Default — legacy behavior, maximum compatibility | +| `AUDIT` | Check policy, **log violations**, but **allow** all requests | Migration period, discovering needed rules | +| `ENFORCE` | Check policy, **block** violations with exception | Production hardened | + +**Zero-config default: `DISABLED`.** When a library user upgrades to a version containing this feature without changing any code, behavior is unchanged. No new log entries, no blocking. Security requires explicit opt-in. This avoids: +- Surprising existing users with new WARN log noise after an upgrade +- Triggering production monitoring alerts unexpectedly +- Breaking existing workflows + +**Explicit opt-in is required** via one of: +- **API:** `loader.setResourceAccessPolicy(ResourceAccessPolicy.bundledDefaults())` +- **Config file:** Place a `resource-access-policy.yaml` in a search path +- **CLI flag:** `--resource-policy-mode=audit` or `--resource-policy-mode=enforce` + +**Factory methods for common scenarios:** + +```java +// Restrictive defaults in AUDIT mode (recommended starting point) +ResourceAccessPolicy.bundledDefaults() + +// Permissive for local development (allows localhost, http) +ResourceAccessPolicy.development() + +// Explicit no-op (same as not setting a policy) +ResourceAccessPolicy.disabled() +``` + +The mode is configurable via: +- **API:** `ResourceAccessPolicy.builder().mode(PolicyMode.AUDIT)` +- **Config file:** `mode: audit` in the policy configuration +- **CLI flag:** `--resource-policy-mode=enforce` + +### 3. URI Security Processing (Mandatory) + +All URIs undergo mandatory security processing **before** pattern matching. This is a non-negotiable requirement, not an implementation detail. + +**Processing pipeline for every URI:** + +```text +Raw URI + │ + ├─ 1. Percent-decode URI components (exactly once) + │ file:///workspace/p%61th → file:///workspace/path + │ + ├─ 2. Normalize scheme to lowercase + │ FILE:///path → file:///path + │ + ├─ 3. For file: scheme: + │ a. Resolve path via Path.of(path).normalize() + │ /workspace/../etc/passwd → /etc/passwd + │ b. Reject paths still containing ".." after normalization + │ c. If symlink policy is FOLLOW (default): + │ Resolve via Path.toRealPath() to canonical path + │ d. Apply case folding per CaseSensitivity mode + │ + ├─ 4. For http/https schemes: + │ a. Normalize hostname to lowercase (RFC 3986) + │ b. Strip default ports (80 for http, 443 for https) + │ c. Pass to NetworkSecurityChecker for IP-based SSRF check + │ + ├─ 5. For jar: scheme: + │ a. Parse inner URI (before !) and recursively check policy + │ b. Parse internal path (after !) for scheme pattern matching + │ + └─ 6. For URIs without a scheme (relative URIs): + Resolve to absolute URI before policy checking. + Deny if resolution is not possible. +``` + +**Symlink resolution policy:** + +| Mode | Behavior | Default | +|------|----------|---------| +| `FOLLOW` | Resolve symlinks via `Path.toRealPath()` before checking | Yes (default) | +| `NOFOLLOW` | Check the path as-is without symlink resolution | No | + +Symlink resolution is enabled by default because a symlink from an allowed directory to a sensitive path is a common bypass vector. When `FOLLOW` is active, the **canonical (real) path** is checked against the policy, not the symlink path. + +**Case sensitivity mode:** + +| Mode | Behavior | Use Case | +|------|----------|----------| +| `SYSTEM_DEFAULT` | Auto-detect from OS: case-insensitive on Windows, case-sensitive elsewhere | Default | +| `CASE_SENSITIVE` | Always case-sensitive matching | Unix-only deployments | +| `CASE_INSENSITIVE` | Always case-insensitive matching | Windows, testing | + +Case sensitivity applies to file path matching in both `FileProtections` and file scheme patterns. For network schemes, hostnames are always case-folded to lowercase per RFC 3986. + +**Configurable via API:** + +```java +ResourceAccessPolicy.builder() + .symlinkPolicy(SymlinkPolicy.FOLLOW) // default + .caseSensitivity(CaseSensitivity.SYSTEM_DEFAULT) // default + .build(); +``` + +### 4. Network Security (IP-Based SSRF Protection) + +**Glob patterns alone cannot protect against SSRF** because IP addresses have multiple representations that bypass string matching: + +| Representation | Example | Resolves To | +|---------------|---------|-------------| +| Standard | `127.0.0.1` | 127.0.0.1 | +| Decimal | `2130706433` | 127.0.0.1 | +| Hexadecimal | `0x7f000001` | 127.0.0.1 | +| Octal | `0177.0.0.1` | 127.0.0.1 | +| Shorthand | `127.1` | 127.0.0.1 | +| IPv4-mapped IPv6 | `::ffff:127.0.0.1` | 127.0.0.1 | +| IPv6 expanded | `0:0:0:0:0:0:0:1` | ::1 | + +The system uses a `NetworkSecurityChecker` that programmatically resolves hostnames to IP addresses and checks them against CIDR blocks using an IP address library. + +**Blocked CIDR ranges (checked programmatically, not via glob patterns):** + +| CIDR Block | Description | +|-----------|-------------| +| `127.0.0.0/8` | IPv4 loopback | +| `::1/128` | IPv6 loopback | +| `10.0.0.0/8` | Private (Class A) | +| `172.16.0.0/12` | Private (Class B) | +| `192.168.0.0/16` | Private (Class C) | +| `169.254.0.0/16` | Link-local (includes cloud metadata 169.254.169.254) | +| `fe80::/10` | IPv6 link-local | +| `fc00::/7` | IPv6 unique local address (ULA) | +| `::ffff:0:0/96` | IPv4-mapped IPv6 (checked after mapping to IPv4) | +| `0.0.0.0/8` | Unspecified / "this" network | +| `100.64.0.0/10` | Shared address space (CGNAT) | + +**Implementation:** Uses an IP address library (e.g., `com.github.seancfoley:ipaddress`) for CIDR matching. The `InetAddress` from `java.net` resolves all IP encoding variants. The CIDR library handles range comparisons. + +**HTTP redirect re-checking:** After any HTTP redirect (3xx), the new URI must be re-checked against the policy before following the redirect. This prevents: +- Policy checks `https://allowed-host.com/` → allowed +- HTTP client follows 302 to `http://169.254.169.254/latest/meta-data/` +- Cloud metadata exfiltrated + +Re-check is documented as a requirement for HTTP client integration. The policy engine provides the `checkAccess()` method; the loader must call it again after receiving a redirect. + +### 5. Scheme-Based Pattern Organization + +Patterns are grouped by URI scheme for clarity: + +```yaml +resource-access-policy: + mode: audit + schemes: + - scheme: https + patterns: + - "pages.nist.gov/**" + - "raw.githubusercontent.com/metaschema-framework/**" + - "!*.internal/**" + - scheme: http + enabled: false + - scheme: file + patterns: + - "/workspace/**" + - "/data/schemas/**" + - scheme: jar + patterns: + - "**" +``` + +**Scheme semantics:** + +| Configuration | Behavior | +|--------------|----------| +| `enabled: false` | Deny all URIs for this scheme | +| `enabled: true` + patterns present | Match against patterns (last match wins) | +| `enabled: true` + no patterns | Use `default-scheme-policy` (typically deny) | + +**Important change:** `enabled: true` with no patterns uses `default-scheme-policy` (default: deny), NOT "allow all." This prevents a common misconfiguration where an empty scheme section silently allows everything. + +**Port handling for host-based schemes (http, https):** + +Ports are stripped before pattern matching. Default ports (80 for http, 443 for https) are always stripped. Non-default ports are also stripped so that patterns match against `host/path` only. Port restrictions can be added as a future enhancement if needed. + +Example: `https://localhost:8443/api` → match target is `localhost/api`. + +**Scheme name validation:** + +Scheme names are validated against a known set at config load time: `http`, `https`, `file`, `jar`, `ftp`, `data`. Unrecognized scheme names generate a WARNING log. This catches typos like `htps` that would silently create dead config entries. + +**Relative URI handling:** + +URIs without a scheme must be resolved to absolute URIs before policy checking. If resolution is not possible, the URI is denied. + +### 6. File System Protections (Defense-in-Depth) + +The `file` scheme ships with a **default allow-list of safe path patterns**. File protections are checked **before** user-defined scheme patterns — a path must be allowed by file protections before scheme patterns are evaluated. + +**Model:** Allow-list. Only paths matching an allow pattern are permitted. Everything else is denied. + +**Default allow patterns (shipped with the library):** + +All platforms: +- `/**` — current working directory subtree +- `/**` — user's home directory subtree +- `!/.*/**` — except ALL dot-directories in home (blanket exclusion) +- `!**/Library/Keychains/**` — except macOS keychains +- `!**/Library/Application Support/com.apple.TCC/**` — except macOS privacy DB +- `!**/AppData/**` — except Windows AppData + +Notes: +- `` and `` are resolved to absolute paths at policy creation time +- The blanket `!/.*/**` pattern excludes all dot-directories: `.ssh`, `.aws`, `.gnupg`, `.kube`, `.docker`, `.azure`, `.netrc`, `.config`, `.local`, `.bash_history`, `.password-store`, `.vault-token`, etc. This is more secure than enumerating individual directories +- If CWD is `/` (root) or `C:\`, a WARNING is logged: "CWD is the filesystem root; FileProtections allow the entire filesystem" + +**What the defaults block (by omission):** + +| Blocked Path | Reason | +|-------------|--------| +| `/etc/**`, `/proc/**`, `/sys/**`, `/dev/**` | System directories (outside CWD/home) | +| `/root/**` | Root home (unless CWD is there) | +| `C:/Windows/**` | Windows system directory | +| `~/.ssh/**`, `~/.aws/**`, `~/.gnupg/**` | Sensitive dot-directories (blanket exclusion) | +| `~/.kube/**`, `~/.docker/**`, `~/.azure/**` | Cloud/container credentials | +| `~/.config/gcloud/**`, `~/.config/gh/**` | Service credentials | +| `~/.netrc`, `~/.npmrc`, `~/.pypirc` | Network tokens | +| `~/.*_history` | Shell history | + +**Case sensitivity and symlink modes:** + +FileProtections respect the policy-level `CaseSensitivity` and `SymlinkPolicy` settings. On Windows with `SYSTEM_DEFAULT`, paths are compared case-insensitively. + +**Conflict detection at build time:** + +When the builder constructs a policy, it checks for conflicts between scheme patterns and FileProtections. If a file scheme allow pattern (e.g., `/opt/data/**`) would be blocked by FileProtections (because `/opt/data/` is outside CWD and home), the builder throws `IllegalStateException`: + +```text +Conflict: file scheme pattern '/opt/data/**' will never match because FileProtections +does not allow '/opt/data/'. Add it to FileProtections via: + .fileProtections(FileProtections.builder().includeDefaults().allow("/opt/data/**").build()) +``` + +**API:** + +```java +// Default: CWD + home minus dot-dirs +ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .forScheme("file") + .allow("/workspace/**") + .build(); + +// Inspect defaults +List defaults = FileProtections.defaultAllowPatterns(); + +// Customize: extend defaults +ResourceAccessPolicy custom = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .fileProtections(FileProtections.builder() + .includeDefaults() + .allow("/opt/metaschema/**") + .build()) + .forScheme("file") + .allow("/workspace/**") + .build(); + +// Customize: narrow defaults +ResourceAccessPolicy tighter = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .fileProtections(FileProtections.builder() + .includeDefaults() + .remove("/**") // remove home dir access + .build()) + .forScheme("file") + .allow("/workspace/**") + .build(); + +// Fully custom (no defaults) +ResourceAccessPolicy fullyCustom = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .fileProtections(FileProtections.builder() + .allow("/opt/app/**") + .build()) + .forScheme("file") + .allow("/opt/app/schemas/**") + .build(); + +// Disable file protections (NOT RECOMMENDED — security warning logged) +ResourceAccessPolicy noProtections = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .fileProtections(FileProtections.disabled()) + .forScheme("file") + .allow("/workspace/**") + .build(); +``` + +`FileProtections.disabled()` (renamed from `none()`) disables all file system protections. When called: +- Logs a `WARN`: "FileProtections disabled — file scheme relies solely on scheme patterns for security" +- The method's Javadoc includes a security warning about the implications + +**Evaluation order for `file:` URIs:** + +```text +1. Apply URI security processing (normalize, decode, resolve symlinks, case-fold) +2. Check FileProtections allow-list (is path in a safe area?) +3. If denied by FileProtections → apply mode behavior (log/block), stop +4. If allowed by FileProtections → check scheme patterns (last match wins) +5. If no scheme pattern matches → use default-scheme-policy +``` + +### 7. JAR Scheme Recursive Checking + +The `jar:` URI format is `jar:!/`. Both components must be checked: + +1. **Inner URI** (JAR location): Parsed and recursively checked against the policy using the inner URI's scheme. This prevents SSRF through `jar:http://evil.com/mal.jar!/schema.xsd`. +2. **Internal path** (after `!`): Checked against `jar` scheme patterns. + +If the inner URI has no `!` separator, it is treated as a malformed JAR URI and denied. + +### 8. Default Bundled Policy + +The library ships with a **restrictive default policy in AUDIT mode**: + +```yaml +resource-access-policy: + mode: audit + default-scheme-policy: deny + schemes: + - scheme: https + patterns: + - "**" + - scheme: http + enabled: false + - scheme: file + patterns: + - "**" + - scheme: jar + patterns: + - "**" +``` + +**Note:** Private IP blocking and cloud metadata protection are handled by the `NetworkSecurityChecker` (Design Decision 4), not by glob patterns. The glob patterns in the default policy focus on scheme-level allow/deny and path restrictions. This separation ensures that IP encoding bypasses cannot circumvent network security. + +**When loaded via `ResourceAccessPolicy.bundledDefaults()`:** +- Mode is AUDIT (log violations, allow all requests) +- `NetworkSecurityChecker` is enabled with all default CIDR blocks +- `FileProtections` are enabled with default allow patterns +- HTTP scheme is disabled entirely +- HTTPS, file, and jar schemes allow all paths (network security and FileProtections provide the restrictions) + +### 9. Library API (Policy on Loader) + +Policy is set on the loader via a new method: + +```java +IModuleLoader loader = ...; + +// Recommended: use bundled defaults (AUDIT mode) +loader.setResourceAccessPolicy(ResourceAccessPolicy.bundledDefaults()); + +// Override mode +loader.setResourceAccessPolicy( + ResourceAccessPolicy.bundledDefaults().withMode(PolicyMode.ENFORCE)); + +// Development mode (allows localhost, HTTP) +loader.setResourceAccessPolicy(ResourceAccessPolicy.development()); + +// Modify an existing policy +loader.setResourceAccessPolicy( + ResourceAccessPolicy.bundledDefaults() + .toBuilder() + .forScheme("https") + .allow("my-internal-host.com/**") + .build()); + +// Custom policy +ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .forScheme("https") + .allow("pages.nist.gov/**") + .allow("raw.githubusercontent.com/metaschema-framework/**") + .forScheme("file") + .allow("/workspace/schemas/**") + .forScheme("jar") + .allowAll() + .denyUnlistedSchemes() + .build(); +loader.setResourceAccessPolicy(policy); +``` + +**Key API points:** +- `IUriResolver` interface remains unchanged for backwards compatibility +- `setResourceAccessPolicy(null)` disables policy checking (equivalent to DISABLED) +- `ResourceAccessPolicy` is **immutable** (final class, all-final fields). `withMode()` and `toBuilder()` create new instances. +- Loaders should use `volatile` or `AtomicReference` for the policy field for thread safety + +**`ResourceAccessPolicy.development()` factory:** + +Returns a permissive policy for local development: + +```java +// Equivalent to: +ResourceAccessPolicy.builder() + .mode(PolicyMode.AUDIT) + .forScheme("https").allowAll() + .forScheme("http").allow("localhost/**") + .forScheme("file").allowAll() + .forScheme("jar").allowAll() + .denyUnlistedSchemes() + .networkSecurity(NetworkSecurityConfig.builder() + .allowLoopback(true) // allow localhost for dev + .build()) + .build(); +``` + +### 10. Diagnostic API & Error Messages + +#### Diagnostic API + +Users need a way to test policies without trial-and-error. The `explain()` method returns a structured `PolicyDecision` without throwing: + +```java +PolicyDecision decision = policy.explain(URI.create("https://10.0.0.1/api")); +decision.isAllowed(); // false +decision.getLayer(); // "network-security" +decision.getDenialReason(); // "IP 10.0.0.1 is in private range 10.0.0.0/8" +decision.getRemediation(); // "Add to NetworkSecurityConfig: .allowCidr(\"10.0.0.1/32\")" +decision.getEvaluationTrace(); // ordered list of evaluation steps + +// Human-readable summary of all rules +String summary = policy.describeEffectiveRules(); +``` + +**`PolicyDecision` fields:** + +| Field | Type | Description | +|-------|------|-------------| +| `allowed` | `boolean` | Whether the URI would be allowed | +| `layer` | `String` | Which layer denied/allowed: "disabled", "file-protections", "network-security", "scheme-patterns", "default-scheme-policy" | +| `denialReason` | `String` | Human-readable reason for denial (null if allowed) | +| `matchingPattern` | `String` | The specific pattern that matched (null if N/A) | +| `configSource` | `String` | Where the matching rule came from (e.g., "bundled defaults", file path) | +| `remediation` | `String` | What to add to allow this URI | +| `evaluationTrace` | `List` | Ordered list of all evaluation steps | + +#### Error Messages + +Error messages must be actionable. Format for `AccessViolationException`: + +```text +Resource access policy violation: 'file:///etc/passwd' was denied. + Normalized URI: /etc/passwd + Denied by: file-protections (path not in allowed areas: , ) + Source: bundled defaults + To allow: FileProtections.builder().includeDefaults().allow("/etc/passwd").build() + Or run with --resource-policy-mode=audit to log without blocking. +``` + +Format for AUDIT mode log messages: + +```text +WARN [resource-access-policy] URI 'https://10.0.0.5/api/schema.json' would be denied + in ENFORCE mode. Denied by: network-security (IP 10.0.0.5 in private range 10.0.0.0/8). + To allow: add to NetworkSecurityConfig: .allowCidr("10.0.0.5/32") +``` + +**Logging conventions:** +- Logger name: `dev.metaschema.core.model.policy` +- All audit messages prefixed with `[resource-access-policy]` for grep-ability +- AUDIT violations: WARN level +- ENFORCE violations: ERROR level (before throwing) +- Allowed requests: DEBUG level (optional) +- Policy initialization: INFO level (which config files loaded) + +### 11. Builder API Design + +The builder uses nested builders for type-safe state transitions: + +```text +ResourceAccessPolicy.builder() → ResourceAccessPolicyBuilder + .mode(PolicyMode) → self + .symlinkPolicy(SymlinkPolicy) → self + .caseSensitivity(CaseSensitivity) → self + .fileProtections(FileProtections) → self + .networkSecurity(NetworkSecurityConfig) → self + .forScheme("https") → SchemeConfigBuilder + .allow("pattern") → self + .deny("!pattern") → self + .allowAll() → self + .denyAll() → self + .forScheme("file") → new SchemeConfigBuilder (finalizes previous) + .denyUnlistedSchemes() → ResourceAccessPolicyBuilder (finalizes scheme) + .build() → ResourceAccessPolicy + .denyUnlistedSchemes() → self (renamed from defaultDeny()) + .build() → ResourceAccessPolicy +``` + +**Key behaviors:** +- Calling `.forScheme()` twice for the same scheme **appends** patterns (does not replace) +- Calling `.allow()` or `.deny()` without a preceding `.forScheme()` throws `IllegalStateException` +- `.build()` validates the policy and runs conflict detection +- The built `ResourceAccessPolicy` is **immutable** — all internal collections are unmodifiable copies + +**`toBuilder()` method:** + +Creates a new builder pre-populated from an existing policy: + +```java +ResourceAccessPolicy modified = existingPolicy.toBuilder() + .forScheme("https") + .allow("additional-host.com/**") + .build(); +``` + +### 12. Configuration Layering & Ratcheting + +Configurations loaded from multiple locations, merged with precedence: + +| Priority | Location | Platform | Purpose | +|----------|----------|----------|---------| +| 1 (lowest) | Bundled in JAR | All | Restrictive defaults in audit mode | +| 2 | `/config/` | All | Distribution-specific overrides | +| 3 | `/etc/metaschema/` | Unix | System-wide administrator settings | +| 3 | `%ProgramData%\metaschema\` | Windows | System-wide administrator settings | +| 4 | `~/.metaschema/` | All | User-specific preferences | +| 5 | `./.metaschema/` | All | Project-specific overrides | +| 6 (highest) | CLI `--resource-policy` | All | CLI argument override | + +#### Ratchet Principle (Security) + +Higher-precedence configs **can only tighten policy, never loosen it:** + +- **Mode ratcheting:** Restriction order: DISABLED < AUDIT < ENFORCE. A higher-precedence layer's mode must be >= the lower-precedence layer's mode. If a project-local config sets `mode: disabled` but the system config sets `mode: enforce`, the effective mode is `enforce`. A WARNING is logged when a layer attempts to weaken the mode. +- **`locked` flag:** Any config layer can mark settings as `locked: true`, preventing higher-precedence layers from changing them. Example: system admin sets `mode: enforce, locked: true` — project-level configs cannot change the mode. + +```yaml +# System-level config (/etc/metaschema/resource-access-policy.yaml) +resource-access-policy: + mode: enforce + locked: true # cannot be weakened by project-level configs +``` + +#### Merge Semantics + +| Setting | Merge Behavior | +|---------|----------------| +| `mode` | Ratchet: most restrictive wins | +| `default-scheme-policy` | Higher-precedence wins (subject to ratchet) | +| Scheme configs (default) | Higher-precedence **replaces** entire scheme config | +| Scheme configs (`inherit: true`) | Higher-precedence **appends** patterns to lower-precedence | + +**Additive merge via `inherit`:** + +```yaml +# Project-level: add patterns to bundled defaults instead of replacing +resource-access-policy: + schemes: + - scheme: https + inherit: true # append to lower-layer patterns + patterns: + - "my-internal-host.com/**" # additional allow +``` + +Without `inherit: true`, the project-level `https` section would replace the bundled defaults entirely, losing any deny patterns for private IPs (though those are now handled by `NetworkSecurityChecker`, this still matters for scheme-level patterns). + +#### YAML Configuration Footgun + +YAML's `!` character is the tag prefix. Unquoted deny patterns will cause silent parsing failures: + +```yaml +# WRONG — YAML interprets ! as a tag +patterns: + - !**/.ssh/** + +# CORRECT — must be quoted +patterns: + - "!**/.ssh/**" +``` + +Config loading should validate that pattern strings do not contain YAML artifacts and log a clear error if parsing produces unexpected types. + +#### Pattern Complexity Limits + +To prevent ReDoS attacks via crafted glob patterns in user-controlled config files: +- Maximum patterns per scheme: 100 +- Maximum pattern length: 500 characters +- Glob-to-regex compilation uses possessive quantifiers or atomic groups to prevent catastrophic backtracking + +### 13. All in Core Module + +The entire policy engine lives in `core`: +- Policy model, pattern matching, enforcement +- URI normalization and security processing +- Network security checker +- Metaschema configuration model and loading +- Default bundled policy + +CLI-specific concerns (CLI flags, environment variables) are handled in `cli-processor`/`metaschema-cli` but delegate to the core API. + +### 14. Integration Points + +All resolution paths check the policy: + +| Component | Current Behavior | Change Required | +|-----------|-----------------|-----------------| +| `DefaultBoundLoader` | Uses `IUriResolver` | Add policy check before resolution | +| `AbstractModuleLoader` | Raw `URI.resolve()` for imports | Add policy check; re-check after redirect | +| `BindingConstraintLoader` | Raw `URI.resolve()` for imports | Add policy check | +| `DefaultXmlDeserializer` | Custom `XMLResolver` for entities | Route through policy; re-check after redirect | +| `DefaultJsonDeserializer` | Reads from provided `Reader` | None — uses loader with policy | +| `DefaultYamlDeserializer` | Reads from provided `Reader` | None — uses loader with policy | + +**Relative URI resolution:** Components that resolve relative URIs (e.g., `../schemas/foo.xml`) must resolve them to absolute URIs before calling `checkAccess()`. + +--- + +## Architecture + +### Component Diagram + +```text +┌──────────────────────────────────────────────────────────────────────┐ +│ ResourceAccessPolicy │ +├──────────────────────────────────────────────────────────────────────┤ +│ ┌─────────────┐ ┌────────────────────┐ ┌──────────────────────┐ │ +│ │ PolicyMode │ │ UriNormalizer │ │ NetworkSecurityChkr │ │ +│ │ ─────────── │ │ ────────────── │ │ ──────────────────── │ │ +│ │ DISABLED │ │ percent-decode │ │ CIDR block matching │ │ +│ │ AUDIT │ │ path normalize │ │ IP resolution │ │ +│ │ ENFORCE │ │ symlink resolve │ │ loopback check │ │ +│ │ │ │ case folding │ │ site-local check │ │ +│ └─────────────┘ └────────────────────┘ │ link-local check │ │ +│ └──────────────────────┘ │ +│ ┌──────────────────────────┐ ┌──────────────────────────────────┐ │ +│ │ FileProtections │ │ SchemePatterns │ │ +│ │ ────────────────── │ │ ──────────────────────────── │ │ +│ │ allow: /** │ │ file: │ │ +│ │ allow: /** │ │ allow: /workspace/** │ │ +│ │ deny: /.*/** │ │ https: │ │ +│ │ (checked FIRST) │ │ allow: pages.nist.gov/** │ │ +│ │ │ │ http: │ │ +│ │ │ │ (disabled) │ │ +│ │ │ │ jar: │ │ +│ │ │ │ allow: ** │ │ +│ └──────────────────────────┘ └──────────────────────────────────┘ │ +│ ┌────────────────────────────────────────────────────────────────┐ │ +│ │ Audit Logger (SLF4J) │ │ +│ │ Logger: dev.metaschema.core.model.policy │ │ +│ │ Prefix: [resource-access-policy] │ │ +│ │ AUDIT: WARN for violations, allow request │ │ +│ │ ENFORCE: ERROR for violations, throw AccessViolationException│ │ +│ │ All: DEBUG for allowed requests, INFO for policy init │ │ +│ └────────────────────────────────────────────────────────────────┘ │ +└──────────────────────────────────────────────────────────────────────┘ +``` + +### Class Hierarchy + +```text +dev.metaschema.core.model.policy/ +├── PolicyMode.java # Enum: DISABLED, AUDIT, ENFORCE +├── SymlinkPolicy.java # Enum: FOLLOW, NOFOLLOW +├── CaseSensitivity.java # Enum: SYSTEM_DEFAULT, CASE_SENSITIVE, CASE_INSENSITIVE +├── AccessViolationException.java # Exception for ENFORCE mode +├── IResourceAccessPolicy.java # Interface for policy checking +├── ResourceAccessPolicy.java # Main policy implementation (immutable) +├── ResourceAccessPolicyBuilder.java # Fluent builder with nested SchemeConfigBuilder +├── SchemePatternSet.java # Glob patterns for one scheme +├── GlobMatcher.java # Glob pattern → regex compilation +├── FileProtections.java # Adjustable file system allow-list +├── UriNormalizer.java # URI security processing pipeline +├── NetworkSecurityChecker.java # IP-based SSRF protection +├── NetworkSecurityConfig.java # Configuration for network security +├── PolicyDecision.java # Diagnostic result from explain() +├── EvaluationStep.java # Single step in evaluation trace +└── package-info.java +``` + +### Integration Flow + +```text +Loader receives URI + │ + ▼ +┌──────────────────────────────────────┐ +│ 1. If DISABLED → return immediately │ +│ │ +│ 2. UriNormalizer.normalize(uri) │ +│ ├─ percent-decode │ +│ ├─ lowercase scheme │ +│ ├─ file: normalize path + symlinks│ +│ └─ http/https: lowercase host │ +│ │ +│ 3. If http/https: │ +│ NetworkSecurityChecker.check() │ +│ ├─ resolve hostname → InetAddress│ +│ ├─ check against CIDR blocks │ +│ └─ if private/reserved → deny │ +│ │ +│ 4. If file: │ +│ FileProtections.isAllowed(path) │ +│ └─ if not in safe area → deny │ +│ │ +│ 5. If jar: │ +│ ├─ parse inner URI │ +│ ├─ recursively check inner URI │ +│ └─ check internal path patterns │ +│ │ +│ 6. SchemePatternSet.isAllowed() │ +│ └─ last match wins │ +│ │ +│ 7. If no match → default-scheme- │ +│ policy │ +│ │ +│ 8. Apply mode behavior │ +│ ├─ AUDIT: log WARN, allow │ +│ └─ ENFORCE: log ERROR, throw │ +└──────────────────────────────────────┘ +``` + +--- + +## API Design + +### Programmatic Configuration (Fluent API) + +```java +// Restrictive server mode +ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .symlinkPolicy(SymlinkPolicy.FOLLOW) + .caseSensitivity(CaseSensitivity.SYSTEM_DEFAULT) + .forScheme("https") + .allow("pages.nist.gov/**") + .allow("raw.githubusercontent.com/metaschema-framework/**") + .deny("*.internal/**") + .forScheme("http") + .denyAll() + .forScheme("file") + .allow("/data/schemas/**") + .forScheme("jar") + .allowAll() + .denyUnlistedSchemes() + .build(); + +// Development mode (one-liner) +ResourceAccessPolicy devPolicy = ResourceAccessPolicy.development(); + +// Bundled defaults (one-liner) +ResourceAccessPolicy defaults = ResourceAccessPolicy.bundledDefaults(); + +// Modify existing policy +ResourceAccessPolicy modified = defaults.toBuilder() + .mode(PolicyMode.ENFORCE) + .forScheme("https") + .allow("my-internal-host.com/**") + .build(); + +// Override just the mode +ResourceAccessPolicy enforced = defaults.withMode(PolicyMode.ENFORCE); + +// Diagnostic check (does NOT throw) +PolicyDecision decision = policy.explain(URI.create("https://10.0.0.1/api")); + +// Effective rules summary +String rules = policy.describeEffectiveRules(); +``` + +### Metaschema-Based Configuration Model + +The policy configuration uses a Metaschema-defined model, enabling: +- **Type-safe configuration** via generated Java classes +- **Multi-format support** — XML, JSON, or YAML +- **Schema validation** — configs validated against the Metaschema model + +**Metaschema Module Definition** (`resource-access-policy-config_metaschema.yaml`): + +Note: The root assembly is named `resource-access-policy-config` (not `resource-access-policy`) to avoid naming collision with the hand-written `ResourceAccessPolicy` class. + +```yaml +metaschema: + schema-name: Resource Access Policy Configuration + schema-version: 1.0.0 + short-name: resource-access-policy-config + namespace: http://csrc.nist.gov/ns/metaschema/resource-access-policy/1.0 + json-base-uri: http://csrc.nist.gov/ns/metaschema/resource-access-policy/1.0 + + definitions: + - define-assembly: + name: resource-access-policy-config + formal-name: Resource Access Policy Configuration + description: >- + Configuration controlling which URIs can be accessed during resource + loading. Uses glob patterns grouped by URI scheme with IP-based + network security. + root-name: resource-access-policy + flags: + - define-flag: + name: mode + as-type: token + formal-name: Enforcement Mode + description: >- + How policy violations are handled. + constraint: + allowed-values: + - enum: + value: disabled + description: No policy checking + - enum: + value: audit + description: Log violations but allow requests + - enum: + value: enforce + description: Block violating requests + - define-flag: + name: default-scheme-policy + as-type: token + formal-name: Default Scheme Policy + description: >- + Policy for schemes not explicitly configured. + constraint: + allowed-values: + - enum: + value: allow + description: Allow unlisted schemes + - enum: + value: deny + description: Deny unlisted schemes + - define-flag: + name: locked + as-type: boolean + formal-name: Locked + description: >- + When true, higher-precedence configuration layers cannot + weaken this policy (ratchet enforcement). + model: + - assembly: + ref: scheme-config + max-occurs: unbounded + group-as: + name: schemes + in-json: ARRAY + + - define-assembly: + name: scheme-config + formal-name: Scheme Configuration + description: >- + Configuration for a specific URI scheme, containing glob patterns + that control access. + flags: + - define-flag: + name: scheme + as-type: token + required: yes + formal-name: URI Scheme + description: The URI scheme (e.g., https, http, file, jar). + - define-flag: + name: enabled + as-type: boolean + formal-name: Enabled + description: >- + Whether this scheme is enabled. When false, all URIs with this + scheme are denied. + - define-flag: + name: inherit + as-type: boolean + formal-name: Inherit + description: >- + When true, patterns are appended to lower-precedence layer + patterns instead of replacing them. + model: + - field: + ref: pattern + max-occurs: unbounded + group-as: + name: patterns + in-json: ARRAY + + - define-field: + name: pattern + as-type: string + formal-name: Access Pattern + description: >- + A glob pattern controlling access. Patterns without a prefix are + allow patterns. Patterns starting with ! are deny patterns + (exceptions). Patterns are evaluated in order; last match wins. + IMPORTANT: In YAML, patterns starting with ! must be quoted. +``` + +### Example Configuration Files + +**YAML format** (`resource-access-policy.yaml`): + +```yaml +# Patterns starting with ! MUST be quoted in YAML +resource-access-policy: + mode: audit + default-scheme-policy: deny + schemes: + - scheme: https + patterns: + - "pages.nist.gov/**" + - "raw.githubusercontent.com/metaschema-framework/**" + - "!*.internal/**" # quoted! YAML ! is a tag prefix + - scheme: http + enabled: false + - scheme: file + patterns: + - "/data/schemas/**" + - "/workspace/**" + - scheme: jar + patterns: + - "**" +``` + +### Loading Configuration + +```java +// Using databind to load configuration +IBindingContext bindingContext = IBindingContext.instance(); +IBoundLoader loader = bindingContext.newBoundLoader(); + +// From file (auto-detects format) +ResourceAccessPolicyConfig config = loader.load( + ResourceAccessPolicyConfig.class, + Path.of("resource-access-policy.yaml")); + +// Create policy from config +ResourceAccessPolicy policy = ResourceAccessPolicy.fromConfiguration(config); + +// Set on module loader +IModuleLoader moduleLoader = ...; +moduleLoader.setResourceAccessPolicy(policy); +``` + +--- + +## Glob Pattern Matching + +### Syntax + +| Pattern | Matches | Example | +|---------|---------|---------| +| `**` | Everything (any characters including `/`) | All URIs for the scheme | +| `*` | Any characters except `/` (single segment) | One directory level | +| `?` | Single character except `/` | One character | +| `*.nist.gov/**` | Subdomain wildcard | `pages.nist.gov/schemas/foo.xml` | +| `/workspace/**` | Directory tree | `/workspace/project/schema.xml` | +| `/workspace/*` | Single level | `/workspace/schema.xml` (not deeper) | +| `!pattern` | **Deny** (block previously allowed) | Negates a previous allow | + +### Pattern Evaluation + +For a given URI: +1. Apply URI security processing (normalize, decode, resolve symlinks) +2. Extract the scheme (lowercased) +3. Find the matching `SchemePatternSet` +4. If scheme is `enabled: false`, result is **deny** +5. If no patterns defined and `enabled: true`, use `default-scheme-policy` +6. Evaluate patterns in order; **last matching pattern wins** +7. If no pattern matches, use `default-scheme-policy` (default: deny) + +### What Patterns Match Against + +After URI normalization, patterns match against the scheme-specific part: + +| Scheme | Pattern matches against | Example URI → match target | +|--------|------------------------|---------------------------| +| `file` | Normalized path | `file:///workspace/foo.xml` → `/workspace/foo.xml` | +| `https` | Host + path (no port) | `https://nist.gov:443/x.xml` → `nist.gov/x.xml` | +| `http` | Host + path (no port) | `http://localhost:8080/api` → `localhost/api` | +| `jar` | Path within JAR | `jar:file:///lib.jar!/schema/x.xsd` → `/schema/x.xsd` | + +### Important: `!` Means DENY + +Unlike `.gitignore` where `!` means "re-include" (stop ignoring a file), in this system `!` means **deny access**. This is the opposite semantic: + +| System | `!` Meaning | Example | +|--------|------------|---------| +| `.gitignore` | "Do NOT ignore this file" (re-include) | `!important.log` keeps the file tracked | +| Resource Access Policy | "DENY access to this resource" (block) | `"!**/.ssh/**"` blocks SSH key access | + +--- + +## Success Criteria + +From Issue #183: +- [ ] All website and readme documentation affected by the changes have been updated +- [ ] A Pull Request is submitted that fully addresses the goals +- [ ] The CI-CD build process runs without any reported errors + +### Functional + +- [ ] Module loading checks policy for imports +- [ ] Document loading checks policy +- [ ] Constraint loading checks policy for imports +- [ ] XML entity resolution checks policy +- [ ] Glob pattern matching works correctly for all schemes +- [ ] `!` deny patterns create proper exceptions +- [ ] DISABLED mode allows all URIs without logging +- [ ] AUDIT mode logs violations but allows all URIs +- [ ] ENFORCE mode blocks violations with `AccessViolationException` +- [ ] Bundled defaults are restrictive +- [ ] Configuration loading works from YAML, JSON, and XML +- [ ] Configuration layering merges correctly with ratcheting +- [ ] `development()` factory allows localhost and HTTP + +### Security + +- [ ] Path traversal attacks caught via mandatory normalization +- [ ] URL encoding bypasses caught via mandatory percent-decoding +- [ ] Symlinks resolved before policy check (default mode) +- [ ] SSRF to localhost caught via IP-based checking (all encodings) +- [ ] SSRF to private IP ranges caught via CIDR block matching +- [ ] Cloud metadata endpoints caught (169.254.169.254) +- [ ] IPv4-mapped IPv6 addresses caught +- [ ] JAR scheme inner URIs recursively checked +- [ ] HTTP redirect URIs re-checked against policy +- [ ] Config layering cannot weaken policy (ratchet) +- [ ] Sensitive system paths denied by FileProtections +- [ ] Case-insensitive matching works on Windows +- [ ] ReDoS prevented via non-backtracking regex + +### Backwards Compatibility + +- [ ] Zero-config default (DISABLED) does not break any existing workflows +- [ ] Existing code without policy configuration works unchanged +- [ ] Library users can opt-in without changing their URI resolvers +- [ ] CLI users can override mode via flags + +### Non-Functional + +- [ ] Actionable error messages with layer, source, and remediation +- [ ] `explain()` provides evaluation trace for debugging +- [ ] `describeEffectiveRules()` provides human-readable policy summary +- [ ] Clear log messages identifying policy violations +- [ ] Minimal performance overhead for URI resolution +- [ ] 80%+ test coverage for policy code + +--- + +## Testing Strategy + +### Unit Tests + +- `GlobMatcher` — Pattern matching for all glob syntax variants, case sensitivity +- `SchemePatternSet` — Pattern evaluation with `!` negation, ordering, empty patterns +- `ResourceAccessPolicy` — Policy checking across modes, factory methods, toBuilder +- `PolicyMode` — Mode behavior (disabled/audit/enforce) +- `UriNormalizer` — Path normalization, percent-decoding, symlink resolution +- `NetworkSecurityChecker` — IP-based CIDR block matching (see below) +- `FileProtections` — Allow-list, defaults, builder, conflict detection, case sensitivity +- `PolicyDecision` — Diagnostic results from explain() +- Configuration loading and validation + +### IP Range Boundary Tests + +Explicit boundary value tests for every private CIDR block, using the IP library: + +```java +@ParameterizedTest +@CsvSource({ + // 127.0.0.0/8 (loopback) + "126.255.255.255, true", // just below range — allowed + "127.0.0.0, false", // start of range — blocked + "127.0.0.1, false", // standard loopback — blocked + "127.255.255.255, false", // end of range — blocked + "128.0.0.0, true", // just above range — allowed + + // 10.0.0.0/8 (private Class A) + "9.255.255.255, true", + "10.0.0.0, false", + "10.255.255.255, false", + "11.0.0.0, true", + + // 172.16.0.0/12 (private Class B) + "172.15.255.255, true", + "172.16.0.0, false", + "172.31.255.255, false", + "172.32.0.0, true", + + // 192.168.0.0/16 (private Class C) + "192.167.255.255, true", + "192.168.0.0, false", + "192.168.255.255, false", + "192.169.0.0, true", + + // 169.254.0.0/16 (link-local, includes cloud metadata) + "169.253.255.255, true", + "169.254.0.0, false", + "169.254.169.254, false", // cloud metadata + "169.254.255.255, false", + "169.255.0.0, true", + + // 100.64.0.0/10 (CGNAT) + "100.63.255.255, true", + "100.64.0.0, false", + "100.127.255.255, false", + "100.128.0.0, true", + + // 0.0.0.0/8 (unspecified) + "0.0.0.0, false", + "0.255.255.255, false", + "1.0.0.0, true", +}) +void testIpv4CidrBoundaries(String ip, boolean allowed) { ... } + +@ParameterizedTest +@CsvSource({ + "::1, false", // IPv6 loopback + "::2, true", // not loopback + "fe80::1, false", // link-local + "fe7f::1, true", // not link-local + "fc00::1, false", // ULA + "fbff::1, true", // not ULA + "::ffff:127.0.0.1, false", // IPv4-mapped loopback + "::ffff:8.8.8.8, true", // IPv4-mapped public +}) +void testIpv6CidrBoundaries(String ip, boolean allowed) { ... } + +@ParameterizedTest +@CsvSource({ + "2130706433, false", // decimal 127.0.0.1 + "0x7f000001, false", // hex 127.0.0.1 + "0177.0.0.1, false", // octal 127.0.0.1 + "127.1, false", // shorthand 127.0.0.1 +}) +void testAlternateIpEncodings(String host, boolean allowed) { ... } +``` + +### Security Tests + +- Path traversal: `../../../etc/passwd`, `..%2f..%2f`, double-encoding +- URL encoding bypass: `%61` for `a`, `%2f` for `/` +- Symlink traversal: symlink from allowed to denied directory +- IP encoding bypass: decimal, octal, hex, shorthand, IPv4-mapped IPv6 +- DNS rebinding documentation (test that re-check API exists) +- HTTP redirect re-checking +- JAR inner URI SSRF: `jar:http://evil.com/mal.jar!/path` +- Case sensitivity: Windows paths, scheme names +- `!` pattern bypass attempts +- ReDoS resistance: patterns with deep nesting +- Config ratchet: lower-precedence config attempts to weaken + +### Integration Tests + +- Module loading with policy in each mode +- Document loading with policy +- Constraint loading with policy +- XML entity resolution with policy +- Configuration layering from multiple sources +- Ratchet enforcement across config layers + +--- + +## Risks and Mitigations + +| Risk | Impact | Mitigation | +|------|--------|------------| +| Breaking existing applications | High | DISABLED as zero-config default; explicit opt-in required | +| Performance overhead | Medium | Efficient glob matching; pattern compilation; IP address caching | +| Incomplete SSRF protection | High | IP-based checking via library, not just string patterns | +| Path traversal bypass | High | Mandatory normalization + symlink resolution before matching | +| Configuration complexity | Medium | Factory methods for common scenarios; diagnostic API | +| FileProtections confusion | Medium | Conflict detection at build time; actionable error messages | +| Platform-specific path issues | Medium | CaseSensitivity mode; test on Windows/Linux/Mac | +| ReDoS via crafted patterns | Medium | Non-backtracking regex; pattern complexity limits | +| Config privilege escalation | High | Ratchet principle; locked flag | + +--- + +## Migration Path + +### Phase 1: Opt-In (AUDIT mode) + +1. Add `loader.setResourceAccessPolicy(ResourceAccessPolicy.bundledDefaults())` to your code +2. Deploy — AUDIT mode logs violations but allows all requests +3. Monitor logs for `[resource-access-policy]` WARN entries +4. Use `policy.explain(uri)` to understand specific decisions +5. Adjust policy patterns to match actual access needs +6. Run in AUDIT mode until no new warnings appear for at least 2 weeks + +### Phase 2: Enforcement + +1. Switch to ENFORCE: `ResourceAccessPolicy.bundledDefaults().withMode(PolicyMode.ENFORCE)` +2. Or in config: `mode: enforce` +3. Monitor for `AccessViolationException` in error tracking +4. Use `--resource-policy-mode=audit` as emergency rollback + +### Phase 3: Customization + +1. Create project-specific `.metaschema/resource-access-policy.yaml` +2. Override bundled defaults for organization-specific needs +3. Use per-loader policies for fine-grained control +4. Deploy organization-wide policies via `/etc/metaschema/` + +--- + +## CLI Integration + +### Global Flags + +These flags apply to all commands that load resources (e.g., `validate`, `validate-content`, `convert`, `generate-schema`): + +| Flag | Description | +|------|-------------| +| `--resource-policy-mode=` | Override enforcement mode (disabled/audit/enforce) | +| `--resource-policy=` | Load custom policy configuration file | + +### `resource-policy` Command + +A new top-level command with subcommands for policy diagnostics. Follows the same parent/subcommand pattern as the existing `metapath` command (`AbstractParentCommand` with `AbstractTerminalCommand` subcommands). + +| Subcommand | Description | +|------------|-------------| +| `resource-policy dump` | Print effective merged policy as YAML and exit | +| `resource-policy check ` | Check a single URI against the policy and print evaluation trace | + +**Usage examples:** + +```bash +# Dump the effective policy (bundled defaults + any config files) +metaschema-cli resource-policy dump + +# Dump with a custom config overlay +metaschema-cli resource-policy dump --resource-policy=my-policy.yaml + +# Check whether a specific URI would be allowed +metaschema-cli resource-policy check https://example.com/schema.xsd + +# Check with enforce mode override +metaschema-cli resource-policy check --resource-policy-mode=enforce file:///etc/passwd +``` + +### Environment Variables + +| Variable | Description | +|----------|-------------| +| `METASCHEMA_RESOURCE_POLICY_MODE` | Override enforcement mode | + +--- + +## Out of Scope + +- Authentication/authorization for HTTP resources (use existing HTTP client config) +- Rate limiting or request throttling +- Content inspection (only URI-based filtering) +- Certificate validation (use JVM truststore config) +- Real-time file watching for config changes (explicit reload only) +- DNS rebinding protection at the HTTP client level (documented as integration requirement) +- Port-based restrictions (may be added in a future version) diff --git a/PRDs/20251217-allowlist-resolver/implementation-plan.md b/PRDs/20251217-allowlist-resolver/implementation-plan.md new file mode 100644 index 0000000000..8c12628fff --- /dev/null +++ b/PRDs/20251217-allowlist-resolver/implementation-plan.md @@ -0,0 +1,1657 @@ +# Resource Access Policy - Implementation Plan + +**Goal:** Implement policy-based URI access control with glob patterns, IP-based SSRF protection, mandatory URI normalization, graduated enforcement modes, and bundled defaults. + +**Architecture:** All policy engine code in `core` module. CLI integration in `cli-processor`/`metaschema-cli`. + +**Tech Stack:** Java 11, JUnit 5, SLF4J, IP address library (`com.github.seancfoley:ipaddress`), Metaschema databind for configuration model. + +--- + +## PR Breakdown + +| PR | Scope | Estimated Files | Key Deliverables | +|----|-------|-----------------|------------------| +| PR1 | Policy engine core | ~25 files | Enums, `GlobMatcher`, `UriNormalizer`, `NetworkSecurityChecker`, `SchemePatternSet`, `FileProtections`, `ResourceAccessPolicy`, builder, diagnostics | +| PR2 | Configuration model | ~10 files | Metaschema module, config loading, bundled defaults, layering with ratchet | +| PR3 | Loader integration | ~10 files | `IModuleLoader`/`IBoundLoader` integration, XML entity policy, redirect re-check | +| PR4 | CLI integration + docs | ~8 files | CLI flags, `resource-policy` command with `dump`/`check` subcommands, env vars, documentation | + +--- + +## PR1: Policy Engine Core + +**Goal:** Implement the glob-based pattern matching engine, URI normalization, IP-based network security, enforcement modes, diagnostics, and the `ResourceAccessPolicy` builder. + +**Module:** `core` + +**Package:** `dev.metaschema.core.model.policy` + +**New dependency:** Add `com.github.seancfoley:ipaddress` to `core/pom.xml` for CIDR block matching. + +### Task 1.1: Create PolicyMode Enum + +**Files:** +- Create: `core/src/main/java/dev/metaschema/core/model/policy/PolicyMode.java` +- Test: `core/src/test/java/dev/metaschema/core/model/policy/PolicyModeTest.java` + +**Test first:** + +```java +package dev.metaschema.core.model.policy; + +import static org.junit.jupiter.api.Assertions.*; + +import org.junit.jupiter.api.Test; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.CsvSource; + +class PolicyModeTest { + + @Test + void testDefaultModeIsDisabled() { + assertEquals(PolicyMode.DISABLED, PolicyMode.defaultMode()); + } + + @ParameterizedTest + @CsvSource({ + "DISABLED, false, false", + "AUDIT, true, false", + "ENFORCE, true, true" + }) + void testModeCharacteristics(PolicyMode mode, boolean checks, boolean blocks) { + assertEquals(checks, mode.isCheckEnabled()); + assertEquals(blocks, mode.isBlockEnabled()); + } + + @ParameterizedTest + @CsvSource({ + "disabled, DISABLED", + "audit, AUDIT", + "enforce, ENFORCE", + "DISABLED, DISABLED", + "Audit, AUDIT" + }) + void testFromString(String input, PolicyMode expected) { + assertEquals(expected, PolicyMode.fromString(input)); + } + + @Test + void testRestrictionOrdering() { + assertTrue(PolicyMode.DISABLED.ordinal() < PolicyMode.AUDIT.ordinal()); + assertTrue(PolicyMode.AUDIT.ordinal() < PolicyMode.ENFORCE.ordinal()); + } + + @ParameterizedTest + @CsvSource({ + "DISABLED, AUDIT, AUDIT", + "AUDIT, ENFORCE, ENFORCE", + "ENFORCE, DISABLED, ENFORCE", + "AUDIT, DISABLED, AUDIT", + "ENFORCE, AUDIT, ENFORCE" + }) + void testMostRestrictive(PolicyMode a, PolicyMode b, PolicyMode expected) { + assertEquals(expected, PolicyMode.mostRestrictive(a, b)); + } +} +``` + +**Implementation:** + +```java +package dev.metaschema.core.model.policy; + +import java.util.Locale; + +import edu.umd.cs.findbugs.annotations.NonNull; + +/** + * Enforcement mode for resource access policies. + * + *

Modes are ordered by restriction level: DISABLED < AUDIT < ENFORCE. + * The {@link #mostRestrictive(PolicyMode, PolicyMode)} method supports the + * ratchet principle where configuration layers can only tighten policy. + */ +public enum PolicyMode { + /** No policy checking; all URIs allowed silently. */ + DISABLED(false, false), + /** Check policy and log violations, but allow all requests. */ + AUDIT(true, false), + /** Check policy and block violations with an exception. */ + ENFORCE(true, true); + + private final boolean checkEnabled; + private final boolean blockEnabled; + + PolicyMode(boolean checkEnabled, boolean blockEnabled) { + this.checkEnabled = checkEnabled; + this.blockEnabled = blockEnabled; + } + + /** Whether this mode performs policy checks. */ + public boolean isCheckEnabled() { return checkEnabled; } + + /** Whether this mode blocks violating requests. */ + public boolean isBlockEnabled() { return blockEnabled; } + + /** Returns the default enforcement mode ({@link #DISABLED}). */ + @NonNull + public static PolicyMode defaultMode() { return DISABLED; } + + /** Parses a mode from a string value (case-insensitive). */ + @NonNull + public static PolicyMode fromString(@NonNull String value) { + return valueOf(value.toUpperCase(Locale.ROOT)); + } + + /** Returns the more restrictive of two modes (ratchet principle). */ + @NonNull + public static PolicyMode mostRestrictive(@NonNull PolicyMode a, @NonNull PolicyMode b) { + return a.ordinal() >= b.ordinal() ? a : b; + } +} +``` + +--- + +### Task 1.2: Create SymlinkPolicy and CaseSensitivity Enums + +**Files:** +- Create: `core/src/main/java/dev/metaschema/core/model/policy/SymlinkPolicy.java` +- Create: `core/src/main/java/dev/metaschema/core/model/policy/CaseSensitivity.java` +- Test: `core/src/test/java/dev/metaschema/core/model/policy/CaseSensitivityTest.java` + +**Test first:** + +```java +package dev.metaschema.core.model.policy; + +import static org.junit.jupiter.api.Assertions.*; + +import org.junit.jupiter.api.Test; + +class CaseSensitivityTest { + + @Test + void testSystemDefaultDetectsOs() { + boolean isWindows = System.getProperty("os.name") + .toLowerCase(java.util.Locale.ROOT).contains("win"); + CaseSensitivity systemDefault = CaseSensitivity.SYSTEM_DEFAULT; + + assertEquals(!isWindows, systemDefault.isCaseSensitive()); + } + + @Test + void testExplicitModes() { + assertTrue(CaseSensitivity.CASE_SENSITIVE.isCaseSensitive()); + assertFalse(CaseSensitivity.CASE_INSENSITIVE.isCaseSensitive()); + } +} +``` + +**Implementation:** + +```java +/** Policy for resolving symbolic links during file path checking. */ +public enum SymlinkPolicy { + /** Resolve symlinks via {@code Path.toRealPath()} before checking (default). */ + FOLLOW, + /** Check the path as-is without symlink resolution. */ + NOFOLLOW +} + +/** Case sensitivity mode for file path matching. */ +public enum CaseSensitivity { + /** Auto-detect from OS: case-insensitive on Windows, case-sensitive elsewhere. */ + SYSTEM_DEFAULT, + /** Always case-sensitive matching. */ + CASE_SENSITIVE, + /** Always case-insensitive matching. */ + CASE_INSENSITIVE; + + /** Whether this mode uses case-sensitive matching. */ + public boolean isCaseSensitive() { + return switch (this) { + case CASE_SENSITIVE -> true; + case CASE_INSENSITIVE -> false; + case SYSTEM_DEFAULT -> !System.getProperty("os.name") + .toLowerCase(java.util.Locale.ROOT).contains("win"); + }; + } +} +``` + +--- + +### Task 1.3: Create AccessViolationException + +**Files:** +- Create: `core/src/main/java/dev/metaschema/core/model/policy/AccessViolationException.java` +- Test: `core/src/test/java/dev/metaschema/core/model/policy/AccessViolationExceptionTest.java` + +**Test first:** + +```java +package dev.metaschema.core.model.policy; + +import static org.junit.jupiter.api.Assertions.*; + +import org.junit.jupiter.api.Test; + +import java.net.URI; + +class AccessViolationExceptionTest { + + @Test + void testExceptionContainsStructuredFields() { + URI uri = URI.create("file:///etc/passwd"); + AccessViolationException ex = new AccessViolationException( + uri, "file-protections", "path not in allowed areas", + "bundled defaults", + "FileProtections.builder().includeDefaults().allow(\"/etc/passwd\").build()"); + + assertEquals(uri, ex.getUri()); + assertEquals("file-protections", ex.getLayer()); + assertEquals("path not in allowed areas", ex.getDenialReason()); + assertEquals("bundled defaults", ex.getConfigSource()); + assertNotNull(ex.getRemediation()); + assertTrue(ex.getMessage().contains(uri.toString())); + assertTrue(ex.getMessage().contains("file-protections")); + } + + @Test + void testExtendsSecurityException() { + AccessViolationException ex = new AccessViolationException( + URI.create("http://localhost"), "network-security", + "loopback denied", "bundled defaults", null); + assertInstanceOf(SecurityException.class, ex); + } +} +``` + +**Implementation:** Exception with fields for `uri`, `layer`, `denialReason`, `configSource`, `remediation`. Message format matches PRD: + +```text +Resource access policy violation: '' was denied. + Denied by: () + Source: + To allow: +``` + +--- + +### Task 1.4: Create GlobMatcher + +**Files:** +- Create: `core/src/main/java/dev/metaschema/core/model/policy/GlobMatcher.java` +- Test: `core/src/test/java/dev/metaschema/core/model/policy/GlobMatcherTest.java` + +**Test first:** + +```java +package dev.metaschema.core.model.policy; + +import static org.junit.jupiter.api.Assertions.*; + +import org.junit.jupiter.api.Test; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.CsvSource; + +class GlobMatcherTest { + + @ParameterizedTest + @CsvSource({ + "'**', 'anything/at/all', true", + "'**', '', true", + "'*.nist.gov/**', 'pages.nist.gov/schemas/foo.xml', true", + "'*.nist.gov/**', 'evil.com/nist.gov/attack', false", + "'/workspace/**', '/workspace/project/schema.xml', true", + "'/workspace/*', '/workspace/schema.xml', true", + "'/workspace/*', '/workspace/sub/schema.xml', false", + "'example.com/path/**', 'example.com/path', true", + "'example.com/path/**', 'example.com/path/', true", + "'example.com/path/**', 'example.com/path/to/resource', true", + "'example.com/path/**', 'example.com/other/resource', false", + "'**/.ssh/**', '/home/user/.ssh/id_rsa', true", + "'**/.ssh/**', '/home/user/projects/ssh-keys', false", + "'localhost/**', 'localhost/api', true", + }) + void testPatternMatching(String pattern, String target, boolean expected) { + GlobMatcher matcher = GlobMatcher.compile(pattern, true); + assertEquals(expected, matcher.matches(target), + () -> String.format("Pattern '%s' vs '%s'", pattern, target)); + } + + @Test + void testCaseInsensitiveMatching() { + GlobMatcher matcher = GlobMatcher.compile("/Workspace/**", false); + assertTrue(matcher.matches("/workspace/file.xml")); + assertTrue(matcher.matches("/WORKSPACE/file.xml")); + } + + @Test + void testCaseSensitiveMatching() { + GlobMatcher matcher = GlobMatcher.compile("/workspace/**", true); + assertTrue(matcher.matches("/workspace/file.xml")); + assertFalse(matcher.matches("/Workspace/file.xml")); + } + + @Test + void testNullSafety() { + GlobMatcher matcher = GlobMatcher.compile("**", true); + assertThrows(NullPointerException.class, () -> matcher.matches(null)); + } + + @Test + void testDirectoryEquivalence() { + // path/** must also match path itself (without trailing slash or children) + GlobMatcher matcher = GlobMatcher.compile("/workspace/**", true); + assertTrue(matcher.matches("/workspace"), "directory itself without trailing slash"); + assertTrue(matcher.matches("/workspace/"), "directory with trailing slash"); + assertTrue(matcher.matches("/workspace/project/schema.xml"), "child path"); + assertFalse(matcher.matches("/workspaceX"), "must not match prefix that is not the directory"); + assertFalse(matcher.matches("/workspac"), "must not match shorter prefix"); + + // Same for host-style patterns + GlobMatcher hostMatcher = GlobMatcher.compile("pages.nist.gov/**", true); + assertTrue(hostMatcher.matches("pages.nist.gov"), "host directory itself"); + assertTrue(hostMatcher.matches("pages.nist.gov/"), "host directory with trailing slash"); + assertTrue(hostMatcher.matches("pages.nist.gov/schemas/foo.xml"), "host child path"); + } + + @Test + void testEmptyPattern() { + GlobMatcher matcher = GlobMatcher.compile("", true); + assertTrue(matcher.matches("")); + assertFalse(matcher.matches("anything")); + } + + @Test + void testReDoSResistance() { + // Crafted pattern that would cause catastrophic backtracking with naive regex + String malicious = "**/**/**/**/**/**/**/**/**/**"; + GlobMatcher matcher = GlobMatcher.compile(malicious, true); + String longPath = "a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z"; + // Should complete in reasonable time (< 1 second), not hang + assertTimeout(java.time.Duration.ofSeconds(1), + () -> matcher.matches(longPath)); + } +} +``` + +**Implementation:** Compile glob patterns to `java.util.regex.Pattern`: +- `*` → `[^/]*+` (possessive quantifier to prevent backtracking) +- `**` → `.*+` (possessive quantifier) +- `?` → `[^/]` +- Escape regex special characters +- Accept `caseSensitive` parameter for `Pattern.CASE_INSENSITIVE` flag +- Use possessive quantifiers or atomic groups to prevent ReDoS +- Validate pattern length (max 500 chars) +- **Directory equivalence:** When a pattern ends with `/**`, compile it to also match the directory prefix itself. For pattern `P/**`, the compiled regex matches `P`, `P/`, and `P/`. Implementation: detect trailing `/**`, strip it to get prefix `P`, compile as `P(/.*+)?` (optional `/` followed by anything). This ensures `path/**` ≡ `path` in allow lists. + +--- + +### Task 1.5: Create UriNormalizer + +**Files:** +- Create: `core/src/main/java/dev/metaschema/core/model/policy/UriNormalizer.java` +- Test: `core/src/test/java/dev/metaschema/core/model/policy/UriNormalizerTest.java` + +**Test first:** + +```java +package dev.metaschema.core.model.policy; + +import static org.junit.jupiter.api.Assertions.*; + +import org.junit.jupiter.api.Test; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.CsvSource; + +import java.net.URI; + +class UriNormalizerTest { + + // --- Path normalization --- + + @ParameterizedTest + @CsvSource({ + "file:///workspace/../etc/passwd, /etc/passwd", + "file:///workspace/./schema.xml, /workspace/schema.xml", + "file:///workspace/project/../../etc/passwd, /etc/passwd", + }) + void testPathTraversalNormalization(String rawUri, String expectedPath) { + URI uri = URI.create(rawUri); + String normalized = UriNormalizer.normalizeFilePath(uri, SymlinkPolicy.NOFOLLOW); + assertEquals(expectedPath, normalized); + } + + @Test + void testRejectPathWithDotsAfterNormalization() { + // Edge case: a path that still contains ".." after normalization + // (should not happen with Path.normalize() but tested for defense-in-depth) + URI uri = URI.create("file:///workspace/../etc/passwd"); + String normalized = UriNormalizer.normalizeFilePath(uri, SymlinkPolicy.NOFOLLOW); + assertFalse(normalized.contains(".."), "Normalized path must not contain '..'"); + } + + // --- Percent-decoding --- + + @ParameterizedTest + @CsvSource({ + "file:///etc/p%61sswd, /etc/passwd", + "file:///workspace%2F..%2F..%2Fetc%2Fpasswd, /etc/passwd", + }) + void testPercentDecoding(String rawUri, String expectedPath) { + URI uri = URI.create(rawUri); + String normalized = UriNormalizer.normalizeFilePath(uri, SymlinkPolicy.NOFOLLOW); + assertEquals(expectedPath, normalized); + } + + // --- Scheme normalization --- + + @Test + void testSchemeNormalization() { + assertEquals("file", UriNormalizer.normalizeScheme(URI.create("FILE:///path"))); + assertEquals("https", UriNormalizer.normalizeScheme(URI.create("HTTPS://host/path"))); + } + + // --- Host normalization (http/https) --- + + @ParameterizedTest + @CsvSource({ + "https://EXAMPLE.COM/path, example.com/path", + "https://Example.Com:443/path, example.com/path", + "https://example.com:8443/path, example.com/path", + "http://LOCALHOST:8080/api, localhost/api", + "http://localhost:80/api, localhost/api", + }) + void testHostNormalization(String rawUri, String expectedTarget) { + URI uri = URI.create(rawUri); + String target = UriNormalizer.normalizeNetworkTarget(uri); + assertEquals(expectedTarget, target); + } + + // --- JAR scheme parsing --- + + @Test + void testJarSchemeInnerUri() { + URI jarUri = URI.create("jar:http://evil.com/mal.jar!/schema/x.xsd"); + URI innerUri = UriNormalizer.extractJarInnerUri(jarUri); + assertEquals(URI.create("http://evil.com/mal.jar"), innerUri); + } + + @Test + void testJarSchemeInternalPath() { + URI jarUri = URI.create("jar:file:///lib.jar!/schema/x.xsd"); + String internalPath = UriNormalizer.extractJarInternalPath(jarUri); + assertEquals("/schema/x.xsd", internalPath); + } + + @Test + void testMalformedJarUri() { + URI jarUri = URI.create("jar:file:///lib.jar"); + assertThrows(IllegalArgumentException.class, + () -> UriNormalizer.extractJarInternalPath(jarUri)); + } +} +``` + +**Implementation:** Static utility class with methods: +- `normalizeScheme(URI)` → lowercase scheme string +- `normalizeFilePath(URI, SymlinkPolicy)` → decode, normalize path, optionally resolve symlinks +- `normalizeNetworkTarget(URI)` → lowercase host, strip default ports, return `host/path` +- `extractJarInnerUri(URI)` → parse inner URI before `!` +- `extractJarInternalPath(URI)` → parse path after `!` +- Reject paths containing `..` after normalization (defense-in-depth) + +--- + +### Task 1.6: Create NetworkSecurityChecker + +**Files:** +- Create: `core/src/main/java/dev/metaschema/core/model/policy/NetworkSecurityChecker.java` +- Create: `core/src/main/java/dev/metaschema/core/model/policy/NetworkSecurityConfig.java` +- Test: `core/src/test/java/dev/metaschema/core/model/policy/NetworkSecurityCheckerTest.java` + +**Test first — IP CIDR boundary tests:** + +```java +package dev.metaschema.core.model.policy; + +import static org.junit.jupiter.api.Assertions.*; + +import org.junit.jupiter.api.Test; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.CsvSource; + +class NetworkSecurityCheckerTest { + + private final NetworkSecurityChecker checker + = new NetworkSecurityChecker(NetworkSecurityConfig.defaults()); + + // --- IPv4 CIDR boundary tests --- + + @ParameterizedTest + @CsvSource({ + // 127.0.0.0/8 (loopback) + "126.255.255.255, true", + "127.0.0.0, false", + "127.0.0.1, false", + "127.255.255.255, false", + "128.0.0.0, true", + + // 10.0.0.0/8 (private Class A) + "9.255.255.255, true", + "10.0.0.0, false", + "10.128.0.1, false", + "10.255.255.255, false", + "11.0.0.0, true", + + // 172.16.0.0/12 (private Class B) + "172.15.255.255, true", + "172.16.0.0, false", + "172.20.0.1, false", + "172.31.255.255, false", + "172.32.0.0, true", + + // 192.168.0.0/16 (private Class C) + "192.167.255.255, true", + "192.168.0.0, false", + "192.168.1.100, false", + "192.168.255.255, false", + "192.169.0.0, true", + + // 169.254.0.0/16 (link-local / cloud metadata) + "169.253.255.255, true", + "169.254.0.0, false", + "169.254.169.254, false", + "169.254.255.255, false", + "169.255.0.0, true", + + // 100.64.0.0/10 (CGNAT / shared address space) + "100.63.255.255, true", + "100.64.0.0, false", + "100.100.0.1, false", + "100.127.255.255, false", + "100.128.0.0, true", + + // 0.0.0.0/8 (unspecified) + "0.0.0.0, false", + "0.255.255.255, false", + "1.0.0.0, true", + + // Public IPs (should be allowed) + "8.8.8.8, true", + "1.1.1.1, true", + "93.184.216.34, true", + }) + void testIpv4CidrBoundaries(String ip, boolean allowed) { + assertEquals(allowed, checker.isAllowed(ip), + () -> "IP " + ip + " should be " + (allowed ? "allowed" : "blocked")); + } + + // --- IPv6 CIDR boundary tests --- + + @ParameterizedTest + @CsvSource({ + // ::1/128 (IPv6 loopback) + "::1, false", + "::2, true", + + // fe80::/10 (IPv6 link-local) + "fe80::1, false", + "fe80::ffff, false", + "febf::1, false", + "fec0::1, true", + + // fc00::/7 (IPv6 ULA) + "fc00::1, false", + "fd00::1, false", + "fdff::ffff, false", + "fe00::1, true", + + // ::ffff:0:0/96 (IPv4-mapped IPv6 — checked after mapping) + "::ffff:127.0.0.1, false", + "::ffff:10.0.0.1, false", + "::ffff:192.168.1.1, false", + "::ffff:8.8.8.8, true", + "::ffff:1.1.1.1, true", + + // Public IPv6 (should be allowed) + "2001:4860:4860::8888, true", + }) + void testIpv6CidrBoundaries(String ip, boolean allowed) { + assertEquals(allowed, checker.isAllowed(ip), + () -> "IP " + ip + " should be " + (allowed ? "allowed" : "blocked")); + } + + // --- Alternate IP encoding tests --- + + @ParameterizedTest + @CsvSource({ + "2130706433, false", // decimal 127.0.0.1 + "0x7f000001, false", // hex 127.0.0.1 + "0177.0.0.1, false", // octal 127.0.0.1 + "127.1, false", // shorthand 127.0.0.1 + }) + void testAlternateIpEncodings(String host, boolean allowed) { + assertEquals(allowed, checker.isAllowed(host), + () -> "Host " + host + " should be " + (allowed ? "allowed" : "blocked")); + } + + // --- Hostname resolution --- + + @Test + void testLocalhostResolution() { + assertFalse(checker.isAllowed("localhost")); + } + + // --- Custom config --- + + @Test + void testAllowLoopback() { + NetworkSecurityChecker devChecker = new NetworkSecurityChecker( + NetworkSecurityConfig.builder() + .allowLoopback(true) + .build()); + + assertTrue(devChecker.isAllowed("127.0.0.1")); + assertTrue(devChecker.isAllowed("localhost")); + assertFalse(devChecker.isAllowed("10.0.0.1")); // still blocked + } + + @Test + void testAllowSpecificCidr() { + NetworkSecurityChecker customChecker = new NetworkSecurityChecker( + NetworkSecurityConfig.builder() + .allowCidr("10.0.0.0/24") + .build()); + + assertTrue(customChecker.isAllowed("10.0.0.1")); + assertFalse(customChecker.isAllowed("10.0.1.1")); + assertFalse(customChecker.isAllowed("192.168.1.1")); + } + + // --- Denial reason --- + + @Test + void testDenialReasonIncludesCidr() { + String reason = checker.getDenialReason("10.0.0.1"); + assertNotNull(reason); + assertTrue(reason.contains("10.0.0.0/8")); + } +} +``` + +**Implementation:** +- `NetworkSecurityChecker` accepts a `NetworkSecurityConfig` +- Uses `com.github.seancfoley:ipaddress` library for CIDR matching +- `isAllowed(String hostOrIp)` — resolves hostname to `InetAddress`, checks against blocked CIDR ranges +- `getDenialReason(String hostOrIp)` — returns human-readable reason with the matching CIDR block +- `NetworkSecurityConfig` — builder with `allowLoopback(boolean)`, `allowCidr(String)`, `defaults()` factory + +--- + +### Task 1.7: Create SchemePatternSet + +**Files:** +- Create: `core/src/main/java/dev/metaschema/core/model/policy/SchemePatternSet.java` +- Test: `core/src/test/java/dev/metaschema/core/model/policy/SchemePatternSetTest.java` + +**Test first:** + +```java +package dev.metaschema.core.model.policy; + +import static org.junit.jupiter.api.Assertions.*; + +import org.junit.jupiter.api.Test; + +class SchemePatternSetTest { + + @Test + void testDisabledSchemeDeniesAll() { + SchemePatternSet set = SchemePatternSet.disabled("http"); + assertFalse(set.isAllowed("example.com/api")); + assertEquals("scheme 'http' is disabled", set.getDenialReason("example.com/api")); + } + + @Test + void testNoPatternsUsesDefaultPolicy() { + // enabled + no patterns should NOT allow all — uses default deny + SchemePatternSet set = SchemePatternSet.enabled("https"); + assertFalse(set.isAllowed("example.com/anything")); + } + + @Test + void testAllowPattern() { + SchemePatternSet set = SchemePatternSet.builder("file") + .allow("/workspace/**") + .build(); + + assertTrue(set.isAllowed("/workspace/project/schema.xml")); + assertFalse(set.isAllowed("/etc/passwd")); + } + + @Test + void testDenyPatternOverridesAllow() { + SchemePatternSet set = SchemePatternSet.builder("file") + .allow("**") + .deny("**/.ssh/**") + .build(); + + assertTrue(set.isAllowed("/workspace/schema.xml")); + assertFalse(set.isAllowed("/home/user/.ssh/id_rsa")); + } + + @Test + void testLastMatchWins() { + SchemePatternSet set = SchemePatternSet.builder("file") + .allow("**") + .deny("/etc/**") + .allow("/etc/motd") + .build(); + + assertTrue(set.isAllowed("/workspace/file.xml")); + assertFalse(set.isAllowed("/etc/passwd")); + assertTrue(set.isAllowed("/etc/motd")); + } + + @Test + void testNoMatchDenies() { + SchemePatternSet set = SchemePatternSet.builder("https") + .allow("nist.gov/**") + .build(); + + assertTrue(set.isAllowed("nist.gov/schemas/x.xml")); + assertFalse(set.isAllowed("evil.com/attack")); + } + + @Test + void testAllowAll() { + SchemePatternSet set = SchemePatternSet.builder("jar") + .allowAll() + .build(); + assertTrue(set.isAllowed("/any/path")); + } + + @Test + void testDenyAll() { + SchemePatternSet set = SchemePatternSet.builder("http") + .denyAll() + .build(); + assertFalse(set.isAllowed("example.com/api")); + } + + @Test + void testCaseInsensitiveMatching() { + SchemePatternSet set = SchemePatternSet.builder("file") + .caseSensitive(false) + .allow("/Workspace/**") + .build(); + + assertTrue(set.isAllowed("/workspace/file.xml")); + assertTrue(set.isAllowed("/WORKSPACE/file.xml")); + } +} +``` + +**Implementation:** Holds an ordered list of `(GlobMatcher, boolean isAllow)` entries. Evaluates last-match-wins. `enabled: true` with no patterns returns `false` (deny, matching default-scheme-policy behavior). Accepts `caseSensitive` flag passed to `GlobMatcher.compile()`. + +--- + +### Task 1.8: Create PolicyDecision and EvaluationStep + +**Files:** +- Create: `core/src/main/java/dev/metaschema/core/model/policy/PolicyDecision.java` +- Create: `core/src/main/java/dev/metaschema/core/model/policy/EvaluationStep.java` +- Test: `core/src/test/java/dev/metaschema/core/model/policy/PolicyDecisionTest.java` + +**Implementation:** Simple immutable data classes: + +```java +/** Diagnostic result from {@link ResourceAccessPolicy#explain(URI)}. */ +public final class PolicyDecision { + private final boolean allowed; + private final String layer; + private final String denialReason; + private final String matchingPattern; + private final String configSource; + private final String remediation; + private final List evaluationTrace; + // constructor, getters, toString +} + +/** Single step in the policy evaluation trace. */ +public final class EvaluationStep { + private final String layer; + private final String description; + private final boolean matched; + private final boolean resultIfMatched; + // constructor, getters +} +``` + +--- + +### Task 1.9: Create IResourceAccessPolicy Interface + +**Files:** +- Create: `core/src/main/java/dev/metaschema/core/model/policy/IResourceAccessPolicy.java` + +**Implementation:** + +```java +package dev.metaschema.core.model.policy; + +import java.net.URI; + +import edu.umd.cs.findbugs.annotations.NonNull; + +/** + * Policy that controls which URIs can be accessed during resource loading. + * + * @see ResourceAccessPolicy + */ +public interface IResourceAccessPolicy { + + /** A policy that allows all access without checking. */ + IResourceAccessPolicy ALLOW_ALL = uri -> { /* no-op */ }; + + /** + * Checks whether the given URI is allowed by this policy. + * + * @param uri + * the URI to check + * @throws AccessViolationException + * if the policy is in ENFORCE mode and the URI is denied + */ + void checkAccess(@NonNull URI uri); +} +``` + +--- + +### Task 1.10: Create FileProtections + +**Files:** +- Create: `core/src/main/java/dev/metaschema/core/model/policy/FileProtections.java` +- Test: `core/src/test/java/dev/metaschema/core/model/policy/FileProtectionsTest.java` + +**Test first:** + +```java +package dev.metaschema.core.model.policy; + +import static org.junit.jupiter.api.Assertions.*; + +import org.junit.jupiter.api.Test; +import org.junit.jupiter.api.io.TempDir; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.ValueSource; + +import java.nio.file.Path; + +class FileProtectionsTest { + + @TempDir + Path cwd; + + @ParameterizedTest + @ValueSource(strings = { + "/etc/passwd", + "/proc/self/environ", + "/sys/kernel/debug", + "/dev/null", + "/root/.bashrc", + "/var/run/secrets/kubernetes.io/token", + "C:/Windows/System32/config/SAM", + }) + void testDefaultDeniesPathsOutsideSafeAreas(String path) { + FileProtections protections = FileProtections.withDefaults(cwd, + CaseSensitivity.CASE_SENSITIVE); + assertFalse(protections.isAllowed(path), + "Should deny (outside safe areas): " + path); + } + + @Test + void testDefaultAllowsCwdSubtree() { + FileProtections protections = FileProtections.withDefaults(cwd, + CaseSensitivity.CASE_SENSITIVE); + String cwdPath = cwd.resolve("project/schema.xml").toString(); + assertTrue(protections.isAllowed(cwdPath), "Should allow CWD subtree"); + } + + @Test + void testDefaultDeniesDotDirsInHome() { + Path home = Path.of(System.getProperty("user.home")); + FileProtections protections = FileProtections.withDefaults(cwd, + CaseSensitivity.CASE_SENSITIVE); + + String sshKey = home.resolve(".ssh/id_rsa").toString(); + assertFalse(protections.isAllowed(sshKey), + "Should deny ~/.ssh (blanket dot-dir exclusion)"); + + String kubeCfg = home.resolve(".kube/config").toString(); + assertFalse(protections.isAllowed(kubeCfg), + "Should deny ~/.kube (blanket dot-dir exclusion)"); + + String normalFile = home.resolve("projects/schema.xml").toString(); + assertTrue(protections.isAllowed(normalFile), + "Should allow normal files in home"); + } + + @Test + void testBuilderIncludeDefaults() { + FileProtections protections = FileProtections.builder(cwd, + CaseSensitivity.CASE_SENSITIVE) + .includeDefaults() + .allow("/opt/metaschema/**") + .build(); + + String cwdFile = cwd.resolve("schema.xml").toString(); + assertTrue(protections.isAllowed(cwdFile)); + assertTrue(protections.isAllowed("/opt/metaschema/x")); + assertFalse(protections.isAllowed("/etc/passwd")); + } + + @Test + void testBuilderRemoveDefault() { + FileProtections protections = FileProtections.builder(cwd, + CaseSensitivity.CASE_SENSITIVE) + .includeDefaults() + .remove("/**") + .build(); + + Path home = Path.of(System.getProperty("user.home")); + assertFalse(protections.isAllowed(home.resolve("file.txt").toString())); + assertTrue(protections.isAllowed(cwd.resolve("file.txt").toString())); + } + + @Test + void testBuilderFullyCustom() { + FileProtections protections = FileProtections.builder(cwd, + CaseSensitivity.CASE_SENSITIVE) + .allow("/opt/app/**") + .build(); + + assertTrue(protections.isAllowed("/opt/app/schema.xml")); + assertFalse(protections.isAllowed("/etc/passwd")); + assertFalse(protections.isAllowed(cwd.resolve("file.txt").toString())); + } + + @Test + void testDisabledAllowsEverythingAndLogsWarning() { + FileProtections protections = FileProtections.disabled(); + assertTrue(protections.isAllowed("/etc/passwd")); + assertTrue(protections.isAllowed("/home/user/.ssh/key")); + } + + @Test + void testDefaultPatternsAreInspectable() { + assertFalse(FileProtections.defaultAllowPatterns().isEmpty()); + } + + @Test + void testCaseInsensitiveOnWindows() { + FileProtections protections = FileProtections.withDefaults(cwd, + CaseSensitivity.CASE_INSENSITIVE); + String cwdUpper = cwd.resolve("Schema.XML").toString().toUpperCase(); + // Should match CWD subtree case-insensitively + assertTrue(protections.isAllowed( + cwd.resolve("Schema.XML").toString())); + } + + @Test + void testCwdRootWarning() { + // When CWD is filesystem root, should log a warning + Path root = Path.of("/"); + // This should succeed but log a WARNING + FileProtections protections = FileProtections.withDefaults(root, + CaseSensitivity.CASE_SENSITIVE); + // Root allows everything via /** + assertTrue(protections.isAllowed("/etc/passwd")); + } +} +``` + +**Implementation:** +- `withDefaults(Path cwd, CaseSensitivity cs)` — default patterns with CWD + home minus dot-dirs +- `disabled()` — allows everything, logs WARN, Javadoc security warning +- `builder(Path cwd, CaseSensitivity cs)` — customizable builder +- `defaultAllowPatterns()` — static inspection method +- `isAllowed(String path)` — check path against patterns +- Warn if CWD is root (`/` or drive root) + +--- + +### Task 1.11: Create ResourceAccessPolicy and Builder + +**Files:** +- Create: `core/src/main/java/dev/metaschema/core/model/policy/ResourceAccessPolicy.java` +- Create: `core/src/main/java/dev/metaschema/core/model/policy/ResourceAccessPolicyBuilder.java` +- Test: `core/src/test/java/dev/metaschema/core/model/policy/ResourceAccessPolicyTest.java` + +**Test first:** + +```java +package dev.metaschema.core.model.policy; + +import static org.junit.jupiter.api.Assertions.*; + +import org.junit.jupiter.api.Test; + +import java.net.URI; + +class ResourceAccessPolicyTest { + + @Test + void testDisabledModeAllowsEverything() { + ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.DISABLED) + .forScheme("file").denyAll() + .build(); + + assertDoesNotThrow(() -> policy.checkAccess(URI.create("file:///etc/passwd"))); + } + + @Test + void testAuditModeLogsButAllows() { + ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.AUDIT) + .forScheme("http").denyAll() + .denyUnlistedSchemes() + .build(); + + assertDoesNotThrow(() -> policy.checkAccess(URI.create("http://localhost/admin"))); + } + + @Test + void testEnforceModeBlocks() { + ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .forScheme("http").denyAll() + .denyUnlistedSchemes() + .build(); + + assertThrows(AccessViolationException.class, + () -> policy.checkAccess(URI.create("http://localhost/admin"))); + } + + @Test + void testEnforceModeAllowsMatching() { + ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .forScheme("https").allow("nist.gov/**") + .denyUnlistedSchemes() + .build(); + + assertDoesNotThrow(() -> policy.checkAccess( + URI.create("https://nist.gov/schemas/x.xml"))); + } + + @Test + void testDenyPatternExceptions() { + ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .forScheme("file") + .allow("**") + .deny("**/.ssh/**") + .fileProtections(FileProtections.disabled()) + .denyUnlistedSchemes() + .build(); + + assertDoesNotThrow(() -> policy.checkAccess( + URI.create("file:///workspace/schema.xml"))); + assertThrows(AccessViolationException.class, + () -> policy.checkAccess( + URI.create("file:///home/user/.ssh/id_rsa"))); + } + + @Test + void testDenyUnlistedSchemesBlocksUnknown() { + ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .forScheme("https").allowAll() + .denyUnlistedSchemes() + .build(); + + assertThrows(AccessViolationException.class, + () -> policy.checkAccess(URI.create("ftp://evil.com/file"))); + } + + @Test + void testWithModeCreatesNewPolicy() { + ResourceAccessPolicy audit = ResourceAccessPolicy.builder() + .mode(PolicyMode.AUDIT) + .forScheme("http").denyAll() + .denyUnlistedSchemes() + .build(); + + assertDoesNotThrow(() -> audit.checkAccess( + URI.create("http://localhost/admin"))); + + ResourceAccessPolicy enforced = audit.withMode(PolicyMode.ENFORCE); + assertThrows(AccessViolationException.class, + () -> enforced.checkAccess(URI.create("http://localhost/admin"))); + } + + @Test + void testToBuilder() { + ResourceAccessPolicy original = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .forScheme("https").allow("nist.gov/**") + .denyUnlistedSchemes() + .build(); + + ResourceAccessPolicy modified = original.toBuilder() + .forScheme("https").allow("github.com/**") + .build(); + + assertDoesNotThrow(() -> modified.checkAccess( + URI.create("https://nist.gov/x.xml"))); + assertDoesNotThrow(() -> modified.checkAccess( + URI.create("https://github.com/x.xml"))); + } + + @Test + void testExplainReturnsDecision() { + ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .forScheme("https").allow("nist.gov/**") + .denyUnlistedSchemes() + .build(); + + PolicyDecision allowed = policy.explain(URI.create("https://nist.gov/x.xml")); + assertTrue(allowed.isAllowed()); + + PolicyDecision denied = policy.explain(URI.create("https://evil.com/x.xml")); + assertFalse(denied.isAllowed()); + assertNotNull(denied.getDenialReason()); + assertNotNull(denied.getLayer()); + assertNotNull(denied.getRemediation()); + assertFalse(denied.getEvaluationTrace().isEmpty()); + } + + @Test + void testDescribeEffectiveRules() { + ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .forScheme("https").allow("nist.gov/**") + .forScheme("file").allow("/workspace/**") + .denyUnlistedSchemes() + .build(); + + String description = policy.describeEffectiveRules(); + assertNotNull(description); + assertTrue(description.contains("ENFORCE")); + assertTrue(description.contains("https")); + assertTrue(description.contains("file")); + } + + @Test + void testBundledDefaultsFactory() { + ResourceAccessPolicy defaults = ResourceAccessPolicy.bundledDefaults(); + assertNotNull(defaults); + // Bundled defaults are AUDIT mode — should not throw + assertDoesNotThrow(() -> defaults.checkAccess( + URI.create("https://example.com/test"))); + } + + @Test + void testDevelopmentFactory() { + ResourceAccessPolicy dev = ResourceAccessPolicy.development(); + assertNotNull(dev); + // Dev mode should allow localhost + assertDoesNotThrow(() -> dev.checkAccess( + URI.create("http://localhost/api"))); + } + + @Test + void testDisabledFactory() { + ResourceAccessPolicy disabled = ResourceAccessPolicy.disabled(); + assertDoesNotThrow(() -> disabled.checkAccess( + URI.create("file:///etc/passwd"))); + } + + @Test + void testJarSchemeRecursiveCheck() { + ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .forScheme("file").allow("/lib/**") + .forScheme("jar").allowAll() + .fileProtections(FileProtections.disabled()) + .denyUnlistedSchemes() + .build(); + + // jar: with file: inner URI pointing to allowed path + assertDoesNotThrow(() -> policy.checkAccess( + URI.create("jar:file:///lib/app.jar!/schema/x.xsd"))); + + // jar: with http: inner URI — http not configured, default deny + assertThrows(AccessViolationException.class, + () -> policy.checkAccess( + URI.create("jar:http://evil.com/mal.jar!/schema/x.xsd"))); + } + + @Test + void testPathNormalizationPreventsTraversal() { + ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .forScheme("file").allow("/workspace/**") + .fileProtections(FileProtections.disabled()) + .denyUnlistedSchemes() + .build(); + + // Path traversal should be caught after normalization + assertThrows(AccessViolationException.class, + () -> policy.checkAccess( + URI.create("file:///workspace/../etc/passwd"))); + } + + @Test + void testSchemeNormalization() { + ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .forScheme("file").allow("/workspace/**") + .fileProtections(FileProtections.disabled()) + .denyUnlistedSchemes() + .build(); + + // Uppercase scheme should still match + assertDoesNotThrow(() -> policy.checkAccess( + URI.create("FILE:///workspace/schema.xml"))); + } + + @Test + void testFileProtectionsConflictDetection() { + // /opt/data/ is outside CWD and home — should throw at build time + assertThrows(IllegalStateException.class, + () -> ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .forScheme("file") + .allow("/opt/data/**") + .denyUnlistedSchemes() + .build()); + } + + @Test + void testEnabledNoPatternsUsesDeny() { + ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.ENFORCE) + .denyUnlistedSchemes() + .build(); + + // No schemes configured — should use default-scheme-policy (deny) + assertThrows(AccessViolationException.class, + () -> policy.checkAccess(URI.create("https://example.com"))); + } + + @Test + void testImmutability() { + ResourceAccessPolicy policy = ResourceAccessPolicy.builder() + .mode(PolicyMode.AUDIT) + .forScheme("https").allow("nist.gov/**") + .build(); + + ResourceAccessPolicy withEnforce = policy.withMode(PolicyMode.ENFORCE); + + // Original should not be affected + assertDoesNotThrow(() -> policy.checkAccess( + URI.create("https://evil.com"))); + // New instance should enforce + assertThrows(AccessViolationException.class, + () -> withEnforce.checkAccess(URI.create("https://evil.com"))); + } +} +``` + +**Implementation:** +- `ResourceAccessPolicy` is a **final, immutable** class +- All internal collections are unmodifiable copies +- `checkAccess(URI)` — full evaluation pipeline (normalize → network check → file protections → scheme patterns → mode behavior) +- `explain(URI)` — same pipeline but returns `PolicyDecision` instead of throwing +- `withMode(PolicyMode)` — returns new instance with different mode +- `toBuilder()` — returns pre-populated builder +- Factory methods: `bundledDefaults()`, `development()`, `disabled()` +- `describeEffectiveRules()` — returns human-readable summary +- `ResourceAccessPolicyBuilder` uses nested `SchemeConfigBuilder` pattern +- `.forScheme()` twice for same scheme appends patterns +- `.build()` runs conflict detection (file scheme patterns vs FileProtections) + +--- + +### Task 1.12: Add package-info.java + +**Files:** +- Create: `core/src/main/java/dev/metaschema/core/model/policy/package-info.java` + +--- + +### Task 1.13: Verify PR1 Build + +```bash +mvn -pl core clean install +mvn -pl core checkstyle:check +``` + +--- + +## PR2: Configuration Model and Bundled Defaults + +**Goal:** Define the Metaschema configuration module, implement config loading with ratcheting, and ship bundled restrictive defaults. + +### Task 2.1: Create Metaschema Configuration Module + +**Files:** +- Create: `core/src/main/metaschema/resource-access-policy-config_metaschema.yaml` + +The Metaschema module definition for the resource access policy configuration model. See PRD for full module definition. Root assembly is `resource-access-policy-config` to avoid naming collision with the hand-written `ResourceAccessPolicy` class. + +--- + +### Task 2.2: Configure Maven Code Generation + +**Files:** +- Modify: `core/pom.xml` (add dependency on `ipaddress` library, verify metaschema-maven-plugin config) + +Verify generated binding classes compile and contain expected fields: +- `ResourceAccessPolicyConfig` (root assembly) +- `SchemeConfig` (scheme configuration) +- `Pattern` (access pattern field) + +--- + +### Task 2.3: Create Bundled Default Policy + +**Files:** +- Create: `core/src/main/resources/dev/metaschema/core/model/policy/default-resource-access-policy.yaml` +- Test: `core/src/test/java/dev/metaschema/core/model/policy/BundledDefaultsTest.java` + +**Test first:** + +```java +package dev.metaschema.core.model.policy; + +import static org.junit.jupiter.api.Assertions.*; + +import org.junit.jupiter.api.Test; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.ValueSource; + +import java.net.URI; + +class BundledDefaultsTest { + + private final ResourceAccessPolicy defaults = ResourceAccessPolicy.bundledDefaults(); + + @Test + void testDefaultModeIsAudit() { + // AUDIT mode: should log but not throw + assertDoesNotThrow(() -> defaults.checkAccess( + URI.create("http://localhost/admin"))); + } + + @ParameterizedTest + @ValueSource(strings = { + "https://pages.nist.gov/schemas/x.xml", + "https://example.com/api", + "jar:file:///lib.jar!/schema/x.xsd", + }) + void testDefaultAllowedInEnforceMode(String uriString) { + ResourceAccessPolicy enforced = defaults.withMode(PolicyMode.ENFORCE); + assertDoesNotThrow(() -> enforced.checkAccess(URI.create(uriString))); + } + + @ParameterizedTest + @ValueSource(strings = { + "http://example.com/api", + "ftp://evil.com/file", + }) + void testDefaultDeniedSchemesInEnforceMode(String uriString) { + ResourceAccessPolicy enforced = defaults.withMode(PolicyMode.ENFORCE); + assertThrows(AccessViolationException.class, + () -> enforced.checkAccess(URI.create(uriString))); + } + + @ParameterizedTest + @ValueSource(strings = { + "https://localhost/admin", + "https://127.0.0.1/secret", + "https://169.254.169.254/meta", + "https://10.0.0.1/internal", + "https://192.168.1.1/router", + }) + void testDefaultDeniedNetworkInEnforceMode(String uriString) { + ResourceAccessPolicy enforced = defaults.withMode(PolicyMode.ENFORCE); + assertThrows(AccessViolationException.class, + () -> enforced.checkAccess(URI.create(uriString))); + } +} +``` + +--- + +### Task 2.4: Implement Configuration Loading with Ratcheting + +**Files:** +- Create: `core/src/main/java/dev/metaschema/core/model/policy/ResourceAccessPolicyLoader.java` +- Test: `core/src/test/java/dev/metaschema/core/model/policy/ResourceAccessPolicyLoaderTest.java` + +**Test first:** Verify: +- Loading from YAML, JSON, and XML config files +- Configuration layering with merge semantics +- Ratchet enforcement (can only tighten, never loosen mode) +- `locked: true` prevents overrides +- `inherit: true` appends patterns instead of replacing +- Scheme name validation (warn on unrecognized schemes like "htps") +- YAML `!` pattern validation (detect unquoted `!` patterns) + +--- + +### Task 2.5: Verify PR2 Build + +```bash +mvn -pl core clean install +mvn -pl core checkstyle:check +``` + +--- + +## PR3: Loader Integration + +**Goal:** Integrate policy checking into all resource loading paths. + +### Task 3.1: Add Policy Support to Loader Interfaces + +**Files:** +- Modify: `core/src/main/java/dev/metaschema/core/model/IModuleLoader.java` + +Add method: + +```java +/** + * Sets the resource access policy for this loader. + *

+ * When set, all URIs resolved by this loader are checked against the policy + * before loading. Use {@link ResourceAccessPolicy#bundledDefaults()} for + * recommended defaults. + * + * @param policy + * the policy to enforce, or {@code null} to disable policy checking + */ +void setResourceAccessPolicy(@Nullable IResourceAccessPolicy policy); +``` + +--- + +### Task 3.2: Integrate Policy in AbstractModuleLoader + +**Files:** +- Modify: `core/src/main/java/dev/metaschema/core/model/AbstractModuleLoader.java` +- Test: `core/src/test/java/dev/metaschema/core/model/AbstractModuleLoaderPolicyTest.java` + +**Test first:** Verify module import URIs are checked against policy. Verify relative URIs are resolved to absolute before checking. + +**Implementation:** Add `volatile IResourceAccessPolicy` field. Check before URI resolution. Resolve relative URIs to absolute before calling `checkAccess()`. + +--- + +### Task 3.3: Integrate Policy in DefaultBoundLoader + +**Files:** +- Modify: `databind/src/main/java/dev/metaschema/databind/io/DefaultBoundLoader.java` +- Test: `databind/src/test/java/dev/metaschema/databind/io/DefaultBoundLoaderPolicyTest.java` + +**Test first:** Verify document loading URIs are checked against policy. + +--- + +### Task 3.4: Integrate Policy in BindingConstraintLoader + +**Files:** +- Modify: `databind/src/main/java/dev/metaschema/databind/model/metaschema/BindingConstraintLoader.java` +- Test: `databind/src/test/java/dev/metaschema/databind/model/metaschema/BindingConstraintLoaderPolicyTest.java` + +**Test first:** Verify constraint import URIs are checked against policy. + +--- + +### Task 3.5: Integrate Policy in DefaultXmlDeserializer + +**Files:** +- Modify: `databind/src/main/java/dev/metaschema/databind/io/xml/DefaultXmlDeserializer.java` +- Test: `databind/src/test/java/dev/metaschema/databind/io/xml/DefaultXmlDeserializerPolicyTest.java` + +**Test first:** Verify XML entity resolution URIs are checked against policy. Document HTTP redirect re-checking requirement. + +--- + +### Task 3.6: Verify PR3 Build + +```bash +mvn clean install -PCI -Prelease +``` + +--- + +## PR4: CLI Integration and Documentation + +**Goal:** Add CLI flags for policy control, diagnostic commands, and documentation. + +### Task 4.1: Add Global CLI Flags + +**Files:** +- Modify: `metaschema-cli/src/main/java/dev/metaschema/cli/commands/MetaschemaCommands.java` (shared options) +- Modify: Resource-loading commands (validate, validate-content, convert, generate-schema) to accept policy flags + +Add global flags available on all resource-loading commands: +- `--resource-policy-mode=` — Override enforcement mode +- `--resource-policy=` — Load custom policy configuration file + +### Task 4.2: Add Environment Variable Support + +Support `METASCHEMA_RESOURCE_POLICY_MODE` environment variable for mode override. + +### Task 4.3: Create ResourcePolicyCommand (Parent Command) + +**Files:** +- Create: `metaschema-cli/src/main/java/dev/metaschema/cli/commands/resourcepolicy/ResourcePolicyCommand.java` +- Modify: `metaschema-cli/src/main/java/dev/metaschema/cli/commands/MetaschemaCommands.java` (register command) + +Create a new `AbstractParentCommand` with `dump` and `check` subcommands. Follows the same pattern as `MetapathCommand`. Register in `MetaschemaCommands.COMMANDS`. + +### Task 4.4: Implement `resource-policy dump` Subcommand + +**Files:** +- Create: `metaschema-cli/src/main/java/dev/metaschema/cli/commands/resourcepolicy/DumpSubcommand.java` + +`AbstractTerminalCommand` that prints the effective merged policy (after all config layers) as YAML to stdout. Uses `policy.describeEffectiveRules()`. Accepts `--resource-policy` and `--resource-policy-mode` flags. + +### Task 4.5: Implement `resource-policy check` Subcommand + +**Files:** +- Create: `metaschema-cli/src/main/java/dev/metaschema/cli/commands/resourcepolicy/CheckSubcommand.java` + +`AbstractTerminalCommand` that takes a URI as a positional argument, runs it through the policy, and prints the `PolicyDecision` evaluation trace. Uses `policy.explain(URI)`. Accepts `--resource-policy` and `--resource-policy-mode` flags. + +### Task 4.6: Documentation + +**Files:** +- Update: Website documentation with resource access policy guide +- Update: CLI help text +- Include: Migration guide (AUDIT → ENFORCE transition steps) +- Include: YAML `!` quoting warning +- Include: Explicit note that `!` means DENY (contrast with `.gitignore`) + +### Task 4.7: Final Verification + +```bash +mvn clean install -PCI -Prelease +``` + +--- + +## Completion Checklist + +**Phase 1: Policy Engine Core (PR1)** +- [ ] `PolicyMode` enum with DISABLED/AUDIT/ENFORCE and `mostRestrictive()` +- [ ] `SymlinkPolicy` enum with FOLLOW/NOFOLLOW +- [ ] `CaseSensitivity` enum with SYSTEM_DEFAULT/CASE_SENSITIVE/CASE_INSENSITIVE +- [ ] `AccessViolationException` with structured fields (layer, reason, source, remediation) +- [ ] `GlobMatcher` with case sensitivity, possessive quantifiers, pattern length limit +- [ ] `UriNormalizer` with path normalization, percent-decoding, symlink resolution, scheme/host normalization, JAR parsing +- [ ] `NetworkSecurityChecker` with CIDR block matching via IP library, alternate encoding support +- [ ] `NetworkSecurityConfig` with builder and `allowLoopback()`, `allowCidr()` +- [ ] `SchemePatternSet` with ordered pattern evaluation, case sensitivity, updated empty-patterns semantics +- [ ] `PolicyDecision` and `EvaluationStep` for diagnostics +- [ ] `IResourceAccessPolicy` interface +- [ ] `ResourceAccessPolicy` (immutable) with builder, factory methods, `toBuilder()`, `explain()`, `describeEffectiveRules()` +- [ ] `FileProtections` with `disabled()` (renamed from `none()`), blanket dot-dir exclusion, CWD root warning, conflict detection +- [ ] `package-info.java` +- [ ] IP boundary value tests for all private CIDR blocks +- [ ] Alternate IP encoding tests (decimal, hex, octal, shorthand, IPv4-mapped IPv6) +- [ ] Path traversal normalization tests +- [ ] Symlink traversal tests +- [ ] Case sensitivity tests +- [ ] ReDoS resistance tests +- [ ] JAR recursive checking tests +- [ ] All tests passing + +**Phase 2: Configuration Model (PR2)** +- [ ] Metaschema module (`resource-access-policy-config_metaschema.yaml`) +- [ ] Maven code generation verified (no naming collision) +- [ ] IP address library dependency added (`com.github.seancfoley:ipaddress`) +- [ ] Bundled default policy (restrictive, AUDIT mode) +- [ ] `ResourceAccessPolicyLoader` with ratchet enforcement, `locked` flag, `inherit` merge +- [ ] Scheme name validation (warn on unrecognized) +- [ ] YAML `!` pattern validation +- [ ] Pattern complexity limits (count, length) +- [ ] Configuration layering tests +- [ ] All tests passing + +**Phase 3: Loader Integration (PR3)** +- [ ] `IModuleLoader.setResourceAccessPolicy()` method +- [ ] `AbstractModuleLoader` policy integration (with relative URI resolution) +- [ ] `DefaultBoundLoader` policy integration +- [ ] `BindingConstraintLoader` policy integration +- [ ] `DefaultXmlDeserializer` policy integration +- [ ] HTTP redirect re-checking documented as integration requirement +- [ ] Integration tests for each loader type +- [ ] All tests passing + +**Phase 4: CLI Integration (PR4)** +- [ ] `--resource-policy-mode` CLI flag +- [ ] `--resource-policy` CLI flag +- [ ] `ResourcePolicyCommand` parent command (extends `AbstractParentCommand`) +- [ ] `resource-policy dump` subcommand +- [ ] `resource-policy check ` subcommand +- [ ] `METASCHEMA_RESOURCE_POLICY_MODE` env var +- [ ] Documentation (migration guide, YAML warnings, `!` semantics) +- [ ] Full CI build passing + +**Final Verification:** +```bash +mvn clean install -PCI -Prelease +```