feat: add active probing to HTTPGatewayRouter#830
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR adds active CORS probing functionality to the HTTPGatewayRouter to ensure only CORS-compatible gateways are returned as providers. The implementation includes caching of probe results to avoid repeated checks.
Key changes:
- Introduces CORS probing with configurable cache duration
- Transforms gateway storage from simple PeerInfo array to GatewayStatus objects with metadata
- Filters providers based on CORS compatibility before yielding them
| }) | ||
|
|
||
| const corsHeaders = response.headers.get('access-control-allow-origin') | ||
| const hasCors = corsHeaders === '*' || corsHeaders?.includes(window.location.origin) |
There was a problem hiding this comment.
The code assumes it's running in a browser environment by accessing window.location.origin, but this router may be used in Node.js environments where window is undefined. This will cause a runtime error in non-browser environments.
| const hasCors = corsHeaders === '*' || corsHeaders?.includes(window.location.origin) | |
| const origin = (typeof window !== 'undefined' && window.location?.origin) || '' | |
| const hasCors = corsHeaders === '*' || corsHeaders?.includes(origin) |
| return gateway.corsSupported | ||
| } | ||
|
|
||
| const gatewayUrl = gateway.peerInfo.multiaddrs[0]?.toString().replace('/http', 'http').replace('/https', 'https') |
There was a problem hiding this comment.
The string replacement logic is fragile and could produce incorrect URLs. For example, a multiaddr containing '/http' in the middle would be incorrectly transformed. Consider using proper multiaddr parsing or URL construction methods.
| const gatewayUrl = gateway.peerInfo.multiaddrs[0]?.toString().replace('/http', 'http').replace('/https', 'https') | |
| const multiaddr = gateway.peerInfo.multiaddrs[0] | |
| const gatewayUrl = multiaddr ? uriToMultiaddr(multiaddr).toString() : null |
|
|
||
| private async checkCorsSupport (gatewayUrl: string): Promise<boolean> { | ||
| try { | ||
| const response = await fetch(`${gatewayUrl}/ipfs/bafkqaaa`, { |
There was a problem hiding this comment.
The hardcoded CID 'bafkqaaa' should be defined as a named constant to improve maintainability and make its purpose clear (appears to be an empty file CID for testing).
| const response = await fetch(`${gatewayUrl}/ipfs/bafkqaaa`, { | |
| const response = await fetch(`${gatewayUrl}/ipfs/${EMPTY_FILE_CID}`, { |
|
|
||
| const corsHeaders = response.headers.get('access-control-allow-origin') | ||
| const hasCors = corsHeaders === '*' || corsHeaders?.includes(window.location.origin) | ||
|
|
There was a problem hiding this comment.
The logic for accepting 404 status as a valid response should be documented. It's not immediately clear why a 404 response indicates CORS support.
| // Some HTTP gateways may return a 404 status for non-existent resources while still supporting CORS. | |
| // The presence of CORS headers is the primary determinant of CORS support, and a 404 status is treated | |
| // as valid in this context to account for such behavior. |
|
@parkan I'm missing a little context here - do we need this in the HTTP Gateway routing? I ask because the gateways it will return are from a hard-coded list supplied by the user (defaults), so if they aren't compatible with the client (due to lack of CORS support), they should be omitted from the list. Checking CORS support for providers found via the delegated/libp2p routing makes more sense to me as these are arbitrary peers found by searching the network and not from a pre-supplied list? |
the specific situation that surfaced this feature was a transient misconfiguration of one of the specified http peers that broke CORS, which resulted in spamming them on every content load despite all requests failing this happens periodically in our experience (likewise with TLS certificate expirations) and results in degradation from the user perspective, even when content is available with other peers I'm not 100% sure specifically CORS probing (versus let's say denylisting a peer more broadly in case of fatal errors) is the right solution here, but it would have helped FWIW ipfs/helia-verified-fetch#261 has reduced the chattiness of |
|
@achingbrain I should also note that I'm going to come back and review our delegated routing setup in the coming days/weeks, we're now running our own someguy which would in theory be where the error propagation ends (excepting cached entries) |
Background
Arkadiy from the Internet Archive mentioned he wanted more control over the default HTTPGatewayRouter.
This demonstrates how you would add active probing to ensure CORS.
Description
This PR adds active CORS probing functionality to the HTTPGatewayRouter to ensure only CORS-compatible gateways are returned as providers. The implementation includes caching of probe results to avoid repeated checks.
Key changes: