-
Notifications
You must be signed in to change notification settings - Fork 0
Made email attachment URL optional #94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe email attachment extraction now parses URLs from content-disposition safely, returning None instead of raising errors when parsing fails. Attachments are filtered to include only HTTP/HTTPS URLs with non-localhost hostnames. URL parsing and hostname validation are added, and docstrings are updated accordingly. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant E as EmailParser
participant CD as _parse_url_from_content_disposition
participant U as urlparse
participant F as AttachmentFilter
E->>CD: Extract URL from Content-Disposition
CD->>U: Parse URL
alt Valid URL
U-->>CD: url (scheme, host, ...)
CD-->>E: URL
E->>F: Validate scheme/hostname
alt http/https and non-localhost
F-->>E: Keep attachment
else Invalid host or scheme
F-->>E: Drop attachment
end
else Missing/invalid header
CD-->>E: None
E->>F: No URL -> Drop attachment
end
note over E: Returns only validated attachments
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
agentuity/io/email.py (1)
90-121: Add None check for _url before use.The
data()method usesself._urlat line 105 without checking if it's None. While theattachmentsproperty filters out None URLs, if anIncomingEmailAttachmentis instantiated directly anddata()is called, it will fail with a non-descriptive error (TypeError: unsupported operand type(s)when httpx tries to use None as a URL).Add a guard at the beginning of the method:
async def data(self): """ Return a Data object that streams the attachment data asynchronously. """ + if self._url is None: + raise ValueError("Attachment URL is not available") tracer = trace.get_tracer("email")
🧹 Nitpick comments (2)
agentuity/io/email.py (2)
298-299: Consider more specific exception handling.Catching all exceptions with a bare
except Exceptionmay hide legitimate parsing or validation errors that should be logged or handled differently. Consider being more specific about which exceptions to catch.- except Exception: - continue + except (ValueError, TypeError, AttributeError) as e: + # Log or handle specific parsing errors if needed + continue
387-387: Optional: Remove explicit return None.Python functions implicitly return
Nonewhen execution completes without a return statement. The explicitreturn Noneis unnecessary but harmless.response = await client.post(url, content=email_body, headers=headers) response.raise_for_status() - return None
| hostname_lower = hostname.lower() | ||
| if hostname_lower in ('localhost', '127.0.0.1', '::1'): | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incomplete localhost/loopback address filtering.
The hostname filtering only checks for exact matches of 'localhost', '127.0.0.1', and '::1'. This misses:
- Other 127.x.x.x loopback addresses (e.g., 127.0.0.2, 127.1.1.1)
- Alternative IPv6 loopback representations (e.g.,
0:0:0:0:0:0:0:1) - Private network ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) that could enable SSRF
Consider using a more comprehensive check:
- hostname_lower = hostname.lower()
- if hostname_lower in ('localhost', '127.0.0.1', '::1'):
- continue
+ import ipaddress
+
+ hostname_lower = hostname.lower()
+ if hostname_lower == 'localhost':
+ continue
+
+ # Check if hostname is an IP address
+ try:
+ ip = ipaddress.ip_address(hostname)
+ if ip.is_loopback or ip.is_private or ip.is_link_local:
+ continue
+ except ValueError:
+ # Not an IP address, proceed with domain name
+ pass📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| hostname_lower = hostname.lower() | |
| if hostname_lower in ('localhost', '127.0.0.1', '::1'): | |
| continue | |
| import ipaddress | |
| hostname_lower = hostname.lower() | |
| if hostname_lower == 'localhost': | |
| continue | |
| # Check if hostname is an IP address | |
| try: | |
| ip = ipaddress.ip_address(hostname) | |
| if ip.is_loopback or ip.is_private or ip.is_link_local: | |
| continue | |
| except ValueError: | |
| # Not an IP address, proceed with domain name | |
| pass |
Summary by CodeRabbit