Made email attachment URL optional #94

Huijiro · 2025-10-15T20:45:42Z

Summary by CodeRabbit

New Features
- Filters email attachments to only include valid HTTP/HTTPS links with non-localhost hostnames.
Bug Fixes
- Prevents crashes when content-disposition is missing or unparseable by handling it gracefully.
- Drops invalid or malformed attachment URLs to avoid processing errors.
Documentation
- Updates descriptions to clarify content-disposition parsing and attachment handling behavior.

coderabbitai · 2025-10-15T20:45:51Z

Walkthrough

The email attachment extraction now parses URLs from content-disposition safely, returning None instead of raising errors when parsing fails. Attachments are filtered to include only HTTP/HTTPS URLs with non-localhost hostnames. URL parsing and hostname validation are added, and docstrings are updated accordingly.

Changes

Cohort / File(s)	Summary of Changes
Email attachment parsing and filtering `agentuity/io/email.py`	- Replace ValueError with None return in `_parse_url_from_content_disposition` on missing/unparseable headers - Import and use `urlparse` for robust URL handling - Filter attachments to only HTTP/HTTPS with non-localhost hostnames; drop invalid/missing URLs - Update docstrings to reflect new parsing and filtering behavior

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant E as EmailParser
    participant CD as _parse_url_from_content_disposition
    participant U as urlparse
    participant F as AttachmentFilter

    E->>CD: Extract URL from Content-Disposition
    CD->>U: Parse URL
    alt Valid URL
        U-->>CD: url (scheme, host, ...)
        CD-->>E: URL
        E->>F: Validate scheme/hostname
        alt http/https and non-localhost
            F-->>E: Keep attachment
        else Invalid host or scheme
            F-->>E: Drop attachment
        end
    else Missing/invalid header
        CD-->>E: None
        E->>F: No URL -> Drop attachment
    end

    note over E: Returns only validated attachments

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

I sniffed the mail with whiskers keen,
Trimmed the links, kept only clean—
No localhost in my burrow door,
Just http(s) I can explore.
With gentle hops through headers’ maze,
I drop the duds and keep the blaze. 🐇📬

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title “Made email attachment URL optional” clearly states the primary change of allowing attachments without a URL to avoid raising an error, matching the update that _parse_url_from_content_disposition now returns None. It is concise and specific enough for a reviewer to understand the main behavior change without needing implementation details.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch update-email

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

agentuity/io/email.py (1)
90-121: Add None check for _url before use.

The data() method uses self._url at line 105 without checking if it's None. While the attachments property filters out None URLs, if an IncomingEmailAttachment is instantiated directly and data() is called, it will fail with a non-descriptive error (TypeError: unsupported operand type(s) when httpx tries to use None as a URL).

Add a guard at the beginning of the method:
 async def data(self):
     """
     Return a Data object that streams the attachment data asynchronously.
     """
+    if self._url is None:
+        raise ValueError("Attachment URL is not available")
     tracer = trace.get_tracer("email")

🧹 Nitpick comments (2)

agentuity/io/email.py (2)
298-299: Consider more specific exception handling.

Catching all exceptions with a bare except Exception may hide legitimate parsing or validation errors that should be logged or handled differently. Consider being more specific about which exceptions to catch.
-            except Exception:
-                continue
+            except (ValueError, TypeError, AttributeError) as e:
+                # Log or handle specific parsing errors if needed
+                continue
387-387: Optional: Remove explicit return None.

Python functions implicitly return None when execution completes without a return statement. The explicit return None is unnecessary but harmless.
                response = await client.post(url, content=email_body, headers=headers)
                response.raise_for_status()
-                return None

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7aafb06 and 50aeb89.

📒 Files selected for processing (1)

agentuity/io/email.py (4 hunks)

coderabbitai · 2025-10-15T20:49:12Z

agentuity/io/email.py

+                hostname_lower = hostname.lower()
+                if hostname_lower in ('localhost', '127.0.0.1', '::1'):
+                    continue


⚠️ Potential issue | 🟠 Major

Incomplete localhost/loopback address filtering.

The hostname filtering only checks for exact matches of 'localhost', '127.0.0.1', and '::1'. This misses:

Other 127.x.x.x loopback addresses (e.g., 127.0.0.2, 127.1.1.1)

Alternative IPv6 loopback representations (e.g., 0:0:0:0:0:0:0:1)

Private network ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) that could enable SSRF

Consider using a more comprehensive check:

- hostname_lower = hostname.lower() - if hostname_lower in ('localhost', '127.0.0.1', '::1'): - continue + import ipaddress + + hostname_lower = hostname.lower() + if hostname_lower == 'localhost': + continue + + # Check if hostname is an IP address + try: + ip = ipaddress.ip_address(hostname) + if ip.is_loopback or ip.is_private or ip.is_link_local: + continue + except ValueError: + # Not an IP address, proceed with domain name + pass

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

hostname_lower = hostname.lower()

if hostname_lower in ('localhost', '127.0.0.1', '::1'):

continue

import ipaddress

hostname_lower = hostname.lower()

if hostname_lower == 'localhost':

continue

# Check if hostname is an IP address

try:

ip = ipaddress.ip_address(hostname)

if ip.is_loopback or ip.is_private or ip.is_link_local:

continue

except ValueError:

# Not an IP address, proceed with domain name

pass

Made email URL optional

50aeb89

coderabbitai bot reviewed Oct 15, 2025

View reviewed changes

Huijiro closed this Oct 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Made email attachment URL optional #94

Made email attachment URL optional #94

Uh oh!

Huijiro commented Oct 15, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 15, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-                hostname_lower = hostname.lower()
-                if hostname_lower in ('localhost', '127.0.0.1', '::1'):
-                    continue
+                import ipaddress
+                hostname_lower = hostname.lower()
+                if hostname_lower == 'localhost':
+                    continue
+                # Check if hostname is an IP address
+                try:
+                    ip = ipaddress.ip_address(hostname)
+                    if ip.is_loopback or ip.is_private or ip.is_link_local:
+                        continue
+                except ValueError:
+                    # Not an IP address, proceed with domain name
+                    pass

Made email attachment URL optional #94

Made email attachment URL optional #94

Uh oh!

Conversation

Huijiro commented Oct 15, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Huijiro commented Oct 15, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 15, 2025 •

edited

Loading