Skip to content

Conversation

@x9xhack
Copy link
Contributor

@x9xhack x9xhack commented Jun 3, 2025

Description

This PR resolves three issues related to URL handling:


🛠 1. Preserve trailing slash in path components

When splitting URL paths, trailing slashes were previously lost — leading to loss of semantic distinction between endpoints like /admin and /admin/. These can represent different resources (e.g. route vs. directory), so we now preserve the trailing slash on the last component if it existed in the original path.

Commit:

fix(tree): preserve trailing slash in last path component

Some endpoints like `/admin` and `/admin/` can represent different resources
(e.g. a page vs. a directory). Stripping the trailing slash causes ambiguity
in routing or visualization. This change ensures the trailing slash is kept
on the last component when present in the original path.

image


🔧 2. Normalize base URL and word joining

Incorrect handling of slashes between a base URL and a word (e.g. /admin + login) could result in malformed URLs like /adminlogin or /admin//login. This change trims the trailing slash from the base and the leading slash from the word before joining them with a single slash.

Commit:

fix(injector): ensure exactly one slash between base URL and word

Trim trailing slash from base URL and leading slash from word to prevent
double slashes or missing slashes in constructed URLs. This ensures
URLs like `/admin` and `login` combine correctly to `/admin/login`.

🚫 3. Avoid duplicate handling of the same URL in recursive mode

In recursive mode, the same URL could be inserted and handled multiple times. This introduces a check to avoid redundant processing and handler calls.

Commit:

fix(recursive): avoid duplicate URL handling by checking existing entries before insert

image


✅ Result

  • /admin + login/admin/login

  • /admin/ + login/admin/login

  • /admin + /login/admin/login

  • /admin/ + /login/admin/login

  • /admin/ path will now be stored as ["admin/"]

  • /admin path will now be stored as ["admin"]

This preserves semantic meaning, improves correctness when mapping URL paths and constructing tasks, and prevents unnecessary re-processing.

🧪 Test Coverage

Tested manually with paths:

  • /install/
  • /install
  • /api/
  • /api/v1/
  • /api/v1

Confirmed correct insertion, dispatch, and uniqueness under recursion.

@x9xhack x9xhack changed the title (WIP) fix(urls): preserve trailing slashes and normalize URL joining fix(urls): preserve trailing slashes and normalize URL joining Jun 3, 2025
@x9xhack x9xhack changed the title fix(urls): preserve trailing slashes and normalize URL joining ( WIP ) fix(urls): preserve trailing slashes and normalize URL joining Jun 3, 2025
@cestef
Copy link
Owner

cestef commented Jun 4, 2025

The approach we were taking up until now indeed seemed a bit unstable, hmu whenever this PR is ready !

@x9xhack
Copy link
Contributor Author

x9xhack commented Jun 5, 2025

You're absolutely right — I’ve paused the PR and marked it as WIP to rethink the approach. This PR tackles three different issues, but I assume the one that felt unstable to you is the "Preserve trailing slash" part. I also don’t like how manual and brittle the slash handling logic has become — it’s messy and easy to get wrong.

The underlying motivation is that:

  • /something
  • /something/

…are not the same, yet the previous version would normalize both to just something, losing that distinction.

I’ll take another pass to find a more robust, less fragile way to handle this without all the slash juggling. Let me know if you were referring to something else as unstable — really appreciate the review and feedback!

@cestef
Copy link
Owner

cestef commented Jun 6, 2025

Yep we're clearly on the same page, this is pretty much what I meant! lmk if you need any help with this 😄

@x9xhack
Copy link
Contributor Author

x9xhack commented Jun 7, 2025

Hey! Yes, I do have something to discuss, but I’m a bit tied up at the moment. I’ll get back to you at the first chance I get. Thanks for the offer! 😊

@x9xhack
Copy link
Contributor Author

x9xhack commented Jun 20, 2025

Hi @cestef, it's been a while since I last had a chance to look at this. To keep things fresh and clear, I’ll close this PR for now and revisit the approach with a clean slate.

  • During this time, even though briefly, I explored a few other tools, and it became clear how tricky it is to sanitize or normalize user-provided URLs. Especially in fuzzing mode, the user should have full control—some wordlists start with a /, and forcing any normalization can break intended behavior. Honestly, the same applies to recursion mode as well.

So yeah, in the end, messing with slashes just isn’t a good idea. :)

@x9xhack x9xhack closed this Jun 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants