Skip to content

Conversation

@Crozzers
Copy link
Contributor

@Crozzers Crozzers commented Dec 18, 2025

This PR fixes #668.

The problem was with the following input using the code-friendly extra:

a_b **x***y* c_d

Code friendly piggy-backs off of the GFM IAB processor and disables _ and __ for em and strong. It works by detecting valid em/strong with this syntax and then hashing it to protect it. Otherwise, it just leaves it alone.

The GFM IAB processor works by matching delimiter runs, * or _ syntax, and incrementing an index to keep track of how much of the input it has processed. For the above input it would follow roughly this process:

  1. Match the first _. It's an opening run. Save for later
  2. Match the **. It's an opening run. Save for later
    • Match the ***. It's an opening AND closing run. Process now
    • Code friendly looks at it and decides to leave it alone, as it's not relevant to the code friendly extra
    • index is set to after *** as this span has been "processed"
  3. Match the *, process now, index incremented
  4. Match the final _. It's a closing run, process now
    • Look for a valid opening run, except the first _ is before index. Usually means nested em has happened. Re-run the loop

Since the text hasn't actually changed, this loop runs forever.

To fix this, I've just added an extra condition to the loop that runs when nested em is detected.
We hash the input text and only continue the loop as long as the text is altered. If the text remains un-altered after a full iteration, we exit.

I've used _hash_text here, although maybe in the future a faster hash could be used if performance becomes an issue. #619 springs to mind, or a quick google suggested the FNV hash. Something for a future PR

@Crozzers
Copy link
Contributor Author

Funnily enough I actually implemented this same hash check in a WIP version of #665. I added it for safety after I added the while nesting loop but I think I removed it because I couldn't think of a scenario where it would be needed. 🤦

@nicholasserra nicholasserra merged commit 2cd30fb into trentm:master Dec 21, 2025
15 checks passed
@nicholasserra
Copy link
Collaborator

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Regression: Hang with code-friendly

2 participants