Fixed HTML Validation Issues with Non-Transparent Elements in References#30
Fixed HTML Validation Issues with Non-Transparent Elements in References#30mercykolapo wants to merge 5 commits intoSamWilsn:masterfrom
Conversation
src/docc/plugins/html/__init__.py
Outdated
There was a problem hiding this comment.
There is the Visitor class that abstracts visiting children. You should use that instead of rolling your own, unless there's a reason not to?
There was a problem hiding this comment.
Thank you for the feedback. I'll fix this. I am just getting familiar with the repo.
There was a problem hiding this comment.
I have implemented the abstract visitor class. I didn't know it was already in existence.
src/docc/plugins/html/__init__.py
Outdated
There was a problem hiding this comment.
I think it's okay to import this at the top of the file.
src/docc/plugins/html/__init__.py
Outdated
There was a problem hiding this comment.
Definitely. Done!
Thank you for the feedback.
| new_node.append(insertion_point) | ||
| else: | ||
| new_node.append(original_child) | ||
| return new_node |
There was a problem hiding this comment.
In the following example, where do anchors end up? Not saying there's anything wrong with what you have; I just want to check my intuition.
<table>
<a href="foo">
<tr>
<td>hello</td>
<td>world</td>
</tr>
</a>
</table>There was a problem hiding this comment.
Thank you for the feedback, the current logic created a nested anchor tag. I have fixed this. The current logic now handles cases where there is already an anchor tag. I will push my changes in a moment.
There was a problem hiding this comment.
Oh, my bad! I was unclear. I guess I was asking about:
<table>
<Reference ...>
<tr>
<td>hello</td>
<td>world</td>
</tr>
</Reference>
</table>There was a problem hiding this comment.
The function finds the first text node ("hello") and wraps it in an anchor
It then wraps the entire in another anchor to preserve structure
This creates nested anchors - outer anchor around the row, inner anchor around the text
The HTML is valid but creates nested clickable areas
The result looks like this:
<table>
<a href="...">
<tr>
<td><a href="...">hello</a></td>
<td>world</td>
</tr>
</a>
</table>
I will push a fix for this because nested anchor tags was not accounted for if not present in the original HTML.
There was a problem hiding this comment.
I have pushed a fix for this. Thank you again for the feedback!
There was a problem hiding this comment.
Hm, we'll have to figure out what to do with this. I guess we'll need some plumbing for pytest and GitHub Actions?
|
I haven't forgotten about this one. I'm playing around with it locally. |
Issue Link: #27 (comment)
Problem
The code was creating invalid HTML by putting table elements ( , , etc.) inside tags, which breaks HTML rules.
Solution
Added smart detection and handling for elements that can't be wrapped in anchor tags.
What Changed
• New function: _contains_non_transparent_elements() - detects problematic HTML elements
• New function: _handle_non_transparent_reference() - handles these cases properly
• Updated: references_reference() - now checks for problems before creating links
• Added: List of HTML elements that can't be inside tags
How It Works
1. Check: Before creating a link, check if the content has problematic elements
2. Fix: Try to move the link inside the content (e.g., text becomes text )
3. Fallback: If can't fix, render without link and show warning
• HTML is now valid
• No breaking changes
• Clear warnings when links can't be created
• Works with existing code
Files Changed
• src/docc/plugins/html/init.py - Added new functions and updated reference handling
Testing
• Added detection for table elements, forms, and other problematic HTML
• Handles edge cases gracefully
• Maintains backward compatibility