Conversation
| @@ -114,8 +116,6 @@ GEM | |||
| sawyer (0.9.2) | |||
| addressable (>= 2.3.5) | |||
| faraday (>= 0.17.3, < 3) | |||
| selma (0.4.11) | |||
| rb_sys (~> 0.9) | |||
There was a problem hiding this comment.
This is weird, but a unclear Bundler behaviour.
Same as a2ef188 that I did to post in the Bundler Slack.
URLs at end of sentences (e.g. "see https://example.com.") had the trailing period included, which would cause fetch failures. Now strip common punctuation (.,;:!?) from the end of matched URLs.
LogfmtLogger already handles proper quoting via String#dump, so calling .inspect on error messages causes double-escaping.
Negative lookbehind (?<!...) requires ES2018+ browsers: Chrome 62+, Firefox 78+, Safari 16.4+, Node 8.10+
The comment described standalone usage but the partial is included via haml :'_link_titles' from edit.haml.
Complete the standard log level set (debug, info, warn, error).
There was a problem hiding this comment.
Pull request overview
Adds an opt-in “fetch link titles” feature in the page editor to convert bare URLs into Markdown links using the target page’s <title>, backed by a new server endpoint and HTTPX-based title fetcher.
Changes:
- Add
/link-titlePOST endpoint plusLinkTitleFetcherservice (HTTPX streaming + redirects + timeout). - Add editor UI (“Hämta titlar”) + JS URL extraction/rewrite + CSS styling.
- Add unit/integration/JS tests and wire JS tests into
rake test; introduce structuredLogfmtLoggerand silence logs in integration tests.
Reviewed changes
Copilot reviewed 16 out of 20 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
views/page/new.haml |
Adds link-title UI near the content textarea (and renames textarea id). |
views/page/edit.haml |
Adds link-title UI near the content textarea (and renames textarea id). |
views/page/_link_titles.haml |
New partial containing the UI + fetch/replace JS logic. |
public/javascripts/link_titles.js |
URL extraction helper (exported for Node tests). |
public/stylesheets/default.css |
Styles for the inline “Hämta titlar” control and status text. |
lib/controllers/page_controller.rb |
Adds /link-title endpoint that fetches titles and returns JSON. |
lib/services/link_title_fetcher.rb |
New HTTPX-based title fetcher with streaming and byte limit. |
lib/services/logfmt_logger.rb |
New structured logfmt logger. |
lib/app.rb |
Adds App.log / App.log= and requires the new logger. |
test/unit/link_title_fetcher_test.rb |
Unit tests for title parsing/edge cases using a mock HTTP client. |
test/integration/app_link_titles_test.rb |
Integration tests for /link-title. |
test/javascript/link_titles_test.js |
Node tests for URL extraction. |
test/unit/logfmt_logger_test.rb |
Unit tests for logfmt formatting/escaping behavior. |
test/integration_test_helper.rb |
Silences app logging during integration tests. |
Rakefile |
Adds test:javascript and includes it in the default test task. |
Gemfile / Gemfile.lock |
Adds httpx and updates lockfile platforms/deps. |
vendor/cache/http-2-1.1.1.gem |
Vendors new dependency required by HTTPX. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| -# Partial for "Hämta titlar" link and JavaScript | ||
| %span.link-titles-inline | ||
| %a#link-titles-btn{ onclick: "fetchLinkTitles()" } Hämta titlar |
There was a problem hiding this comment.
The UI uses an <a> without an href plus an onclick handler. Anchors without href are not keyboard-accessible by default and are problematic for assistive tech. Use a <button type="button"> (or add href="#", role="button", keyboard handlers, and event.preventDefault()), and avoid nesting an interactive control inside a <label> if possible.
| %a#link-titles-btn{ onclick: "fetchLinkTitles()" } Hämta titlar | |
| %button#link-titles-btn{ type: "button", onclick: "fetchLinkTitles()" } Hämta titlar |
There was a problem hiding this comment.
I like the flow now, pressing tab goes from title input to content text area. Fetching the titles is something of a niche feature, not always used. I think the current solution is fine.
| desc "Run JavaScript tests (requires Node.js)" | ||
| task :javascript do | ||
| if system('which node > /dev/null 2>&1') | ||
| sh 'node --test test/javascript/*_test.js' |
There was a problem hiding this comment.
The Rake task only checks that node exists, but then runs node --test, which requires newer Node versions. If an older Node is installed, this task will fail. Consider feature-detecting --test support (or checking node --version) and skipping with a clear message when unsupported.
| sh 'node --test test/javascript/*_test.js' | |
| if system('node --test -e "" > /dev/null 2>&1') | |
| sh 'node --test test/javascript/*_test.js' | |
| else | |
| puts "Skipping JavaScript tests (installed Node.js does not support `--test` flag)" | |
| end |
There was a problem hiding this comment.
I think we are totally fine, Gemini says
The
--testflag was first introduced to Node.js as an experimental feature in April 2022 with the release of Node.js v18.0.0.
public/javascripts/link_titles.js
Outdated
| @@ -0,0 +1,30 @@ | |||
| // Extract bare URLs from content (URLs not already in markdown link format) | |||
| // Note: Uses negative lookbehind (?<!...) which requires ES2018+ | |||
| // Supported in Chrome 62+, Firefox 78+, Safari 16.4+, Node 8.10+ | |||
There was a problem hiding this comment.
The comment claims negative lookbehind is supported in “Node 8.10+”, but this file uses ES2018 regex lookbehind (and the test harness uses node:test / node --test, which generally implies Node 18+). Update the compatibility note to reflect the actual minimum supported Node/browser versions, or avoid lookbehind if older runtimes must be supported.
| // Supported in Chrome 62+, Firefox 78+, Safari 16.4+, Node 8.10+ | |
| // Supported in Chrome 62+, Firefox 78+, Safari 16.4+, Node 18+ |
|
|
||
| post "/link-title", url: "https://example.com" | ||
|
|
||
| assert_equal 302, last_response.status |
There was a problem hiding this comment.
This endpoint returns 401 when not logged in (see post '/link-title' in the controller), but the test asserts a 302 redirect. Update the expectation to match the endpoint behavior (or change the endpoint to redirect, but that’s atypical for a JSON/XHR endpoint).
| assert_equal 302, last_response.status | |
| assert_equal 401, last_response.status |
There was a problem hiding this comment.
If this is wrong, why aren't the test failing? 🤔
| def test_link_title_returns_json | ||
| ensure_logged_in_user(@user) | ||
|
|
||
| post "/link-title", url: "https://nonexistent.invalid" | ||
|
|
There was a problem hiding this comment.
This test posts to /link-title with a real hostname (nonexistent.invalid), which causes the controller to make a real outbound HTTP request via HTTPX. That makes the test suite slower and potentially flaky/blocked in CI environments without egress/DNS. Prefer stubbing the fetcher/HTTP client (e.g., inject a mock LinkTitleFetcher/HTTPX in the app for tests) and assert on the stubbed response instead of doing real network I/O.
| url = params[:url].to_s | ||
| halt 400, "Missing URL" if url.empty? | ||
|
|
||
| content_type :json | ||
|
|
There was a problem hiding this comment.
/link-title accepts an arbitrary url and fetches it server-side without validation. This is an SSRF vector (e.g., http://localhost, cloud metadata IPs, private RFC1918 ranges) and redirects can pivot to internal hosts. Add strict validation (http/https only) and block private/loopback/link-local IPs (including after DNS resolution and after redirects) before fetching.
views/page/_link_titles.haml
Outdated
| var content = textarea.value; | ||
| var escapedUrl = url.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); | ||
| // Negative lookbehind requires ES2018+ (Chrome 62+, Firefox 78+, Safari 16.4+) | ||
| var pattern = new RegExp('(?<!\\]\\()' + escapedUrl, 'g'); | ||
| textarea.value = content.replace(pattern, '[' + data.title + '](' + url + ')'); |
There was a problem hiding this comment.
The fetched <title> is inserted into markdown as [${title}](${url}) without escaping. Titles containing ], (, ), backslashes, or newlines can break the markdown and potentially enable unexpected rendering depending on markdown settings. Escape markdown-special characters in data.title before inserting.
1. Fix Node version in browser compatibility comment (Node 18+) 2. Use mock fetcher in integration tests to avoid network I/O 3. Add SSRF protection via UrlValidator - blocks private IPs, loopback, link-local, and validates after DNS resolution 4. Escape markdown-special characters in fetched titles to prevent injection (brackets, backslashes, newlines)
The BaseController before filter already handles this with a redirect.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 22 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @http || HTTPX.plugin(:stream) | ||
| .plugin(:follow_redirects) | ||
| .with(timeout: { operation_timeout: TIMEOUT }, follow_redirects: { max_redirects: MAX_REDIRECTS }) | ||
| end |
There was a problem hiding this comment.
LinkTitleFetcher enables :follow_redirects, but /link-title only validates the initial URL via UrlValidator.safe?. A user can supply a public URL that 3xx-redirects to a private/loopback/link-local host and the fetcher will follow it, reintroducing an SSRF path despite the validator. Consider disabling automatic redirects and manually following them while re-validating each Location (and/or validating the final effective URL) before performing the next request.
| def safe? | ||
| uri = URI.parse(@url) | ||
| return false unless %w[http https].include?(uri.scheme) | ||
| return false if BLOCKED_HOSTS.include?(uri.host&.downcase) | ||
| return false if private_ip?(uri.host) | ||
|
|
||
| # Resolve DNS and check resolved IPs | ||
| ips = resolve_ips(uri.host) | ||
| return false if ips.empty? | ||
| return false if ips.any? { |ip| private_ip?(ip) } |
There was a problem hiding this comment.
UrlValidator.safe? assumes URI.parse(@url).host is present. Some syntactically-valid URIs (e.g. http:///path) have scheme but no host, which will cause Resolv.getaddresses(nil) to raise a TypeError that is not rescued here, turning an invalid input into a 500. Add an explicit return false unless uri.host (and/or rescue TypeError in resolve_ips / safe?).
| var response = await fetch('/link-title', { | ||
| method: 'POST', | ||
| body: formData | ||
| }); | ||
|
|
||
| if (response.ok) { | ||
| var data = await response.json(); | ||
| if (data.error) { |
There was a problem hiding this comment.
The client treats any response.ok as JSON and immediately calls response.json(). If the session has expired, BaseController#prevent_unauthorized_modifications will redirect back, and fetch() will follow the 302 and return a 200 HTML page — response.ok will be true, but response.json() will throw and the user will just see a console error. Consider either (a) sending an Accept: application/json (or X-Requested-With) header and returning a non-redirect 401/403 for XHR/JSON requests, or (b) verifying Content-Type is JSON before parsing and showing a visible status error when it isn't.

Uses https://gitlab.com/os85/httpx (https://github.com/HoneyryderChuck/httpx) and the
:streamplugin (https://gitlab.com/os85/httpx/-/wikis/Stream)Close #560