Feat/add clerk auth and premium ux #31

mrmps · 2025-12-07T10:20:26Z

Summary by CodeRabbit

New Features
- Added automatic right-to-left (RTL) language support with intelligent text direction detection for articles and AI responses.
Bug Fixes
- Improved URL validation and normalization to correctly handle malformed protocol formats.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

vercel · 2025-12-07T10:20:33Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
smry	Error			Dec 7, 2025 10:20am

coderabbitai · 2025-12-07T10:20:35Z

Walkthrough

Pull request introduces RTL (Right-to-Left) language support across the application by adding language detection utilities, propagating lang and dir metadata through API routes and components, updating stylesheets for RTL layout, and migrating the build system from Node/pnpm to Bun. Additionally, URL validation is enhanced to handle malformed protocols, and a marketing feature is removed.

Changes

Cohort / File(s)	Summary
Build System Migration to Bun `Dockerfile`, `package.json`, `tsconfig.json`	Replaces Node.js/pnpm with Bun in Dockerfile (base image, install, build command). Adds bun-types to dependencies and updates TypeScript compiler options to include Bun type definitions.
RTL Language Support Utilities `lib/rtl.ts`, `lib/rtl.test.ts`	Introduces RTL detection module with functions to identify RTL languages, analyze text direction via Unicode ranges, and generate direction attributes. Includes comprehensive test coverage for RTL/LTR detection across multiple languages and mixed-content scenarios.
RTL-Aware Styling `app/globals.css`	Adds base-layer CSS rules for RTL support, including direction-based text alignment, prose styling adjustments, blockquote/list/code block adaptations, and forced LTR for code within RTL contexts.
Article API Language & Direction Metadata `app/api/article/route.ts`, `app/api/jina/route.ts`	Propagates lang and dir fields through article fetch/cache/render paths. Computes htmlLang from HTML/parsed data, derives textDir via getTextDirection utility, and includes language/direction in API responses and cached article objects.
Component RTL/Language Props `components/ai/response.tsx`, `components/article/content.tsx`, `components/features/summary-form.tsx`	Adds dir and lang props to Response component; propagates language/direction attributes to rendered article header and content elements; passes RTL metadata to AI response rendering.
Type Schema Updates `types/api.ts`	Constrains dir field in ArticleSchema to enum values ('rtl' \| 'ltr') with 'ltr' default, replacing unrestricted string type.
URL Validation Enhancement `lib/validation/url.ts`, `lib/validation/url.test.ts`	Introduces cleanProtocol helper to sanitize malformed/duplicate protocols; updates normalizeUrl to collapse and fix protocol issues before validation; adds comprehensive test suite covering edge cases and real-world malformed URLs.
Marketing UI Cleanup `components/marketing/ad-spot.tsx`	Removes Zap icon import and corresponding "Fair rotation" FeatureCard from AdDrawerContent.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–25 minutes

Areas requiring attention:
- RTL detection algorithm correctness in lib/rtl.ts, particularly Unicode range classification and character-counting logic in detectTextDirection
- Schema/type consistency across all touched files (verify lang/dir fields propagate correctly through article, cached, and API response types)
- Component prop propagation chain for RTL attributes to ensure correct fallbacks and none are dropped
- URL protocol cleaning loop in lib/validation/url.ts for edge cases (e.g., deeply nested or malformed protocols)
- Dockerfile build correctness and feature parity with pnpm setup (cache mounting, lockfile handling)

Possibly related PRs

feat: enhance summary robustness and refresh UI #27: Modifies the same article API schemas and caching logic in app/api/article/route.ts and app/api/jina/route.ts; adds complementary htmlContent propagation alongside the main PR's lang/dir handling.
Normalize URL handling and introduce marketing ad spots #29: Modifies components/marketing/ad-spot.tsx concurrently; involves related AdSpot component changes.
Integrate Clerk auth and premium UX across app #30: Also affects components/marketing/ad-spot.tsx and implements Clerk/premium gating refactoring alongside the main PR's feature removal.

Poem

🐰 Hop, hop, with texts that twist and turn,
RTL rules and directions we learn!
Bun builds swift, from left to right,
Metadata flows—each article bright! 🌍✨

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	The PR title 'Feat/add clerk auth and premium ux' does not match the actual changes, which primarily implement RTL/LTR language support, Bun migration, and URL validation improvements with no Clerk authentication or premium UX changes present.	Update the title to accurately reflect the main changes, such as 'Add RTL/LTR language support and Bun migration' or similar.
Docstring Coverage	⚠️ Warning	Docstring coverage is 78.57% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/add-clerk-auth-and-premium-ux

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2025-12-07T10:22:53Z

Greptile Overview

Greptile Summary

This PR adds comprehensive RTL (Right-to-Left) language support and migrates the project from Node.js/pnpm to Bun runtime.

Key Changes

RTL Language Support: Implemented automatic detection and proper rendering of RTL languages (Arabic, Hebrew, Persian, etc.) by extracting language metadata from HTML, analyzing Unicode character ranges, and applying appropriate CSS styling
URL Normalization Enhancement: Fixed handling of malformed URL protocols including duplicate protocols (https://https://) and single-slash malformations (https:/example.com)
Bun Migration: Migrated build system from Node.js/pnpm to Bun for improved performance, including Docker configuration and test framework updates
Type Safety: Added proper TypeScript types for dir and lang fields throughout the API schema and component props

Implementation Quality

The RTL implementation is well-architected with proper separation of concerns. The lib/rtl.ts utility provides language detection via ISO 639-1 codes and Unicode range analysis, with a smart fallback that analyzes text content when language metadata is unavailable. Test coverage is comprehensive with real-world Arabic, Hebrew, and Persian text samples.

The URL normalization fix addresses a production issue with iterative protocol cleaning logic that handles edge cases like multiple duplicate protocols.

Confidence Score: 5/5

This PR is safe to merge with minimal risk
The changes are well-tested, follow established patterns, and add new functionality without breaking existing features. The RTL detection logic is robust with proper fallbacks, URL normalization fixes real bugs, and the Bun migration is a straightforward runtime swap with no logic changes. All new code has corresponding tests.
No files require special attention

Important Files Changed

File Analysis

Filename	Score	Overview
Dockerfile	5/5	Migrated from Node.js/pnpm to Bun runtime with updated build commands
lib/rtl.ts	5/5	New RTL language detection utilities with Unicode range analysis and language code lookup
lib/validation/url.ts	5/5	Added cleanProtocol function to fix duplicate and malformed URL protocols
app/api/article/route.ts	4/5	Integrated RTL detection, extracts language from HTML attributes, adds dir/lang to cached articles
app/api/jina/route.ts	4/5	Added RTL text direction detection for Jina-sourced articles with fallback handling
components/article/content.tsx	5/5	Applied dir and lang attributes to article header and content containers
types/api.ts	5/5	Updated ArticleSchema to include dir field with 'rtl'

Sequence Diagram

sequenceDiagram
    participant User
    participant Frontend
    participant ArticleAPI as /api/article
    participant JinaAPI as /api/jina
    participant RTLLib as lib/rtl
    participant URLValidation as lib/validation/url
    participant Redis
    participant Diffbot
    participant JinaReader as Jina Reader

    User->>Frontend: Enter URL
    Frontend->>URLValidation: normalizeUrl(url)
    URLValidation->>URLValidation: cleanProtocol()
    URLValidation-->>Frontend: Normalized URL

    alt Article Fetch (smry-fast/slow/wayback)
        Frontend->>ArticleAPI: GET /api/article?url=...&source=...
        ArticleAPI->>Redis: Check cache
        
        alt Cache Hit
            Redis-->>ArticleAPI: Cached article
        else Cache Miss
            ArticleAPI->>Diffbot: Fetch article
            Diffbot-->>ArticleAPI: HTML content
            ArticleAPI->>ArticleAPI: Parse with Readability
            ArticleAPI->>ArticleAPI: Extract lang attribute
        end
        
        ArticleAPI->>RTLLib: getTextDirection(lang, textContent)
        RTLLib->>RTLLib: isRTLLanguage(lang)
        alt Language is RTL
            RTLLib-->>ArticleAPI: 'rtl'
        else Analyze content
            RTLLib->>RTLLib: detectTextDirection(textContent)
            RTLLib-->>ArticleAPI: 'rtl' or 'ltr'
        end
        
        ArticleAPI->>Redis: Cache article with dir/lang
        ArticleAPI-->>Frontend: Article with dir/lang
    else Jina Fetch
        Frontend->>JinaAPI: GET /api/jina?url=...
        JinaAPI->>Redis: Check cache
        
        alt Cache Hit
            Redis-->>JinaAPI: Cached article
        else Cache Miss
            JinaAPI->>JinaReader: Fetch from Jina
            JinaReader-->>JinaAPI: Article content
        end
        
        JinaAPI->>RTLLib: getTextDirection(null, textContent)
        RTLLib->>RTLLib: detectTextDirection(textContent)
        RTLLib-->>JinaAPI: 'rtl' or 'ltr'
        
        JinaAPI->>Redis: Cache article with dir
        JinaAPI-->>Frontend: Article with dir
    end

    Frontend->>Frontend: Render with dir/lang attributes
    Frontend->>User: Display article (RTL or LTR)

greptile-apps

_{15 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (7)

tsconfig.json (1)

2-4: Global Bun typings are fine, but consider scoping to tests if needed

Adding "types": ["bun-types"] ensures Bun globals and bun:test are typed across the project. If you ever run into conflicts with Node/Next types, consider moving Bun typings into a dedicated test/build tsconfig instead of the main one.

Dockerfile (1)

1-11: Bun-based build looks good; consider pinning image version and/or multi-stage build

The Bun migration (cached bun install + bun run build + bun run start) is clear and should work fine. For better reproducibility and smaller images, consider:

Pinning oven/bun to a specific version instead of latest.

Optionally using a multi-stage build (build in one stage, run in a slimmer runtime stage) if image size becomes a concern.

app/globals.css (1)

179-221: RTL base styles are well-scoped and align with expected behavior

The new [dir="rtl"] rules (text alignment, blockquotes, lists, and forcing code/pre back to LTR) are appropriately scoped in the base layer and should play nicely with Tailwind utilities. This is a solid foundation for RTL rendering.

lib/rtl.test.ts (1)

1-108: Strong RTL test coverage; minor describe/test mismatch for Persian case

The test suite thoroughly exercises isRTLLanguage, detectTextDirection, and getTextDirection for RTL/LTR, mixed content, and null/empty inputs—nice coverage.

One small nit: inside describe('getTextDirection'), the "handles Persian text" test calls detectTextDirection directly. For clarity, consider either:

Moving that test into the detectTextDirection describe block, or

Changing it to assert getTextDirection('fa', persianText) so it matches the surrounding describe name.

Functionally everything is correct; this is just a structure/readability tweak.
lib/validation/url.test.ts (1)
4-195: URL tests are thorough; consider adding coverage for uppercase HTTP(S) schemes

This suite does a great job exercising normalization (including malformed and duplicate protocols), validation, and the Zod schema across many realistic cases.

To backstop the case-insensitivity fix suggested in lib/validation/url.ts, it would be useful to add a couple of expectations like:
it("treats uppercase protocols as valid", () => {
  expect(normalizeUrl("HTTPS://example.com")).toBe("https://example.com");
  expect(normalizeUrl("HTTP://example.com")).toBe("http://example.com");
  expect(isValidUrl("HTTPS://example.com")).toBe(true);
});
That will ensure regressions around scheme casing are caught by tests.

Also applies to: 197-246
lib/rtl.ts (1)

72-74: Sampling strategy note.

Sampling only the first 10,000 characters may not be representative for articles where RTL content appears predominantly in the body rather than the lead. Consider whether this is acceptable for your use case, or if random/distributed sampling would be more robust.
app/api/jina/route.ts (1)
173-183: Redundant dir assignment and inconsistent lang handling.

On lines 177-180:

...articleWithDir already includes dir: articleDir, then line 179 sets dir: articleDir again (redundant).

Line 180 hardcodes lang: "" which overwrites the lang from articleWithDir.

Consider cleaning up:
          article: {
            ...articleWithDir,
            byline: article.byline || "",
-           dir: articleDir,
-           lang: "",
+           lang: articleWithDir.lang || "",
            publishedTime: article.publishedTime || null,
            htmlContent: article.content,
          },

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d68c712 and d6ade63.

⛔ Files ignored due to path filters (2)

bun.lock is excluded by !**/*.lock
pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (15)

Dockerfile (1 hunks)
app/api/article/route.ts (11 hunks)
app/api/jina/route.ts (6 hunks)
app/globals.css (1 hunks)
components/ai/response.tsx (4 hunks)
components/article/content.tsx (2 hunks)
components/features/summary-form.tsx (1 hunks)
components/marketing/ad-spot.tsx (1 hunks)
lib/rtl.test.ts (1 hunks)
lib/rtl.ts (1 hunks)
lib/validation/url.test.ts (1 hunks)
lib/validation/url.ts (3 hunks)
package.json (1 hunks)
tsconfig.json (1 hunks)
types/api.ts (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (5)

lib/rtl.test.ts (1)

lib/rtl.ts (3)

isRTLLanguage (45-51)

detectTextDirection (66-104)

getTextDirection (110-121)

components/features/summary-form.tsx (1)

components/ai/response.tsx (1)

Response (377-420)

lib/validation/url.test.ts (1)

lib/validation/url.ts (3)

normalizeUrl (49-69)

isValidUrl (74-81)

NormalizedUrlSchema (87-104)

app/api/jina/route.ts (2)

lib/rtl.ts (1)

getTextDirection (110-121)

types/api.ts (1)

ArticleResponseSchema (49-56)

app/api/article/route.ts (1)

lib/rtl.ts (1)

getTextDirection (110-121)

🪛 ast-grep (0.40.0)

components/article/content.tsx

[warning] 300-300: Usage of dangerouslySetInnerHTML detected. This bypasses React's built-in XSS protection. Always sanitize HTML content using libraries like DOMPurify before injecting it into the DOM to prevent XSS attacks.
Context: dangerouslySetInnerHTML
Note: [CWE-79] Improper Neutralization of Input During Web Page Generation [REFERENCES]
- https://reactjs.org/docs/dom-elements.html#dangerouslysetinnerhtml
- https://cwe.mitre.org/data/definitions/79.html

(react-unsafe-html-injection)

🔇 Additional comments (24)

package.json (1)

88-107: bun-types devDependency wiring looks appropriate

Adding bun-types as a devDependency aligns with using Bun tooling and bun:test while keeping runtime dependencies clean. No issues from a build/tooling perspective.

components/marketing/ad-spot.tsx (1)

3-6: Lucide icon imports are now minimal and consistent with usage

The lucide-react import list matches actual icon usage (no unused icons like Zap), keeping the bundle clean.

lib/rtl.ts (5)

6-22: RTL languages set looks comprehensive.

The set covers major RTL languages. Note that Kurdish (ku) and Hausa (ha) are bidirectional scripts - they can be written in both Arabic (RTL) and Latin (LTR) scripts. The current approach of treating them as RTL by default is reasonable for content analysis but may produce false positives for Latin-script Kurdish/Hausa content.

45-51: LGTM!

The locale normalization correctly handles both hyphen and underscore separators (e.g., ar-SA, ar_SA), and the null-check is appropriate.

56-60: LGTM!

The implementation is correct and efficient for the small number of RTL ranges.

110-121: LGTM!

The function correctly prioritizes explicit RTL language codes while falling back to content analysis. The behavior of still analyzing content when a non-RTL language code is provided (e.g., en) is reasonable as it allows detection of mixed-content scenarios.

126-136: LGTM!

Clean implementation with proper conditional property inclusion.

components/features/summary-form.tsx (1)

208-211: LGTM!

The RTL props are correctly propagated to the Response component, with appropriate fallback values that align with the article schema defaults.

types/api.ts (1)

29-29: Verify intended behavior with .nullable().optional().default('ltr').

With this chain, .default('ltr') only applies when the field is undefined or missing. If dir: null is explicitly passed, it will remain null (not default to 'ltr').

If null should also resolve to 'ltr', consider using .nullish().default('ltr') or applying the default after parsing. Otherwise, components consuming this schema should handle null values explicitly (as they currently do with dir || 'ltr' fallbacks).

components/article/content.tsx (1)

70-74: LGTM!

The header container correctly propagates article direction and language attributes for proper RTL rendering. The fallbacks align with the schema defaults.

app/api/jina/route.ts (4)

22-24: LGTM!

The CachedArticleSchema correctly adds optional lang and dir fields with proper types matching the ArticleSchema constraints.

71-72: LGTM!

The GET path correctly falls back to computing direction when not cached, using both the language code and text content for accurate detection.

197-198: LGTM!

The existing-cache path correctly computes direction as a fallback, consistent with the GET handler.

210-220: LGTM!

The error fallback path correctly computes and includes direction, ensuring RTL support even when caching fails.

components/ai/response.tsx (2)

203-206: LGTM!

The type definitions for RTL support are well-documented and correctly typed. The optional nature is appropriate since language metadata may not always be available.

386-402: LGTM!

The dir and lang props are correctly propagated to the root container div, enabling proper RTL rendering and language accessibility for the markdown content.

app/api/article/route.ts (8)

12-12: LGTM!

Import of getTextDirection aligns with the RTL support pattern used across the codebase.

26-27: LGTM!

Schema addition for lang is appropriately optional and nullable for backward compatibility with existing cached data.

40-41: LGTM!

The dir field is correctly constrained to the enum ['rtl', 'ltr'] and both fields are optional/nullable, ensuring backward compatibility with existing cache entries.

223-250: LGTM!

The language extraction logic properly checks multiple sources (HTML lang, xml:lang, Readability's extracted lang) with appropriate fallbacks. The text direction detection correctly uses both the language code and content analysis via getTextDirection.

332-349: LGTM!

Direction detection and field population follow the same consistent pattern as the smry-fast path. The enhanced logging with lang and dir improves observability.

455-456: LGTM!

The fallback to getTextDirection for missing dir values ensures backward compatibility with cached articles that predate this change.

539-540: LGTM!

Consistent dir/lang population pattern maintained across the fallback response path.

562-563: LGTM!

Consistent dir/lang population in the main success response path.

coderabbitai · 2025-12-07T10:27:32Z

app/api/article/route.ts

+          dir: validatedArticle.dir || getTextDirection(validatedArticle.lang, validatedArticle.textContent),
+          lang: validatedArticle.lang || "",
          content: validatedArticle.content,
          textContent: validatedArticle.textContent,
          length: validatedArticle.length,
          siteName: validatedArticle.siteName,
          publishedTime: validatedArticle.publishedTime || null,
+          image: validatedArticle.image || null,


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for ArticleResponseSchema definition rg -n "ArticleResponseSchema" --type=ts -A 40

Repository: mrmps/SMRY

Length of output: 29677

🏁 Script executed:

#!/bin/bash # Find ArticleSchema definition rg -n "ArticleSchema\s*=" --type=ts -B 2 -A 20 | head -100

Repository: mrmps/SMRY

Length of output: 5544

Response object inconsistency: image field presence varies across paths.

The response at line 617 includes image: validatedArticle.image || null, but similar response constructions at lines 533-549 and 556-572 omit this field. While ArticleSchema defines image as optional, this inconsistency means clients will receive different response structures depending on which code path executes (cache hit, post-fetch validation, or error handler).

Add the image field to the response blocks at lines 533-549 and 556-572 for consistency.

🤖 Prompt for AI Agents

In app/api/article/route.ts around lines 533-549, 556-572 and 610-617, the responses are inconsistent: the block at 610-617 includes image: validatedArticle.image || null but the earlier response blocks (533-549 and 556-572) omit image; add image: validatedArticle.image || null to the response objects in the 533-549 and 556-572 blocks so all code paths return the same shape (use the same null fallback as the existing block) and keep field ordering consistent with the other response properties.

coderabbitai · 2025-12-07T10:27:32Z

app/api/jina/route.ts

+      // Detect text direction for the incoming article
+      const articleDir = getTextDirection(null, article.textContent);
+      const articleWithDir = { ...article, dir: articleDir, lang: article.lang || null };


⚠️ Potential issue | 🟡 Minor

article.lang doesn't exist on the input schema.

The JinaCacheUpdateSchema (defined in types/api.ts, lines 92-104) doesn't include a lang field. The expression article.lang || null on line 166 will always evaluate to null because article.lang is undefined.

Either add lang to JinaCacheUpdateSchema if callers can provide it, or simplify:

- const articleWithDir = { ...article, dir: articleDir, lang: article.lang || null }; + const articleWithDir = { ...article, dir: articleDir, lang: null };

#!/bin/bash # Verify JinaCacheUpdateSchema definition echo "=== JinaCacheUpdateSchema definition ===" rg -n -A15 'JinaCacheUpdateSchema' types/api.ts

🤖 Prompt for AI Agents

In app/api/jina/route.ts around lines 164 to 166, the code reads article.lang || null but JinaCacheUpdateSchema does not define a lang field so article.lang will always be undefined; either (A) add an optional lang?: string | null to JinaCacheUpdateSchema in types/api.ts (update the type at lines ~92-104 and ensure any callers populate it where appropriate), or (B) simplify the route code to stop referencing article.lang and set lang: null (or omit lang entirely) when constructing articleWithDir; pick A if callers can/should supply language, otherwise pick B and remove the redundant check.

coderabbitai · 2025-12-07T10:27:32Z

components/ai/response.tsx

    );
  },
-  (prevProps, nextProps) => prevProps.children === nextProps.children
+  (prevProps, nextProps) => prevProps.children === nextProps.children && prevProps.dir === nextProps.dir


⚠️ Potential issue | 🟡 Minor

Memoization predicate missing lang comparison.

The lang prop is passed to the div but not included in the memoization check. If lang changes while children and dir remain the same, the component won't re-render, leaving a stale lang attribute on the DOM element.

- (prevProps, nextProps) => prevProps.children === nextProps.children && prevProps.dir === nextProps.dir + (prevProps, nextProps) => prevProps.children === nextProps.children && prevProps.dir === nextProps.dir && prevProps.lang === nextProps.lang

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

(prevProps, nextProps) => prevProps.children === nextProps.children && prevProps.dir === nextProps.dir

(prevProps, nextProps) => prevProps.children === nextProps.children && prevProps.dir === nextProps.dir && prevProps.lang === nextProps.lang

🤖 Prompt for AI Agents

In components/ai/response.tsx around line 419, the memoization predicate only compares children and dir but omits lang, so when lang changes the component can stay memoized and the DOM's lang attribute becomes stale; update the predicate to also compare prevProps.lang === nextProps.lang (i.e., return prevProps.children === nextProps.children && prevProps.dir === nextProps.dir && prevProps.lang === nextProps.lang) so the component re-renders when lang changes.

coderabbitai · 2025-12-07T10:27:33Z

components/article/content.tsx

                  <div
                    className="mt-6 wrap-break-word prose dark:prose-invert max-w-none"
+                    dir={data.article.dir || 'ltr'}
+                    lang={data.article.lang || undefined}
                    dangerouslySetInnerHTML={{ __html: data.article.content }}
                  />


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check for sanitization utilities and DOMPurify usage echo "=== Checking for DOMPurify and sanitization imports ===" rg -n 'DOMPurify|sanitize|purify' --type=ts --type=tsx echo "" echo "=== Checking article-related files ===" fd -e ts -e tsx 'article' | head -20 echo "" echo "=== Checking for readability processing ===" rg -n 'readability|Readability' --type=ts --type=tsx

Repository: mrmps/SMRY

Length of output: 293

🏁 Script executed:

#!/bin/bash # Search without file type filter echo "=== Checking for sanitization ===" rg -n 'DOMPurify|sanitize|purify|xss|escapeHtml|strip' --glob='*.ts' --glob='*.tsx' echo "" echo "=== Reading use-articles.ts ===" cat -n lib/hooks/use-articles.ts echo "" echo "=== Finding article API routes ===" find . -type f \( -name '*.ts' -o -name '*.tsx' \) -path '*/routes/*' -o -path '*/api/*' | grep -i article

Repository: mrmps/SMRY

Length of output: 5161

🏁 Script executed:

#!/bin/bash # Read jina.ts to see escapeHtml usage echo "=== Reading lib/api/jina.ts ===" cat -n lib/api/jina.ts echo "" echo "=== Reading app/api/article/route.ts ===" cat -n app/api/article/route.ts

Repository: mrmps/SMRY

Length of output: 32659

🏁 Script executed:

#!/bin/bash # Check for Content Security Policy or other security headers echo "=== Checking for CSP or security headers in Next.js config ===" find . -type f \( -name 'next.config.*' -o -name 'middleware.ts' \) | head -10 | xargs cat 2>/dev/null echo "" echo "=== Checking package.json for marked and readability versions ===" grep -A 2 '"marked"\|"readability"\|"dompurify"' package.json

Repository: mrmps/SMRY

Length of output: 699

Add HTML sanitization before rendering data.article.content with dangerouslySetInnerHTML.

Article content from all sources (Jina.ai, Readability, Diffbot) is currently rendered without sanitization. The marked markdown parser, Mozilla Readability, and Diffbot all output unsanitized HTML, and the Jina fallback's escapeHtml() only encodes entities without removing dangerous tags/attributes. Without DOMPurify or similar, the application is vulnerable to XSS attacks. Integrate a sanitization library (e.g., DOMPurify) in the article processing pipeline before the content reaches this component.

The dir/lang additions are correct and improve accessibility.

🧰 Tools

🪛 ast-grep (0.40.0)

[warning] 300-300: Usage of dangerouslySetInnerHTML detected. This bypasses React's built-in XSS protection. Always sanitize HTML content using libraries like DOMPurify before injecting it into the DOM to prevent XSS attacks.
Context: dangerouslySetInnerHTML
Note: [CWE-79] Improper Neutralization of Input During Web Page Generation [REFERENCES]
- https://reactjs.org/docs/dom-elements.html#dangerouslysetinnerhtml
- https://cwe.mitre.org/data/definitions/79.html

(react-unsafe-html-injection)

🪛 Biome (2.1.2)

[error] 301-301: Avoid passing content using the dangerouslySetInnerHTML prop.

Setting content using code can expose users to cross-site scripting (XSS) attacks

(lint/security/noDangerouslySetInnerHtml)

🤖 Prompt for AI Agents

In components/article/content.tsx around lines 297–302, the component currently injects data.article.content directly via dangerouslySetInnerHTML, which is vulnerable to XSS because content from Jina/Readability/Diffbot is unsanitized; integrate HTML sanitization (e.g., DOMPurify) in the article processing pipeline or just before rendering: install/import a sanitizer, run data.article.content through the sanitizer to remove dangerous tags/attributes and then pass the sanitized HTML to dangerouslySetInnerHTML (or store sanitized content on the article object earlier), ensuring the sanitized string is what the component renders while leaving dir/lang handling unchanged.

coderabbitai · 2025-12-07T10:27:33Z

lib/validation/url.ts

+// Valid protocol: http:// or https://
+const VALID_PROTOCOL_REGEX = /^https?:\/\//;



⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "url.ts" -o -name "url.test.ts" | head -20

Repository: mrmps/SMRY

Length of output: 106

🏁 Script executed:

cat -n lib/validation/url.ts

Repository: mrmps/SMRY

Length of output: 3593

🏁 Script executed:

cat -n lib/validation/url.test.ts

Repository: mrmps/SMRY

Length of output: 10400

🏁 Script executed:

# Check if validator.js documentation or source is available in node_modules find . -type d -name "validator" 2>/dev/null | head -5

Repository: mrmps/SMRY

Length of output: 36

🏁 Script executed:

# Look for validator package info fd -t f "package.json" | xargs grep -l "validator" | head -3

Repository: mrmps/SMRY

Length of output: 66

🏁 Script executed:

cat package.json | grep -A 5 "validator"

Repository: mrmps/SMRY

Length of output: 410

Add case-insensitive flag to protocol regex to handle uppercase schemes

The VALID_PROTOCOL_REGEX is case-sensitive while the cleanProtocol function uses case-insensitive regexes, creating an inconsistency. Uppercase schemes like "HTTPS://example.com" will fail the check at line 60, causing normalizeUrl to incorrectly prepend https://, resulting in "https://HTTPS://example.com" that fails validation.

Add the /i flag to match the case-insensitive handling in cleanProtocol:

-const VALID_PROTOCOL_REGEX = /^https?:\/\//; +const VALID_PROTOCOL_REGEX = /^https?:\/\//i;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Valid protocol: http:// or https://

const VALID_PROTOCOL_REGEX = /^https?:\/\//;

// Valid protocol: http:// or https://

const VALID_PROTOCOL_REGEX = /^https?:\/\//i;

🤖 Prompt for AI Agents

In lib/validation/url.ts around lines 4 to 6, the VALID_PROTOCOL_REGEX is currently case-sensitive and will not match uppercase schemes (e.g., "HTTPS://"), causing normalizeUrl to prepend another protocol; update the regex to be case-insensitive by adding the /i flag (e.g., /^https?:\/\//i) so it matches "http"/"https" in any case and keep the existing comment or adjust it if needed.

mrmps added 6 commits December 7, 2025 01:35

removes featurecard

f6a263d

fix

6ab32d3

ads normalize url and tests

124fed6

adds bun tests

1816e9a

docker switch to bun

eeb8763

adds rtl

d6ade63

greptile-apps bot reviewed Dec 7, 2025

View reviewed changes

coderabbitai bot reviewed Dec 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/add clerk auth and premium ux #31

Feat/add clerk auth and premium ux #31

Uh oh!

mrmps commented Dec 7, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

vercel bot commented Dec 7, 2025

Uh oh!

coderabbitai bot commented Dec 7, 2025 •

edited

Loading

Uh oh!

greptile-apps bot commented Dec 7, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 7, 2025

Uh oh!

coderabbitai bot Dec 7, 2025

Uh oh!

coderabbitai bot Dec 7, 2025

Uh oh!

coderabbitai bot Dec 7, 2025

Uh oh!

coderabbitai bot Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	(prevProps, nextProps) => prevProps.children === nextProps.children && prevProps.dir === nextProps.dir
	(prevProps, nextProps) => prevProps.children === nextProps.children && prevProps.dir === nextProps.dir && prevProps.lang === nextProps.lang

		// Valid protocol: http:// or https://
		const VALID_PROTOCOL_REGEX = /^https?:\/\//;

Feat/add clerk auth and premium ux #31

Are you sure you want to change the base?

Feat/add clerk auth and premium ux #31

Uh oh!

Conversation

mrmps commented Dec 7, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

vercel bot commented Dec 7, 2025

Uh oh!

coderabbitai bot commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

greptile-apps bot commented Dec 7, 2025

Greptile Overview

Greptile Summary

Key Changes

Implementation Quality

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mrmps commented Dec 7, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 7, 2025 •

edited

Loading