Skip to content

feat: Add JSON file connector for importing JSON data#20

Merged
youhaowei merged 60 commits intomainfrom
auto-claude/002-add-json-file-connector
Feb 4, 2026
Merged

feat: Add JSON file connector for importing JSON data#20
youhaowei merged 60 commits intomainfrom
auto-claude/002-add-json-file-connector

Conversation

@youhaowei
Copy link
Owner

@youhaowei youhaowei commented Jan 3, 2026

Summary

  • Add new @dashframe/connector-json package for importing JSON files as data sources
  • Support for nested JSON structures with automatic flattening to tabular format
  • Comprehensive type inference (string, number, boolean, date) from JSON values
  • Integrate JSON connector into the app's connector registry

Changes

New Package: @dashframe/connector-json

  • Flatten utilities (flatten.ts): Convert nested JSON to flat objects with dot-notation keys
    • flattenObject(): Flatten single objects with configurable depth, separator, array handling
    • flattenObjectArray(): Flatten arrays with consistent keys across all objects
    • unflattenObject(): Reverse operation for roundtrip support
  • JSON to DataFrame converter (index.ts): Convert JSON to Arrow-based DataFrame
    • Automatic type inference from JSON values
    • Support for JSON arrays (records) and single objects
    • Proper null handling for missing values
  • JSONConnector class (connector.ts): File-based connector extending FileSourceConnector
    • Validates .json files
    • Configurable array handling options

Integration

  • Register JSON connector in apps/web/lib/connectors/registry.ts
  • Add @dashframe/connector-json as dependency to web app

Tests

  • 87 unit tests covering:
    • Flatten utilities (57 tests)
    • Connector and conversion logic (30 tests)

Test plan

  • All tests passing (bun check passes with 47 tasks)
  • TypeScript types verified
  • ESLint passes (cognitive complexity refactored)
  • Prettier formatting applied

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Local file upload (CSV/JSON) with format detection and improved JSON→chart workflow with suggestions and one-click visualization.
  • Documentation

    • Expanded E2E README and added sample datasets and usage guidance.
  • Tests

    • New E2E pipeline and feature scenarios for local uploads and JSON→chart; unit tests for CSV parser and local file connector.
  • Chores

    • Connector/package reorganization, package renames, CI/e2e and formatting script updates; visualization wiring improvements for more reliable rendering.

✏️ Tip: You can customize this high-level summary in your review settings.

Greptile Overview

Greptile Summary

This PR adds comprehensive JSON file support to DashFrame by creating a new @dashframe/json package and refactoring the file connector architecture.

Key changes:

  • New JSON package (packages/json/): Implements robust JSON flattening utilities and JSON-to-DataFrame conversion with automatic type inference (string, number, boolean, date)
  • Unified local file connector (packages/connector-local/): Replaces the CSV-specific connector with a format-agnostic connector that handles both CSV and JSON files
  • CSV utilities extraction (packages/csv/): Separates CSV parsing logic from connector code for better reusability
  • Integration: Updated connector registry to use the new localFileConnector

Architecture improvements:

  • Clean separation between format-specific parsers (@dashframe/csv, @dashframe/json) and connector orchestration (@dashframe/connector-local)
  • Consistent type inference strategy across both CSV and JSON (uses first non-null value)
  • Comprehensive test coverage with 87 new unit tests

Data handling:

  • JSON arrays are flattened to tabular format with dot-notation keys (e.g., user.address.city)
  • Nested structures are automatically flattened with configurable depth, separator, and array handling
  • Missing keys across objects are filled with null for consistent column schemas
  • Primary key detection using regex pattern /^_?id$/i

The implementation follows the established patterns from CSV connector and integrates cleanly with the existing DataFrame architecture.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk - well-tested, follows established patterns, and maintains backward compatibility.
  • The implementation demonstrates high quality: 87 comprehensive unit tests, clean architecture with separation of concerns, consistent patterns with existing CSV connector, and thorough validation. The code follows functional programming principles where appropriate, has good error handling, and addresses edge cases (empty arrays, null values, mixed types). Previous review concerns about empty-string column names and mixed array validation have been addressed in the implementation.
  • No files require special attention - all core files have comprehensive test coverage and follow established patterns.

Important Files Changed

Filename Overview
packages/json/src/flatten.ts Implements JSON flattening utilities with dot-notation keys. Well-tested with comprehensive edge case handling.
packages/json/src/index.ts Converts JSON to DataFrame with type inference and Arrow IPC serialization. Type inference uses first non-null value.
packages/connector-local/src/connector.ts Unified local file connector supporting CSV and JSON. Good validation for array-of-objects requirement.
apps/web/lib/connectors/registry.ts Updated to use localFileConnector instead of deprecated CSV-only connector. Clean integration.

Sequence Diagram

sequenceDiagram
    participant User
    participant LocalFileConnector
    participant JSONParser
    participant FlattenUtils
    participant ArrowConverter
    participant DuckDB
    participant IndexedDB

    User->>LocalFileConnector: Upload JSON file
    LocalFileConnector->>LocalFileConnector: Validate file size & extension
    LocalFileConnector->>JSONParser: parse(text, tableId)
    JSONParser->>JSONParser: JSON.parse(text)
    JSONParser->>JSONParser: Validate structure (array/object)
    JSONParser->>FlattenUtils: flattenObjectArray(jsonData)
    FlattenUtils->>FlattenUtils: flattenObject() for each item
    FlattenUtils->>FlattenUtils: Normalize keys across objects
    FlattenUtils-->>JSONParser: FlattenedObject[]
    JSONParser->>JSONParser: inferColumnType() from values
    JSONParser->>JSONParser: parseValue() with type coercion
    JSONParser->>ArrowConverter: Build Arrow Table
    ArrowConverter->>ArrowConverter: Create typed vectors (Float64, Bool, Utf8, TimestampMillisecond)
    ArrowConverter->>ArrowConverter: tableToIPC(arrowTable)
    ArrowConverter->>IndexedDB: DataFrame.create(ipcBuffer)
    IndexedDB-->>JSONParser: DataFrame instance
    JSONParser-->>LocalFileConnector: FileParseResult
    LocalFileConnector-->>User: DataFrame with fields & schema
Loading

@vercel
Copy link

vercel bot commented Jan 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
dashframe Ready Ready Preview, Comment Feb 4, 2026 6:34am

@coderabbitai
Copy link

coderabbitai bot commented Jan 3, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review
📝 Walkthrough

Walkthrough

Introduces a unified LocalFileConnector (CSV+JSON) and new @dashframe/csv and @dashframe/json packages, replaces CSV connector usage with the local connector across app and e2e, adds E2E tests/fixtures, updates Playwright/CI and formatting scripts, and adds sample datasets and docs.

Changes

Cohort / File(s) Summary
Connector removal
packages/connector-csv/src/connector.ts, packages/connector-csv/src/connector.test.ts
Removed legacy CSV connector implementation and its test suite.
Local file connector package
packages/connector-local/package.json, packages/connector-local/src/connector.ts, packages/connector-local/src/connector.test.ts, packages/connector-local/src/index.ts
Added LocalFileConnector handling CSV/JSON, 100MB limit, extension detection, parsing delegation; exported LocalFileConnector and localFileConnector; comprehensive unit tests.
CSV parser package
packages/csv/package.json, packages/csv/src/parser.ts, packages/csv/src/parser.test.ts, packages/csv/tsconfig.json
Introduced @dashframe/csv exposing parseCSV(text): string[][] and its tests; package renamed from @dashframe/connector-csv.
JSON package & utilities
packages/json/*
packages/json/package.json, packages/json/src/flatten.ts, packages/json/src/index.ts, packages/json/src/flatten.test.ts, packages/json/tsconfig.json, packages/json/vitest.config.ts, packages/json/CHANGELOG.md, packages/json/README.md, packages/json/.prettierignore
New @dashframe/json with flatten utilities, jsonToDataFrame, type inference, primary-key detection, tests, docs, and build/test configs.
Registry & app integration
apps/web/lib/connectors/registry.ts, apps/web/lib/local-csv-handler.ts, apps/web/package.json, e2e/web/package.json
Switched app and e2e to localFileConnector; updated imports and dependencies (@dashframe/connector-csv@dashframe/csv + @dashframe/connector-local).
E2E tests, fixtures & steps
e2e/web/features/*, e2e/web/steps/*.ts, e2e/web/fixtures/users_data.json, e2e/web/README.md
Added local file upload and JSON-to-chart feature files, new step defs (uploads, error cases, table/column assertions), fixture JSON, and updated E2E docs/commands.
Playwright & CI
.github/workflows/ci.yml, e2e/web/playwright.config.ts, .prettierignore, apps/web/.prettierignore
Added CI E2E job, changed format-check flow and root format scripts, Playwright external-server detection with production-build internal server flow, and ignored/generated E2E files.
Visualization / DuckDB connection
packages/visualization/src/VisualizationProvider.tsx, apps/web/components/providers/VisualizationSetup.tsx, packages/visualization/src/renderers/vgplot-renderer.ts, packages/visualization/src/components/Chart.tsx
VisualizationProvider gains connection prop and forwards it; consumers updated to require/passed connection; vgplot renderer adjusts coordinator access and query signature; Chart adds test-id and logging.
Visualization encoding & analyses
apps/web/components/visualizations/*.tsx, apps/web/app/insights/[insightId]/_components/InsightView.tsx
Reworked encoding resolution to use dataTables + resolveEncodingToSql; insight analysis outputs are remapped to UUID aliases; memoization and rendering paths adjusted.
Scripts & monorepo config
package.json, turbo.json
Root scripts changed (format now writes, added format:check), added turbo.json keys ui and concurrency.
Samples
samples/README.md, samples/*.json
Added multiple sample datasets and explanatory README.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant App as DashFrame App
    participant Connector as LocalFileConnector
    participant CSV as "@dashframe/csv.parseCSV"
    participant JSON as "@dashframe/json.jsonToDataFrame"
    participant DF as DataFrame builder
    participant Storage as IndexedDB

    User->>App: Upload file (CSV or JSON)
    App->>Connector: parse(file)

    rect rgb(230,245,255)
    Note over Connector: Validate size & extension
    Connector->>Connector: detect extension
    alt csv
        Connector->>CSV: parseCSV(file text)
        CSV-->>Connector: rows[][]
    else json
        Connector->>JSON: jsonToDataFrame(parsed JSON)
        JSON-->>Connector: conversion result
    else unsupported
        Connector-->>App: Error (unsupported)
    end
    end

    rect rgb(235,255,235)
    Note over Connector,DF: Build Arrow IPC & metadata
    Connector->>DF: build Arrow table, infer types, assign PK
    DF->>Storage: persist (IndexedDB)
    Storage-->>DF: ok
    DF-->>App: DataFrame & metadata
    end

    App->>User: Redirect to insight config / show errors
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding a JSON file connector for importing JSON data, which is the primary objective of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 91.89% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch auto-claude/002-add-json-file-connector

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
packages/connector-json/src/flatten.test.ts (1)

541-553: Consider adding a test for keys containing the separator character.

The custom separator tests validate basic functionality, but there's no coverage for the edge case where an original key already contains the separator character, which could cause ambiguity during unflattening.

For example: { "user_name": "Alice" } with separator _ would flatten to the same key, but unflatten might incorrectly interpret it as nested.

🔎 Suggested additional test case
it("should handle keys that contain the separator character", () => {
  // Note: This tests the limitation - separator in original keys causes ambiguity
  const input = { "user_name": "Alice" };
  const result = unflattenObject(input, "_");
  // Documents current behavior - may or may not be desired
  expect(result).toEqual({ user: { name: "Alice" } });
});
packages/connector-json/src/connector.test.ts (1)

102-106: Consider extracting UUID test constant.

The UUID template literal casting is repeated throughout the file. Extracting it to a constant would reduce verbosity and improve maintainability.

🔎 Suggested improvement
+const TEST_UUID = "test-uuid" as `${string}-${string}-${string}-${string}-${string}`;
+
 describe("parse - file size validation", () => {
   it("should reject files exceeding 100MB", async () => {
     // ...
     await expect(
-      connector.parse(
-        largeFile,
-        "test-uuid" as `${string}-${string}-${string}-${string}-${string}`,
-      ),
+      connector.parse(largeFile, TEST_UUID),
     ).rejects.toThrow("File size exceeds 100MB limit");
   });
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1fd31e5 and 0bc4945.

⛔ Files ignored due to path filters (1)
  • bun.lock is excluded by !**/*.lock
📒 Files selected for processing (14)
  • apps/web/lib/connectors/registry.ts
  • apps/web/package.json
  • package.json
  • packages/connector-json/.prettierignore
  • packages/connector-json/CHANGELOG.md
  • packages/connector-json/README.md
  • packages/connector-json/package.json
  • packages/connector-json/src/connector.test.ts
  • packages/connector-json/src/connector.ts
  • packages/connector-json/src/flatten.test.ts
  • packages/connector-json/src/flatten.ts
  • packages/connector-json/src/index.ts
  • packages/connector-json/tsconfig.json
  • packages/connector-json/vitest.config.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/package.json

📄 CodeRabbit inference engine (AGENTS.md)

Use lowercase scope format for package names (e.g., @dashframe/*)

Files:

  • apps/web/package.json
  • packages/connector-json/package.json
  • package.json
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Prefer pure functions over classes for data conversion, transformation, and utilities in TypeScript
Use classes and inheritance for entities with behavior when it makes sense for encapsulation and code organization
Use tRPC for APIs to proxy server-side calls and avoid CORS issues with external APIs
Import state management hooks from @dashframe/core, not directly from @dashframe/core-dexie, to keep components backend-agnostic
Dynamically import VegaChart with ssr: false to prevent SSR serialization issues with Vega-Lite Set objects
Use PascalCase for user-facing names (DashFrame), lowercase scope for packages (@dashframe/), and lowercase prefix for storage keys (dashframe:)
Always import UI components from @dashframe/ui package, not from individual component files
All UI elements on pages MUST use components from @dashframe/ui. If a needed component doesn't exist, add it to the UI package first before using it in pages.
Use shadcn/ui components for standard UI patterns (buttons, cards, dialogs, forms), shared components for DashFrame-specific patterns, and feature-specific components for one-off domain-specific UI
Design token enforcement: Use p-4 (compact), p-6 (standard), p-8 (spacious) for spacing; rounded-2xl for main cards, rounded-xl for nested, rounded-full for badges; h-4 w-4 for inline icons, h-5 w-5 for standalone
Use sentence case for text (except acronyms like CSV, API). No UPPERCASE text.
Extract components to @dashframe/ui when pattern appears or will appear in 3+ places, encapsulates meaningful UI logic, has clear semantic purpose, and include JSDoc documentation
Add JSDoc comments for all new functions, types, and components
Use Zustand stores for state persistence following the pattern in apps/web/lib/stores/
Use functional converter pattern (no classes/inheritance) for data conversion functions
Leverage TypeScript strict mode throughout the codebase
Use superjson transformer for tRPC configuration to support Date, Set, Map seri...

Files:

  • apps/web/lib/connectors/registry.ts
  • packages/connector-json/src/connector.test.ts
  • packages/connector-json/vitest.config.ts
  • packages/connector-json/src/connector.ts
  • packages/connector-json/src/index.ts
  • packages/connector-json/src/flatten.test.ts
  • packages/connector-json/src/flatten.ts
**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Write tests for critical logic, especially for data handling. Include test plans for features with specs.

Files:

  • packages/connector-json/src/connector.test.ts
  • packages/connector-json/src/flatten.test.ts
🧠 Learnings (10)
📚 Learning: 2025-12-30T20:23:14.452Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-30T20:23:14.452Z
Learning: Applies to **/package.json : Use lowercase scope format for package names (e.g., dashframe/*)

Applied to files:

  • apps/web/package.json
  • packages/connector-json/package.json
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Applies to **/*.{ts,tsx} : Use superjson transformer for tRPC configuration to support Date, Set, Map serialization. See lib/trpc/init.ts and routers/ for implementation.

Applied to files:

  • packages/connector-json/tsconfig.json
  • packages/connector-json/src/flatten.ts
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Applies to **/*.{ts,tsx} : Leverage TypeScript strict mode throughout the codebase

Applied to files:

  • packages/connector-json/tsconfig.json
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Packages export TypeScript source directly with `main: "src/index.ts"`, not compiled JS. Web app uses TypeScript path mappings for hot reload.

Applied to files:

  • packages/connector-json/tsconfig.json
  • packages/connector-json/.prettierignore
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Applies to **/*.{ts,tsx} : Add JSDoc comments for all new functions, types, and components

Applied to files:

  • packages/connector-json/tsconfig.json
📚 Learning: 2025-12-30T20:23:14.452Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-30T20:23:14.452Z
Learning: Applies to **/*.{ts,tsx} : Use TypeScript in strict mode

Applied to files:

  • packages/connector-json/tsconfig.json
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Applies to **/*.test.{ts,tsx} : Write tests for critical logic, especially for data handling. Include test plans for features with specs.

Applied to files:

  • packages/connector-json/src/connector.test.ts
  • packages/connector-json/vitest.config.ts
  • packages/connector-json/src/flatten.test.ts
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Packages export from stub package dashframe/core-store which is aliased to dashframe/core-${NEXT_PUBLIC_STORAGE_IMPL} for backend selection

Applied to files:

  • packages/connector-json/package.json
📚 Learning: 2026-01-03T03:26:23.295Z
Learnt from: youhaowei
Repo: youhaowei/DashFrame PR: 17
File: packages/ui/src/primitives/scroll-area.tsx:21-21
Timestamp: 2026-01-03T03:26:23.295Z
Learning: In the DashFrame repository, prettier-plugin-tailwindcss is configured with `tailwindStylesheet: "./packages/ui/src/globals.css"` in .prettierrc to ensure consistent Tailwind class sorting across macOS and Linux environments. This configuration causes automated class reordering throughout the codebase.

Applied to files:

  • packages/connector-json/.prettierignore
📚 Learning: 2025-12-30T20:23:14.452Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-30T20:23:14.452Z
Learning: Use Bun + Turborepo for monorepo management with apps/* and packages/* directory structure

Applied to files:

  • packages/connector-json/.prettierignore
🧬 Code graph analysis (3)
packages/connector-json/src/connector.ts (2)
packages/connector-json/src/index.ts (2)
  • jsonToDataFrame (165-301)
  • jsonConnector (318-318)
packages/types/src/index.ts (1)
  • UUID (5-5)
packages/connector-json/src/flatten.test.ts (1)
packages/connector-json/src/flatten.ts (6)
  • flattenObject (165-217)
  • JsonValue (15-21)
  • flattenObjectArray (239-266)
  • FlattenedObject (33-33)
  • extractKeys (275-283)
  • unflattenObject (299-328)
packages/connector-json/src/flatten.ts (1)
packages/connector-json/src/index.ts (8)
  • JsonValue (312-312)
  • JsonPrimitive (313-313)
  • FlattenedObject (314-314)
  • FlattenOptions (315-315)
  • flattenObject (308-308)
  • flattenObjectArray (309-309)
  • extractKeys (310-310)
  • unflattenObject (311-311)
🪛 GitHub Actions: CI
packages/connector-json/src/connector.test.ts

[error] 1-1: Prettier formatting issues detected. Run 'prettier --write' to fix code style in this file.

packages/connector-json/vitest.config.ts

[error] 1-1: Prettier formatting issues detected. Run 'prettier --write' to fix code style in this file.

packages/connector-json/src/connector.ts

[error] 1-1: Prettier formatting issues detected. Run 'prettier --write' to fix code style in this file.

packages/connector-json/src/index.ts

[error] 1-1: Prettier formatting issues detected. Run 'prettier --write' to fix code style in this file.

packages/connector-json/src/flatten.test.ts

[error] 1-1: Prettier formatting issues detected. Run 'prettier --write' to fix code style in this file.

🪛 LanguageTool
packages/connector-json/README.md

[grammar] ~149-~149: Use a hyphen to join words.
Context: ... generates: 1. _rowIndex - System computed field (number, identifier) -...

(QB_NEW_EN_HYPHEN)

🔇 Additional comments (28)
packages/connector-json/.prettierignore (1)

1-7: Configuration looks good.

The ignore patterns are appropriate for a TypeScript package in this monorepo. All standard build artifacts, cache, and temporary files are properly excluded from Prettier formatting.

packages/connector-json/CHANGELOG.md (1)

1-5: LGTM!

The changelog entry is clear and appropriate for an initial release.

packages/connector-json/tsconfig.json (1)

1-10: LGTM!

The TypeScript configuration follows standard monorepo patterns with appropriate declaration generation and test file exclusion.

packages/connector-json/package.json (1)

1-29: LGTM!

The package configuration correctly follows coding guidelines:

  • Uses lowercase scope format (@dashframe/connector-json)
  • Exports TypeScript source directly
  • Standard scripts and appropriate dependencies
packages/connector-json/README.md (1)

1-207: LGTM!

The documentation is comprehensive, well-structured, and provides clear examples for both direct conversion and connector pattern usage. The feature overview, type inference details, and API exports are clearly documented.

packages/connector-json/vitest.config.ts (1)

1-19: Fix Prettier formatting issues to unblock CI pipeline.

The CI pipeline has detected formatting issues in this file. Run the following command to fix:

prettier --write packages/connector-json/vitest.config.ts
⛔ Skipped due to learnings
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Applies to **/*.test.{ts,tsx} : Write tests for critical logic, especially for data handling. Include test plans for features with specs.
packages/connector-json/src/flatten.ts (6)

1-9: Well-documented module with clear purpose.

The module header clearly explains the purpose and use case for the flattening utilities. This aligns with the coding guidelines requiring JSDoc documentation.


14-33: Clean type definitions with good documentation.

The type hierarchy is well-structured: JsonValue for input, JsonPrimitive for leaf values, and FlattenedObject for output. The separation makes the API contract clear.


66-95: Robust helper functions.

The isPlainObject check correctly handles edge cases by verifying [object Object] via Object.prototype.toString.call(), which is more reliable than typeof alone.


165-217: Solid implementation with good edge case handling.

The function correctly handles:

  • Primitives at root level (wrapped in { value: ... })
  • Empty objects/arrays at root
  • Max depth limiting with JSON stringification
  • Fallback for unexpected types at line 212

The recursive approach with context object is clean and avoids parameter drilling.


239-266: Efficient two-pass normalization approach.

The implementation correctly ensures all objects have consistent keys by:

  1. Flattening all objects and collecting unique keys
  2. Normalizing with sorted keys and null-filling

This is essential for DataFrame conversion where all rows need the same columns.


299-328: Remove unnecessary caveat about "typical cases"—the implementation correctly reconstructs nested structures for all cases.

The implementation detects numeric indices via regex and builds nested structures incrementally. Array reconstruction works regardless of key processing order because each key independently determines whether to create an array or object based on the next path segment, and JavaScript sparse arrays handle out-of-order index assignments.

apps/web/package.json (1)

5-5: Workspace dependency correctly added.

The new @dashframe/connector-json package follows the naming convention (@dashframe/* lowercase scope) and is properly positioned alphabetically in the dependencies list.

apps/web/lib/connectors/registry.ts (2)

22-22: Clean import following existing pattern.

The import statement correctly follows the established pattern for connector imports.


36-40: JSON connector correctly registered in the array.

The jsonConnector is added to allConnectors between csvConnector and notionConnector, which maintains a logical grouping of file-based connectors.

Note that unlike Notion, there's no feature flag controlling JSON connector visibility. This appears intentional since JSON is a common format that should be generally available.

packages/connector-json/src/connector.ts (3)

78-88: First-element validation is a reasonable trade-off.

The validation only checks if the first array element is an object. While this means arrays with mixed types (e.g., [{}, "string", {}]) would pass initial validation, the downstream jsonToDataFrame will handle flattening appropriately.

For very large files, validating every element would add unnecessary overhead. Consider documenting this behavior in the helper text if users might be confused by partial validation.


90-95: Safe type assertion after validation.

The Parameters<typeof jsonToDataFrame>[0] type assertion on line 92 is safe because the preceding validation ensures jsonData is either an array of objects or a single object, which matches JSONData.


1-15: Well-structured connector with appropriate architecture.

The module-level JSDoc and imports are clean. Using a class for the connector is appropriate since it represents an entity with behavior (extending FileSourceConnector), which aligns with the coding guidelines of preferring classes for entities with behavior and proper encapsulation.

The implementation includes thorough validation:

  • File size checking with correct math calculation
  • Proper JSON parsing with error handling
  • Structure validation for both array and object formats
  • Clear, descriptive error messages
packages/connector-json/src/connector.test.ts (2)

289-369: Thorough testing of valid JSON formats with mock verification.

The tests correctly verify that jsonToDataFrame is called with the expected parsed data. This validates the connector's parsing and delegation logic without testing the DataFrame conversion itself.


1-24: Excellent test structure with comprehensive coverage and appropriate mocking.

The test file correctly mocks jsonToDataFrame to avoid IndexedDB dependencies, enabling true unit testing of the connector logic. The mock returns a minimal valid response structure. Test coverage is comprehensive, including static properties, file size validation, empty/invalid JSON handling, structure validation, and valid JSON formats (arrays, objects, nested structures).

Note: The Prettier check failed due to a missing dev dependency (prettier-plugin-organize-imports) in the environment, not a code formatting issue. The code itself is properly formatted. Verify that this dependency is installed in the project's dev environment before running format checks.

packages/connector-json/src/flatten.test.ts (1)

1-23: Address missing Prettier plugin dependency, not code formatting.

The test file is well-structured with comprehensive coverage for the flattening utilities. However, the CI issue is not a code formatting problem—it's a missing prettier-plugin-organize-imports dependency in the project configuration. Running prettier --write would fail for the same reason.

Ensure that prettier-plugin-organize-imports is properly installed and configured in the project's package.json or Prettier config before attempting formatting fixes.

Likely an incorrect or invalid review comment.

packages/connector-json/src/index.ts (7)

1-27: LGTM!

Imports are well-organized, separating types from runtime imports. The Apache Arrow types and utilities are appropriately imported for DataFrame construction.


29-44: LGTM!

Type definitions are well-documented with JSDoc comments and provide appropriate flexibility for JSON input handling. As per coding guidelines, JSDoc documentation is included.


50-67: Date inference heuristic is reasonable but has edge cases.

The combination of Date.parse(), separator check, and length requirement provides good protection against false positives. However, be aware:

  • Partial ISO dates like "2023-01" (length 7) won't be detected as dates.
  • Date.parse() behavior varies slightly across JS runtimes for non-ISO formats.

These are acceptable trade-offs for avoiding aggressive date coercion on ambiguous strings.


73-80: LGTM!

The first-non-null inference strategy is clearly documented and appropriate for JSON data where types are typically consistent within a column.


119-138: LGTM!

The result interface is well-documented and provides comprehensive metadata alongside the DataFrame reference.


303-318: LGTM!

Clean public API surface with organized re-exports for flatten utilities and connector functionality.


165-301: Well-structured conversion function following functional patterns.

The implementation is clean with clear step-by-step processing, good error handling, and comprehensive JSDoc documentation. The logic correctly:

  • Normalizes input to array format
  • Flattens nested structures
  • Infers types from data
  • Constructs Arrow table with explicit types
  • Creates IndexedDB-backed DataFrame

Per coding guidelines, this correctly uses pure functions for data transformation. The DataFrame.create API is called correctly with field IDs as the second parameter and appropriate storage options.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Fix all issues with AI Agents 🤖
In @packages/connector-json/README.md:
- Around line 185-200: The README export list incorrectly mentions flattenArray;
update the exports block to reflect the actual public API by replacing
flattenArray with the real exported symbols from flatten.ts: export
flattenObject, flattenObjectArray, extractKeys, and unflattenObject (keep
flattenObject as-is and add flattenObjectArray, extractKeys, unflattenObject,
removing flattenArray), referencing the existing export line that currently
lists flattenObject and flattenArray so reviewers can locate and change it.
🧹 Nitpick comments (3)
packages/connector-json/src/flatten.ts (1)

268-283: Minor redundancy with flattenObjectArray.

extractKeys duplicates logic that's already performed inside flattenObjectArray (which also collects all keys). Consider whether this standalone export is needed, or if the keys could be returned as part of flattenObjectArray's result.

packages/connector-json/src/index.ts (1)

50-67: Date inference heuristic may have edge cases.

The date detection requires both a valid Date.parse() result AND pattern matching ([-/:] + length ≥ 8). This is reasonable but could still match ambiguous strings. Consider adding an optional dateDetection: boolean flag in JSONConversionOptions if users need to disable auto-detection for problematic datasets.

packages/connector-json/README.md (1)

145-156: Minor grammar: use hyphen in compound modifier.

"System computed field" should be "System-computed field" when used as a compound adjective.

🔎 Proposed fix
-1. **`_rowIndex`** - System computed field (number, identifier)
+1. **`_rowIndex`** - System-computed field (number, identifier)
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0bc4945 and eba33bf.

⛔ Files ignored due to path filters (1)
  • bun.lock is excluded by !**/*.lock
📒 Files selected for processing (13)
  • apps/web/lib/connectors/registry.ts
  • apps/web/package.json
  • packages/connector-json/.prettierignore
  • packages/connector-json/CHANGELOG.md
  • packages/connector-json/README.md
  • packages/connector-json/package.json
  • packages/connector-json/src/connector.test.ts
  • packages/connector-json/src/connector.ts
  • packages/connector-json/src/flatten.test.ts
  • packages/connector-json/src/flatten.ts
  • packages/connector-json/src/index.ts
  • packages/connector-json/tsconfig.json
  • packages/connector-json/vitest.config.ts
🚧 Files skipped from review as they are similar to previous changes (10)
  • apps/web/lib/connectors/registry.ts
  • packages/connector-json/src/connector.ts
  • packages/connector-json/.prettierignore
  • packages/connector-json/package.json
  • packages/connector-json/vitest.config.ts
  • packages/connector-json/CHANGELOG.md
  • apps/web/package.json
  • packages/connector-json/src/connector.test.ts
  • packages/connector-json/src/flatten.test.ts
  • packages/connector-json/tsconfig.json
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Prefer pure functions over classes for data conversion, transformation, and utilities in TypeScript
Use classes and inheritance for entities with behavior when it makes sense for encapsulation and code organization
Use tRPC for APIs to proxy server-side calls and avoid CORS issues with external APIs
Import state management hooks from @dashframe/core, not directly from @dashframe/core-dexie, to keep components backend-agnostic
Dynamically import VegaChart with ssr: false to prevent SSR serialization issues with Vega-Lite Set objects
Use PascalCase for user-facing names (DashFrame), lowercase scope for packages (@dashframe/), and lowercase prefix for storage keys (dashframe:)
Always import UI components from @dashframe/ui package, not from individual component files
All UI elements on pages MUST use components from @dashframe/ui. If a needed component doesn't exist, add it to the UI package first before using it in pages.
Use shadcn/ui components for standard UI patterns (buttons, cards, dialogs, forms), shared components for DashFrame-specific patterns, and feature-specific components for one-off domain-specific UI
Design token enforcement: Use p-4 (compact), p-6 (standard), p-8 (spacious) for spacing; rounded-2xl for main cards, rounded-xl for nested, rounded-full for badges; h-4 w-4 for inline icons, h-5 w-5 for standalone
Use sentence case for text (except acronyms like CSV, API). No UPPERCASE text.
Extract components to @dashframe/ui when pattern appears or will appear in 3+ places, encapsulates meaningful UI logic, has clear semantic purpose, and include JSDoc documentation
Add JSDoc comments for all new functions, types, and components
Use Zustand stores for state persistence following the pattern in apps/web/lib/stores/
Use functional converter pattern (no classes/inheritance) for data conversion functions
Leverage TypeScript strict mode throughout the codebase
Use superjson transformer for tRPC configuration to support Date, Set, Map seri...

Files:

  • packages/connector-json/src/index.ts
  • packages/connector-json/src/flatten.ts
🧠 Learnings (1)
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Applies to **/*.{ts,tsx} : Leverage TypeScript strict mode throughout the codebase

Applied to files:

  • packages/connector-json/src/index.ts
🧬 Code graph analysis (1)
packages/connector-json/src/index.ts (2)
packages/connector-json/src/flatten.ts (5)
  • JsonValue (15-21)
  • FlattenOptions (38-58)
  • JsonPrimitive (27-27)
  • flattenObjectArray (239-266)
  • extractKeys (275-283)
packages/types/src/index.ts (1)
  • DataFrame (19-19)
🪛 LanguageTool
packages/connector-json/README.md

[grammar] ~149-~149: Use a hyphen to join words.
Context: ... generates: 1. _rowIndex - System computed field (number, identifier) -...

(QB_NEW_EN_HYPHEN)

🔇 Additional comments (11)
packages/connector-json/src/flatten.ts (4)

1-33: Well-structured type definitions with clear JSDoc documentation.

The type hierarchy (JsonValueJsonPrimitiveFlattenedObject) is clean and the recursive JsonValue type correctly represents valid JSON structures. As per coding guidelines, JSDoc comments are present.


35-95: LGTM on options interface and helper functions.

The FlattenOptions interface is well-documented with sensible defaults. Helper functions isPlainObject, isPrimitive, and buildKey are simple and correct.


165-217: Clean implementation following functional patterns.

The flattenObject function properly handles edge cases (primitives at root, empty arrays/objects) and delegates to specialized helpers. The context-based approach with the inner flatten function keeps the recursion clean. Based on coding guidelines, this appropriately uses pure functions for data transformation.


299-328: Verify behavior with sparse array indices and custom separators.

The unflattenObject function works correctly for contiguous indices but may produce sparse arrays for non-contiguous keys (e.g., items.0 and items.5). Also, ensure callers pass the same separator used during flattening—mismatch will produce incorrect results.

packages/connector-json/src/index.ts (5)

1-27: Clean imports and proper separation of concerns.

Imports are well-organized with types from @dashframe/engine-browser, Arrow primitives from apache-arrow, and local flatten utilities. Based on coding guidelines, the file leverages TypeScript strict mode appropriately.


82-93: Boolean parsing now handles common representations.

The new parseBoolean function correctly addresses the previous review feedback—it now accepts case-insensitive "true", string "1", and numeric 1.


69-80: Type inference uses only the first non-null value.

inferColumnType takes the type of the first non-null value. If data is heterogeneous (e.g., first value is 30, later values are "N/A"), the column will be typed as number and "N/A" will become null. This is acceptable for typical JSON but worth documenting.


174-310: Solid implementation of JSON-to-DataFrame conversion.

The function properly:

  • Normalizes input to an array
  • Flattens nested objects
  • Infers column types
  • Creates Arrow vectors with appropriate types
  • Builds the DataFrame with IndexedDB storage

The primary key detection (/^_?id$/i) and fallback to _rowIndex is sensible. Arrow type mapping (Float64, Bool, TimestampMillisecond, Utf8) aligns with the inferred column types.


316-325: No issues found. All re-exported symbols from ./flatten exist: flattenObject, flattenObjectArray, extractKeys, unflattenObject, FlattenedObject, FlattenOptions, JsonPrimitive, and JsonValue are properly defined and exported in flatten.ts.

packages/connector-json/README.md (2)

1-27: Comprehensive overview with clear feature list.

The documentation provides good context about the package's capabilities including type inference, nested flattening, and Arrow IPC storage.


28-102: Clear usage examples with good progression.

The examples progress from simple usage to nested objects to the connector pattern, with helpful inline comments showing expected outcomes.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Fix all issues with AI Agents 🤖
In @packages/connector-json/README.md:
- Line 149: Update the description for the `_rowIndex` field to hyphenate the
compound adjective: change "System computed field (number, identifier)" to
"System-computed field (number, identifier)" in the README entry for
`_rowIndex`.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eba33bf and 5d654cb.

📒 Files selected for processing (1)
  • packages/connector-json/README.md
🧰 Additional context used
🪛 LanguageTool
packages/connector-json/README.md

[grammar] ~149-~149: Use a hyphen to join words.
Context: ... generates: 1. _rowIndex - System computed field (number, identifier) -...

(QB_NEW_EN_HYPHEN)

🔇 Additional comments (1)
packages/connector-json/README.md (1)

185-205: Export list is now correct.

The exports in lines 195–201 now correctly reflect the actual public API from flatten.ts. The previous issue flagged in earlier review comments (incorrect reference to flattenArray) has been resolved—the list now properly includes flattenObject, flattenObjectArray, extractKeys, and unflattenObject.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Fix all issues with AI Agents 🤖
In @e2e/web/steps/visualizations.steps.ts:
- Around line 6-12: The step definition Then("I should see the chart rendered",
...) currently only asserts the "N rows · M columns" text; update it to either
rename the step string to reflect that it verifies chart data (e.g., "chart data
should be visible") or add a DOM assertion that the Chart component is actually
rendered (check for the Chart component/canvas element that VisualizationDisplay
renders when activeTab enables the Chart, or a canvas/svg selector within the
Chart component) so the test verifies visualization rendering rather than only
data-loaded text.

In @packages/connector-local/package.json:
- Around line 12-13: The package.json format scripts are inverted relative to
the root: currently "format" runs the check and "format:write" runs the write;
rename them to match the root convention by changing "format" to run the write
command (prettier --write .) and add/rename the check script to "format:check"
(prettier --check .); update any internal references or CI/tasks that call
"format" or "format:write" to use the new "format" (write) and "format:check"
names so behavior stays consistent.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/web/lib/local-csv-handler.ts (1)

59-73: Data source type consistently migrated to "local".

The data source type has been properly updated from "csv" to "local" throughout the local-csv-handler.ts function, and the connector registry correctly registers the localFileConnector with "local" type. The types are properly defined in stores/types.ts.

However, update the documentation comment in registry.ts (line 72) from the outdated example showing 'csv' to 'local':

Suggested comment update
-     * @param id - Connector ID (e.g., 'csv', 'notion')
+     * @param id - Connector ID (e.g., 'local', 'notion')
🧹 Nitpick comments (7)
samples/README.md (2)

16-30: Add language identifier to fenced code block.

The code block would benefit from a language identifier for proper syntax highlighting and rendering.

🔎 Proposed fix
-```
+```text
 samples/

Based on static analysis hints.


36-62: Add language identifier to fenced code block.

The ASCII diagram should have a language identifier for consistent markdown rendering.

🔎 Proposed fix
-```
+```text
 ┌─────────────┐     ┌──────────────┐     ┌─────────────┐

Based on static analysis hints.

packages/csv/src/parser.test.ts (1)

117-166: Consider adding Unicode/UTF-8 test case.

Edge cases are well covered. For a data import utility, consider adding a test for Unicode content (e.g., non-ASCII characters, emojis) to ensure proper handling of international data.

🔎 Optional: Add Unicode test case
+    it("should handle Unicode content", () => {
+      const csv = "name,city\nAlice,東京\nBob,Zürich";
+      const result = parseCSV(csv);
+
+      expect(result).toEqual([
+        ["name", "city"],
+        ["Alice", "東京"],
+        ["Bob", "Zürich"],
+      ]);
+    });
packages/connector-local/src/connector.ts (1)

139-143: Preserve original error context for debugging.

The catch block discards the original parsing error, which could make debugging malformed JSON files harder. Consider including the original error message.

🔎 Proposed improvement
     try {
       jsonData = JSON.parse(text);
     } catch {
-      throw new Error("Invalid JSON format: failed to parse file content");
+    } catch (e) {
+      const message = e instanceof Error ? e.message : "unknown error";
+      throw new Error(`Invalid JSON format: ${message}`);
     }
e2e/web/steps/data-sources.steps.ts (1)

106-117: Consider using string literal instead of RegExp constructor.

While the static analysis tool flags a potential ReDoS risk, this is a false positive since rowCount is typed as number from the Gherkin {int} syntax. However, using a RegExp constructor with template literals is unnecessary here.

🔎 Simpler alternative using string includes
 Then(
   "I should see the data table with {int} rows",
   async ({ page }, rowCount: number) => {
-    // Wait for row count indicator (format: "X rows • Y fields")
-    // Use .first() since this text appears multiple times on the page
     await expect(
-      page.getByText(new RegExp(`${rowCount} rows`)).first(),
+      page.getByText(`${rowCount} rows`, { exact: false }).first(),
     ).toBeVisible({
       timeout: 15_000,
     });
   },
 );

This avoids the RegExp constructor while achieving the same substring matching behavior.

packages/json/src/index.ts (2)

86-93: Add JSDoc comment for parseBoolean helper.

Per coding guidelines, all functions should have JSDoc documentation.

📝 Add JSDoc
+/**
+ * Parse a raw value as a boolean, handling common representations.
+ * Accepts: boolean true/false, case-insensitive "true", string/numeric "1".
+ */
 const parseBoolean = (raw: JsonPrimitive): boolean => {

Note: The existing comment on lines 83-85 provides this information but should be formatted as JSDoc.

As per coding guidelines, all new functions should have JSDoc comments.


99-126: Add JSDoc comment for parseValue helper.

Per coding guidelines, all functions should have JSDoc documentation.

📝 Add JSDoc
+/**
+ * Parse a raw value into the appropriate typed value for Arrow.
+ * For date columns, returns Date objects for proper Arrow serialization.
+ * 
+ * @param raw - The raw JSON primitive value
+ * @param type - The target column type
+ * @returns Parsed value suitable for Arrow, or null if parsing fails
+ */
 const parseValue = (raw: JsonPrimitive, type: ColumnType): unknown => {

Note: The existing comment on lines 96-98 provides this information but should be formatted as JSDoc.

As per coding guidelines, all new functions should have JSDoc comments.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0491bd5 and 5736af8.

⛔ Files ignored due to path filters (9)
  • bun.lock is excluded by !**/*.lock
  • samples/app-events.csv is excluded by !**/*.csv
  • samples/expenses.csv is excluded by !**/*.csv
  • samples/orders.csv is excluded by !**/*.csv
  • samples/revenue.csv is excluded by !**/*.csv
  • samples/sample-sales.csv is excluded by !**/*.csv
  • samples/sprints.csv is excluded by !**/*.csv
  • samples/subscriptions.csv is excluded by !**/*.csv
  • samples/users.csv is excluded by !**/*.csv
📒 Files selected for processing (46)
  • .github/workflows/ci.yml
  • .prettierignore
  • apps/web/lib/connectors/registry.ts
  • apps/web/lib/local-csv-handler.ts
  • apps/web/package.json
  • e2e/web/README.md
  • e2e/web/features/data-sources/local_file_upload.feature
  • e2e/web/features/workflows/json_to_chart.feature
  • e2e/web/fixtures/users_data.json
  • e2e/web/package.json
  • e2e/web/playwright.config.ts
  • e2e/web/steps/common.steps.ts
  • e2e/web/steps/data-sources.steps.ts
  • e2e/web/steps/visualizations.steps.ts
  • package.json
  • packages/connector-csv/src/connector.test.ts
  • packages/connector-csv/src/connector.ts
  • packages/connector-local/package.json
  • packages/connector-local/src/connector.test.ts
  • packages/connector-local/src/connector.ts
  • packages/connector-local/src/index.ts
  • packages/connector-local/tsconfig.json
  • packages/csv/.prettierignore
  • packages/csv/CHANGELOG.md
  • packages/csv/README.md
  • packages/csv/package.json
  • packages/csv/src/index.ts
  • packages/csv/src/parser.test.ts
  • packages/csv/src/parser.ts
  • packages/csv/tsconfig.json
  • packages/csv/vitest.config.ts
  • packages/json/.prettierignore
  • packages/json/CHANGELOG.md
  • packages/json/README.md
  • packages/json/package.json
  • packages/json/src/flatten.test.ts
  • packages/json/src/flatten.ts
  • packages/json/src/index.ts
  • packages/json/tsconfig.json
  • packages/json/vitest.config.ts
  • samples/README.md
  • samples/app-users.json
  • samples/departments.json
  • samples/employees.json
  • samples/projects.json
  • turbo.json
💤 Files with no reviewable changes (2)
  • packages/connector-csv/src/connector.ts
  • packages/connector-csv/src/connector.test.ts
✅ Files skipped from review due to trivial changes (6)
  • packages/json/.prettierignore
  • packages/json/README.md
  • samples/employees.json
  • packages/json/CHANGELOG.md
  • packages/json/tsconfig.json
  • packages/csv/tsconfig.json
🚧 Files skipped from review as they are similar to previous changes (2)
  • apps/web/package.json
  • apps/web/lib/connectors/registry.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Prefer pure functions over classes for data conversion, transformation, and utilities in TypeScript
Use classes and inheritance for entities with behavior when it makes sense for encapsulation and code organization
Use tRPC for APIs to proxy server-side calls and avoid CORS issues with external APIs
Import state management hooks from @dashframe/core, not directly from @dashframe/core-dexie, to keep components backend-agnostic
Dynamically import VegaChart with ssr: false to prevent SSR serialization issues with Vega-Lite Set objects
Use PascalCase for user-facing names (DashFrame), lowercase scope for packages (@dashframe/), and lowercase prefix for storage keys (dashframe:)
Always import UI components from @dashframe/ui package, not from individual component files
All UI elements on pages MUST use components from @dashframe/ui. If a needed component doesn't exist, add it to the UI package first before using it in pages.
Use shadcn/ui components for standard UI patterns (buttons, cards, dialogs, forms), shared components for DashFrame-specific patterns, and feature-specific components for one-off domain-specific UI
Design token enforcement: Use p-4 (compact), p-6 (standard), p-8 (spacious) for spacing; rounded-2xl for main cards, rounded-xl for nested, rounded-full for badges; h-4 w-4 for inline icons, h-5 w-5 for standalone
Use sentence case for text (except acronyms like CSV, API). No UPPERCASE text.
Extract components to @dashframe/ui when pattern appears or will appear in 3+ places, encapsulates meaningful UI logic, has clear semantic purpose, and include JSDoc documentation
Add JSDoc comments for all new functions, types, and components
Use Zustand stores for state persistence following the pattern in apps/web/lib/stores/
Use functional converter pattern (no classes/inheritance) for data conversion functions
Leverage TypeScript strict mode throughout the codebase
Use superjson transformer for tRPC configuration to support Date, Set, Map seri...

Files:

  • e2e/web/steps/visualizations.steps.ts
  • packages/connector-local/src/connector.ts
  • packages/csv/src/parser.ts
  • packages/csv/src/parser.test.ts
  • packages/json/src/flatten.test.ts
  • packages/connector-local/src/connector.test.ts
  • packages/csv/src/index.ts
  • e2e/web/playwright.config.ts
  • e2e/web/steps/common.steps.ts
  • packages/json/src/flatten.ts
  • packages/connector-local/src/index.ts
  • packages/json/src/index.ts
  • apps/web/lib/local-csv-handler.ts
  • packages/json/vitest.config.ts
  • e2e/web/steps/data-sources.steps.ts
**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Write tests for critical logic, especially for data handling. Include test plans for features with specs.

Files:

  • packages/csv/src/parser.test.ts
  • packages/json/src/flatten.test.ts
  • packages/connector-local/src/connector.test.ts
**/package.json

📄 CodeRabbit inference engine (AGENTS.md)

Use lowercase scope format for package names (e.g., @dashframe/*)

Files:

  • e2e/web/package.json
  • package.json
  • packages/json/package.json
  • packages/connector-local/package.json
  • packages/csv/package.json
🧠 Learnings (16)
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Applies to **/*.test.{ts,tsx} : Write tests for critical logic, especially for data handling. Include test plans for features with specs.

Applied to files:

  • .prettierignore
  • packages/csv/src/parser.test.ts
  • packages/json/src/flatten.test.ts
  • packages/connector-local/src/connector.test.ts
  • e2e/web/README.md
  • packages/json/vitest.config.ts
  • e2e/web/steps/data-sources.steps.ts
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Applies to **/*.{ts,tsx} : Leverage TypeScript strict mode throughout the codebase

Applied to files:

  • .prettierignore
  • packages/json/src/index.ts
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Packages export TypeScript source directly with `main: "src/index.ts"`, not compiled JS. Web app uses TypeScript path mappings for hot reload.

Applied to files:

  • .prettierignore
📚 Learning: 2025-12-30T20:23:14.452Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-30T20:23:14.452Z
Learning: Run 'bun check' (lint + typecheck + format) before committing code

Applied to files:

  • .github/workflows/ci.yml
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Applies to **/*.{ts,tsx} : Use superjson transformer for tRPC configuration to support Date, Set, Map serialization. See lib/trpc/init.ts and routers/ for implementation.

Applied to files:

  • packages/json/src/flatten.ts
📚 Learning: 2025-12-30T20:23:14.452Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-30T20:23:14.452Z
Learning: Applies to **/package.json : Use lowercase scope format for package names (e.g., dashframe/*)

Applied to files:

  • packages/json/package.json
  • packages/csv/package.json
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Applies to **/*.{ts,tsx} : Use PascalCase for user-facing names (DashFrame), lowercase scope for packages (dashframe/*), and lowercase prefix for storage keys (dashframe:*)

Applied to files:

  • packages/json/package.json
  • packages/connector-local/package.json
  • packages/csv/package.json
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Packages export from stub package dashframe/core-store which is aliased to dashframe/core-${NEXT_PUBLIC_STORAGE_IMPL} for backend selection

Applied to files:

  • packages/json/package.json
  • packages/connector-local/package.json
📚 Learning: 2025-12-30T20:23:14.452Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-30T20:23:14.452Z
Learning: Applies to **/*.{ts,tsx} : Use lowercase prefix 'dashframe:' for storage keys

Applied to files:

  • packages/json/package.json
  • packages/connector-local/package.json
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Applies to **/*.{ts,tsx} : All UI elements on pages MUST use components from dashframe/ui. If a needed component doesn't exist, add it to the UI package first before using it in pages.

Applied to files:

  • packages/json/package.json
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Applies to **/*.{ts,tsx} : Extract components to dashframe/ui when pattern appears or will appear in 3+ places, encapsulates meaningful UI logic, has clear semantic purpose, and include JSDoc documentation

Applied to files:

  • packages/json/package.json
📚 Learning: 2025-12-30T20:23:01.376Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.376Z
Learning: Applies to **/*.{ts,tsx} : Import state management hooks from dashframe/core, not directly from dashframe/core-dexie, to keep components backend-agnostic

Applied to files:

  • packages/json/package.json
  • packages/connector-local/package.json
  • apps/web/lib/local-csv-handler.ts
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Applies to **/*.{ts,tsx} : Always import UI components from dashframe/ui package, not from individual component files

Applied to files:

  • packages/json/package.json
  • packages/connector-local/package.json
  • packages/json/vitest.config.ts
📚 Learning: 2025-12-30T20:23:14.452Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-30T20:23:14.452Z
Learning: Applies to **/*.{ts,tsx} : Use PascalCase for user-facing component names (e.g., DashFrame)

Applied to files:

  • packages/json/package.json
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: DataFrame as central abstraction: CSV/Notion → DataFrame → Vega-Lite → Chart

Applied to files:

  • apps/web/lib/local-csv-handler.ts
📚 Learning: 2025-12-30T20:23:01.377Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.377Z
Learning: Create new data sources by: 1) Creating package in packages/<source>/ with converter function, 2) Adding tRPC router, 3) Extending stores, 4) Updating DataSourcesPanel.tsx

Applied to files:

  • apps/web/lib/local-csv-handler.ts
🧬 Code graph analysis (3)
packages/csv/src/parser.test.ts (2)
packages/csv/src/index.ts (1)
  • parseCSV (230-230)
packages/csv/src/parser.ts (1)
  • parseCSV (23-87)
packages/json/src/flatten.ts (1)
packages/json/src/index.ts (7)
  • JsonValue (324-324)
  • JsonPrimitive (323-323)
  • FlattenedObject (321-321)
  • FlattenOptions (322-322)
  • flattenObject (318-318)
  • extractKeys (317-317)
  • unflattenObject (320-320)
packages/json/src/index.ts (2)
packages/json/src/flatten.ts (5)
  • JsonValue (15-21)
  • FlattenOptions (38-58)
  • JsonPrimitive (27-27)
  • flattenObjectArray (239-266)
  • extractKeys (275-283)
packages/types/src/index.ts (5)
  • ColumnType (8-8)
  • DataFrame (19-19)
  • SourceSchema (14-14)
  • UUID (5-5)
  • TableColumn (11-11)
🪛 ast-grep (0.40.3)
e2e/web/steps/data-sources.steps.ts

[warning] 111-111: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(${rowCount} rows)
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html

(regexp-from-variable)

🪛 markdownlint-cli2 (0.18.1)
samples/README.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


36-36: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 150000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: E2E Tests
🔇 Additional comments (54)
packages/json/vitest.config.ts (1)

1-19: The resolve aliases are necessary and appropriately configured.

The source file (index.ts) imports from @dashframe/engine-browser, which requires these aliases for proper resolution during test execution. While the test file itself only imports local modules, the aliases ensure that any imports of the source code during testing can resolve these dependencies correctly. This is a standard and intentional pattern across all vitest configurations in the monorepo.

packages/csv/package.json (1)

2-2: LGTM! Package name follows naming convention.

The package name "@dashframe/csv" correctly uses the lowercase scope format as required by the coding guidelines.

.prettierignore (1)

12-13: LGTM! Appropriate exclusion for generated E2E artifacts.

Excluding the E2E-generated files from Prettier formatting is a standard best practice that prevents unnecessary formatting churn on auto-generated content.

samples/app-users.json (1)

1-146: LGTM! Well-structured sample data.

The sample data is consistent, well-formed, and provides diverse test scenarios across multiple countries, industries, and user segments.

samples/departments.json (1)

1-47: LGTM! Well-structured sample data.

The department sample data is consistent, well-formed, and complements the user data with organizational hierarchy information.

turbo.json (1)

3-4: Configuration is valid.

Both "ui": "stream" and "concurrency": "15" are correct per the turbo schema. The "ui" field defaults to "stream" (also accepts "tui"), and "concurrency" correctly uses a string value representing an integer ≥1 or a percentage.

package.json (1)

23-24: LGTM! Script reorganization improves clarity.

The separation of format (write) and format:check (verify) aligns with typical CI/CD workflows and makes the intent of each script more explicit.

packages/connector-local/src/index.ts (1)

1-8: LGTM! Clean barrel export pattern.

The package entry point follows standard practices with clear documentation and straightforward re-exports.

e2e/web/package.json (2)

10-12: LGTM! Simplified E2E test scripts.

The removal of the E2E_MODE=dev environment variable from test scripts simplifies the testing workflow while maintaining all necessary test modes.


21-23: LGTM! Dependencies align with connector refactor.

The dependency updates correctly reflect the consolidation from separate CSV/JSON connectors to a unified local connector with dedicated parsing utilities.

samples/projects.json (1)

1-67: LGTM! Sample data is well-structured.

The JSON structure is valid and consistent across all entries. The sample data appropriately mirrors the pattern used in other sample files (app-users.json, departments.json, employees.json) and provides good test coverage with varied status values (active, beta, maintenance) and diverse tech stacks.

.github/workflows/ci.yml (2)

32-32: LGTM! Improved format check with helpful diff output.

The conditional format check now shows developers exactly what formatting changes are needed when the check fails, improving the developer experience while still enforcing formatting standards.


37-69: LGTM! Well-configured E2E test pipeline.

The E2E test job is properly structured with:

  • Sequential execution after checks pass (prevents wasted CI time)
  • Frozen lockfile for reproducible builds
  • Chromium-only installation for faster CI runs
  • Proper artifact upload with if: always() to capture results even on test failures
  • Reasonable 7-day retention for test artifacts
apps/web/lib/local-csv-handler.ts (2)

2-2: Import path updated for package restructuring.

The import path change from @dashframe/connector-csv to @dashframe/csv aligns with the PR's refactoring to separate CSV parsing utilities from connector implementations. The function csvToDataFrame is correctly imported from the new package location.


195-209: Data source type consistently migrated in handleFileConnectorResult.

The data source type has been properly updated from "csv" to "local" in this function as well, maintaining consistency with handleLocalCSVUpload. Both functions now correctly reference the unified local file connector.

packages/csv/src/parser.ts (4)

1-28: LGTM! Well-documented parser with clear state management.

The function documentation is thorough with a helpful example, and the ESLint disable comment appropriately justifies why the cognitive complexity warning should be suppressed for this state-machine parser.


33-47: Quote handling logic is correct.

The parser properly handles:

  • Escaped quotes (""") with appropriate loop counter advancement
  • End of quoted fields (single ")
  • Character accumulation within quoted fields

48-75: LGTM! Comprehensive line ending and field separator handling.

The parser correctly handles:

  • Quote transitions and field separators
  • All line ending variants (CRLF, LF, CR)
  • Empty row filtering to skip rows with all empty fields
  • Proper loop counter advancement for multi-character line endings

78-86: Final field/row handling is correct.

The parser properly handles the last field and row, even when the CSV file doesn't end with a newline. The consistent empty-row filtering ensures data quality.

packages/csv/src/index.ts (1)

227-230: Export surface is correct and complete.

Verification confirms that both csvToDataFrame (line 93) and parseCSV (line 230) are properly exported from the package and actively used throughout the codebase. csvToDataFrame is imported in apps/web/lib/local-csv-handler.ts and packages/connector-local/src/connector.ts, while parseCSV is imported in packages/connector-local/src/connector.ts. The refactoring appropriately separates concerns by keeping the pure utility functions while removing the connector pattern.

packages/csv/src/parser.test.ts (4)

1-13: Well-structured test file with comprehensive JSDoc.

The test file header clearly documents the test coverage areas, and the imports are clean. This aligns well with the coding guidelines requiring JSDoc comments for new functions and tests for critical data handling logic.


15-38: Good coverage of standard CSV parsing scenarios.

The standard parsing tests cover the essential happy paths: simple CSV, empty input, and single row handling.


40-80: Thorough quoted field tests.

Good coverage of RFC 4180 CSV edge cases including commas in quotes, escaped quotes (doubled), newlines within quoted fields, and empty quoted fields.


82-115: Line ending coverage is complete.

Testing LF, CRLF, and CR line endings ensures cross-platform compatibility.

e2e/web/playwright.config.ts (2)

11-16: Clean external server detection logic.

The guard pattern with hasExternalServer ensures the non-null assertion on line 15 is safe. This approach provides flexibility for both CI (auto-build) and local development (use running server).


64-74: Good production build isolation.

Using NEXT_DIST_DIR=.next-e2e keeps the E2E build separate from the development .next directory, preventing conflicts. The 180-second timeout is appropriate for production builds that include asset optimization.

packages/connector-local/src/connector.ts (5)

1-23: Clear documentation and proper module organization.

The JSDoc header clearly explains the architecture and the imports are well-organized. Using a class here is appropriate since it extends FileSourceConnector for interface conformance, aligning with the coding guideline to "use classes and inheritance for entities with behavior."


32-42: Helper functions are well-designed.

getFileExtension handles edge cases (no extension returns empty string), and isSupportedExtension uses a type guard for type safety.


82-107: Clean parse dispatch logic.

The parse method properly validates file size, detects format, and uses a switch statement for type-safe dispatch. TypeScript's exhaustive checking ensures all supported extensions are handled.


168-172: Singleton pattern is appropriate for connector registry.

Exporting a singleton instance is the correct pattern for use in the connector registry, as documented in the example.


153-162: Array validation only checks the first item.

The validation only inspects jsonData[0], so heterogeneous arrays like [{valid: 1}, "string", 42] would pass. However, downstream processing in jsonToDataFrame handles non-objects gracefully by converting primitives to {value: ...} and normalizing columns across rows, so this does not cause functional failures. If stricter type enforcement is desired, validate all items upfront; otherwise, this is acceptable as-is.

Likely an incorrect or invalid review comment.

e2e/web/README.md (2)

1-89: Comprehensive E2E testing documentation.

The README provides clear guidance on test structure, running modes (production vs development), filtering, and the E2E_PORT environment variable. This will help developers onboard quickly.


90-141: Good addition of workflow documentation.

The new sections for Adding New Tests, Test Fixtures, and Debugging provide practical guidance. The step-by-step workflow for adding tests is particularly helpful.

packages/json/src/flatten.ts (5)

1-34: Well-designed type definitions with clear documentation.

The types are well-documented with JSDoc comments. The separation of JsonPrimitive from JsonValue is a good design choice that makes the flattening semantics clear.


60-88: Robust helper functions.

isPlainObject correctly uses Object.prototype.toString.call() to distinguish plain objects from other object types (like Date, Map, etc.). This is a best practice for type checking in JavaScript.


165-217: Clean implementation of flattenObject with good edge case handling.

The function handles primitives at root, empty objects/arrays, and uses a recursive approach with proper depth tracking. The context pattern keeps the implementation clean.


239-266: Good key normalization in flattenObjectArray.

The two-pass approach (flatten all, then normalize keys) ensures consistent column structure across all objects. Filling missing keys with null is the right approach for DataFrame compatibility.


299-328: unflattenObject handles array reconstruction correctly.

The regex check /^\d+$/ for numeric indices correctly determines when to create arrays vs objects. The eslint disable for any is acceptable here given the dynamic nature of the reconstruction.

One edge case to note: if flattened data contains conflicting paths (e.g., both a.b: 1 and a.b.c: 2), the later key will overwrite the structure. This is likely acceptable given that flattenObject wouldn't produce such conflicts.

e2e/web/fixtures/users_data.json (1)

1-37: Good test fixture design.

The fixture uses example.com email addresses (RFC 2606 reserved domain), provides varied department values for aggregation testing, and has a mix of numeric and string fields suitable for chart visualization tests.

e2e/web/steps/common.steps.ts (1)

14-22: Appropriate timeout for file processing scenarios.

The 15-second timeout accommodates file upload, parsing (especially for larger JSON files), and navigation. The comment clearly explains the rationale. This is a reasonable default that should handle most cases without causing flaky tests.

e2e/web/features/workflows/json_to_chart.feature (1)

1-14: LGTM!

The feature file is well-structured with clear scenario steps that align with the JSON upload workflow. The use of @core and @workflow tags is appropriate for this critical user path.

e2e/web/features/data-sources/local_file_upload.feature (1)

1-36: LGTM!

This feature file provides comprehensive coverage of the local file upload functionality, including both happy paths (CSV and JSON) and error scenarios (unsupported format, empty CSV, invalid JSON). The scenarios are well-structured with appropriate tags and clear assertions.

packages/json/src/flatten.test.ts (1)

1-605: Excellent test coverage!

This test suite provides comprehensive coverage of the JSON flattening utilities with 57 well-organized test cases. The tests cover:

  • Core functionality (basic and deep flattening)
  • Edge cases (empty objects/arrays, null values, primitives at root)
  • Configuration options (separator, arrayHandling, maxDepth)
  • Complex real-world scenarios
  • Roundtrip operations (flatten → unflatten)

The test organization with clear describe blocks and descriptive test names makes the suite easy to navigate and maintain. This aligns well with the coding guideline to write tests for critical logic, especially for data handling.

Based on coding guidelines for test coverage.

packages/json/package.json (1)

6-7: No changes needed. This is intentional monorepo design—all packages export TypeScript source directly via main and types pointing to src/index.ts. This approach enables hot reload during development with TypeScript path mappings and is consistently applied across the entire package structure.

packages/connector-local/src/connector.test.ts (1)

1-374: LGTM! Comprehensive test coverage for LocalFileConnector.

This test suite provides excellent coverage of the LocalFileConnector functionality:

  • Static properties and configuration validation
  • File size boundary testing (100MB limit with exact boundary case)
  • Extension validation with appropriate error messages
  • CSV and JSON parsing delegation with proper mocking
  • Edge cases including empty files, invalid formats, and primitive values
  • Singleton instance verification

The mocking strategy appropriately isolates the connector logic from format-specific converters, and the test organization is clear and maintainable.

e2e/web/steps/data-sources.steps.ts (4)

4-7: LGTM! Proper ES module path resolution.

Using fileURLToPath and path.dirname correctly handles ES module paths for fixture file resolution.


15-24: LGTM! Clear file upload implementation.

The step correctly waits for the Local Files connector card with exact text matching to avoid ambiguity, then uploads from the fixtures directory.


29-73: LGTM! Good coverage of error scenarios.

These steps effectively test error handling by creating synthetic files inline with various invalid formats (unsupported extension, empty CSV, invalid JSON syntax).


124-135: LGTM! Robust column verification.

The step correctly verifies column presence by checking for the sortable button labels in the data table UI.

packages/json/src/index.ts (5)

1-27: LGTM! Clean imports and dependencies.

The imports are well-organized, bringing in necessary types from engine-browser, Arrow utilities, and local flatten functions.


29-44: LGTM! Clear type definitions.

The JSONData and JSONConversionOptions types provide a clean API surface with appropriate documentation.


50-67: Date detection heuristic is reasonable but not perfect.

The date inference logic (lines 56-62) uses Date.parse() combined with a regex check for separators and minimum length. While this heuristic works for most date formats, it could theoretically match non-date strings that pass Date.parse() validation.

This is acceptable for automatic type inference where perfect accuracy isn't required, and users can adjust field types manually if needed. The dual validation (parse check + pattern check) provides reasonable confidence.


174-310: Excellent implementation following functional converter pattern!

The jsonToDataFrame function is well-structured and follows the coding guidelines perfectly:

✅ Pure function approach for data conversion (not a class)
✅ Comprehensive JSDoc documentation with examples
✅ Proper type safety with TypeScript strict mode
✅ Handles edge cases (empty data, nested objects, type inference)
✅ Correct Arrow vector typing by column type
✅ IndexedDB storage integration with proper primary key detection

The step-by-step implementation is clear:

  1. Input normalization (array vs single object)
  2. Flattening nested structures
  3. Type inference from sample values
  4. ID column detection with reasonable heuristic
  5. Row creation with _rowIndex
  6. Schema and field generation
  7. Arrow table conversion with proper types
  8. DataFrame creation with IndexedDB storage

316-325: LGTM! Clean re-exports.

The flatten utilities are properly re-exported, maintaining a cohesive package API.

@youhaowei
Copy link
Owner Author

Code review

Found 2 issues:

  1. No limit on array expansion creating columns - With arrayHandling: "index" (the default), a JSON array of 10,000 items creates 10,000 columns (items.0, items.1, ..., items.9999). Given the 100MB file size limit, users uploading real-world JSON with moderately-sized arrays could hit memory/performance issues. Consider adding a configurable column limit or warning.

*/
function flattenArray(
value: JsonValue[],
prefix: string,
currentDepth: number,
ctx: FlattenContext,
flattenValue: (v: JsonValue, p: string, d: number) => void,
): void {
if (ctx.opts.arrayHandling === "stringify" || value.length === 0) {
ctx.result[prefix] = JSON.stringify(value);
return;
}
for (let i = 0; i < value.length; i++) {
const key = buildKey(prefix, String(i), ctx.opts.separator);
flattenValue(value[i], key, currentDepth + 1);
}
}

  1. Missing format scripts in packages/csv/package.json - The packages/json and packages/connector-local packages both have format and format:write scripts, but packages/csv is missing them. This causes inconsistency when running workspace-level format commands.

"scripts": {
"build": "tsc -p tsconfig.json",
"clean": "rm -rf dist *.tsbuildinfo",
"dev": "tsc -w -p tsconfig.json --preserveWatchOutput",
"lint": "eslint src",
"test": "vitest run",
"test:watch": "vitest",
"typecheck": "tsc --noEmit -p tsconfig.json"
},


Additional observations (lower confidence, for consideration):

  • Missing tests for conversion functions: csvToDataFrame and jsonToDataFrame are mocked in connector tests rather than tested directly. CLAUDE.md says "Write tests for critical logic, especially when it comes to handling data."
  • Vitest version inconsistency: packages/csv uses ^4.0.16 while packages/json and packages/connector-local use ^3.2.4
  • Date type inference: The inferType function uses Date.parse() which is permissive; consider stricter ISO 8601 validation

🤖 Generated with Claude Code

If this code review was useful, please react with 👍. Otherwise, react with 👎.

@youhaowei
Copy link
Owner Author

@coderabbitai re-review

@coderabbitai
Copy link

coderabbitai bot commented Jan 5, 2026

@youhaowei, I'll review the changes in this PR.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Fix all issues with AI Agents 🤖
In @e2e/web/steps/visualizations.steps.ts:
- Around line 6-15: The test’s SVG check is too broad: replace the generic
page.locator("svg") in the Then("I should see the chart rendered", async ({ page
}) => { ... }) step with a scoped selector that targets the chart container
(e.g., page.locator('[data-testid="visualization-chart"] svg') or
page.locator('.Chart > svg')), and if the Chart component doesn’t already have a
stable hook add a data-testid="visualization-chart" attribute to the Chart
component container so the test reliably selects the chart SVG; keep the same
visibility assertion and timeout.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e8d3f54 and ebf93f9.

📒 Files selected for processing (3)
  • e2e/web/steps/visualizations.steps.ts
  • packages/connector-local/package.json
  • packages/json/package.json
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/json/package.json
  • packages/connector-local/package.json
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Prefer pure functions over classes for data conversion, transformation, and utilities in TypeScript
Use classes and inheritance for entities with behavior when it makes sense for encapsulation and code organization
Use tRPC for APIs to proxy server-side calls and avoid CORS issues with external APIs
Import state management hooks from @dashframe/core, not directly from @dashframe/core-dexie, to keep components backend-agnostic
Dynamically import VegaChart with ssr: false to prevent SSR serialization issues with Vega-Lite Set objects
Use PascalCase for user-facing names (DashFrame), lowercase scope for packages (@dashframe/), and lowercase prefix for storage keys (dashframe:)
Always import UI components from @dashframe/ui package, not from individual component files
All UI elements on pages MUST use components from @dashframe/ui. If a needed component doesn't exist, add it to the UI package first before using it in pages.
Use shadcn/ui components for standard UI patterns (buttons, cards, dialogs, forms), shared components for DashFrame-specific patterns, and feature-specific components for one-off domain-specific UI
Design token enforcement: Use p-4 (compact), p-6 (standard), p-8 (spacious) for spacing; rounded-2xl for main cards, rounded-xl for nested, rounded-full for badges; h-4 w-4 for inline icons, h-5 w-5 for standalone
Use sentence case for text (except acronyms like CSV, API). No UPPERCASE text.
Extract components to @dashframe/ui when pattern appears or will appear in 3+ places, encapsulates meaningful UI logic, has clear semantic purpose, and include JSDoc documentation
Add JSDoc comments for all new functions, types, and components
Use Zustand stores for state persistence following the pattern in apps/web/lib/stores/
Use functional converter pattern (no classes/inheritance) for data conversion functions
Leverage TypeScript strict mode throughout the codebase
Use superjson transformer for tRPC configuration to support Date, Set, Map seri...

Files:

  • e2e/web/steps/visualizations.steps.ts
🧠 Learnings (1)
📚 Learning: 2025-12-30T20:23:01.389Z
Learnt from: CR
Repo: youhaowei/DashFrame PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T20:23:01.389Z
Learning: Applies to **/*.{ts,tsx} : Dynamically import VegaChart with `ssr: false` to prevent SSR serialization issues with Vega-Lite Set objects

Applied to files:

  • e2e/web/steps/visualizations.steps.ts
⏰ Context from checks skipped due to timeout of 150000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: E2E Tests

@greptile-apps
Copy link

greptile-apps bot commented Jan 6, 2026

Greptile Summary

This PR successfully adds JSON file import capabilities to DashFrame through a well-architected connector system. The implementation extracts CSV parsing into a reusable package, creates comprehensive JSON flattening utilities, and unifies both formats under a single local file connector.

Key architectural improvements:

  • Clean separation: @dashframe/csv (parser utilities), @dashframe/json (flattening + conversion), @dashframe/connector-local (unified orchestration)
  • Nested JSON support with automatic flattening to dot-notation keys (e.g., user.address.city)
  • Proper type inference for JSON values (string/number/boolean/date) with Arrow table creation
  • Enhanced column name remapping with case-insensitive matching and field name fallback for visualization

Testing & validation:

  • 87 unit tests (57 for flattening, 30 for conversion logic)
  • E2E test coverage for JSON-to-chart workflow
  • All tests passing (bun check with 47 tasks)

Code quality observations:

  • Excellent JSDoc documentation throughout
  • Follows functional programming patterns per CLAUDE.md guidelines
  • Proper error handling with clear user-facing messages
  • Removed debug logging in production code

Confidence Score: 5/5

  • This PR is safe to merge with high confidence - well-tested, clean architecture, and no breaking changes
  • Score reflects comprehensive test coverage (87 unit tests + E2E tests), clean architectural separation following monorepo best practices, proper type safety throughout, excellent documentation, and successful CI validation. The code follows all repository guidelines from CLAUDE.md including functional patterns, proper error handling, and component reusability.
  • No files require special attention

Important Files Changed

Filename Overview
packages/json/src/flatten.ts Well-structured flattening utilities with comprehensive JSDoc, proper type guards, and clean separation of concerns
packages/json/src/index.ts Solid JSON-to-DataFrame converter with proper type inference, Arrow table creation, and IndexedDB storage integration
packages/csv/src/parser.ts Extracted CSV parser with state machine logic for handling quoted fields and various line endings
packages/connector-local/src/connector.ts Unified local file connector that cleanly delegates to format-specific parsers based on extension
apps/web/lib/connectors/registry.ts Updated registry to use new localFileConnector replacing previous CSV-only connector
apps/web/app/insights/[insightId]/_components/InsightView.tsx Enhanced column name remapping with case-insensitive matching and field name fallback; removed debug logging

Sequence Diagram

sequenceDiagram
    participant User
    participant LocalConnector as Local File Connector
    participant Parser as Format Parser<br/>(CSV/JSON)
    participant Flatten as JSON Flattener
    participant DataFrame as DataFrame Builder
    participant IndexedDB as IndexedDB Storage
    participant Viz as Visualization Engine

    User->>LocalConnector: Upload file (.csv/.json)
    LocalConnector->>LocalConnector: Validate file size & extension
    
    alt CSV File
        LocalConnector->>Parser: parseCSV(text)
        Parser-->>LocalConnector: 2D string array
        LocalConnector->>DataFrame: csvToDataFrame(data)
    else JSON File
        LocalConnector->>Parser: JSON.parse(text)
        Parser-->>LocalConnector: JSON objects
        LocalConnector->>Flatten: flattenObjectArray(jsonData)
        Flatten->>Flatten: Flatten nested structures<br/>with dot-notation
        Flatten-->>LocalConnector: Flat rows with consistent keys
        LocalConnector->>DataFrame: jsonToDataFrame(flatData)
    end
    
    DataFrame->>DataFrame: Infer column types<br/>(string/number/boolean/date)
    DataFrame->>DataFrame: Create Arrow Table<br/>with typed vectors
    DataFrame->>IndexedDB: Store Arrow IPC buffer
    IndexedDB-->>DataFrame: dataFrameId
    DataFrame-->>LocalConnector: FileParseResult<br/>(fields, schema, rowCount)
    LocalConnector-->>User: Redirect to insight page
    
    User->>Viz: Request chart suggestions
    Viz->>IndexedDB: Query DataFrame analysis
    IndexedDB-->>Viz: Column statistics
    Viz-->>User: Display suggested charts
    
    User->>Viz: Create visualization
    Viz->>IndexedDB: Execute DuckDB query
    IndexedDB-->>Viz: Query results
    Viz-->>User: Render chart (SVG)
Loading

@greptile-apps
Copy link

greptile-apps bot commented Feb 3, 2026

Too many files changed for review. (123 files found, 100 file limit)

Replace Gherkin/BDD test structure with native Playwright specs:

- Remove playwright-bdd, @cucumber/cucumber dependencies
- Delete .feature files and step definitions
- Add lib/test-fixtures.ts with custom Playwright fixtures:
  - homePage() - Navigate and verify home page loaded
  - uploadFile(fileName) - Upload from fixtures directory
  - uploadBuffer(name, content, mimeType) - Upload in-memory content
  - waitForChart() - Wait for chart data and SVG render
- Add new spec files:
  - csv-to-chart.spec.ts - CSV upload workflow
  - json-to-chart.spec.ts - JSON upload workflow
  - chart-editing.spec.ts - Chart type switching
  - dashboard.spec.ts - Dashboard creation
  - error-handling.spec.ts - Invalid file handling
- Update playwright.config.ts to use tests/ directory
- Simplify CLAUDE.md E2E documentation

Benefits:
- Simpler test structure without BDD abstraction layer
- Better TypeScript support and IDE autocomplete
- Reusable fixtures for common test actions
- Easier debugging with native Playwright tooling

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
youhaowei and others added 2 commits February 2, 2026 19:45
Add validation to vgplot-renderer that checks for required X/Y channels
before attempting to render. Previously, charts would crash with
"Cannot destructure property 'column' of 'mark.channelField(...)' as it
is null" when users hadn't selected all required axes.

Changes:
- Add validateEncoding() to check for required channels
- Add renderIncompleteEncoding() to show friendly "Select X axis" message
- Add tests for empty string color (None selection) and missing X axis
- Update CLAUDE.md to use `bun run test` instead of `bun test`

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The Dexie database was being instantiated at module load time, which
caused a 500 error on Vercel when Next.js SSR imported the module tree.
IndexedDB doesn't exist on the server, causing the crash.

Fix: Use a Proxy pattern to lazily initialize the database only when
first accessed in a browser environment. This maintains backward
compatibility with all existing code while preventing SSR errors.

Also adds explicit note in CLAUDE.md that the project uses Bun.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
IndexedDB only exists in the browser. During SSG/SSR, pages that use
useLiveQuery would fail because there's no IndexedDB.

Changes:
- Use dynamic import with ssr:false for InsightPageContent
- Improve db proxy to return no-op tables during SSG instead of throwing

This ensures the insight page shell is statically generated while
data fetching happens purely client-side.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ects

Barrel exports (index.ts) cause entire module trees to load. Importing
LoadingView from "./_components" also loaded InsightView which imports
@dashframe/core, causing IndexedDB access during SSR.

Fix: Import LoadingView directly from its file to avoid loading the
entire barrel export chain.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace `use(params)` with `useParams()` to ensure the page component
runs entirely client-side, avoiding any server-side execution that
could trigger IndexedDB access.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The FUNCTION_INVOCATION_FAILED error occurred because:
1. Dynamic routes run as serverless functions on Vercel
2. The function imported modules that access IndexedDB at load time
3. IndexedDB doesn't exist in Node.js → crash

Solution:
- Add `export const dynamic = "force-static"` to make page static
- Split into server component (page.tsx) + client component wrapper
- Server component only passes params, no browser API access
- Client component uses dynamic import with ssr:false for data layer

Route changed from ƒ (Dynamic/serverless) to ○ (Static).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Standardize SSR handling across all dynamic routes by using
dynamic() with ssr:false instead of force-static. This approach
renders a skeleton during SSR and loads IndexedDB-dependent
content client-side only.

Changes:
- Extract page content to _components/ for all dynamic routes
- Remove force-static and server/client component split pattern
- Delete unused InsightPageClient.tsx and LoadingView.tsx

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace `dynamic import + ssr:false` pattern with simpler `force-static`
server component pattern for all dynamic routes:
- dashboards/[dashboardId]
- data-sources/[sourceId]
- insights/[insightId]
- insights/[insightId]/join/[tableId]
- visualizations/[visualizationId]

The force-static approach is cleaner:
- Server component pages pass params to client content components
- No need for dynamic() wrapper or Suspense boundaries
- Next.js pre-renders empty shell at build time
- Client components hydrate and load IndexedDB data on mount

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
youhaowei and others added 2 commits February 3, 2026 22:59
- Added ignore rule for E2E build artifacts in ESLint configuration to prevent linting of generated files.
- Modified Next.js configuration to use absolute paths for storage implementation, resolving potential dependency issues.
- Updated build script in package.json to include Webpack in the build process.
- Multi-server setup for local parallel execution (each worker gets own port)
- IndexedDB cleared before each test for clean state
- CI mode: single worker with single server for reliability
- Local mode: up to 6 parallel workers with separate servers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Clear IndexedDB once per worker instead of per test
- Removes extra page load/reload overhead from homePage
- Worker-scoped auto fixture runs before any tests in worker

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove E2E_PORT option that bypassed build and shared IndexedDB
- Simplify config to always use isolated production builds
- Update documentation to reflect new parallel execution model

This ensures consistent test isolation via port-per-worker approach.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@youhaowei youhaowei merged commit c7a96d0 into main Feb 4, 2026
5 checks passed
@youhaowei youhaowei deleted the auto-claude/002-add-json-file-connector branch February 4, 2026 06:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant