feat(cli): add sync observability with per-table progress and timing#207
Merged
MatLBS merged 4 commits intogetnao:mainfrom Feb 16, 2026
Merged
Conversation
…d slow query warnings During `nao sync` on large databases, the progress bar appears stuck because it only updates per-schema with no visibility into per-table progress or query timing. This makes it impossible to tell if the sync is working or hung. Add timing instrumentation and granular progress reporting: - Connection and schema discovery timing - Per-schema summary (matched vs total tables, list duration) - Live table name in progress bar showing current work - Elapsed time column as a built-in heartbeat - Slow query warnings (>5s) with accessor name and duration - Error context showing which accessor failed and how long it ran - Per-schema completion summary with table count, duration, error count - Total sync duration in final summary Fixes getnao#206 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d slow query warnings During `nao sync` on large databases, the progress bar updates per-schema but appears stuck within a schema because there's no per-table feedback. Users can't tell if the sync is working or hung. Add timing instrumentation and granular progress reporting: - Connection and schema discovery timing - Per-schema summary (matched vs total tables, list duration) - Live table name in progress bar showing current work - Elapsed time column as a built-in heartbeat - Table count column (329/416) showing completed vs total - Slow query warnings (>5s) with accessor name and duration - Error context showing which accessor failed and how long it ran - Per-schema completion summary with table count, duration, error count - Total sync duration in final summary Fixes getnao#206 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously get_schemas() returned all schemas from the database, and filtering only happened at the table level via matches_pattern(). This meant nao would query list_tables() for schemas that would ultimately have zero matching tables, wasting time on unnecessary Snowflake calls. Add _schema_matches() to extract the schema prefix from include/exclude patterns (e.g., ANALYTICS from ANALYTICS.*) and filter schemas before entering the per-table sync loop. Refs getnao#206 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The sync observability changes added multiple console.print calls (connection timing, schema discovery, per-schema summary, etc.) so tests that expected a single print call or checked only the last call now need to search through all calls to find the error line. Add find_print_call_containing() helper to search all console.print calls for the expected error marker, replacing assertions on call_args (last call) and assert_called_once. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bc209c5 to
b0b4f29
Compare
Contributor
|
Hi ealexisaraujo 👋, thank you so much for your PR, I tried it and everything works well |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Improve
nao syncobservability so users always know what the CLI is doing during long-running database syncs, and pre-filter schemas to avoid unnecessary Snowflake queries.Changes in
provider.pySCHEMA → TABLE_NAMEso users always see which table is being syncedMofNCompleteColumn— progress bar shows329/416(completed/total) instead of just a percentageTimeElapsedColumn— elapsed time ticks in real-time as a built-in heartbeat, so users know the sync is alive even during slow queriesdb_config.connect()takes (e.g.,Connected to my-db (1.9s))Found 3 schemas (297ms))list_tables()duration (e.g.,▸ TRANSFORM — 416 tables (of 416 total, listed in 1.2s))⏱ TRANSFORM.LARGE_TABLE description took 1m16s)✗ TRANSFORM.BAD_TABLE preview failed after 3.5s: Numeric value...)✓ TRANSFORM — 416 tables synced in 38m0s (1 errors))654 tables across 3 datasets in 50m11s)Changes in
snowflake.py_schema_matches()method extracts schema prefixes frominclude/excludepatterns (e.g.,ANALYTICSfromANALYTICS.*) and filters schemas before entering the per-table sync looplist_tables()to Snowflake for schemas that will have zero matching tables, reducing unnecessary API calls and warehouse costMotivation
When running
nao syncon large databases (e.g., Snowflake with 100B+ row tables), several problems make the CLI difficult to use:COUNT(*)or has crashedget_schemas()returns all schemas from the database even whenincludepatterns limit to a subset, causing unnecessarylist_tables()calls for non-matching schemasWhy these accessors are slow
The three default templates have very different performance characteristics:
columnsINFORMATION_SCHEMA.COLUMNSmetadatadescriptionSELECT COUNT(*) FROM tablepreviewSELECT * FROM table LIMIT 10Without timing, users had no visibility into which accessor was causing the sync to appear hung.
Example output (before vs after)
Before
(No visibility into which table, which accessor, how long, or whether it's still alive)
After
Files changed
cli/nao_core/commands/sync/providers/databases/provider.pyMofNCompleteColumn,TimeElapsedColumn, slow query warnings, error context, schema/total summariescli/nao_core/config/databases/snowflake.py_schema_matches()for schema-level pre-filtering based on include/exclude patternsTest plan
nao sync -p databasesagainst a Snowflake database with multiple schemas and hundreds of tablesMofNCompleteColumnshowsN/Mcount (e.g.,329/416)make lintpasses (ty, ruff check, ruff format)include/excludelists still return all schemas (no regression)Fixes #206
🤖 Generated with Claude Code