Skip to content
/ dbtwiz Public

Python package with CLI helper tool for dbt in GCP using BigQuery.

License

Notifications You must be signed in to change notification settings

amedia/dbtwiz

Repository files navigation

dbtwiz

Python package with CLI helper tool for dbt in GCP using BigQuery. Although some functions are platform independent, the majority assume GCP and BigQuery is used.

Installation

pip install git+https://github.com/amedia/dbtwiz

Subcommands

These are the available subcommands for dbtwiz. You can also run dbtwiz --help/dbtwiz -h, which will list the commands with more details.

  • model - Create, validate, and manage dbt models
    • create - Create new dbt model.
    • fix - Run sqlfmt and sqlfix for staged and/or defined sql files.
    • inspect - Output information about a given model.
    • lint - Run sqlfmt --diff and sqlfluff lint for staged and/or defined sql files.
    • move - Moves a model by copying to a new location with a new name, and/or by updating the references to the model by other dbt models.
    • validate - Validates the yml and sql files for a model.
  • source - Create and manage dbt sources
    • create - Create new dbt source
  • build - Build one or more dbt models with interactive selection or exact names.
  • test - Test dbt models with optional date specification.
  • manifest - Update dbt manifests for fast lookup and caching.
  • admin - Production backfilling and administrative tasks
    • backfill - Backfill date-partitioned models in production for a specified date range. Spawns Cloud Run jobs to process multiple dates in parallel with configurable batch sizes.
    • cleandev - Delete all materializations in the dbt development dataset
    • orphaned - List or delete orphaned materializations in the data warehouse
    • partition-expiry - Checks for mismatched partition expiry and allows updating to correct.
    • restore - Restore a deleted BigQuery table from a snapshot using time travel.

Configuration

Project config

Depending on the specific subcommand, there are some configuration settings defined in a pyproject.toml file that the tool will look for.

The tool will give you a warning when you run a commmand that needs one of the config elements should it be missing, so you don't need to add them all before they become relevant.

[tool.dbtwiz.project]
# Config for default number of days per batch when running backfill
backfill_default_batch_size = 30
# Config for bucket containing dbt manifest.json at the top level
bucket_state_project = ""           # Project name for bucket
bucket_state_identifier = ""        # Bucket name

# Config for service account used for backfill and cleanup of orphaned models in prod
service_account_project = ""        # Project name for where service account actions are run
service_account_identifier = ""     # Name of service account
service_account_region = ""         # Region for where service account actions are run

# Config for user actions
user_project = ""                   # Project name for where user queries are run

# Config for docker image used for backfill
docker_image_url_dbt = ""           # Url for docker image
docker_image_profiles_path = ""     # Path to profiles dir in docker image
docker_image_manifest_path = ""     # Path to manifest in docker image

# Config for orphaned dbt models cleanup
orphan_cleanup_bq_region = ""       # The region where data is materialized (e.g. region-eu)
orphan_cleanup_projects = [""]      # Which projects to look for orphaned models in (e.g. prod)
orphan_cleanup_skip_projects = [""] # Which projects not to look for orphaned moels in (e.g. dev)

User config

The default configuration of dbtwiz will be installed the first time you run it, but you may want to adjust some settings from the get-go to fit your environment.

The config settings are stored in a file config.toml in the dbtwiz folder within your user's app settings directory:

  • ~/.config/dbtwiz/config.toml for GitHub Codespaces or local Linux environments
  • %appdata%/dbtwiz/config.toml for Windows

If dbtwiz looks up a user config file and it doesn't exist, it will be created with default settings according to your platform. It might look like this:

# When true, check for existing GCP auth token, and ask for
# automatic reauthentication if needed.
auth_check = true

# Command for opening model source files in editor, with empty
# curly braces where the file path should be inserted. If curly
# braces are left out, the file name will be appended at the end.
# Some examples:
# - Visual Studio Code: "code {}"
# - Emacs (with running server): "emacsclient -n {}"
editor_command = "code {}"

# Enable debug logging of some internal dbtwiz operations. You won't
# need this unless you're working on or helping troubleshoot dbtwiz.
log_debug = false

# Command for showing prerendered model info files in the interactive
# fzf-based selector. A sensible default is chosen based on the
# current platform.
sql_formatter = "fmt -s"

# Set to "light" to use a color scheme suitable for a light background,
# or to "dark" for better contrasts against a dark background.
theme = "light"

Development

# where you keep locally checked out repos:
git clone git@github.com:amedia/dbtwiz

# inside the virtual environment of your dbt project:
pip install -e <local-path-to-dbtwiz-repository>

About

Python package with CLI helper tool for dbt in GCP using BigQuery.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 5