Create Initial Snowflake Notebook (5 sections) #61
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces a Snowflake data integration layer for the getML Feature Store, including configuration, infrastructure bootstrapping, and data ingestion utilities. It provides a modular, environment-variable-driven setup for connecting to Snowflake, ensures required infrastructure (warehouse/database) is present, and supplies SQL templates and scripts for automating data preparation and ingestion. Additionally, it adds tools for converting Jaffle Shop CSV data to Parquet format for efficient loading.
Snowflake Data Integration Layer
Core data integration package:
datapackage with modules for Snowflake settings, session management, infrastructure bootstrapping, SQL loading utilities, and top-level imports for streamlined usage. (integration/snowflake/data/__init__.py,integration/snowflake/data/_settings.py,integration/snowflake/data/_snowflake_session.py,integration/snowflake/data/_bootstrap.py,integration/snowflake/data/_sql_loader.py) [1] [2] [3] [4] [5]SnowflakeSettingsand context-managed Snowpark session creation. [1] [2]SQL automation and templates:
integration/snowflake/data/sql/...) [1] [2] [3] [4] [5] [6] [7] [8] [9]Jaffle Shop Data Preparation
integration/jaffle-shop-data/convert_jaffle_csv_to_parquet.py,integration/jaffle-shop-data/GENERATE_JAFFLE_SHOP_PARQUET.md) [1] [2]Project Configuration
pyproject.tomlwith dependencies for data engineering, Snowflake integration, development, and linting, ensuring reproducible environments and code quality. (integration/pyproject.toml)