Conversation
Makefile
Outdated
| @echo "Cleanup complete." | ||
|
|
||
| notebook: ## Launch Jupyter Notebook | ||
| ${POETRY} run pip install jupyter |
There was a problem hiding this comment.
Should we move this into a poetry dependency group? Similar to the docs.
I agree, and that's great, but should we also spin up the resources as part of this effort? We could even inject a notebook that imports Spark-connect, etc (which won't be installed from a fresh install? I think this is a dev dependency, we probably want to double check there to avoid scaring newcomers to the project). |
|
Bonus idea: what if |
We could do getting started as a notebook! https://py.iceberg.apache.org/#getting-started-with-pyiceberg |
yea we could do that. the integration test setup gives us 2 different catalogs (rest and hms) |
|
@kevinjqliu I would keep it simple, and go with the preferred catalog; REST :) |
2e6d5a1 to
d69f359
Compare
| test-integration: test-integration-setup test-integration-exec test-integration-cleanup ## Run integration tests | ||
|
|
||
| test-integration-setup: ## Start Docker services for integration tests | ||
| test-integration-setup: install ## Start Docker services for integration tests |
There was a problem hiding this comment.
adding the make install pre-req here because otherwise dev/provision.py will fail
| @@ -0,0 +1,359 @@ | |||
| { | |||
There was a problem hiding this comment.
We can also add a .pre-commit linter for notebooks: https://github.com/nbQA-dev/nbQA
Fokko
left a comment
There was a problem hiding this comment.
Looks great! Should we also add a section to the docs?
| .gitignore | ||
| uv.lock | ||
| mkdocs/* | ||
| notebooks/* |
There was a problem hiding this comment.
We can do this, but then we have to make sure that they are not bundled in the release. The notebooks do contain code.
There was a problem hiding this comment.
yea agreed. i double check the artifacts, the new notebooks/ dir is not included. Similar to how the mkdocs/ dir is not included.
Feels like this can be a potential footgun where a folder is included in the artifact but RAT check is ignored in .rat-excludes. I think we can add a CI check to prevent this. I'll track this as a separate issue
| find /tmp/warehouse/ | ||
| ``` | ||
|
|
||
| ## Try it yourself with Jupyter Notebooks |
There was a problem hiding this comment.
Added to the frontpage, https://py.iceberg.apache.org/

| export PYICEBERG_CATALOG__TEST_CATALOG__SECRET_ACCESS_KEY=password | ||
| ``` | ||
|
|
||
| ## Notebooks for Experimentation |
There was a problem hiding this comment.
Added a new section to https://py.iceberg.apache.org/contributing/
|
Added some docs and linter for notebook |
geruh
left a comment
There was a problem hiding this comment.
LGTM! This is awesome Kevin! Was able to get it up and running locally with no issues
|
Merging this! We can add more improvements as follow up |



Rationale for this change
Add 2 new Make command (
make notebook) to spin up a jupyter notebook;make notebook-infraspins up jupyter notebook along with integration test infrastructure.Pyiceberg Example Notebook
Pyiceberg example notebook (
notebooks/pyiceberg_example.ipynb) is based on the https://py.iceberg.apache.org/#getting-started-with-pyiceberg page and doesn't require additional test infra.Spark Example Notebook
Spark integration example notebook (
notebooks/spark_integration_example.ipynb) is based on https://iceberg.apache.org/docs/nightly/spark-getting-started/ and requires integration test infrastructure (Spark, IRC, S3)With spark connect (#2491) and our testing setup, we can quickly spin up a local env with
make test-integration-execwhich includes:In the jupyter notebook, connect to spark easily
Are these changes tested?
Yes, run both
make notebookandmake notebook-infralocally and run the example notebooksAre there any user-facing changes?