Skip to content

Comments

Fix streamlit deploy failure when pages/*.py overlaps with pages/ directory#2780

Open
sfc-gh-moczko wants to merge 2 commits intomainfrom
fix/2741-streamlit-deploy-pages-collision
Open

Fix streamlit deploy failure when pages/*.py overlaps with pages/ directory#2780
sfc-gh-moczko wants to merge 2 commits intomainfrom
fix/2741-streamlit-deploy-pages-collision

Conversation

@sfc-gh-moczko
Copy link
Collaborator

@sfc-gh-moczko sfc-gh-moczko commented Feb 24, 2026

Summary

Fixes #2741.

snow streamlit deploy fails with NotInDeployRootError when both pages/ (directory) and pages/*.py (glob) appear in artifacts and a pages/ directory exists on disk.

Root cause

_ArtifactPathMap.put() walks directory sources to mark children in _dest_is_dir, but does not register those children in __dest_to_src. When a glob like pages/*.py later resolves to the same files, __dest_to_src.get(dest) returns None instead of the existing source, so the duplicate mapping is silently added. During symlink_or_copy(), the second mapping calls Path.resolve() on an already-created symlink, which follows it back to the project source directory — outside the deploy root — and raises NotInDeployRootError.

Fix

_ArtifactPathMap.put() (bundle_map.py) — Two changes in the put() method:

  1. Directory walk now registers children in __dest_to_src: Each file discovered during os.walk is recorded as __dest_to_src[child_dest] = child_src. If a different source already maps to that destination, TooManyFilesError is raised immediately.
  2. File-to-file branch returns early on same-source duplicates: When current_source == src, the mapping is redundant (already covered by a parent directory walk), so put() returns without adding a duplicate entry.

convert_streamlit_to_v2_data() (definition_conversion.py) — Defense-in-depth:

Adds _is_path_covered_by_directory() helper. During V1→V2 definition conversion, additional_source_files entries like pages/*.py are skipped if they fall under a directory already included as an artifact (e.g. pages/). This prevents the overlapping mappings from reaching _ArtifactPathMap in the first place.

Test plan

  • test_bundle_map_deduplicates_directory_and_glob_overlap — directory src/snowpark added first, then explicit src/snowpark/main.py → snowpark/main.py; verifies the duplicate is silently deduplicated and only the directory mapping remains
  • test_bundle_map_disallows_different_source_collision_with_directory_child — directory src/snowpark added first, then app/manifest.yml → snowpark/main.py (different source, same dest); verifies TooManyFilesError is raised
  • test_bundle_map_disallows_collisions_anywhere_in_deployed_hierarchy — previously @pytest.mark.skip; now passes because directory children are tracked in __dest_to_src
  • test_bundle_deduplicates_pages_directory_and_glob — end-to-end StreamlitEntity.bundle() with artifacts: ["streamlit_app.py", "pages/", "pages/*.py"]; verifies the bundle completes and produces the correct output files
  • test_v1_to_v2_streamlit_conversion_deduplicates_pages — V1 streamlit with additional_source_files: [pages/*.py] and a pages/ directory on disk; verifies pages/*.py is filtered from the converted V2 artifacts
  • test_v1_to_v2_streamlit_conversion_keeps_non_overlapping_additional_files — same setup but with utils/helper.py added; verifies non-overlapping files are preserved while pages/*.py is still filtered
  • All 290 existing tests pass with 0 regressions

iter_stage() was reconstructing file paths from Snowflake ls output,
which only contains the unqualified stage name. This caused GET commands
to resolve against the connection default database instead of the
database specified in the FQN. Now preserves the original stage_path FQN
by using root_path() and joining only the relative file path.

Fixes SNOW-3074550

.... Generated with [Cortex Code](https://docs.snowflake.com/user-guide/snowflake-cortex/cortex-agents)

Co-Authored-By: Cortex Code <noreply@snowflake.com>
@sfc-gh-moczko sfc-gh-moczko requested a review from a team as a code owner February 24, 2026 01:14
@sfc-gh-moczko sfc-gh-moczko force-pushed the fix/2741-streamlit-deploy-pages-collision branch from 5ee660a to 26c478a Compare February 24, 2026 05:21
…ectory (#2741)

When both `pages/` and `pages/*.py` appear in artifacts, the bundle map
created duplicate destination mappings that caused `NotInDeployRootError`
during symlink resolution. Fix by tracking directory-walked children in
`__dest_to_src` to detect and deduplicate overlapping mappings, and by
filtering redundant `additional_source_files` during V1-to-V2 definition
conversion.

.... Generated with [Cortex Code](https://docs.snowflake.com/en/user-guide/cortex-code/cortex-code)

Co-Authored-By: Cortex Code <noreply@snowflake.com>
@sfc-gh-moczko sfc-gh-moczko force-pushed the fix/2741-streamlit-deploy-pages-collision branch from 26c478a to 35039f3 Compare February 24, 2026 06:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SNOW-3015632: Streamlit deploy fails when pages/*.py is in artifacts and pages/ directory exists

1 participant