FXC-4775: Surface batch download failures when gzip extraction fails #3147
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Batch.download()to raise when any background download fails.Testing
poetry run pytest -q tests/test_web/test_webapi.py -k "batch_download_surfaces_download_errors"poetry run pytest -q tests/test_web/test_s3utils.pypoetry run pre-commit run --all-filesNote
Cursor Bugbot is generating a summary for commit 8302236. Configure here.
Greptile Overview
Greptile Overview
Greptile Summary
This PR fixes a critical bug where
Batch.download()would silently succeed even when background downloads failed due to gzip extraction errors.Key Changes
container.py: Modified the batch download mechanism to call
fut.result()on each completed future. This ensures exceptions from background worker threads are propagated to the main thread rather than being silently swallowed.s3utils.py: Made gzip extraction atomic by:
tempfile.mkstemp()replace()to move it to the final destinationtest_webapi.py: Added regression test
test_batch_download_surfaces_download_errorsthat verifies exceptions from background downloads are properly surfaced to the caller.CHANGELOG.md: Added user-facing entry describing the bug fix.
Impact
This fix ensures that users are immediately notified when batch downloads fail, preventing silent data corruption or missing files that could lead to confusing downstream errors. The atomic extraction also prevents partially-written files from appearing as successful downloads.
Confidence Score: 4/5
Important Files Changed
File Analysis
Sequence Diagram
sequenceDiagram participant User participant Batch participant ThreadPool participant Job participant download_gz_file participant extract_gzip participant FileSystem User->>Batch: download(path_dir) Batch->>Batch: Create download tasks for each job Batch->>ThreadPool: Submit futures for parallel downloads loop For each completed future ThreadPool-->>Batch: future completes Batch->>Batch: fut.result() [NEW: propagates exceptions] alt Download succeeds Batch->>Batch: Update progress else Download fails (e.g., gzip error) Job->>download_gz_file: download and extract download_gz_file->>FileSystem: Download .gz to temp file download_gz_file->>FileSystem: Create temp output file [NEW: atomic] download_gz_file->>extract_gzip: Extract to temp output alt Extraction fails extract_gzip-->>download_gz_file: Exception download_gz_file->>FileSystem: Clean up temp output [NEW] download_gz_file-->>Job: Raise WebError [NEW] Job-->>Batch: RuntimeError Batch-->>User: Exception raised [FIXED] end end end Batch-->>User: Download complete or error raisedImportant Files Changed
File Analysis
Sequence Diagram
sequenceDiagram participant User participant Batch participant ThreadPool participant Job participant download_gz_file participant extract_gzip participant FileSystem User->>Batch: download(path_dir) Batch->>Batch: Create download tasks for each job Batch->>ThreadPool: Submit futures for parallel downloads loop For each completed future ThreadPool-->>Batch: future completes Batch->>Batch: fut.result() [NEW: propagates exceptions] alt Download succeeds Batch->>Batch: Update progress else Download fails (e.g., gzip error) Job->>download_gz_file: download and extract download_gz_file->>FileSystem: Download .gz to temp file download_gz_file->>FileSystem: Create temp output file [NEW: atomic] download_gz_file->>extract_gzip: Extract to temp output alt Extraction fails extract_gzip-->>download_gz_file: Exception download_gz_file->>FileSystem: Clean up temp output [NEW] download_gz_file-->>Job: Raise WebError [NEW] Job-->>Batch: RuntimeError Batch-->>User: Exception raised [FIXED] end end end Batch-->>User: Download complete or error raised