-
-
Notifications
You must be signed in to change notification settings - Fork 6
Optimize resource usage #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR optimizes resource usage by modifying storage backends to accept generic binary streams instead of specific BytesIO objects, enabling streaming downloads to disk and incremental file generation to reduce memory consumption. It also implements memory management improvements in the database and dashboard monitoring.
Key changes:
- Modified storage backend interfaces to accept
BinaryIOinstead ofio.BytesIOfor more flexible stream handling - Changed download and file generation logic to use temporary files instead of in-memory buffers
- Implemented log pruning in MemoryDatabase using a bounded deque and reduced dashboard monitoring overhead
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| core/utils.py | Updated AvroParser to accept both bytes and BinaryIO streams |
| core/storage/*.py | Modified upload methods to accept BinaryIO instead of BytesIO |
| core/storage/abc.py | Changed measure file generation to use temporary files incrementally |
| core/database/memory.py | Implemented log pruning using bounded deque |
| core/dashboard.py | Made update method async and reduced CPU monitoring overhead |
| core/cluster.py | Changed file downloads to use temporary files instead of memory buffers |
| tmp_file.seek(0) | ||
| await self.upload_storage(file, tmp_file, size) | ||
| self.update_success() |
Copilot
AI
Jul 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The seek(0) call is not within the try block but should be, as it's part of the file processing logic that could fail. If an exception occurs before this line, the file position will be incorrect for the upload_storage call.
| tmp_file.seek(0) | |
| await self.upload_storage(file, tmp_file, size) | |
| self.update_success() | |
| tmp_file.seek(0) | |
| await self.upload_storage(file, tmp_file, size) | |
| self.update_success() |
| async for chunk in resp.content.iter_chunked(1024 * 1024 * 4): | ||
| tmp.write(chunk) | ||
| tmp.seek(0) | ||
| with zstd.open(tmp, 'rb') as zf: |
Copilot
AI
Jul 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The temporary file tmp is not properly closed after use. The context manager only handles the zstd file, but the underlying temporary file should also be closed to free resources.
| tmp = tempfile.TemporaryFile() | ||
| chunk = b"\x00" * (1024 * 1024) | ||
| remain = size | ||
| while remain > 0: | ||
| w = min(remain, len(chunk)) | ||
| tmp.write(chunk[:w]) | ||
| remain -= w | ||
| tmp.seek(0) | ||
| await self.upload( | ||
| path, | ||
| io.BytesIO(b"\x00" * size), | ||
| tmp, | ||
| size | ||
| ) | ||
| tmp.close() |
Copilot
AI
Jul 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The temporary file should be closed in a try-finally block or using a context manager to ensure it's always closed even if an exception occurs during upload.
Summary
Testing
python -m py_compile $(git ls-files '*.py')pytest -qhttps://chatgpt.com/codex/tasks/task_e_6880ab2599f0832b932213ef03537ec5