Skip to content

Conversation

@nateprewitt
Copy link
Collaborator

This PR makes the metadata property on AzureBlobFile lazy-loaded for read mode, addressing the performance issue raised in #517. Previously, metadata was fetched synchronously every time a file was opened (~50ms overhead), regardless of use. Now it's only fetched on first access of the .metadata property. This approach preserves backwards compatibility while reducing average overhead.

The one assumption being made is end-users aren't manually assigning metadata from their code. That seems like an unlikely edge case. If we're concerned about that though, I can add a setter as well. I just don't want to encourage that behavior if we don't consider it supported/intended.

Changes

  • Removes eager get_blob_metadata call from __init__ for read mode.
  • Adds @property decorator to fetch and cache metadata on first access.
  • Includes tests verifying lazy-loading behavior.

@nateprewitt nateprewitt marked this pull request as ready for review January 14, 2026 23:34
Copy link
Collaborator

@anjaliratnam-msft anjaliratnam-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, I just had one small comment!

Copy link
Collaborator

@kyleknap kyleknap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 🚢

@kyleknap kyleknap merged commit 04ec512 into fsspec:main Jan 17, 2026
8 checks passed
@nateprewitt nateprewitt deleted the lazy_metadata branch January 17, 2026 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants