Skip to content

Add support for TiffStreamingRawDataset #687

@bhimrazy

Description

@bhimrazy

🚀 Feature

Notes from @tchaton

We could add https://developmentseed.org/async-tiff/latest to the StreamingRawDataset

from litdata import StreamingRawDataset
from litdata.raw.types import TIFF
import torch

class TiffStreamingRawDataset(StreamingRawDataset):

    def setup(self, urls):
        return [TIFF(url, tile=(512, 512, 3), ....]

    def __getitem__(self, decoded_bytes: bytes):
        return torch.frombuffer(decoded_bytes, torch.uint8)

example: https://github.com/microsoft/pytorch-cloud-geotiff-optimization/blob/5fb6d1294163beff822441829dcd63a3791b7808/optimized_cog_streaming/datamodules.py#L89 and https://github.com/microsoft/pytorch-cloud-geotiff-optimization/blob/5fb6d1294163beff822441829dcd63a3791b7808/optimized_cog_streaming/datasets.py#L42

Motivation

Pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requesthelp wantedExtra attention is needed

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions