Skip to content

Feature-gate heavy DataLoader deps (arrow, parquet, hf-hub, foyer) #74

@darinkishore

Description

@darinkishore

Problem

DataLoader pulls in heavy transitive dependencies:

  • arrow (56.1.0)
  • parquet (56.1.0)
  • hf-hub (0.4.3)
  • rayon (1.10.0)
  • foyer (0.20.0)
  • tempfile (3.23.0)

These are for data loading (CSV, Parquet, HuggingFace datasets) and caching — tangential to the core module/optimizer/signature architecture. Users who only need the typed prediction pipeline pay the full compile-time and binary-size cost.

Proposal

Feature-gate behind optional features:

[features]
default = ["dataloader"]
dataloader = ["dep:arrow", "dep:parquet", "dep:hf-hub", "dep:rayon"]
cache = ["dep:foyer", "dep:tempfile"]

data/example.rs and data/prediction.rs stay in core (used by optimizers/cache).
data/dataloader.rs, data/serialize.rs, data/utils.rs move behind dataloader feature.
utils/cache.rs moves behind cache feature.

Impact

  • Faster compiles for the common case
  • Smaller binary for embedding/deployment scenarios
  • No behavioral change for default feature set

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions