Skip to content

Mutex lock conflict between datasets and grain libraries with array_record>=0.8 #1023

@chicham

Description

@chicham

Hello,

I have a compatibility issue when importing both huggingface/datasets and grain libraries together with array_record>=0.8. The problem manifests as mutex lock conflicts that vary based on import order, causing either hangs or fatal crashes.

Environment

  • Python: 3.12/3.13
  • datasets: 4.0.0
  • grain: 0.2.12
  • array_record: >=0.8

Importing datasets then grain

Script: https://gist.github.com/chicham/cf9e1e9d6485bb848a6d5b288d86fb9d

❯ uv run error_import_datasets_grain.py
Installed 43 packages in 270ms
Datasets version 4.0.0
WARNING:absl:Failed to import TraceAnnotation.
[mutex.cc : 452] RAW: Lock blocking 0x6000020641f8

Importing grain then datasets

Script: https://gist.github.com/chicham/6547c4683fda0bfd718ed15eaa66cf70

❯ uv run error_import_grain_datasets.py
WARNING:absl:Failed to import TraceAnnotation.
Grain version 0.2.12
libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed:
Invalid argument

Workaround

Forcing array_record<0.8 solve the issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    type:bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions