Skip to content

fix header json parsing#3

Open
thammegowda wants to merge 1 commit intocarsonpo:mainfrom
thammegowda:tg/jsonfix
Open

fix header json parsing#3
thammegowda wants to merge 1 commit intocarsonpo:mainfrom
thammegowda:tg/jsonfix

Conversation

@thammegowda
Copy link

@thammegowda thammegowda commented Oct 18, 2025

Thank you for this library/code! I tried to load a safetensor file but got 0 tensors loaded. I asked an AI coding agent to find and fix the bug. Note: apart from this line of text, the rest is generated by agent -- but I've verified that the fix works.


Safetensors.hpp Bug Fix Summary

Problem

The safetensors.hpp library was loading 0 tensors from safetensors files, while the Python reference implementation correctly loaded 340 tensors from the test file tmp/gemma-3-1b-it/model.safetensors.

Root Cause

The bug was in the SimpleJSONParser class in safetensors.hpp. Specifically:

  1. Improper metadata skipping: When encountering the __metadata__ field in the JSON header, the parser used a simplistic approach:

    // Skip metadata
    while (json[pos] != ',' && json[pos] != '}')
        pos++;
  2. This approach failed because:

    • It couldn't handle nested JSON objects properly
    • When it encountered "__metadata__":{"format":"pt"}, it would stop at the first } (end of metadata object)
    • This left the parser in an incorrect state, preventing it from parsing subsequent tensor entries

Solution

Added a proper skipValue() method that recursively handles:

  • Strings
  • Arrays
  • Objects (nested)
  • Numbers and other primitives

The fix involved:

  1. Adding the skipValue() method to properly skip JSON values of any complexity
  2. Updating parseTensorInfo() to use skipValue() for unknown fields
  3. Updating the main parse() method to use skipValue() for the metadata object

Changes Made

Files Modified

  • libs/safetensors.cpp/safetensors.hpp - Fixed JSON parser
  • tests/test_serialize.cpp - Enhanced test case with better error reporting

Files Created (for testing/verification)

  • tests/test_safetensors_load.py - Python reference implementation
  • tests/debug_safetensors_header.py - Debug script to inspect safetensors headers
  • tests/test_verify_tensors.py - Verification script for specific tensors

Test Results

Before Fix

Loading model from tmp/gemma-3-1b-it/model.safetensors
Loaded 0 tensors.

After Fix

Loading model from tmp/gemma-3-1b-it/model.safetensors
Loaded 340 tensors.
Tensor name: model.embed_tokens.weight
  Shape: [262144, 1152]
  Dtype: c10::BFloat16
  Device: cpu
...
Exit code: 0

Verification

  • ✓ Python loads 340 tensors
  • ✓ C++ loads 340 tensors
  • ✓ Tensor shapes and dtypes match between implementations
  • ✓ Test passes with exit code 0

How to Test

Python Test

cd /home/tg/work/repos/projectXYZ
python3 tests/test_safetensors_load.py tmp/gemma-3-1b-it/model.safetensors

C++ Test

cd /home/tg/work/repos/projectXYZ
./build-debug/tests/projectXYZ-tests safetensors_load tmp/gemma-3-1b-it/model.safetensors

Both should report loading 340 tensors successfully.

@thammegowda
Copy link
Author

Also need this fix issue dealing with endianness of float types: thammegowda@09064c8

I ended up reformatting the code to suit my coding style. main...thammegowda:safetensors.cpp:main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant