-
Notifications
You must be signed in to change notification settings - Fork 25
Description
The specification outlines the float type as 32bit. Python has 64bit floats, hence when packing these per the template are dumped to the output file. Other parsers (e.g. mmtf-java) try to load these as 32bit floats, and hence fail. We can overcome this easily by updating the msgpack.packb call to include use_single_float=True.
However, it seems mmtf-java also violates the standard, and uses doubles (64bit floats) for the ncsOperatorList, thus the above change means it can't parse the output still. Given mmtf-java is used for the RCSB files, we can assume they won't shift to 32bit floats - it'll break their parsing for even more files.
Additionally, the msgpack-python implementation does not support selecting doubles for only one field - msgpack/msgpack-python#326. Instead you have to pack the biological assemblies list separately and then combine it, as in the collapsed snipped below.
Code for packing separately.
# The mmtf standard expects everything as 32bit - hence use_single_float.
# Note the encode_data no longer includes bioAssemblyList.
main = msgpack.packb(self.encode_data(), use_bin_type=True, use_single_float=True)
# Assemblies need to be 64bit for Java compatibility.
assemblies = msgpack.packb(
{"bioAssemblyList": self.bio_assembly},
use_bin_type=True,
use_single_float=False,
)
# In msgpack, the first three bytes of a map (over 15 elements) are `\xde\x12\x34`, where
# 1234 gives the map length.
# Our `main` map has 30-something elements, hence only the `\x34` matters.
# Get the new length indicator, prepended with the map indicator and a `\x00`.
new_map_length: bytes = b"\xde\x00" + chr(main[2] + 1).encode()
# Strip the first three bytes from `main` (the map indicator byte and two bytes for length).
main = main[3:]
# Strip the first byte from `assemblies` (it's less than 15 elements, has a single byte indicator).
assemblies = assemblies[1:]
# Finally put it all back together.
new_data = new_map_length + main + assembliesFor reference I have raised this issue in the mmtf-java repo too - rcsb/mmtf-java#53.