Skip to content

File Hashing

Jim Lake edited this page Jul 9, 2025 · 3 revisions

Details on Maxlo file hashing.

File Hashing

On small files, simply calculate the SHA256 of the file. Provide this to the backend and it will provide you with a signed PUT to upload the file. The S3 backend will enforce SHA256 correctness.

Multipart Files

We handle file integrity and hashing based on the S3 SHA256 hashing algorithm. In all supported S3 providers, we get SHA256 integrity per part. We leverage that to ensure multipart files have file integrity by hashing those hashes.

The Maxlo backend enforces all these requirements before providing signed upload URLs.

Part Requirements

  • Between 3 and 999 file parts.
  • All parts must be the same size except for the last part that may be smaller.
  • The first part must be at least 5MB.
  • The sum of the part_size of the parts must equal the file_size.
  • The SHA256 of the bytes of the part_hash's is equal to the file_hash.

Additionally, you must have an active file in the appropriate volume to create a multipart upload. We do not support uploading expired versions of multipart files at this time.

Multipart Hashing

Calculate the chunk size (see below) and then hash each part independantly. Record the part_hash and part_size. Combine all the bytes of the part_hashs into 1 buffer and then hash that buffer. The result is the file_hash. You will need the part_hash and part_size for all the parts to obtain signed upload parameters.

Chunk size calculator

In general the following algorithm should be used to calculate the chunk size. The backend will record the chunk size for uploaded objects. For objects stored on the file system, its is not always possible to record the chunk size. As such, consistently calculating chunk sizes on all platforms significantly improves efficency of storage by preventing 2 identical files having different hashes due to different chunk sizes.

The algorithm roughly described:

  • If the file is 16MB or smaller do not chunk it.
  • Start at 8MB chunk size.
  • If more than 900 chunks are required, multiply the chunk size by 64 and retry.
  • The max chunk size is 5GB.

Given this the chunk sizes are:

  • 8MB
  • 512MB
  • 5GB

Example in javascript:

const START_CHUNK_SIZE = 8 * 1024 * 1024;
const START_MULTIPART_SIZE = START_CHUNK_SIZE * 2;
const MAX_CHUNK_SIZE = 5 * 1024 * 1024 * 1024;
const MAX_PARTS = 900;
const CHUNK_MULT = 64;
const SIZE_BREAK1 = START_CHUNK_SIZE * MAX_PARTS;
const SIZE_BREAK2 = START_CHUNK_SIZE * CHUNK_MULT * MAX_PARTS;

exports.calcChunkSize = calcChunkSize;
exports.START_MULTIPART_SIZE = START_MULTIPART_SIZE;

function calcChunkSize(size) {
  let chunk_size;
  if (size <= START_MULTIPART_SIZE) {
    chunk_size = START_MULTIPART_SIZE;
  } else if (size <= SIZE_BREAK1) {
    chunk_size = START_CHUNK_SIZE;
  } else if (size <= SIZE_BREAK2) {
    chunk_size = START_CHUNK_SIZE * CHUNK_MULT;
  } else {
    chunk_size = MAX_CHUNK_SIZE;
  }
  return chunk_size;
}

Clone this wiki locally