Skip to content

Conversation

@joelochlann
Copy link
Member

@joelochlann joelochlann commented Jan 29, 2026

Depends on #4608, currently diffed to that branch

how to test

  • run npm run test:integration and see the large image tests now pass
  • deploy to test and look at the logging for images coming in. check:
    • How much to they get resized to?
    • How many attempts does it take?
    • Do we ever run out of attempts?
    • What is the CPU and memory usage of the lambda?

@joelochlann joelochlann requested a review from a team as a code owner January 29, 2026 10:08
// We use sqrt because scaling affects both dimensions
const sizeRatio = MAX_IMAGE_SIZE_BYTES / imageBytes.length;
// Be conservative: target 80% of max size to account for compression variance
let scaleFactor = Math.sqrt(sizeRatio * 0.8);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to think through if this square root thing is actually right... 🤔

You have a 4x4 image = 16 pixels = (let's pretend one byte per pixel, uncompressed) = 16 bytes
You want to halve the size to 8 bytes
If you simply halve the width & height to 2x2, you'll get 4 pixels = 4 times smaller than 16 bytes

Whereas if you scale width & height by 1/sqrt(2) you get:
about 2.8 * 2.8 = 7.84 = about 8

OK I get it!

@joelochlann joelochlann added the feature Departmental tracking: work on a new feature label Jan 29, 2026
@github-actions
Copy link

github-actions bot commented Jan 29, 2026

@paperboyo
Copy link
Contributor

Brute force seems the only way to ensure they will fall below certain filesize indeed. Sharp is a good choice (based on Vips) if lambda can haz any.

Without either reading about model behaviour with different resolution/compression and other characteristics (eg. does transparency even matter?) or doing a lot of tests, it’s impossible to know what are the best compromises (better larger and more compressed or the other way around? fine to bake transparency into JPEGs or is transparency taken seriously? etc).

One thing I can’t think wouldn’t be sensible is to convert everything to sRGB (this, I think).

@joelochlann
Copy link
Member Author

Thanks @paperboyo, those are great points.

Interestingly, Cohere v4 (which we may move to soon), "images > 2,458,624 pixels are downsampled to that size; images < 3,136 pixels are upsampled".

This makes me think that I should actually target 2.4 megapixels first, and see what size those typically come out at. If they're usually under 5 MiB, I think that's a reasonable default for downscaling. We can iteratively downscale for edge cases

@joelochlann
Copy link
Member Author

joelochlann commented Jan 30, 2026

@paperboyo forgive me image ignorance, why is converting everything to sRGB a good idea? Would this save image bytes?

@paperboyo
Copy link
Contributor

why is converting everything to sRGB a good idea

This is but a hunch. Our corpus may stray more from sRGB which is the most prevalent on the web. Hunch is that robots weren’t trained to make better decisions on what they are looking at thanks to colourspace. It’s possible, they may just not see as well. Or – barf completely on something exotic colourmodels.
Impossible to know for sure without looking into robot’s intestines, ofc.

Would this same image bytes?

Converting would change pixel values. So that pixel differences that look the same would translate to similar numerical difference. Without conversion, they may give you smaller numerical difference. The smaller, relatively, the wider than sRGB the colourspace is (sRGB is the smallest space).
Converting to (s)RGB colourmodel will make them pixels comparable, otherwise we will feed robots CMYKs, 1bit and Lab. Rarely, but sometimes. Again, unless they convert themselves (or can work on different models: unlikely), this should be safer.

I think.

Copy link
Contributor

@ellenmuller ellenmuller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some comments - not blockers :) Worked a dream when I tested it!!! 😍

Comment on lines 115 to 120
let result: Uint8Array = imageBytes;
let attempts = 0;
const maxAttempts = 5;

while (result.length > MAX_IMAGE_SIZE_BYTES && attempts < maxAttempts) {
attempts++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might prefer this as a for loop, eg:

for (
    let attempts = 1;
    attempts <= maxAttempts && result.length > MAX_IMAGE_SIZE_BYTES;
    attempts++
  )

Kind of personal preference, but it reads nicer imo!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer the while loop. Or perhaps even a recursive function...

Comment on lines 121 to 140
const newWidth = Math.round(originalWidth * scaleFactor);
const newHeight = Math.round(originalHeight * scaleFactor);

console.log(`Attempt ${attempts}: scaling to ${newWidth}x${newHeight} (factor: ${scaleFactor.toFixed(3)})`);

let pipeline = sharp(imageBytes).resize(newWidth, newHeight, {
fit: "inside",
withoutEnlargement: true,
});

// Output in the same format
if (mimeType === "image/jpeg") {
pipeline = pipeline.jpeg({ quality: 85 });
} else if (mimeType === "image/png") {
pipeline = pipeline.png({ compressionLevel: 9 });
}

result = new Uint8Array(await pipeline.toBuffer());
console.log(`Result size: ${result.length} bytes`);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably be split off into its own function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree that this function is a little long, let me see if I can re-structure a bit

Base automatically changed from js-integration-test-for-large-images to main January 30, 2026 12:42
@joelochlann
Copy link
Member Author

OK, I got curious (always the best and worst thing that can happen) and had a look at what was happening in detail, rather than just the test pass/fail.

The file size change is much more aggressive than I would expect, coming out e.g. 0.14 times the size despite a scale factor of 0.9:

  console.log
    Image size 5242970 bytes exceeds 5242880 limit, downscaling...

      at downscaleImageIfNeeded (src/index.ts:109:11)

  console.log
    Original size 3454x2303, 7.95 MP

      at downscaleImageIfNeeded (src/index.ts:115:11)

  console.log
    Attempt 1: scaling to 3089x2060, 6.36 MP (factor: 0.894)

      at downscaleImageIfNeeded (src/index.ts:136:13)

  console.log
    Result size: 740976 bytes

      at downscaleImageIfNeeded (src/index.ts:153:13)

  console.log
    Downscaled from 5242970 to 740976 bytes

Perhaps this isn't a bad thing if the resulting image still looks OK, but I should (a) understand why this is happening and (b) eyeball the resulting images.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Departmental tracking: work on a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants