Skip to content

Conversation

@loewenheim
Copy link
Contributor

@loewenheim loewenheim commented Dec 15, 2025

This adds a new trimming processor specifically for EAP items. The reason we can't just reuse the existing processor is the size calculation logic. The existing processor calculates the sizes of values using a special serializer. For EAP items, specfically their attributes, we want to use the attribute size logic defined in eap::size.

Attributes are processed in order from smallest to greatest total (key + value) size. The effect of this is that small attributes are more likely to be preserved. Attribute keys are taken into account for the size, but never trimmed—if a key is too long the attribute is just removed.

ref: #5362. ref: RELAY-177.

This adds a new trimming processor specifically for EAP items. The reason
we can't just reuse the existing processor is the size calculation logic.
The existing processor calculates the sizes of values using a special
serializer. For EAP items, specfically their attributes, we want to use
the attribute size logic defined in `eap::size`.

Attributes are processed in order from smallest to greatest total (key
+ value) size. The effect of this is that small attributes are more
likely to be preserved. Attribute keys are taken into account for the
size, but never trimmed—if a key is too long the attribute is just
removed.
@loewenheim loewenheim requested a review from a team as a code owner December 15, 2025 13:36
@linear
Copy link

linear bot commented Dec 15, 2025

Copy link
Member

@Dav1dde Dav1dde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a good base to add more features like special keys, adding in support for conventions later.

If you can think of more examples and edge cases to write tests for, I think it would be a good time to do that now, these snapshots will come in very handy for every future change.

Comment on lines +43 to +45
// Heuristic to avoid trimming a value like `[1, 1, 1, 1, ...]` into `[null, null, null,
// null, ...]`, making it take up more space.
self.remaining_depth(state) == Some(1) && !value.is_empty()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example/test for this? We might need the null cases to individually attach metadata to each of the nodes, deleting the parent will delete the child metadata (which maybe is okay).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied that for parity with the other trimming processor, but that doesn't seem to have a test for it either :/

@loewenheim
Copy link
Contributor Author

Seems a good base to add more features like special keys, adding in support for conventions later.

If you can think of more examples and edge cases to write tests for, I think it would be a good time to do that now, these snapshots will come in very handy for every future change.

What I'd like to test is "dynamic" max chars/max bytes (similar to dynamic PII, i.e. being able to compute it with a function). Implementing that feature isn't difficult; now that I've done it for PII I think it should be very similar. I just can't think of how to mock it for testing purposes, since this processor's logic is directly tied to Attributes/AttributeValues.

@Dav1dde
Copy link
Member

Dav1dde commented Dec 16, 2025

@loewenheim sounds good, I like if we split the work and tackle the modifications afterwards and just merge this as is. I think for these future changes we'll benefit from a more extensive testsuite, so maybe adding more (snapshot) tests now will pay off soon, but obviously that's up to you.

:shipit:

@loewenheim
Copy link
Contributor Author

As written, the processor currently "overaccepts" numerical/boolean attributes. This means that if e.g. a u64 attribute value would cause the size limit to be exceeded, we accept that attribute anyway and only start discarding attributes starting with the next one (the test_overaccept_number test demonstrates this). I believe this is acceptable because

  1. it can cause us to overaccept 7B at maximum, so the effect is bounded
  2. attribute key lengths don't vary that widely; I would expect attribute sizes to be determined mostly by string lengths. As a consequence I would expect it to be very unlikely for a numerical/boolean attribute to be the cutoff in the first place.

@Dav1dde does that make sense?

Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to revert my opinion on whether this should be a separate processor:

  1. The two trimming processors now share a lot of copy-pasted code.
  2. The difference seems to be mainly in process_attributes, which the original trimming processor does not override.

Alternatively, we could deduplicate code by embedding a trimming::TrimmingProcessor within the new one, and delegating to it.

@loewenheim
Copy link
Contributor Author

loewenheim commented Dec 17, 2025

Alternatively, we could deduplicate code by embedding a trimming::TrimmingProcessor within the new one, and delegating to it.

I don't think that would work because of the size calculation logic.

I agree that the duplication is super unfortunate, but I don't see how we can avoid it without making it possible to generalize size calculations.

@jjbayer
Copy link
Member

jjbayer commented Dec 17, 2025

I don't think that would work because of the size calculation logic.

I agree that the duplication is super unfortunate, but I don't see how we can avoid it without making it possible to generalize size calculations.

Alright, not a blocker so feel free to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants