Skip to content

Conversation

@inikep
Copy link
Collaborator

@inikep inikep commented Dec 5, 2025

  1. Refactors the JSON serialization logic within AuditJsonHandler to guarantee strict compliance with the JSON standard by eliminating trailing commas and centralizing field separation control.

The previous approach of appending ", " after every value handler was inconsistent and required error-prone comma removal logic in EndObject.

This change adopts a safer, state-driven approach:

  • Centralized Comma Management: The responsibility for adding the comma separator is moved entirely from the value handlers (Int, String, etc.) to the Key() handler.
  • State-Driven Separation: The new m_is_first_field state flag, set in StartObject() and checked/updated in Key(), ensures a comma is prepended only when necessary (i.e., not for the first field), thereby naturally preventing trailing commas within objects.
  • Inter-Event Separation: Confirmed that the ,\n separator is correctly appended to separate top-level audit event objects in the array.
  1. The audit_log_read() UDF was failing to respect the max_array_length parameter, returning all records instead of the specified limit. This was caused by the read loop not checking the is_batch_end flag.

Additionally, attempting to read the remaining records in a subsequent call caused an infinite loop or parsing errors. This occurred because a new rapidjson::Reader was created for each call, losing the internal state required to resume parsing mid-stream (e.g., handling the comma separator between array elements).

The fix involves:

  • Respecting the is_batch_end flag in the AuditLogReader::read loop to stop processing when the limit is reached.
  • Storing the rapidjson::Reader instance within AuditLogReaderContext to preserve parsing state across multiple audit_log_read() calls.
  • Adding error checking for reader->HasParseError() to prevent infinite loops on malformed data or state mismatches.
  • Updating the udf_audit_log_read_validate_output test case to verify correct behavior for max_array_length.

@inikep inikep requested a review from dlenev December 5, 2025 08:29
Copy link
Contributor

@dlenev dlenev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Przemek!

I have a few questions/suggestions about this patch. Please see below.
Otherwise code changes look fine to me.

Do you plan to squash the second commit in the first one before push?
If not then I think it needs to reference some jira ticket and have its own [8.0] tag in the title.

…tion in audit_log_read()/AuditJsonHandler

Refactors the JSON serialization logic within AuditJsonHandler to guarantee strict compliance with the JSON standard by eliminating trailing commas and centralizing field separation control.

The previous approach of appending ", " after every value handler was inconsistent and required error-prone comma removal logic in EndObject.

This change adopts a safer, state-driven approach:
- Centralized Comma Management: The responsibility for adding the comma separator is moved entirely from the value handlers (Int, String, etc.) to the Key() handler.
- State-Driven Separation: The new m_is_first_field state flag, set in StartObject() and checked/updated in Key(), ensures a comma is prepended only when necessary (i.e., not for the first field), thereby naturally preventing trailing commas within objects.
- Inter-Event Separation: Confirmed that the ,\n separator is correctly appended to separate top-level audit event objects in the array.
@inikep
Copy link
Collaborator Author

inikep commented Dec 18, 2025

Do you plan to squash the second commit in the first one before push? If not then I think it needs to reference some jira ticket and have its own [8.0] tag in the title.

Manish reported a similar issue so I decided to create a separate JIRA ticket: https://perconadev.atlassian.net/browse/PS-10387

…d pagination issues

The `audit_log_read()` UDF was failing to respect the `max_array_length` parameter, returning all records instead of the specified limit. This was caused by the read loop not checking the `is_batch_end` flag.

Additionally, attempting to read the remaining records in a subsequent call caused an infinite loop or parsing errors. This occurred because a new `rapidjson::Reader` was created for each call, losing the internal state required to resume parsing mid-stream (e.g., handling the comma separator between array elements).

The fix involves:
- Respecting the `is_batch_end` flag in the `AuditLogReader::read` loop to stop processing when the limit is reached.
- Storing the `rapidjson::Reader` instance within `AuditLogReaderContext` to preserve parsing state across multiple `audit_log_read()` calls.
- Adding error checking for `reader->HasParseError()` to prevent infinite loops on malformed data or state mismatches.
- Updating the `udf_audit_log_read_validate_output` test case to verify correct behavior for `max_array_length`.
Copy link
Contributor

@dlenev dlenev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@inikep inikep merged commit 74dd764 into percona:8.0 Dec 30, 2025
22 of 23 checks passed
@inikep inikep deleted the PS-10347-8.0 branch December 30, 2025 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants