-
Notifications
You must be signed in to change notification settings - Fork 508
PS-10347 [8.0]: Ensure strict JSON compliance and robust field separation in audit_log_read()/AuditJsonHandler #5776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dlenev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello Przemek!
I have a few questions/suggestions about this patch. Please see below.
Otherwise code changes look fine to me.
Do you plan to squash the second commit in the first one before push?
If not then I think it needs to reference some jira ticket and have its own [8.0] tag in the title.
plugin/audit_log_filter/tests/mtr/t/udf_audit_log_read_validate_output.test
Outdated
Show resolved
Hide resolved
…tion in audit_log_read()/AuditJsonHandler Refactors the JSON serialization logic within AuditJsonHandler to guarantee strict compliance with the JSON standard by eliminating trailing commas and centralizing field separation control. The previous approach of appending ", " after every value handler was inconsistent and required error-prone comma removal logic in EndObject. This change adopts a safer, state-driven approach: - Centralized Comma Management: The responsibility for adding the comma separator is moved entirely from the value handlers (Int, String, etc.) to the Key() handler. - State-Driven Separation: The new m_is_first_field state flag, set in StartObject() and checked/updated in Key(), ensures a comma is prepended only when necessary (i.e., not for the first field), thereby naturally preventing trailing commas within objects. - Inter-Event Separation: Confirmed that the ,\n separator is correctly appended to separate top-level audit event objects in the array.
Manish reported a similar issue so I decided to create a separate JIRA ticket: https://perconadev.atlassian.net/browse/PS-10387 |
…d pagination issues The `audit_log_read()` UDF was failing to respect the `max_array_length` parameter, returning all records instead of the specified limit. This was caused by the read loop not checking the `is_batch_end` flag. Additionally, attempting to read the remaining records in a subsequent call caused an infinite loop or parsing errors. This occurred because a new `rapidjson::Reader` was created for each call, losing the internal state required to resume parsing mid-stream (e.g., handling the comma separator between array elements). The fix involves: - Respecting the `is_batch_end` flag in the `AuditLogReader::read` loop to stop processing when the limit is reached. - Storing the `rapidjson::Reader` instance within `AuditLogReaderContext` to preserve parsing state across multiple `audit_log_read()` calls. - Adding error checking for `reader->HasParseError()` to prevent infinite loops on malformed data or state mismatches. - Updating the `udf_audit_log_read_validate_output` test case to verify correct behavior for `max_array_length`.
dlenev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
The previous approach of appending ", " after every value handler was inconsistent and required error-prone comma removal logic in EndObject.
This change adopts a safer, state-driven approach:
audit_log_read()UDF was failing to respect themax_array_lengthparameter, returning all records instead of the specified limit. This was caused by the read loop not checking theis_batch_endflag.Additionally, attempting to read the remaining records in a subsequent call caused an infinite loop or parsing errors. This occurred because a new
rapidjson::Readerwas created for each call, losing the internal state required to resume parsing mid-stream (e.g., handling the comma separator between array elements).The fix involves:
is_batch_endflag in theAuditLogReader::readloop to stop processing when the limit is reached.rapidjson::Readerinstance withinAuditLogReaderContextto preserve parsing state across multipleaudit_log_read()calls.reader->HasParseError()to prevent infinite loops on malformed data or state mismatches.udf_audit_log_read_validate_outputtest case to verify correct behavior formax_array_length.