-
Notifications
You must be signed in to change notification settings - Fork 0
fix(curation): Use matches_hash for statement hash due to string issues #64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe code modifies statement identifier extraction in two functions: Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Failed to generate code suggestions for PR |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## devel #64 +/- ##
==========================================
- Coverage 57.90% 57.89% -0.02%
==========================================
Files 7 7
Lines 1354 1356 +2
==========================================
+ Hits 784 785 +1
- Misses 570 571 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
R/utils_getSubnetworkFromIndra.R (1)
128-129: Consider extracting common JSON parsing logic.The same JSON parsing pattern appears in both
.filterIndraResponseand.constructEdgesDataFrame. Consider extracting this into a helper function to reduce duplication and ensure consistent error handling.Example helper function to add:
#' Extract matches_hash from INDRA statement JSON #' @param stmt_json_string JSON string from INDRA statement #' @return matches_hash value or NA_character_ on error #' @keywords internal #' @noRd .extractMatchesHashFromStmtJson <- function(stmt_json_string) { tryCatch({ stmt_json <- fromJSON(stmt_json_string) if (is.null(stmt_json$matches_hash)) { return(NA_character_) } as.character(stmt_json$matches_hash) }, error = function(e) { warning(paste("Failed to parse stmt_json:", e$message)) return(NA_character_) }) }Then use it in both locations:
# In .filterIndraResponse stmt_hash <- .extractMatchesHashFromStmtJson(res[[i]]$data$stmt_json) if (is.na(stmt_hash)) next # In .constructEdgesDataFrame stmt_hash = vapply(keys(res), function(x) { .extractMatchesHashFromStmtJson(query(res, x)$data$stmt_json) }, "")Also applies to: 304-307
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
R/utils_getSubnetworkFromIndra.R(2 hunks)
🔇 Additional comments (1)
R/utils_getSubnetworkFromIndra.R (1)
128-129: Commit message confirms the PR addresses string issues with statement hashes, and code changes are systematic and complete.The verification confirms that:
PR intent is validated: The commit message explicitly states "fix(curation): Use matches_hash for statement hash due to string issues," directly addressing the concern about "string issues."
Changes are consistent and complete: Both code locations (lines 129 and 305-306) apply the same pattern—parsing
stmt_jsonand extractingmatches_hashinstead of directly accessing$data$stmt_hash. A search for remaining direct$data$stmt_hashaccesses found none, confirming all references were updated.Implementation is systematic: The change consistently uses
fromJSON(res[[i]]$data$stmt_json)orfromJSON(query(res, x)$data$stmt_json)followed by field extraction ($matches_hash), confirmingstmt_jsonis an expected INDRA response field.Tests exist: Test file at
./tests/testthat/test-getSubnetworkFromIndra.Rvalidates basicgetSubnetworkFromIndrafunctionality using a mocked INDRA response.Limitation: The sandbox environment cannot verify that production INDRA API responses consistently contain
stmt_jsonwith amatches_hashfield, nor can tests be executed to confirm the curation filtering logic works correctly with the new field extraction pattern. Manual testing with actual INDRA API responses is recommended to fully validate the fix.
| stmt_json <- fromJSON(res[[i]]$data$stmt_json) | ||
| stmt_hash <- stmt_json$matches_hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for JSON parsing and field validation.
The fromJSON call lacks error handling and there's no validation that matches_hash exists in the parsed JSON. If the INDRA API returns malformed JSON or if the structure changes, this will cause runtime failures.
Apply this diff to add defensive error handling:
for (i in seq_along(res)) {
- stmt_json <- fromJSON(res[[i]]$data$stmt_json)
- stmt_hash <- stmt_json$matches_hash
+ stmt_json <- tryCatch(
+ fromJSON(res[[i]]$data$stmt_json),
+ error = function(e) {
+ warning(paste("Failed to parse stmt_json for statement", i, ":", e$message))
+ return(NULL)
+ }
+ )
+ if (is.null(stmt_json) || is.null(stmt_json$matches_hash)) {
+ warning(paste("Missing matches_hash for statement", i))
+ next
+ }
+ stmt_hash <- stmt_json$matches_hash
incorrect_count <- .get_incorrect_curation_count(stmt_hash, api_key)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| stmt_json <- fromJSON(res[[i]]$data$stmt_json) | |
| stmt_hash <- stmt_json$matches_hash | |
| stmt_json <- tryCatch( | |
| fromJSON(res[[i]]$data$stmt_json), | |
| error = function(e) { | |
| warning(paste("Failed to parse stmt_json for statement", i, ":", e$message)) | |
| return(NULL) | |
| } | |
| ) | |
| if (is.null(stmt_json) || is.null(stmt_json$matches_hash)) { | |
| warning(paste("Missing matches_hash for statement", i)) | |
| next | |
| } | |
| stmt_hash <- stmt_json$matches_hash |
🤖 Prompt for AI Agents
In R/utils_getSubnetworkFromIndra.R around lines 128-129, the call stmt_json <-
fromJSON(res[[i]]$data$stmt_json) and immediate access to stmt_json$matches_hash
assume valid JSON and presence of matches_hash; add defensive error handling:
wrap fromJSON in tryCatch to catch/parsing errors (log/warn and skip or set
stmt_hash to NA), validate that parsed object is a list and contains the
matches_hash field before accessing it, handle missing or NULL matches_hash
(log/warn and skip or set NA), and ensure subsequent code branches handle
NA/absence safely so malformed or changed INDRA responses do not cause runtime
failures.
| stmt_hash = vapply(keys(res), function(x) { | ||
| as.character(query(res, x)$data$stmt_hash) | ||
| stmt_json <- fromJSON(query(res, x)$data$stmt_json) | ||
| stmt_json$matches_hash | ||
| }, ""), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling and validation for JSON parsing in vapply.
Similar to the issue in .filterIndraResponse, this fromJSON call lacks error handling. Additionally, since this is inside a vapply expecting character output, any parsing failure will cause a type mismatch error that's harder to debug.
Apply this diff to add defensive error handling:
stmt_hash = vapply(keys(res), function(x) {
- stmt_json <- fromJSON(query(res, x)$data$stmt_json)
- stmt_json$matches_hash
+ tryCatch({
+ stmt_json <- fromJSON(query(res, x)$data$stmt_json)
+ if (is.null(stmt_json$matches_hash)) {
+ return(NA_character_)
+ }
+ stmt_json$matches_hash
+ }, error = function(e) {
+ warning(paste("Failed to parse stmt_json for key", x, ":", e$message))
+ return(NA_character_)
+ })
}, ""),🤖 Prompt for AI Agents
In R/utils_getSubnetworkFromIndra.R around lines 304 to 307, the inline
fromJSON(query(res, x)$data$stmt_json) call inside vapply lacks error handling
and can break the vapply's expected character return; wrap the JSON parsing in a
tryCatch that first verifies query(res, x)$data$stmt_json exists and is
non-empty, attempt fromJSON inside tryCatch, check that the parsed object has a
character matches_hash field, and on any error or missing/invalid value return a
safe default (e.g., "" or NA_character_) so the vapply always returns a
character vector and failures are logged or flagged for debugging.
Summary by CodeRabbit