Skip to content

Conversation

@tonywu1999
Copy link
Contributor

@tonywu1999 tonywu1999 commented Nov 5, 2025

Summary by CodeRabbit

  • New Features
    • Network edges are clickable and open associated evidence links in a new browser tab with URL validation and error handling.
    • This behavior is active in both exported HTML files and in-browser previews.
    • Edges now include a statement identifier (statement hash) in their data for easier reference.

@coderabbitai
Copy link

coderabbitai bot commented Nov 5, 2025

Walkthrough

Adds a JS helper to safely open http(s) links and an internal R factory createEdgeClickHandler() wired into Cytoscape HTML export/preview to open edge evidenceLinks; updates duplicate-edge collapsing/enrichment logic and adds a stmt_hash column to constructed edges.

Changes

Cohort / File(s) Summary
HTML visualization — edge click behavior
R/visualizeNetworksWithHTML.R
Adds openLinkInNewTab(url) JS (validation + error handling); adds internal createEdgeClickHandler() returning an edge_click handler that extracts evidenceLink and calls the JS helper; constructs event_handlers in exportCytoscapeToHTML() / previewNetworkInBrowser() and passes it into generateCytoscapeConfig(); includes the JS in exported HTML.
Edges data — stmt_hash & collapsing logic
R/utils_getSubnetworkFromIndra.R
Adds stmt_hash column to edges produced by .constructEdgesDataFrame(); modifies .collapseDuplicateEdgesIntoEdgeToMetadataMapping() to enrich edges on insertion via .addAdditionalMetadataToIndraEdge(), set paper_count to 1 (TODO noted), and replace existing mapping only when an incoming edge has greater evidence_count.

Sequence Diagram

sequenceDiagram
    participant User
    participant HTML as "Exported HTML (Cytoscape)"
    participant R as "R createEdgeClickHandler"
    participant JS as "openLinkInNewTab()"

    User->>HTML: Click edge
    HTML->>R: emit edge_click with edge data (includes evidenceLink, stmt_hash)
    R->>R: edge_click handler reads evidenceLink
    R->>JS: call openLinkInNewTab(evidenceLink)
    JS->>JS: validate URL (http/https) and try window.open
    JS-->>User: new tab opened or error logged
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

  • Review focus:
    • JS URL validation and window.open error handling.
    • Wiring of event_handlers into generateCytoscapeConfig() and HTML export.
    • Changes in .collapseDuplicateEdgesIntoEdgeToMetadataMapping() (paper_count, enrichment, replacement condition) and impact of adding stmt_hash.

Possibly related PRs

Suggested labels

Review effort 2/5

Poem

🐰 I hopped along the graphy trail,
I hopped a click and chased a tale.
A link sanitized, a tab set free,
Evidence found for you and me.
Hooray — a rabbit's little glee!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is completely empty with no substantive content. All template sections lack details, and the checklist items are unchecked. Fill in Motivation and Context, provide detailed Changes list, describe Testing performed, and verify all checklist items are completed before requesting review.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main feature: adding edge clicking functionality to open URLs in regular HTML export.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch link-clicking

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 14c4b19 and 5af1e08.

📒 Files selected for processing (1)
  • R/utils_getSubnetworkFromIndra.R (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: pkgdown
  • GitHub Check: test-coverage
  • GitHub Check: build
🔇 Additional comments (1)
R/utils_getSubnetworkFromIndra.R (1)

303-305: LGTM! Statement hash extraction enables edge interaction.

The stmt_hash column addition is correctly implemented and consistent with other column extractions in this function. This metadata supports the edge-clicking feature described in the PR objectives.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Nov 5, 2025

Failed to generate code suggestions for PR

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
R/visualizeNetworksWithHTML.R (1)

1087-1090: Optional: Consider whether debug handler is still needed.

This generic edge click handler (logging to console) will fire alongside the new link-opening handler from the configuration. Multiple handlers on the same event is valid in Cytoscape.js, but if the console logging is only for debugging, you might consider removing or conditionalizing it.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e265c55 and 672e33f.

📒 Files selected for processing (1)
  • R/visualizeNetworksWithHTML.R (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: pkgdown
  • GitHub Check: test-coverage
  • GitHub Check: build
🔇 Additional comments (4)
R/visualizeNetworksWithHTML.R (4)

832-848: Well-implemented security for opening external links.

The openLinkInNewTab function includes proper safeguards:

  • Protocol restriction to HTTP(S) only prevents javascript:, data:, and file: URI attacks
  • noopener,noreferrer attributes and explicit win.opener = null prevent tab-nabbing
  • Input validation filters out empty, NA, and non-string values
  • Error handling catches exceptions gracefully

1122-1133: LGTM - Clean event handler implementation.

The edge click handler properly retrieves the evidenceLink from edge data and delegates to the security-validated openLinkInNewTab function. The validation checks here overlap slightly with those in openLinkInNewTab, but this defensive approach is reasonable.


1153-1162: Consistent integration of edge click handlers.

The modifications properly create and pass event handlers to the configuration pipeline. The approach is consistent with the pattern used in previewNetworkInBrowser.


1182-1191: LGTM - Parallel implementation for preview function.

The changes mirror the pattern in exportNetworkToHTML, ensuring both export and preview paths support edge click functionality consistently.

@codecov-commenter
Copy link

codecov-commenter commented Nov 5, 2025

Codecov Report

❌ Patch coverage is 12.50000% with 42 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.90%. Comparing base (e265c55) to head (5af1e08).

Files with missing lines Patch % Lines
R/visualizeNetworksWithHTML.R 0.00% 42 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            devel      #63      +/-   ##
==========================================
- Coverage   59.69%   57.90%   -1.80%     
==========================================
  Files           7        7              
  Lines        1315     1354      +39     
==========================================
- Hits          785      784       -1     
- Misses        530      570      +40     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 02b006a and 14c4b19.

📒 Files selected for processing (1)
  • R/utils_getSubnetworkFromIndra.R (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: pkgdown
  • GitHub Check: test-coverage
  • GitHub Check: build
🔇 Additional comments (1)
R/utils_getSubnetworkFromIndra.R (1)

303-305: The original review comment is based on incorrect assumptions about stmt_hash usage.

The code adds stmt_hash to the edges data.frame (lines 303-305), but this field is never actually used in the visualization layer. Edge clicking to URLs is already implemented using the evidenceLink column, not stmt_hash. The stmt_hash column is only used internally for INDRA curation API filtering (in .get_incorrect_curation_count()). It does not contribute to the "edge clicking" feature mentioned in the review.

Likely an incorrect or invalid review comment.

Comment on lines 256 to 261
edge <- .addAdditionalMetadataToIndraEdge(edge, input)
edge$data$paper_count <- 1 # TODO: fix paper count
if (!key %in% keys(edgeToMetadataMapping) ||
edge$data$evidence_count > edgeToMetadataMapping[[key]]$data$evidence_count) {
edgeToMetadataMapping[[key]] <- edge
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the behavioral change from accumulating to replacing duplicate edges.

The logic has changed from accumulating evidence across duplicate edges to keeping only the edge with the highest evidence_count. This is a significant behavioral change that:

  1. Discards potentially valuable metadata from lower-evidence duplicates
  2. Calls .addAdditionalMetadataToIndraEdge() for every edge, even those that will be discarded (performance concern)
  3. Has no tie-breaking logic when evidence_count values are equal

Please clarify:

  • Is this behavioral change intentional for the URL-clicking feature?
  • Should evidence be accumulated across duplicates instead of replaced?
  • Can .addAdditionalMetadataToIndraEdge() be moved after the condition to avoid enriching edges that will be discarded?

Run the following script to check if this function is tested and whether existing tests catch this behavioral change:


🏁 Script executed:

#!/bin/bash
# Description: Search for tests of .collapseDuplicateEdgesIntoEdgeToMetadataMapping and usage patterns

# Find test files
echo "=== Test files for this function ==="
rg -l "collapseDuplicateEdgesIntoEdgeToMetadataMapping" --type=r

# Check for test expectations around evidence_count accumulation
echo -e "\n=== Test expectations for evidence_count ==="
rg -A 5 -B 5 "evidence_count.*collapseDuplicate|collapseDuplicate.*evidence_count" --type=r

# Look for documentation about duplicate edge handling
echo -e "\n=== Documentation about duplicate edges ==="
rg -i "duplicate.*edge|collapse.*edge" --type=r -g "*.Rd" -g "*.md"

Length of output: 160


I need to dig deeper into the codebase to understand the function implementation and its context.

Let me search directly in the codebase for the function implementation and context:


🏁 Script executed:

#!/bin/bash
# Description: Find function definition and surrounding context

echo "=== Finding collapseDuplicateEdgesIntoEdgeToMetadataMapping function definition ==="
rg -n "collapseDuplicateEdgesIntoEdgeToMetadataMapping" -A 30 -B 5 R/utils_getSubnetworkFromIndra.R --type=r

echo -e "\n=== Looking at the structure around line 256 ==="
sed -n '240,280p' R/utils_getSubnetworkFromIndra.R

Length of output: 282


🏁 Script executed:

#!/bin/bash
# Description: Find function definition and surrounding context in cloned repository

echo "=== Finding file ==="
find . -name "utils_getSubnetworkFromIndra.R" -type f

echo -e "\n=== Finding collapseDuplicateEdgesIntoEdgeToMetadataMapping function definition ==="
rg -n "collapseDuplicateEdgesIntoEdgeToMetadataMapping" -A 30 -B 5 --type=r

echo -e "\n=== Looking at the structure around line 256 for better context ==="
rg -n "\.addAdditionalMetadataToIndraEdge" -A 10 -B 10 --type=r | head -100

Length of output: 8449


🏁 Script executed:

#!/bin/bash
# Description: Check git history for this function and evidence handling

echo "=== Check git history for collapseDuplicateEdgesIntoEdgeToMetadataMapping ==="
git log -p --all -S "collapseDuplicateEdgesIntoEdgeToMetadataMapping" -- "R/utils_getSubnetworkFromIndra.R" | head -150

echo -e "\n=== Check git log to see recent changes ==="
git log --oneline -20 -- "R/utils_getSubnetworkFromIndra.R"

echo -e "\n=== Search for comments about evidence accumulation ==="
rg -i "accumulat|aggregate|combine.*evidence" -A 2 -B 2 --type=r

Length of output: 2937


🏁 Script executed:

#!/bin/bash
# Description: Check the actual diff of the commit that modified evidence_count logic

echo "=== Show commit 14c4b19 diff ==="
git show 14c4b19 -- "R/utils_getSubnetworkFromIndra.R" | head -200

echo -e "\n=== Show the commit message and metadata ==="
git show --no-patch --format=fuller 14c4b19

echo -e "\n=== Check what changed in collapseDuplicateEdgesIntoEdgeToMetadataMapping specifically ==="
git show 14c4b19:R/utils_getSubnetworkFromIndra.R 2>/dev/null | sed -n '240,270p'

Length of output: 2974


Behavioral change from evidence accumulation to replacement requires fixes and documentation.

The git history confirms this change was intentional (commit 14c4b19: "modify which statement is captured based on max evidence count"), but there are critical issues:

  1. Evidence accumulation removed: Previously duplicates had evidence_count and paper_count accumulated (+=); now only the edge with highest evidence_count is kept. This loses accumulated evidence semantics unless each INDRA edge already represents combined evidence.

  2. Metadata enrichment inefficiency: .addAdditionalMetadataToIndraEdge() is now called unconditionally for all edges (line 256), but only edges passing the evidence_count check are kept (line 260). This performs unnecessary work on discarded edges.

  3. Broken paper_count: Previously accumulated actual paper count; now hardcoded to 1 (line 257) with a TODO comment, indicating incomplete implementation.

  4. Missing tie-breaker: When evidence_count values are equal, the first edge encountered wins by default. Consider using >= or adding a secondary sort criterion.

Recommended fixes:

  • Move .addAdditionalMetadataToIndraEdge() inside the conditional to avoid enriching discarded edges
  • Document why evidence accumulation was removed and confirm INDRA edges already represent combined evidence
  • Fix paper_count to properly track unique papers or clarify its intended meaning
  • Add tie-breaking logic for equal evidence_count values
🤖 Prompt for AI Agents
In R/utils_getSubnetworkFromIndra.R around lines 256-261, the change replaced
evidence accumulation with a replace-on-max-evidence strategy and introduced
three regressions: unnecessary unconditional metadata enrichment, hardcoded
paper_count = 1, and no tie-breaker for equal evidence_count; to fix, move the
.addAdditionalMetadataToIndraEdge(edge, input) call inside the conditional so
enrichment only occurs for the kept/updated edge, restore accumulation semantics
for evidence_count and paper_count (e.g., add existing edge$data$evidence_count
and merge unique paper identifiers to compute paper_count) unless you explicitly
document that INDRA edges are already aggregated, and implement a deterministic
tie-breaker when evidence_count is equal (for example prefer larger paper_count
or use timestamp/source priority) and add a short comment documenting the
behavioral change and rationale.

edge <- .addAdditionalMetadataToIndraEdge(edge, input)
edge$data$paper_count <- 1
edge <- .addAdditionalMetadataToIndraEdge(edge, input)
edge$data$paper_count <- 1 # TODO: fix paper count
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

TODO indicates incomplete paper_count implementation.

Setting paper_count to a hardcoded value of 1 with a TODO comment suggests this feature is incomplete. This could lead to incorrect paper count metrics in the network visualization and downstream analysis.

Verify whether paper_count should be:

  • Extracted from edge$data similar to evidence_count
  • Calculated from source_counts JSON
  • Accumulated across duplicate edges

Additionally, consider whether this PR should be merged with an incomplete implementation, or if the paper_count feature should be properly implemented or removed before merge.

🤖 Prompt for AI Agents
In R/utils_getSubnetworkFromIndra.R around line 257, the code sets
edge$data$paper_count <- 1 which is a placeholder; replace this with a real
computation: if edge$data already contains a paper_count field use that,
otherwise parse edge$data$source_counts (JSON) to extract and count unique paper
identifiers (or sum counts per source if the JSON stores counts), and when
collapsing/merging duplicate edges accumulate paper_count across duplicates;
remove the TODO and ensure paper_count is an integer stored in edge$data.

@tonywu1999 tonywu1999 merged commit b42471f into devel Nov 5, 2025
4 checks passed
@tonywu1999 tonywu1999 deleted the link-clicking branch November 5, 2025 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants