Skip to content

Comments

Add graph replay dump tensor tool#72

Open
wlc952 wants to merge 11 commits intosgl-project:mainfrom
wlc952:main
Open

Add graph replay dump tensor tool#72
wlc952 wants to merge 11 commits intosgl-project:mainfrom
wlc952:main

Conversation

@wlc952
Copy link

@wlc952 wlc952 commented Jan 30, 2026

This module provides tools to dump tensors during CUDA Graph replay phase, which is challenging due to Python code not executing during graph replay.

Core principle:

  • Capture phase: Pre-allocate GPU buffers and record copy operations into the graph
  • Replay phase: Graph executes recorded copy operations automatically
  • Post-replay: Save buffer contents to files in Python layer

Usage:

  1. Set environment variables:
        export SGLANG_GRAPH_DEBUG=1                    # Enable debugging
        export SGLANG_GRAPH_DEBUG_DIR=/tmp/graph_debug # Dump directory (optional)
        export SGLANG_GRAPH_DEBUG_LAYERS=0,1,2         # Layers to dump (optional)
  1. Insert dump points in model code:
        from sglang.srt.utils.graph_debug_utils import gdebug

        # In model forward pass
        gdebug.capture_tensor("rope_q", q, layer_id=0)

Copilot AI review requested due to automatic review settings February 13, 2026 06:35
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a debugging utility for CUDA Graph tensor dumping. The tool addresses the challenge that Python code doesn't execute during CUDA Graph replay by pre-allocating GPU buffers during the capture phase and recording copy operations into the graph. During replay, these operations execute automatically, and the tool can then save the buffer contents to disk.

Changes:

  • Added graph_utils.py module with GraphDebugger class and gdebug singleton for tensor capture/dump functionality
  • Integrated gdebug calls in graph.py to set phases during graph capture and replay

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 13 comments.

File Description
python/minisgl/utils/graph_utils.py New debugging utility that provides tools for dumping tensors during CUDA Graph replay phase with configurable environment variables
python/minisgl/engine/graph.py Integration of gdebug to track capture and replay phases for debugging

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

wlc952 and others added 9 commits February 13, 2026 14:53
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants