Megatron Bridge in CloudAI #764

srivatsankrishnan · 2025-12-22T23:46:18Z

Summary

Added MegatronBridge as a native CloudAI SlurmSystem workload with a SlurmCommandGenStrategy that runs Megatron-Bridge’s scripts/performance/setup_experiment.py using CloudAI-managed installs (Git clone + dedicated venv). Implemented Slurm job-id retrieval for Megatron-Bridge by generating a readable wrapper script (megatron_bridge_submit_and_parse_jobid.sh) that redirects launcher output to megatron_bridge_launcher.log
extracts Job id: from that log and CloudAI can track the Slurm job.

Known Issues: There is some issues with M-bridge overriding logic from passing the default values to it. Once the issue is root caused (on M-bridge side), we should follow it up with another PR.

Test Plan

CI/CD
Internal Cluster

 cloudai run --system-config ../cloudaix/conf/common/system/lyris.toml --tests-dir conf/experimental/megatron_bridge/test --test-scenario conf/experimental/megatron_bridge/test_scenario/megatron_bridge_qwen_30b.toml 
[INFO] System Name: lyris
[INFO] Scheduler: slurm
[INFO] Test Scenario Name: megatron_bridge_qwen_30b
[INFO] Checking if workloads components are installed.
[INFO] Test Scenario: megatron_bridge_qwen_30b

Section Name: megatron_bridge_qwen_30b
  Test Name: megatron_bridge_qwen_30b
  Description: Megatron-Bridge run via CloudAI SlurmSystem for Qwen3 30B A3B
  No dependencies
[INFO] Initializing Runner [RUN] mode
[INFO] Creating SlurmRunner
[INFO] Scenario results will be stored at: results/megatron_bridge_qwen_30b_2025-12-22_15-26-23
[INFO] Starting test: megatron_bridge_qwen_30b (results at: results/megatron_bridge_qwen_30b_2025-12-22_15-26-23/megatron_bridge_qwen_30b/0)
[INFO] Running test: megatron_bridge_qwen_30b
[INFO] Submitted slurm job: 586542
[INFO] Job completed: megatron_bridge_qwen_30b (iteration 1 of 1)
[INFO] Generated scenario report at results/megatron_bridge_qwen_30b_2025-12-22_15-26-23/megatron_bridge_qwen_30b.html
[INFO] Scenario results                                                                                                       
┌──────────────────────────_────────_─────────────────────────────────────────────────────────────────────────────────┐
│ Case                     │ Status │ Details                                                                         │
_──────────────────────────┼────────┼─────────────────────────────────────────────────────────────────────────────────_
│ megatron_bridge_qwen_30b │ PASSED │ results/megatron_bridge_qwen_30b_2025-12-22_15-26-23/megatron_bridge_qwen_30b/0 │
└──────────────────────────_────────_─────────────────────────────────────────────────────────────────────────────────┘

[INFO] All jobs are complete.

Additional Notes

Include any other notes or comments about the pull request here. This can include challenges faced, future considerations, or context that reviewers might find helpful.

coderabbitai · 2025-12-22T23:46:24Z

📝 Walkthrough

Walkthrough

Adds a Megatron-Bridge workload: new TOML configs and test scenario, a workload package (CmdArgs/TestDefinition), Slurm command generator, report generation strategy, registration updates, and unit tests for reporting and Slurm command generation.

Changes

Cohort / File(s)	Summary
Configuration `conf/experimental/megatron_bridge/test/megatron_bridge_qwen_30b.toml`, `conf/experimental/megatron_bridge/test_scenario/megatron_bridge_qwen_30b.toml`	New test configuration and test scenario for a Qwen3 30B A3B Megatron-Bridge run (metadata, cmd_args, 2-node scenario).
Workload package exports `src/cloudai/workloads/megatron_bridge/__init__.py`	New package initializer re-exporting MegatronBridge public types and strategies (CmdArgs, TestDefinition, report & slurm strategies).
Core workload implementation `src/cloudai/workloads/megatron_bridge/megatron_bridge.py`	Adds `MegatronBridgeCmdArgs` and `MegatronBridgeTestDefinition`, including docker/python/git properties, installable resolution, ref inference/mapping, and extensive multi-rule constraint validation.
Reporting `src/cloudai/workloads/megatron_bridge/report_generation_strategy.py`	Adds `MegatronBridgeReportGenerationStrategy` to discover launcher logs, extract step time and TFLOP/s per GPU samples, compute statistics, and emit `report.txt` and get_metric API.
Slurm command generation `src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py`	Adds `MegatronBridgeSlurmCommandGenStrategy` to locate repo/venv, build and wrap launcher command, normalize flags (recompute/cuda-graph), enforce hf_token presence, write `generated_command.sh` and dump metadata.
Registration `src/cloudai/registration.py`	Registers MegatronBridge test definition, Slurm command-gen strategy, and report generation strategy with existing registries and SlurmSystem.
Tests — new `tests/report_generation_strategy/test_megatron_bridge_report_generation_strategy.py`, `tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py`	New unit tests for report parsing/generation and Slurm command generation (hf_token validation, wrapper behavior, normalization, detach handling, generated command contents).
Tests — updated `tests/test_init.py`, `tests/test_test_scenario.py`, `tests/test_cloudaigym.py`, `tests/test_test_definitions.py`	Tests updated to include MegatronBridge registrations and expectations (counts, reporter list), minor assertion cleanup, and a conditional skip for MegatronBridge tests lacking hf_token.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

DeepEP benchmark #723 — overlapping changes to registration and workload export surfaces; likely touches same registries and exports.

Suggested reviewers

srinivas212
TaekyungHeo
amaslenn

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title "Megatron Bridge in CloudAI" directly and clearly describes the main change: adding Megatron Bridge support as a native CloudAI workload.
Description check	✅ Passed	The description is well-detailed and directly related to the changeset, covering the implementation of MegatronBridge as a SlurmSystem workload, the wrapper script functionality, test plans, and known issues.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 120a3c5 and cfa33b9.

📒 Files selected for processing (4)

src/cloudai/workloads/megatron_bridge/megatron_bridge.py
src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py
tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py
tests/test_test_definitions.py

🧰 Additional context used

🧠 Learnings (8)

📓 Common learnings

Learnt from: srivatsankrishnan
Repo: NVIDIA/cloudai PR: 764
File: src/cloudai/workloads/megatron_bridge/megatron_bridge.py:148-155
Timestamp: 2025-12-23T00:28:50.788Z
Learning: In src/cloudai/workloads/megatron_bridge/megatron_bridge.py, when constraint_check is called, CloudAIGym has already resolved list-valued fields (like tp, pp, cp, etc.) to static scalar values. The _as_int helper using cast() is safe because it will only receive int or None at runtime, never List[int].

Learnt from: srivatsankrishnan
Repo: NVIDIA/cloudai PR: 764
File: src/cloudai/workloads/megatron_bridge/megatron_bridge.py:98-101
Timestamp: 2025-12-23T00:23:11.471Z
Learning: In src/cloudai/workloads/megatron_bridge/megatron_bridge.py, the nemo_run_repo GitRepo uses commit="main" intentionally. Nemo Run is a Slurm executor (not a framework) used by Megatron Bridge to launch recipes, and tracking the main branch is acceptable for this dependency.

📚 Learning: 2025-12-17T22:24:51.805Z

Learnt from: srivatsankrishnan
Repo: NVIDIA/cloudai PR: 760
File: tests/standalone_command_gen_strategy/test_aiconfigurator_standalone_command_gen_strategy.py:33-122
Timestamp: 2025-12-17T22:24:51.805Z
Learning: In the NVIDIA/cloudai repository, avoid suggesting overly nitpick refactor comments such as test parametrization when there are only two test cases with different modes (e.g., agg vs disagg). Such refactoring suggestions are not needed unless explicitly requested.

Applied to files:

tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py

📚 Learning: 2025-12-16T19:47:41.994Z

Learnt from: amaslenn
Repo: NVIDIA/cloudai PR: 754
File: src/cloudai/_core/registry.py:226-234
Timestamp: 2025-12-16T19:47:41.994Z
Learning: In this repository, prefer expressing behavioral documentation through tests rather than docstrings. Tests act as living, verified documentation. Reserve docstrings for interfaces or high-level descriptions, and avoid duplicating behavior that is already covered by tests.

Applied to files:

tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py
src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py
src/cloudai/workloads/megatron_bridge/megatron_bridge.py
tests/test_test_definitions.py

📚 Learning: 2025-12-23T00:23:11.471Z

Learnt from: srivatsankrishnan
Repo: NVIDIA/cloudai PR: 764
File: src/cloudai/workloads/megatron_bridge/megatron_bridge.py:98-101
Timestamp: 2025-12-23T00:23:11.471Z
Learning: In src/cloudai/workloads/megatron_bridge/megatron_bridge.py, the nemo_run_repo GitRepo uses commit="main" intentionally. Nemo Run is a Slurm executor (not a framework) used by Megatron Bridge to launch recipes, and tracking the main branch is acceptable for this dependency.

Applied to files:

src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py
src/cloudai/workloads/megatron_bridge/megatron_bridge.py

📚 Learning: 2025-12-17T22:02:45.215Z

Learnt from: srivatsankrishnan
Repo: NVIDIA/cloudai PR: 756
File: src/cloudai/workloads/aiconfig/standalone_command_gen_strategy.py:65-85
Timestamp: 2025-12-17T22:02:45.215Z
Learning: In CloudAI's DSE flow for the Aiconfigurator workload (src/cloudai/workloads/aiconfig/standalone_command_gen_strategy.py), list-valued parameters in AiconfiguratorCmdArgs (such as batch_size, ctx_tokens, tp, pp, dp, etc. in Agg and Disagg models) are scalarized by apply_params_set before gen_exec_command is called, so these fields are guaranteed to be scalar integers at command generation time.

Applied to files:

src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py
src/cloudai/workloads/megatron_bridge/megatron_bridge.py

📚 Learning: 2025-12-23T00:28:50.788Z

Learnt from: srivatsankrishnan
Repo: NVIDIA/cloudai PR: 764
File: src/cloudai/workloads/megatron_bridge/megatron_bridge.py:148-155
Timestamp: 2025-12-23T00:28:50.788Z
Learning: In src/cloudai/workloads/megatron_bridge/megatron_bridge.py, when constraint_check is called, CloudAIGym has already resolved list-valued fields (like tp, pp, cp, etc.) to static scalar values. The _as_int helper using cast() is safe because it will only receive int or None at runtime, never List[int].

Applied to files:

src/cloudai/workloads/megatron_bridge/megatron_bridge.py

📚 Learning: 2025-12-05T13:59:40.479Z

Learnt from: amaslenn
Repo: NVIDIA/cloudai PR: 739
File: src/cloudai/workloads/ai_dynamo/report_generation_strategy.py:123-138
Timestamp: 2025-12-05T13:59:40.479Z
Learning: In the AI Dynamo workload for CloudAI, num_nodes fields in WorkerBaseArgs can be typed as `int | list[int]`, but lists are unrolled at the cmd_gen/json_gen level. By the time report generation runs, only scalar integer values are present in num_nodes fields. The Slurm command generation strategy enforces this with explicit assertions.

Applied to files:

src/cloudai/workloads/megatron_bridge/megatron_bridge.py

📚 Learning: 2025-12-05T13:58:27.113Z

Learnt from: amaslenn
Repo: NVIDIA/cloudai PR: 739
File: src/cloudai/workloads/ai_dynamo/ai_dynamo.py:34-63
Timestamp: 2025-12-05T13:58:27.113Z
Learning: In the AI Dynamo workload (src/cloudai/workloads/ai_dynamo/), list-valued fields in WorkerBaseArgs (e.g., num_nodes: int | list[int], data_parallel_size: int | list[int] | None) are unrolled at the cmd_gen/json_gen level, so downstream code only receives scalar values.

Applied to files:

src/cloudai/workloads/megatron_bridge/megatron_bridge.py

🧬 Code graph analysis (3)

tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py (3)

src/cloudai/systems/slurm/slurm_system.py (1)

SlurmSystem (96-743)

src/cloudai/workloads/megatron_bridge/megatron_bridge.py (5)

MegatronBridgeCmdArgs (26-91)

MegatronBridgeTestDefinition (94-433)

python_executable (116-119)

megatron_bridge_repo (122-129)

docker_image (110-113)

src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py (1)

MegatronBridgeSlurmCommandGenStrategy (32-286)

src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py (3)

src/cloudai/models/scenario.py (2)

TestRunDetails (201-253)

from_test_run (241-253)

src/cloudai/systems/slurm/slurm_command_gen_strategy.py (1)

SlurmCommandGenStrategy (32-472)

src/cloudai/workloads/megatron_bridge/megatron_bridge.py (5)

MegatronBridgeCmdArgs (26-91)

MegatronBridgeTestDefinition (94-433)

megatron_bridge_repo (122-129)

python_executable (116-119)

docker_image (110-113)

src/cloudai/workloads/megatron_bridge/megatron_bridge.py (2)

src/cloudai/_core/installables.py (4)

DockerImage (36-84)

GitRepo (87-115)

Installable (25-32)

PythonExecutable (119-145)

src/cloudai/models/workload.py (2)

CmdArgs (26-29)

TestDefinition (89-141)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Greptile Review
GitHub Check: Run pytest (3.10)

🔇 Additional comments (4)

src/cloudai/workloads/megatron_bridge/megatron_bridge.py (1)

26-433: LGTM! Comprehensive Megatron-Bridge workload implementation.

The implementation is well-structured with:

Proper field validation (hf_token requirement)

Lazy initialization of installables (docker_image, python_executable, megatron_bridge_repo)

Comprehensive constraint validation (17 constraints) with detailed error logging

Safe handling of potentially list-typed fields (confirmed by learnings that CloudAIGym resolves lists before constraint_check)

The constraint_check method is appropriately thorough given the complexity of distributed training configurations, with proper division-by-zero guards and clear error messages.

Based on learnings, the use of cast() in _as_int and bool() in _as_bool is safe because CloudAIGym resolves list-valued fields to static scalar values before constraint_check is called.

tests/test_test_definitions.py (1)

91-94: LGTM! Appropriate test skip for credential-required config.

The skip prevents test failures when the MegatronBridge example config has an empty hf_token placeholder. This is a reasonable approach for configs requiring user-provided credentials, with a clear message directing users to set the token.

tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py (1)

31-191: LGTM! Comprehensive test coverage for command generation.

The test suite thoroughly validates:

Schema validation (hf_token requirement)

Default value emission behavior (ensuring defaults aren't forced)

Container image path handling (local vs. installed paths)

Argument normalization (CUDA graph scope)

Flag handling (detach/no-detach/omit variations)

Command file generation

The detach flag test properly reconstructs cmd_args (lines 160-165) to ensure fields_set is correctly populated, addressing the past review concern about Pydantic's model_fields_set behavior.

src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py (1)

32-286: LGTM! Robust Slurm command generation strategy.

The implementation demonstrates strong engineering practices:

Error handling: Clear RuntimeError messages for missing required fields and installation issues (lines 168-175, 237-240)

Troubleshooting support: Warnings for missing installs with actionable guidance (lines 51-54, 59-62)

Job tracking: Well-structured wrapper script that captures Slurm job ID for CloudAI integration (lines 104-150)

Path resolution: Proper fallback logic for local vs. installed container images (lines 177-182)

Normalization: Consistent handling of list/string arguments for recompute_modules and CUDA graph scopes

The command construction respects Pydantic's fields_set to avoid emitting default values, and the detach flag handling correctly uses flag presence rather than boolean values.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2025-12-23T00:12:57Z

Greptile Summary

Adds MegatronBridge as a native CloudAI workload for Slurm systems, enabling distributed AI training benchmarks through CloudAI's standardized test framework with automatic Git repository management and version mapping
Implements sophisticated Slurm job ID tracking by generating wrapper scripts that capture Megatron-Bridge launcher output and parse job IDs for CloudAI's job monitoring system
Introduces comprehensive constraint validation with 17 different checks for complex distributed training configurations including tensor/pipeline/context parallelism compatibility and FSDP requirements

Important Files Changed

Filename	Overview
src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py	New file implementing complex wrapper script generation and job ID parsing logic with hardcoded log format dependencies
src/cloudai/workloads/megatron_bridge/megatron_bridge.py	New file with extensive constraint validation system for distributed training configs and CloudAI-managed installation logic
conf/experimental/megatron_bridge/test/megatron_bridge_qwen_30b.toml	New config file containing hardcoded placeholder HF token that requires manual replacement before use

Confidence score: 3/5

This PR requires careful review due to complex integration logic and hardcoded dependencies that could cause runtime failures
Score reflects sophisticated but brittle job ID parsing mechanism, extensive constraint validation system with potential edge cases, and hardcoded log filename dependencies between components
Pay close attention to slurm_command_gen_strategy.py for wrapper script logic and megatron_bridge.py for constraint validation edge cases

Sequence Diagram

sequenceDiagram
    participant User
    participant CloudAI_CLI as "CloudAI CLI"
    participant TestDefinition as "MegatronBridgeTestDefinition"
    participant CommandGenStrategy as "MegatronBridgeSlurmCommandGenStrategy"
    participant Installer as "SlurmInstaller"
    participant SlurmRunner as "SlurmRunner"
    participant MegatronBridge as "Megatron-Bridge Launcher"
    participant SlurmCluster as "Slurm Cluster"
    
    User->>CloudAI_CLI: "cloudai run --system-config system.toml --test-scenario scenario.toml"
    CloudAI_CLI->>TestDefinition: "Parse test configuration"
    TestDefinition->>TestDefinition: "Validate cmd_args (hf_token, constraints)"
    TestDefinition->>TestDefinition: "Setup installables (docker_image, nemo_run_repo, megatron_bridge_repo)"
    
    CloudAI_CLI->>Installer: "Install dependencies"
    Installer->>Installer: "Clone Megatron-Bridge repo"
    Installer->>Installer: "Create Python venv with NeMo-Run"
    Installer->>Installer: "Cache container image"
    
    CloudAI_CLI->>CommandGenStrategy: "Generate execution command"
    CommandGenStrategy->>CommandGenStrategy: "Build launcher command parts"
    CommandGenStrategy->>CommandGenStrategy: "Create wrapper script (megatron_bridge_submit_and_parse_jobid.sh)"
    CommandGenStrategy->>SlurmRunner: "Return wrapped command"
    
    SlurmRunner->>MegatronBridge: "Execute wrapper script"
    MegatronBridge->>MegatronBridge: "Run setup_experiment.py"
    MegatronBridge->>SlurmCluster: "Submit training job via sbatch"
    SlurmCluster-->>MegatronBridge: "Job ID"
    MegatronBridge->>MegatronBridge: "Log output to megatron_bridge_launcher.log"
    MegatronBridge->>SlurmRunner: "Echo 'Submitted batch job <ID>'"
    
    SlurmRunner->>SlurmRunner: "Parse job ID and track job"
    SlurmRunner-->>User: "Job completion status"

greptile-apps

Additional Comments (2)

src/cloudai/workloads/megatron_bridge/megatron_bridge.py, line 151-152 (link)

logic: _as_int accepts List[int] but only casts without checking type, so tp=[1,2,4] sweeps will pass through incorrectly as list objects
src/cloudai/workloads/megatron_bridge/megatron_bridge.py, line 154-155 (link)

logic: _as_bool has same issue - use_megatron_fsdp=[true, false] sweeps will incorrectly evaluate as truthy list

_{12 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

coderabbitai

Actionable comments posted: 10

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fbf9891 and 78b919e.

📒 Files selected for processing (12)

conf/experimental/megatron_bridge/test/megatron_bridge_qwen_30b.toml
conf/experimental/megatron_bridge/test_scenario/megatron_bridge_qwen_30b.toml
src/cloudai/registration.py
src/cloudai/workloads/megatron_bridge/__init__.py
src/cloudai/workloads/megatron_bridge/megatron_bridge.py
src/cloudai/workloads/megatron_bridge/report_generation_strategy.py
src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py
tests/report_generation_strategy/test_megatron_bridge_report_generation_strategy.py
tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py
tests/test_cloudaigym.py
tests/test_init.py
tests/test_test_scenario.py

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2025-12-16T19:47:41.994Z

Learnt from: amaslenn
Repo: NVIDIA/cloudai PR: 754
File: src/cloudai/_core/registry.py:226-234
Timestamp: 2025-12-16T19:47:41.994Z
Learning: In this repository, prefer expressing behavioral documentation through tests rather than docstrings. Tests act as living, verified documentation. Reserve docstrings for interfaces or high-level descriptions, and avoid duplicating behavior that is already covered by tests.

Applied to files:

tests/test_cloudaigym.py
tests/report_generation_strategy/test_megatron_bridge_report_generation_strategy.py
tests/test_init.py
src/cloudai/workloads/megatron_bridge/megatron_bridge.py
src/cloudai/workloads/megatron_bridge/report_generation_strategy.py
tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py
src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py
tests/test_test_scenario.py
src/cloudai/workloads/megatron_bridge/__init__.py
src/cloudai/registration.py

📚 Learning: 2025-12-17T22:02:45.215Z

Learnt from: srivatsankrishnan
Repo: NVIDIA/cloudai PR: 756
File: src/cloudai/workloads/aiconfig/standalone_command_gen_strategy.py:65-85
Timestamp: 2025-12-17T22:02:45.215Z
Learning: In CloudAI's DSE flow for the Aiconfigurator workload (src/cloudai/workloads/aiconfig/standalone_command_gen_strategy.py), list-valued parameters in AiconfiguratorCmdArgs (such as batch_size, ctx_tokens, tp, pp, dp, etc. in Agg and Disagg models) are scalarized by apply_params_set before gen_exec_command is called, so these fields are guaranteed to be scalar integers at command generation time.

Applied to files:

src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py

🧬 Code graph analysis (5)

tests/report_generation_strategy/test_megatron_bridge_report_generation_strategy.py (2)

src/cloudai/_core/test_scenario.py (1)

TestRun (58-174)

src/cloudai/workloads/megatron_bridge/report_generation_strategy.py (3)

MegatronBridgeReportGenerationStrategy (28-163)

can_handle_directory (47-48)

generate_report (73-136)

src/cloudai/workloads/megatron_bridge/megatron_bridge.py (2)

src/cloudai/_core/installables.py (4)

DockerImage (36-84)

GitRepo (87-115)

Installable (25-32)

PythonExecutable (119-145)

src/cloudai/models/workload.py (2)

CmdArgs (26-29)

TestDefinition (89-141)

src/cloudai/workloads/megatron_bridge/report_generation_strategy.py (1)

src/cloudai/_core/report_generation_strategy.py (1)

ReportGenerationStrategy (24-40)

src/cloudai/workloads/megatron_bridge/__init__.py (3)

src/cloudai/workloads/megatron_bridge/megatron_bridge.py (2)

MegatronBridgeCmdArgs (26-89)

MegatronBridgeTestDefinition (92-431)

src/cloudai/workloads/megatron_bridge/report_generation_strategy.py (1)

MegatronBridgeReportGenerationStrategy (28-163)

src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py (1)

MegatronBridgeSlurmCommandGenStrategy (32-281)

src/cloudai/registration.py (4)

src/cloudai/workloads/megatron_bridge/report_generation_strategy.py (1)

MegatronBridgeReportGenerationStrategy (28-163)

src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py (1)

MegatronBridgeSlurmCommandGenStrategy (32-281)

src/cloudai/workloads/megatron_bridge/megatron_bridge.py (1)

MegatronBridgeTestDefinition (92-431)

src/cloudai/_core/registry.py (2)

add_command_gen_strategy (251-259)

add_report (207-210)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Greptile Review

🔇 Additional comments (21)

conf/experimental/megatron_bridge/test_scenario/megatron_bridge_qwen_30b.toml (1)

17-22: LGTM!

The test scenario configuration is minimal and correctly references the test definition. The structure aligns with other scenario files in the codebase.

src/cloudai/registration.py (1)

94-98: LGTM!

The MegatronBridge registrations follow the established patterns in the codebase:

Import grouped with other workload imports

Command generation strategy registered for SlurmSystem

Test definition and report strategy properly registered

Also applies to: 193-195, 245-245, 262-262

conf/experimental/megatron_bridge/test/megatron_bridge_qwen_30b.toml (2)

17-35: LGTM for experimental configuration.

The test configuration is well-structured for a Qwen 30B model run. The parameters align with the 2-node scenario (8 GPUs / 4 GPUs per node = 2 nodes).

35-35: No action needed — hf_token validation is already implemented.

The placeholder hf_token = "REPLACE_ME_WITH_HF_TOKEN" is already caught at runtime. The _build_launcher_parts method in MegatronBridgeSlurmCommandGenStrategy explicitly validates this value at line 155 and raises a RuntimeError with a clear message: "HuggingFace token is required. Please set cmd_args.hf_token to a real token string (not 'REPLACE_ME_WITH_HF_TOKEN') in your local test TOML."

Likely an incorrect or invalid review comment.

tests/test_test_scenario.py (1)

52-52: LGTM!

The test updates correctly integrate MegatronBridge:

Import added in proper alphabetical position

Reporter count incremented from 15 to 16

Parametrized test case added for the new test definition and report strategy mapping

Also applies to: 475-475, 485-485

tests/test_init.py (1)

52-55: LGTM! MegatronBridge registration follows the established pattern.

The import, command generation strategy mapping, test definition count update, and test definition entry are all consistent with the existing workload registrations.

Also applies to: 136-136, 220-220, 235-235

tests/report_generation_strategy/test_megatron_bridge_report_generation_strategy.py (2)

27-42: LGTM! The fixture provides a realistic test setup.

The log content accurately reflects the Megatron-Bridge output format with Step Time and GPU utilization metrics, which the report generation strategy parses.

45-57: LGTM! Tests cover the essential report generation behavior.

The tests verify both can_handle_directory() and the content of the generated report, aligning with the behavioral documentation approach mentioned in the learnings.

src/cloudai/workloads/megatron_bridge/__init__.py (1)

17-26: LGTM! Clean package initialization.

The __all__ exports are alphabetically ordered and include all necessary public entities for the Megatron Bridge workload.

src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py (4)

104-150: LGTM! Well-designed wrapper script for job ID extraction.

The script properly:

Sets strict mode (set -euo pipefail)

Redirects launcher output to a log file

Parses the job ID with graceful fallback (|| true)

Emits CloudAI-compatible output or fails with diagnostic info

152-159: Good validation: Rejecting placeholder HF tokens prevents common misconfiguration.

Clear error message guides users to set a real token in their TOML configuration.

191-204: LGTM! Clean helper functions for flag handling.

The add() and add_field() helpers properly handle None, booleans (converted to "true"/"false"), and the fields_set logic to avoid emitting defaults.

271-275: LGTM! Detach flag handling is correct.

The logic properly emits --detach for True, --no-detach for False, and nothing when the field is not explicitly set, avoiding Megatron-Bridge's default override issues mentioned in the PR notes.

tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py (3)

88-126: LGTM! Good coverage of the fields_set logic.

By constructing MegatronBridgeCmdArgs with only the required fields, this test correctly verifies that optional fields not in fields_set are not emitted in the command.

127-145: LGTM! Container path and CUDA graph scope tests verify key normalization behavior.

Both tests validate that user-provided values are correctly passed through or normalized in the generated wrapper script.

173-183: LGTM! Verifies the generated command file is written correctly.

The test confirms that generated_command.sh is created and contains the expected wrapper script invocation.
src/cloudai/workloads/megatron_bridge/megatron_bridge.py (5)
1-23: LGTM!

License header and imports are appropriate for this module.

26-90: LGTM!

The MegatronBridgeCmdArgs class is well-structured with appropriate field definitions supporting both scalar values and lists for sweep configurations. The hf_token validator correctly sanitizes input.

413-431: LGTM!

The constraint aggregation logic is correct. All 17 constraints are properly combined, and the approach of logging each failure individually before returning provides excellent debugging visibility.

232-235: Edge case: cuda_graph_scope with only empty/whitespace values.

The cuda_graphs determination checks len(scopes) > 0, but if _normalize_str_list filters out all empty segments, an original non-empty input like cuda_graph_scope=" " would result in scopes = []. This is likely the intended behavior, but worth confirming the edge case is handled as expected.

246-251: Consider simplifying the nested ternary for readability.

The constraint logic on line 247 is difficult to parse at a glance due to nested ternaries:
constraint6 = pp == 1 and cp == 1 and (vp == 1 if vp is not None else True) if fsdp else True
🔎 Proposed refactor for clarity
-        constraint6 = pp == 1 and cp == 1 and (vp == 1 if vp is not None else True) if fsdp else True
+        if fsdp:
+            constraint6 = pp == 1 and cp == 1 and (vp is None or vp == 1)
+        else:
+            constraint6 = True
⛔ Skipped due to learnings
Learnt from: srivatsankrishnan
Repo: NVIDIA/cloudai PR: 760
File: tests/standalone_command_gen_strategy/test_aiconfigurator_standalone_command_gen_strategy.py:33-122
Timestamp: 2025-12-17T22:24:51.805Z
Learning: In the NVIDIA/cloudai repository, avoid suggesting overly nitpick refactor comments such as test parametrization when there are only two test cases with different modes (e.g., agg vs disagg). Such refactoring suggestions are not needed unless explicitly requested.

src/cloudai/workloads/megatron_bridge/megatron_bridge.py

src/cloudai/workloads/megatron_bridge/report_generation_strategy.py

src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py

tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py

tests/test_cloudaigym.py

srivatsankrishnan · 2025-12-23T00:46:39Z

Additional Comments (2)

src/cloudai/workloads/megatron_bridge/megatron_bridge.py, line 151-152 (link)
logic: _as_int accepts List[int] but only casts without checking type, so tp=[1,2,4] sweeps will pass through incorrectly as list objects

src/cloudai/workloads/megatron_bridge/megatron_bridge.py, line 154-155 (link)
logic: _as_bool has same issue - use_megatron_fsdp=[true, false] sweeps will incorrectly evaluate as truthy list

12 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

This is fine. The list resolution happens in CloudAIGym before calling constraint checks. So at this point there is no question of values coming in as sweeps (lists). You can ignore this.

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 871f460 and a054f4a.

📒 Files selected for processing (1)

tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: srivatsankrishnan
Repo: NVIDIA/cloudai PR: 764
File: src/cloudai/workloads/megatron_bridge/megatron_bridge.py:98-101
Timestamp: 2025-12-23T00:23:11.471Z
Learning: In src/cloudai/workloads/megatron_bridge/megatron_bridge.py, the nemo_run_repo GitRepo uses commit="main" intentionally. Nemo Run is a Slurm executor (not a framework) used by Megatron Bridge to launch recipes, and tracking the main branch is acceptable for this dependency.

📚 Learning: 2025-12-16T19:47:41.994Z

Learnt from: amaslenn
Repo: NVIDIA/cloudai PR: 754
File: src/cloudai/_core/registry.py:226-234
Timestamp: 2025-12-16T19:47:41.994Z
Learning: In this repository, prefer expressing behavioral documentation through tests rather than docstrings. Tests act as living, verified documentation. Reserve docstrings for interfaces or high-level descriptions, and avoid duplicating behavior that is already covered by tests.

Applied to files:

tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Greptile Review

🔇 Additional comments (7)

tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py (7)

32-77: LGTM: Clean fixture setup.

The fixtures properly create a synthetic test environment with fake installation paths, avoiding dependencies on real installs. The approach is well-documented with inline comments.

79-86: LGTM: Proper validation test.

Correctly verifies that placeholder tokens are rejected at runtime with an appropriate error.

88-125: LGTM: Comprehensive default handling test.

Correctly verifies that optional fields not specified in TOML are omitted from the generated command. The intentional duplication of setup logic (rather than using fixtures) is appropriate for testing this minimal configuration.

127-139: LGTM: Container path handling validated.

Properly verifies that local container image paths are preserved verbatim and not overridden by cached paths.

141-145: LGTM: Normalization verified.

Correctly validates that cuda_graph_scope bracket notation is normalized in the generated wrapper.

165-170: LGTM: model_fields_set issue resolved.

The reconstruction approach using model_dump and model_validate correctly addresses the previous review concern. This ensures model_fields_set is properly populated based on which fields are present in the data dict, allowing the strategy code's "detach" in fields_set check to work correctly.

183-193: LGTM: Command file generation verified.

Properly validates that the generated command file is created with the expected content and format.

tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py

greptile-apps

Additional Comments (2)

src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py, line 219 (link)

style: check if self.system.account exists before using, as account may be optional
src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py, line 237-243 (link)

style: consider moving required field validation to Pydantic model instead of runtime checks

If model_name and model_size are truly required, define them without defaults in MegatronBridgeCmdArgs so Pydantic validates them at construction time.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

_{12 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps

Additional Comments (3)

src/cloudai/workloads/megatron_bridge/megatron_bridge.py, line 201-205 (link)

style: Constraint 2 bypassed (assumes num_layers not available, set to True). VP validation incomplete. Constraint 3 (dp != 0) is redundant with Constraint 1 since if num_gpus % (tp*pp*cp) == 0 and num_gpus > 0, then dp will always be non-zero.
src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py, line 153-157 (link)

style: Consider moving HF token validation (checking for empty/placeholder values) to a @field_validator in MegatronBridgeCmdArgs for earlier error detection rather than failing at command generation time.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py, line 126-132 (link)

style: Job ID extraction relies on grepping for "Job id: <num>" pattern. Verify this pattern matches Megatron-Bridge's output format across different versions (currently mapped to r0.2.0 for container 25.11).

_{12 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps

Additional Comments (3)

tests/test_test_definitions.py, line 93 (link)

style: Check should also handle None values to be more robust

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
src/cloudai/workloads/megatron_bridge/report_generation_strategy.py, line 62-64 (link)

style: Limiting to last 10 data points is hardcoded and may not be suitable for all scenarios - consider making this configurable or removing the limit. Is there a specific reason for limiting to 10 data points, or should this be configurable based on the test requirements?

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
src/cloudai/workloads/megatron_bridge/report_generation_strategy.py, line 98 (link)

style: Using population standard deviation (pstdev) instead of sample standard deviation - verify this is the intended statistical measure. Should this be sample standard deviation (stdev) instead of population standard deviation (pstdev)?

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

_{13 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

srivatsankrishnan · 2025-12-23T05:04:06Z

Additional Comments (3)

src/cloudai/workloads/megatron_bridge/megatron_bridge.py, line 201-205 (link)
style: Constraint 2 bypassed (assumes num_layers not available, set to True). VP validation incomplete. Constraint 3 (dp != 0) is redundant with Constraint 1 since if num_gpus % (tp*pp*cp) == 0 and num_gpus > 0, then dp will always be non-zero.

src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py, line 153-157 (link)
style: Consider moving HF token validation (checking for empty/placeholder values) to a @field_validator in MegatronBridgeCmdArgs for earlier error detection rather than failing at command generation time.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py, line 126-132 (link)
style: Job ID extraction relies on grepping for "Job id: <num>" pattern. Verify this pattern matches Megatron-Bridge's output format across different versions (currently mapped to r0.2.0 for container 25.11).

12 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Constraint checks are independent to each other. you can ignore them. HF_token to field validator is addresssed in this commit.
Job ID extraction pattern will remain the same since its using Nemo Run and remains the same with different version we are using. Safe to ignore. @greptile

greptile-apps

Additional Comments (5)

conf/experimental/megatron_bridge/test_scenario/megatron_bridge_qwen_30b.toml, line 22 (link)

syntax: num_nodes value should be an integer, not a string
tests/report_generation_strategy/test_megatron_bridge_report_generation_strategy.py, line 32 (link)

style: The log content appears to have a formatting issue - 'ain_fp8_mx/0' looks like a truncated prefix that should probably be a complete experiment name or identifier. Is this truncated log line intentional for testing, or should it include the full experiment identifier?

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
src/cloudai/workloads/megatron_bridge/report_generation_strategy.py, line 62-64 (link)

style: only last 10 samples used for statistics without explanation. Is this limitation intentional for performance reasons or to focus on steady-state metrics?

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py, line 133 (link)

style: Job ID parsing relies on hardcoded regex pattern 'Job id[: ]+[0-9]+' which could break if Megatron-Bridge changes its output format. Is there a more robust way to get the job ID from Megatron-Bridge, such as through its API or a structured output format?

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
src/cloudai/workloads/megatron_bridge/megatron_bridge.py, line 137-141 (link)

style: Version mapping only includes 25.11->r0.2.0, defaulting unknown versions to 'main' branch could cause compatibility issues

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

_{13 files reviewed, 5 comments}

_{Edit Code Review Agent Settings | Greptile}

srivatsankrishnan · 2025-12-23T05:17:06Z

Additional Comments (3)

tests/test_test_definitions.py, line 93 (link)
style: Check should also handle None values to be more robust
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

src/cloudai/workloads/megatron_bridge/report_generation_strategy.py, line 62-64 (link)
style: Limiting to last 10 data points is hardcoded and may not be suitable for all scenarios - consider making this configurable or removing the limit. Is there a specific reason for limiting to 10 data points, or should this be configurable based on the test requirements?
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

src/cloudai/workloads/megatron_bridge/report_generation_strategy.py, line 98 (link)
style: Using population standard deviation (pstdev) instead of sample standard deviation - verify this is the intended statistical measure. Should this be sample standard deviation (stdev) instead of population standard deviation (pstdev)?
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

13 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

@greptile.
Safe to ignore these comments. Last 10 data points is a requirement for stability in training step. using normal std deviation is file.

greptile-apps

Got it! I've noted these preferences for ML training workloads:

Last 10 data points limit: This is intentional for training stability metrics and shouldn't be flagged as hardcoded
Population std deviation (pstdev): This is the correct statistical measure for training metrics, not sample std deviation

Here's a link to the new rule that will be applied in future reviews.

srivatsankrishnan · 2025-12-23T05:20:20Z

Additional Comments (5)

conf/experimental/megatron_bridge/test_scenario/megatron_bridge_qwen_30b.toml, line 22 (link)
syntax: num_nodes value should be an integer, not a string

tests/report_generation_strategy/test_megatron_bridge_report_generation_strategy.py, line 32 (link)
style: The log content appears to have a formatting issue - 'ain_fp8_mx/0' looks like a truncated prefix that should probably be a complete experiment name or identifier. Is this truncated log line intentional for testing, or should it include the full experiment identifier?
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

src/cloudai/workloads/megatron_bridge/report_generation_strategy.py, line 62-64 (link)
style: only last 10 samples used for statistics without explanation. Is this limitation intentional for performance reasons or to focus on steady-state metrics?
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

src/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.py, line 133 (link)
style: Job ID parsing relies on hardcoded regex pattern 'Job id[: ]+[0-9]+' which could break if Megatron-Bridge changes its output format. Is there a more robust way to get the job ID from Megatron-Bridge, such as through its API or a structured output format?
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

src/cloudai/workloads/megatron_bridge/megatron_bridge.py, line 137-141 (link)
style: Version mapping only includes 25.11->r0.2.0, defaulting unknown versions to 'main' branch could cause compatibility issues
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

13 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

@greptile
Repeating the comments. Already addressed in this PR. Safe to ignore these preferences.

greptile-apps

Perfect! I've created a rule to remember your team's preferences for MegatronBridge workload implementations. I won't flag these patterns as issues in future reviews. Thanks for the feedback!

conf/experimental/megatron_bridge/test/megatron_bridge_qwen_30b.toml

alexmanle

Great work!

srivatsankrishnan added 21 commits December 17, 2025 23:03

initial port m-bridge

ecf22e8

m-bridge using slurmcommand stratergy

1b42e01

latest m-bridge changes

59a9b14

tracked config

1a3dcc6

fix installation sequence

6e39fc3

update container url

7fba08b

fix url

d344e58

absolute path

87f03d8

fix the caching bug

1986de4

fix the container

dd9ed59

sanitize the flag/api to reflect changes

d32a50e

cuda_graph scope logic update

7199e4f

more api name change fixes

5eadb51

update to use local path as is

022cbc6

add unit test and clean up overrides

9960611

don't pass defaults yet (issue with M-bridge overrides).

462c6b7

keep optional none to they don't get set (M_bridge has override issues)

c58ccac

get job_id and redirect m-bridge to own logs

db742c1

fix job id retrival logic

52b3dd7

make report generation logic fix

08bb064

fix report parsing logic + unit test

ae12297

srivatsankrishnan added 2 commits December 22, 2025 15:51

Merge branch 'main' into m-bridge

dc6b76b

fix unit tests/liting etc

78b919e

srivatsankrishnan marked this pull request as ready for review December 23, 2025 00:09

srivatsankrishnan requested review from alexmanle, amaslenn and jeffnvidia as code owners December 23, 2025 00:09

greptile-apps bot reviewed Dec 23, 2025

View reviewed changes

coderabbitai bot reviewed Dec 23, 2025

View reviewed changes

srivatsankrishnan added 6 commits December 22, 2025 16:31

simplify log finding logic

593556c

fix silent failures during report generation

0074a6e

simplifying extracting shared log-finding and extraction logic.

62613cb

f-strings or consistent formatting.

2ab4b4f

fix produces "None" string if installed_path is None

871f460

fix Assigning detach after construction doesn't update model_fields_set

a054f4a

coderabbitai bot reviewed Dec 23, 2025

View reviewed changes

tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py Outdated Show resolved Hide resolved

greptile-apps bot reviewed Dec 23, 2025

View reviewed changes

Incomplete test assertions

120a3c5

greptile-apps bot reviewed Dec 23, 2025

View reviewed changes

greptile fixes

cfa33b9

greptile-apps bot reviewed Dec 23, 2025

View reviewed changes

alexmanle reviewed Dec 23, 2025

View reviewed changes

conf/experimental/megatron_bridge/test/megatron_bridge_qwen_30b.toml Show resolved Hide resolved

alexmanle approved these changes Dec 23, 2025

View reviewed changes

srivatsankrishnan merged commit 99f9158 into NVIDIA:main Dec 23, 2025
5 checks passed

Megatron Bridge in CloudAI #764

Megatron Bridge in CloudAI #764

Uh oh!

Conversation

srivatsankrishnan commented Dec 22, 2025

Summary

Test Plan

Additional Notes

Uh oh!

coderabbitai bot commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

greptile-apps bot commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Important Files Changed

Confidence score: 3/5

Sequence Diagram

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (2)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

srivatsankrishnan commented Dec 23, 2025

Additional Comments (2)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (2)

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (3)

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (3)

Uh oh!

srivatsankrishnan commented Dec 23, 2025

Additional Comments (3)

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (5)

Uh oh!

srivatsankrishnan commented Dec 23, 2025

Additional Comments (3)

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

srivatsankrishnan commented Dec 23, 2025

coderabbitai bot commented Dec 22, 2025 •

edited

Loading

greptile-apps bot commented Dec 23, 2025 •

edited

Loading

greptile-apps bot left a comment •

edited

Loading

greptile-apps bot left a comment •

edited

Loading

greptile-apps bot left a comment •

edited

Loading

greptile-apps bot left a comment •

edited

Loading

greptile-apps bot left a comment •

edited

Loading