Skip to content

Conversation

@AmberLJC
Copy link
Owner

@AmberLJC AmberLJC commented Dec 26, 2025

This PR adds 2 new papers to the Multi-Modal Serving Systems section:

  • HydraInfer: Hybrid Disaggregated Scheduling for Multimodal Large Language Model Serving
  • Enabling Disaggregated Multi-Stage MLLM Inference via GPU-Internal Scheduling and Resource Sharing

Closes #46

Generated with Claude Code


Note

Updates the README.md paper list.

  • Adds HydraInfer: Hybrid Disaggregated Scheduling for Multimodal Large Language Model Serving to the Multi-Modal Serving Systems section
  • Adds Enabling Disaggregated Multi-Stage MLLM Inference via GPU-Internal Scheduling and Resource Sharing to the Multi-Modal Serving Systems section

Written by Cursor Bugbot for commit 473028e. This will update automatically on new commits. Configure here.

Added:
- HydraInfer: Hybrid disaggregated scheduling for MLLM serving
- Paper on disaggregated multi-stage MLLM inference with GPU scheduling

Both papers added to Multi-Modal Serving Systems section.

Co-authored-by: Jiachen LIU <AmberLJC@users.noreply.github.com>
@AmberLJC AmberLJC merged commit d62232c into main Dec 26, 2025
2 checks passed
@claude
Copy link

claude bot commented Dec 26, 2025

PR Review: Add new multimodal LLM serving papers

Thank you for the contribution! I've reviewed the changes and have the following feedback:

✅ Strengths

  1. Correct Section Placement: Both papers are appropriately placed in the Multi-Modal Serving Systems section, which aligns with their focus on multimodal LLM inference.

  2. Accurate arXiv Links: The arXiv URLs are correctly formatted without the www. prefix, following the repository's formatting guidelines.

  3. Relevant Content: Both papers are highly relevant to the repository's focus on LLM systems research, specifically addressing disaggregated scheduling and GPU resource management for multimodal models.

⚠️ Issues Found

Line 319 (Second paper): The entry is missing the metadata format used throughout the repository.

Current format:

- [Enabling Disaggregated Multi-Stage MLLM Inference via GPU-Internal Scheduling and Resource Sharing](https://arxiv.org/abs/2512.17574)

Expected format (according to CLAUDE.md):

- [Paper Title](URL): Brief description | Venue/Organization

📝 Recommendations

  1. Add description separator: The second paper entry should include a colon (:) followed by a brief description after the link.

  2. Add metadata: If available, include the venue or organization information after a pipe (|) separator.

Suggested correction for line 319:

- [Enabling Disaggregated Multi-Stage MLLM Inference via GPU-Internal Scheduling and Resource Sharing](https://arxiv.org/abs/2512.17574): Disaggregated multi-stage MLLM inference with GPU-internal scheduling

📋 Minor Observations

  • Line 296: I noticed an existing inconsistency in the file - Optimus: has a colon in the title itself, which is unusual. This is not related to your PR but something to note for future consistency.

  • Line 307: Similarly, DistriFusion: has this same pattern. Your entries actually follow better practices than some existing entries.

✅ Overall Assessment

Code Quality: Good - follows most formatting conventions
Best Practices: Mostly adhered to, with one formatting issue
Security: N/A - documentation only
Test Coverage: N/A - documentation only
Performance: N/A - documentation only

Recommendation: Merge with minor revision to add the description separator and metadata to the second paper entry for consistency with repository guidelines.


Review generated following the guidelines in CLAUDE.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

12/25 new papers

2 participants