Skip to content

Conversation

@AmberLJC
Copy link
Owner

@AmberLJC AmberLJC commented Dec 12, 2025

Category Description Papers
Architecture Efficient attention, KV-cache systems, speculative decoding, sparse attention ~15
Compression Quantization, pruning, KV cache compression ~10
Inference LLM serving, scheduling, distributed inference, long-context ~25
Multi-Modality VLM efficiency, diffusion optimization, token pruning ~20
RL RL training infrastructure, policy optimization, scaling ~15
Training Distributed training, memory efficiency, hyperparameter scaling ~35

Note

Introduces a new neurips25-mlsys section aggregating curated NeurIPS 2025 papers and concise summaries across Architecture, Compression, Inference, Multi‑Modality, RL, and Training.

  • New bundle: Adds neurips25-mlsys/ with curated NeurIPS 2025 ML systems content.
    • README.md: Overview, stats graphic, table of contents, category summaries, related links.
    • architecture.md: Efficient attention/KV-cache/speculative decoding systems; sparse attention (e.g., Gated Attention), diffusion architectures; theory and benchmarks.
    • compression.md: Quantized attention, KV-cache compression; FP4 training, FP8 fine-tuning; sparsification and compression theory.
    • inference.md: Serving systems (scheduling, distributed, KV cache), energy benchmarking, multi‑LoRA, TPU support, reliability; speculative decoding and long‑context methods; KV cache algorithms.
    • multi-modality.md: Multimodal serving, video systems; token pruning/merging; efficient diffusion architectures/training; multimodal adaptation; diffusion theory.
    • rl.md: RL training infrastructure, communication‑efficient training; efficient rollout/sampling; scalable policy optimization; scaling and analysis.
    • training.md: Distributed/communication‑efficient training; memory and long‑context training; compiler/hardware optimization; energy; stability and architectural tweaks; optimizer/data/WD scaling laws and numerical stability.

Written by Cursor Bugbot for commit bd7ec06. This will update automatically on new commits. Configure here.

@AmberLJC AmberLJC merged commit 46f8b14 into main Dec 12, 2025
2 checks passed
@claude
Copy link

claude bot commented Dec 12, 2025

PR Review: NeurIPS 2025 MLSys Bundle

Thank you for this comprehensive contribution! This is an excellent curated collection of NeurIPS 2025 papers. I've reviewed the PR against the repository's CLAUDE.md guidelines and have the following feedback:

✅ Strengths

1. Excellent Organization & Structure

  • Clear hierarchical organization (Systems → System-Algorithm Co-design → Algorithm)
  • Comprehensive table of contents in each file
  • Well-categorized papers across 6 major domains
  • Stats graphic and summary tables provide quick overview

2. High-Quality Content

  • Rich, detailed summaries for each paper explaining motivation, approach, and results
  • Appropriate highlighting of important papers (🏆 Best Paper, 🔦 Spotlight, 🎤 Oral)
  • Quantitative metrics included (speedups, memory savings, etc.)
  • Author affiliations and organizations noted where relevant

3. Self-Contained Resource

  • Creates a standalone NeurIPS 2025 collection in neurips25-mlsys/ directory
  • All 8 files present (README + 6 category files + stats.png)
  • Cross-references to main repository

⚠️ Format Deviations from Repository Standards

While this is excellent work, the formatting differs from the main repository's established conventions (per CLAUDE.md):

1. Link Format Inconsistencies
The main repository uses a specific format:

- [Paper Title](URL): Brief description | Venue

This PR uses an extended format with multiple fields:

#### Paper Title 🔦 **SPOTLIGHT**
- **Link:** URL
- **Authors:** Name (Org)
- **Summary:** Detailed multi-sentence summary

Examples from main repo:

  • [MegaScale](https://arxiv.org/abs/2402.15627): Scaling Large Language Model Training to More Than 10,000 GPUs | ByteDance
  • [DISTMM](https://www.usenix.org/conference/nsdi24/presentation/huang): Accelerating distributed multimodal model training | NSDI' 24

2. arXiv URL Format

  • Main repo: Uses https://arxiv.org/abs/XXXX.XXXXX (no www.)
  • This PR: Mix of formats including https://www.researchgate.net/, https://neurips.cc/virtual/, GitHub URLs, etc.

3. Metadata Placement

  • Main repo: Venue/org after pipe | on same line as title
  • This PR: Separate fields with extensive formatting

4. Summary Length

  • Main repo: Brief, concise descriptions (1 sentence typically)
  • This PR: Multi-paragraph detailed summaries

💡 Recommendations

Option A: Standalone Conference Bundle (Recommended)
Keep the current detailed format as a special conference-specific resource separate from the main README. This approach:

  • Preserves the excellent detailed summaries and organization
  • Serves as a comprehensive NeurIPS 2025 reference guide
  • Doesn't force retrofit into main repo format
  • Can be linked from main README as a curated collection

Option B: Integration into Main README
If integrating papers into the main README sections:

  • Convert to main repo format: [Title](URL): Brief desc | NeurIPS' 25
  • Place papers in appropriate existing sections (Training, Serving, Multi-Modal, etc.)
  • Remove extended summaries and author lists
  • Example: [SageAttention3](https://github.com/thu-ml/SageAttention): 8-bit attention with 2-5× speedup over FlashAttention | NeurIPS' 25

🔍 Minor Issues

1. Broken/Placeholder Links
Some entries have incomplete URLs:

  • AutoJudge (architecture.md:55): "NeurIPS 2025 Main Track" is not a URL
  • Several entries use NeurIPS schedule links instead of paper PDFs/arXiv

2. Missing arXiv Links
Many papers link to NeurIPS schedule pages, conference sites, or GitHub. Consider adding arXiv links where available for consistency with main repo.

3. Emoji Usage
The main CLAUDE.md states: "Only use emojis if the user explicitly requests it." This PR extensively uses emojis (🏆🔦🎤🏗️📦⚡🎨🎮🔧). While they enhance readability, this deviates from repository conventions.

4. Stats Image Reference
Ensure stats.png is included in the PR and displays correctly.


📋 Code Quality Assessment

  • No code changes: Pure documentation/content addition ✓
  • No security concerns: All links appear to be to legitimate academic/research sources ✓
  • No performance issues: Static markdown files ✓
  • Test coverage: N/A for documentation ✓

🎯 Verdict

Overall: Strong contribution with formatting considerations

This is high-quality work that significantly enhances the repository. The main decision is whether to:

  1. Keep as a standalone detailed NeurIPS 2025 guide (preserving current format)
  2. Integrate individual papers into main README (requires reformatting)

I recommend Option A - treating this as a special curated collection that complements the main repository. The detailed summaries, organization, and comprehensive coverage make this valuable as a standalone resource.

Suggested Next Steps:

  1. Add a link from main README.md to the NeurIPS 2025 collection
  2. Fix placeholder/missing URLs (particularly AutoJudge and similar entries)
  3. Consider adding arXiv links where available for future reference
  4. Verify stats.png displays correctly

Great work overall! This will be a valuable resource for the research community.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants