MLSys bundle from NeurIPS 25 #41

AmberLJC · 2025-12-12T19:48:39Z

Category	Description	Papers
Architecture	Efficient attention, KV-cache systems, speculative decoding, sparse attention	~15
Compression	Quantization, pruning, KV cache compression	~10
Inference	LLM serving, scheduling, distributed inference, long-context	~25
Multi-Modality	VLM efficiency, diffusion optimization, token pruning	~20
RL	RL training infrastructure, policy optimization, scaling	~15
Training	Distributed training, memory efficiency, hyperparameter scaling	~35

Note

Introduces a new neurips25-mlsys section aggregating curated NeurIPS 2025 papers and concise summaries across Architecture, Compression, Inference, Multi‑Modality, RL, and Training.

New bundle: Adds neurips25-mlsys/ with curated NeurIPS 2025 ML systems content.
- README.md: Overview, stats graphic, table of contents, category summaries, related links.
- architecture.md: Efficient attention/KV-cache/speculative decoding systems; sparse attention (e.g., Gated Attention), diffusion architectures; theory and benchmarks.
- compression.md: Quantized attention, KV-cache compression; FP4 training, FP8 fine-tuning; sparsification and compression theory.
- inference.md: Serving systems (scheduling, distributed, KV cache), energy benchmarking, multi‑LoRA, TPU support, reliability; speculative decoding and long‑context methods; KV cache algorithms.
- multi-modality.md: Multimodal serving, video systems; token pruning/merging; efficient diffusion architectures/training; multimodal adaptation; diffusion theory.
- rl.md: RL training infrastructure, communication‑efficient training; efficient rollout/sampling; scalable policy optimization; scaling and analysis.
- training.md: Distributed/communication‑efficient training; memory and long‑context training; compiler/hardware optimization; energy; stability and architectural tweaks; optimizer/data/WD scaling laws and numerical stability.

^{Written by Cursor Bugbot for commit bd7ec06. This will update automatically on new commits. Configure here.}

claude · 2025-12-12T19:50:39Z

PR Review: NeurIPS 2025 MLSys Bundle

Thank you for this comprehensive contribution! This is an excellent curated collection of NeurIPS 2025 papers. I've reviewed the PR against the repository's CLAUDE.md guidelines and have the following feedback:

✅ Strengths

1. Excellent Organization & Structure

Clear hierarchical organization (Systems → System-Algorithm Co-design → Algorithm)
Comprehensive table of contents in each file
Well-categorized papers across 6 major domains
Stats graphic and summary tables provide quick overview

2. High-Quality Content

Rich, detailed summaries for each paper explaining motivation, approach, and results
Appropriate highlighting of important papers (🏆 Best Paper, 🔦 Spotlight, 🎤 Oral)
Quantitative metrics included (speedups, memory savings, etc.)
Author affiliations and organizations noted where relevant

3. Self-Contained Resource

Creates a standalone NeurIPS 2025 collection in neurips25-mlsys/ directory
All 8 files present (README + 6 category files + stats.png)
Cross-references to main repository

⚠️ Format Deviations from Repository Standards

While this is excellent work, the formatting differs from the main repository's established conventions (per CLAUDE.md):

1. Link Format Inconsistencies
The main repository uses a specific format:

- [Paper Title](URL): Brief description | Venue

This PR uses an extended format with multiple fields:

#### Paper Title 🔦 **SPOTLIGHT**
- **Link:** URL
- **Authors:** Name (Org)
- **Summary:** Detailed multi-sentence summary

Examples from main repo:

[MegaScale](https://arxiv.org/abs/2402.15627): Scaling Large Language Model Training to More Than 10,000 GPUs | ByteDance
[DISTMM](https://www.usenix.org/conference/nsdi24/presentation/huang): Accelerating distributed multimodal model training | NSDI' 24

2. arXiv URL Format

Main repo: Uses https://arxiv.org/abs/XXXX.XXXXX (no www.)
This PR: Mix of formats including https://www.researchgate.net/, https://neurips.cc/virtual/, GitHub URLs, etc.

3. Metadata Placement

Main repo: Venue/org after pipe | on same line as title
This PR: Separate fields with extensive formatting

4. Summary Length

Main repo: Brief, concise descriptions (1 sentence typically)
This PR: Multi-paragraph detailed summaries

💡 Recommendations

Option A: Standalone Conference Bundle (Recommended)
Keep the current detailed format as a special conference-specific resource separate from the main README. This approach:

Preserves the excellent detailed summaries and organization
Serves as a comprehensive NeurIPS 2025 reference guide
Doesn't force retrofit into main repo format
Can be linked from main README as a curated collection

Option B: Integration into Main README
If integrating papers into the main README sections:

Convert to main repo format: [Title](URL): Brief desc | NeurIPS' 25
Place papers in appropriate existing sections (Training, Serving, Multi-Modal, etc.)
Remove extended summaries and author lists
Example: [SageAttention3](https://github.com/thu-ml/SageAttention): 8-bit attention with 2-5× speedup over FlashAttention | NeurIPS' 25

🔍 Minor Issues

1. Broken/Placeholder Links
Some entries have incomplete URLs:

AutoJudge (architecture.md:55): "NeurIPS 2025 Main Track" is not a URL
Several entries use NeurIPS schedule links instead of paper PDFs/arXiv

2. Missing arXiv Links
Many papers link to NeurIPS schedule pages, conference sites, or GitHub. Consider adding arXiv links where available for consistency with main repo.

3. Emoji Usage
The main CLAUDE.md states: "Only use emojis if the user explicitly requests it." This PR extensively uses emojis (🏆🔦🎤🏗️📦⚡🎨🎮🔧). While they enhance readability, this deviates from repository conventions.

4. Stats Image Reference
Ensure stats.png is included in the PR and displays correctly.

📋 Code Quality Assessment

No code changes: Pure documentation/content addition ✓
No security concerns: All links appear to be to legitimate academic/research sources ✓
No performance issues: Static markdown files ✓
Test coverage: N/A for documentation ✓

🎯 Verdict

Overall: Strong contribution with formatting considerations

This is high-quality work that significantly enhances the repository. The main decision is whether to:

Keep as a standalone detailed NeurIPS 2025 guide (preserving current format)
Integrate individual papers into main README (requires reformatting)

I recommend Option A - treating this as a special curated collection that complements the main repository. The detailed summaries, organization, and comprehensive coverage make this valuable as a standalone resource.

Suggested Next Steps:

Add a link from main README.md to the NeurIPS 2025 collection
Fix placeholder/missing URLs (particularly AutoJudge and similar entries)
Consider adding arXiv links where available for future reference
Verify stats.png displays correctly

Great work overall! This will be a valuable resource for the research community.

AmberLJC added 3 commits December 9, 2025 23:18

neurips-25-mlsys

017add6

mlsys

9508534

stats

bd7ec06

AmberLJC merged commit 46f8b14 into main Dec 12, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MLSys bundle from NeurIPS 25 #41

MLSys bundle from NeurIPS 25 #41

Uh oh!

AmberLJC commented Dec 12, 2025 •

edited by cursor bot

Loading

Uh oh!

Uh oh!

claude bot commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MLSys bundle from NeurIPS 25 #41

MLSys bundle from NeurIPS 25 #41

Uh oh!

Conversation

AmberLJC commented Dec 12, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

claude bot commented Dec 12, 2025

PR Review: NeurIPS 2025 MLSys Bundle

✅ Strengths

⚠️ Format Deviations from Repository Standards

💡 Recommendations

🔍 Minor Issues

📋 Code Quality Assessment

🎯 Verdict

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AmberLJC commented Dec 12, 2025 •

edited by cursor bot

Loading