multihead-latent-attention

Here are 3 public repositories matching this topic...

nfldty / mla-vit

MLA-ViT introduces Multi-Head Latent Attention (MLA) to Vision Transformers, reducing time and memory usage during both training and inference while maintaining comparable accuracy.

computer-vision vision-transformer multihead-latent-attention

Updated Apr 13, 2025
Jupyter Notebook

shreyansh26 / multihead-latent-attention

Star

A code deep-dive on one of the key innovations from Deepseek - Multihead Latent Attention (MLA)

attention mha mla multihead-attention gqa mqa multihead-latent-attention

Updated Nov 9, 2025
Python

SamanSathenjeri / nano-deepseekV3

Star

A compact, single-GPU optimized version of DeepSeek-V3, trained on FineWebEDU for research and experimentation.

ai ml mlp rope mixture-of-experts deepseek multihead-latent-attention

Updated Aug 11, 2025
Python

Improve this page

Add a description, image, and links to the multihead-latent-attention topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multihead-latent-attention topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly