video-understanding-dataset

Here are 8 public repositories matching this topic...

westlake-repl / MicroLens

A Large Short-video Recommendation Dataset with Raw Text/Audio/Image/Videos (Talk Invited by DeepMind).

video video-understanding large short-video video-generation audio-recommendation image-recommendation video-recommendation foundation-models large-language-models llm text-recommendation llm-recommendation video-understanding-dataset video-generation-dataset

Updated Jan 27, 2025
Python

JIA-Lab-research / LSDBench

Star

A benchmark that focuses on the sampling dilemma in long-video tasks. Through well-designed tasks, it evaluates the sampling efficiency of long-video VLMs. (ICCV2025)

benchmark video-understanding sampling-strategies reasoning-agent vision-language-model long-video-understanding multimodal-large-language-models video-understanding-dataset

Updated Aug 7, 2025
Python

Hai-chao-Zhang / DenseVideoUnderstand

Star

[Arxiv 2509.14199] DENSE VIDEO UNDERSTANDING WITH GATED RESIDUAL TOKENIZATION

vqa video-understanding vqa-dataset video-understanding-dataset video-llm

Updated Sep 21, 2025
Python

DragonLiu1995 / MUSIC-AVQA-v2.0

Star

Additional Videos Data and QA pairs for Balancing Original MUSIC-AVQA Dataset

qa-dataset audio-visual-learning video-understanding-dataset audio-visual-question-answering audio-visual-qa

Updated Jun 26, 2025

yuanrr / SCVBench

Star

SCVBench: A Benchmark with multi-turn dialogues for Story-Centric Video Understanding (IJCAI' 25)

videoqa chain-of-thought video-understanding-dataset

Updated Oct 15, 2025
Python

asad14053 / AdCare-VLM

Star

(Accepted: NeurIPS 2025 Workshop Mexico City 7HVU) AdCare-VLM: Leveraging Large Vision Language Models (LVLMs) to Monitor Long-Term Medication Adherence and Care

vllm video-understanding-dataset video-llava large-scale-holistic-video-understanding pre-alignment unified-latent-space