Share New Paper-VideoCoF

Hello! 👋

I would like to share to add our latest work, Unified Video Editing with Temporal Reasoner (aka VideoCoF) to this awesome list.

VideoCoF introduces a Chain-of-Frames (CoF) mechanism that empowers video diffusion models with temporal reasoning capabilities. 
1. Unlike previous methods, it follows a "See -> Reason -> Edit" paradigm: the model first explicitly reasons about the editing region (generating a reasoning frame) before executing the edit. This allows for precise, mask-free video editing. 
2. Trained on only 50k data (33 frames), VideoCoF demonstrates robust multi-shot editing and length generalization (e.g., 4× length extrapolation).
3. Diverse Editing Tasks: Supports fine-grained (instance and part level, spatial aware) Object Removal, Object Addition, Object Swap, and Local Style Transfer. 
4. Also, benefited from dmd Lora, we released a 4-step fast inference script (~20-30s per video). The full model and code has been released.

We believe this work fits perfectly into the scope of Video Reasoning.

Here is the information:

Paper Title: Unified Video Editing with Temporal Reasoner

Authors: Xiangpeng Yang, Ji Xie, Yiyuan Yang, Yan Huang, Min Xu, Qiang Wu

Links: 
[[Paper]](https://arxiv.org/abs/2512.07469) | [[Project Page]](https://videocof.github.io/) }| [[Code]](https://github.com/knightyxp/VideoCoF)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Share New Paper-VideoCoF #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Share New Paper-VideoCoF #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions