I'm an MSc. Computer Vision student at MBZUAI doing machine learning and computer vision research. I work on developing self-evolving large multimodal models for generalizable multimodal intelligence, within the broader context of multimodal representation learning for reasoning. I'm also interested in unified large-scale models for image understanding and generation, and controllable generation of extended, coherent video sequences.
I also build AI-tech for computer aided diagnostics at Zestral, in collaboration with multiple hypergrowth startups. If you're passionate about building cutting-edge tech backed by deep research, let's connect!

