About Me

Biography [CV]

I am a Ph.D. student at

Shanghai Jiao Tong University and Shanghai AI Lab. My advisor is Yali Wang. I received my B.S. degree in Computer Science and Technology from China University of Mining and Technology (Beijing) in 2024.

Currently, I am a Research Intern at Alibaba (Taotian). I have also spent wonderful time as a research intern at Huawei Noah's Ark Lab, Shanghai AI Lab, SIAT, and Samsung. My research interests include:
  • Unified Multimodal Understanding and Generation
  • Omni-modal Representation Learning
  • Video Understanding & Generation
  • Multimodal Agents

🔥🔥🔥 I'm actively pursuing intern opportunities in Multimodal Understanding and Generation. Feel free to reach out for potential collaborations.

📑 Publications

* indicates equal contribution
  • UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation


    Zhengrong Yue, Haiyu Zhang, Xiangyu Zeng, Boyu Chen, Chenting Wang, Shaobin Zhuang, Lu Dong, Kunpeng Du, Yi Wang, Limin Wang, Yali Wang
    Arxiv, 2025.
    Paper Code

  • Beyond Textual CoT: Interleaved Text-Image Chains with Deep Confidence Reasoning for Image Editing


    Zhentao Zou*, Zhengrong Yue*, Kunpeng Du, Binlei Bao, Hanting Li, Haizhen Xie, Guozheng Xu, Yue Zhou, Yali Wang, Jie Hu, Xue Jiang, Xinghao Chen
    Arxiv, 2025.
    Paper Code

  • VideoChat-M1: Collaborative Policy Planning for Video Understanding via Multi-Agent Reinforcement Learning


    Boyu Chen*, Zikang Wang*, Zhengrong Yue*, Kainan Yan*, Chenyun Yu, Yi Huang, Zijun Liu, Yafei Wen, Xiaoxin Chen, Yang Liu, Peng Li, Yali Wang
    Arxiv, 2025.
    Paper Code

  • VideoChat-A1: Thinking with Long Videos by Chain-of-Shot Reasoning


    Zikang Wang*, Boyu Chen*, Zhengrong Yue*, Yi Wang, Yu Qiao, Limin Wang, Yali Wang
    AAAI 2026 (Oral).
    Paper Code

  • G-UBS: Towards Robust Understanding of Implicit Feedback via Group-Aware User Behavior Simulation


    Boyu Chen*, Siran Chen*, Zhengrong Yue*, Kainan Yan, Chenyun Yu, Beibei Kong, Cheng lei, Chengxiang Zhuo, Zang Li, Yali Wang
    AAAI 2026.
    Paper Code

  • VTTS: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception


    Ziang Yan*, Yinan He*, Xinhao Li*, Zhengrong Yue*, Xiangyu Zeng, Yali Wang, Yu Qiao, Limin Wang, Yi Wang
    NeurIPS 2025.
    Paper Code

  • LVAgent: Dynamic Round-by-round MLLM Agent Collaboration for Long Video Understanding


    Boyu Chen*, Zhengrong Yue*, Siran Chen*, Zikang Wang, Yang Liu, Peng Li, Yali Wang
    ICCV 2025.
    Paper Code

  • V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents


    Zhengrong Yue, Shaobin Zhuang, Kunchang Li, Yanbo Ding, Yali Wang
    CVPR 2025.
    Paper Code

  • TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning


    Xiangyu Zeng, Kunchang Li, Chenting Wang, Xinhao Li, Tianxiang Jiang, Ziang Yan, Songze Li, Yansong Shi, Zhengrong Yue, Yi Wang, Yali Wang, Yu Qiao, Limin Wang
    ICLR 2025.
    Paper Code

  • Muses: 3D-Controllable Image Generation via Multi-Modal Agent Collaboration


    Yanbo Ding, Shaobin Zhuang, Kunchang Li, Zhengrong Yue, Yu Qiao, Yali Wang
    AAAI 2025.
    Paper Code

🤵🏻 Internships

  • Alibaba | Research Intern (Taotian Group) | Jan 2026 - Present
    Unified Tokenizer; Unified Reward Model for Generation and Editing.

  • Huawei | Research Intern (Noah's Ark Lab) | July 2025 - Dec 2025
    Image Editing Model Based on Text-Image Interwoven Thought Chains; Unified Multimodal Model.

  • Shanghai AI Lab | Research Intern (OpenGVLab) | July 2024 - June 2025
    Unified Tokenizer; Video Understanding and Generation tasks within multimodal frameworks.

  • SIAT | Research Intern (Multimedia Lab) | Nov 2023 - June 2024
    Explored video style editing based on MLLM Agents.

  • Samsung | Research Intern (LUL Lab) | Sept 2023 - Nov 2023
    Multilingual document question-answering large model for Galaxy Z-Fold based on RAG.

🏅 Honors & Awards

  • National Scholarship (Top 1%), 2023
  • Outstanding Graduate of Beijing, 2024
  • Beijing Municipal Triple-Excellent Student, 2023
  • 1st Prize, National Robot and Artificial Intelligence Competition, 2022
  • 1st Prize, National University Student Intelligent Vehicle Competition, 2022

🤝 Services

  • Conference Reviewer: CVPR 2026, ECCV 2026
  • Journal Reviewer: IEEE TPAMI