About Me

Biograpgy [CV]

I am a Ph.D. student at

Shanghai Jiao Tong University and Shanghai AI Lab. My advisors are Yali Wang. I received my B.S. degree in Computer Science and Technology from China University of Mining and Technology (Beijing) in 2024. Currently, I am a Research Intern at Shanghai AI Lab. I was fortunate to be involved in internship programs at Samsung and SIAT.

My research interests include:
  • Unified Multimodal Understanding and Generation
  • Video Understanding
  • Video Generation
  • Multimodal Agent
Most of my work focuses on unified multimodal understanding and generation foundation models, covering model design, large-scale pretraining, dataset collection, and benchmark evaluation.

🔥🔥🔥 I'm actively pursuing intern opportunities in Multimodal Understanding and Generation. Feel free to reach out for potential collaborations.

📑 Publications

  • UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation

    Zhengrong Yue, Haiyu Zhang, Xiangyu Zeng, Boyu Chen, Chenting Wang, Shaobin Zhuang, Lu Dong, KunPeng Du, Yi Wang, Limin Wang, Yali Wang

    Arxiv, 2025.

    Paper Code

  • Beyond Textual CoT: Interleaved Text-Image Chains with Deep Confidence Reasoning for Image Editing

    Zhentao Zou, Zhengrong Yue, Kunpeng Du, Binlei Bao, Hanting Li, Haizhen Xie, Guozheng Xu, Yue Zhou, Yali Wang, Jie Hu, Xue Jiang, Xinghao Chen

    Arxiv, 2025.

    Paper Code

  • G-UBS: Towards Robust Understanding of Implicit Feedback via Group-Aware User Behavior Simulation

    Boyu Chen, Siran Chen, Zhengrong Yue, Kainan Yan, Chenyun Yu, Beibei Kong, Cheng Lei, Chengxiang Zhuo, Zang Li, Yali Wang

    Arxiv, 2025.

    Paper Code

  • VideoChat-A1: Thinking with Long Videos by Chain-of-Shot Reasoning

    Zikang Wang*, Boyu Chen*, Zhengrong Yue*, Yi Wang, Yu Qiao, Limin Wang, Yali Wang .

    Arxiv, 2025.

    Paper Code

  • VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception

    Ziang Yan*, Yinan He*, Xinhao Li*, Zhengrong Yue*, Xiangyu Zeng, Yali Wang, Yu Qiao, Limin Wang, Yi Wang .

    NIPS 2025.

    Paper Code

  • LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents

    Boyu Chen*, Zhengrong Yue*, Siran Chen*, Zikang Wang*, Yang Liu, Peng Li, Yali Wang .

    ICCV 2025.

    Paper Code

  • V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents

    Zhengrong Yue*, Shaobin Zhuang*, Kunchang Li*, Yanbo Ding*, Yali Wang.

    CVPR 2025.

    Paper Code

  • TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning

    Xiangyu Zeng, Kunchang Li, Chenting Wang, Xinhao Li, Tianxiang Jiang, Ziang Yan, Songze Li, Yansong Shi, Zhengrong Yue, Yi Wang, Yali Wang, Yu Qiao, Limin Wang.

    ICLR 2025.

    Paper Code

  • Muses: 3D-Controllable Image Generation via Multi-Modal Agent Collaboration

    Yanbo Ding*, Shaobin Zhuang*, Kunchang Li*, Zhengrong Yue*, Yu Qiao, Yali Wang.

    AAAI 2025.

    Paper Code

🤵🏻 Internships

  • Huawei, Foundation Model Department (Noah's Ark Lab)

    July 2025 Unified Multimodal Model and Benchmark.
  • Shanghai Artificial Intelligence Laboratory, General Vision Lab (OpenGVLab)

    July 2024 Pre-training and Representation Learning for Unified Understanding and generation.
  • Shenzhen Institute of Advanced Technology, Multimedia Lab (MMLAB)

    November 2023 Explored video style editing based on MLLM Agents.
  • Samsung R&D Institute China-Beijing, Language Understanding Lab (LUL)

    September 2023 Developed a multilingual document question-answering large model for Galaxy Z-Fold smartphones based on RAG.

🏅 Honors

  • The 23td China Robotics and Artificial Intelligence Competition Intelligent Sorting Challenge (National No. 1) National First Prize(2022)
  • The 17th National Undergraduate Smart Car Competition Xunfei Creative Group (National Top Four) National First Prize(2022)
  • The 15th National College Student Energy Conservation and Emission Reduction Social Practice and Science and Technology Contest, National Third Prize(2022)
  • The 15th China Undergraduate Computer Design Contest Provincial Third Prize(2022)

🤝 Services

  • Attend CVPR 2023 Beijing Workshop, 2023.06