|
Mingshuang Luo | 罗明双
I am a last year Ph.D. student at the Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS), China, under the supervision of Prof. Hong Chang and Associate Prof. Ruibing Hou.
Before embarking on my doctoral studies, I worked as a Speech Recognition Algorithm Engineer at Xiaomi Group AI Lab from 2021.7 to 2022.8, supervised by Dr. Daniel Povey (IEEE Fellow).
I obtained my master degree in computer technology from University of the Chinese Academy of Science in 2021.7, supervised by Prof. Shiguang Shan (IEEE Fellow) and Associate Prof. Shuang Yang.
My research interests lie in computer vision and computer graphics, with a particular focus on
Multimodal AIGC, 2D/3D character animation, video/motion generation, world model, and VLA.
Expected graduation in 2026, open to postdoc and
research scientist opportunities.
Email
/
Google
Scholar /
Github
|
|
News
|
[2026/02] 🎉 DreamActor-M2 and its project page was released.
[2026/01] 🎉 CLASP was accepted by IEEE TMM 2026.
[2025/06] 🎉 Morph was accepted by ICCV 2025.
[2024/09] 🎉 M3GPT was accepted by NeurIPS 2024.
[2024/09] 🎉 One paper was accepted by Briefings in Bioinformatics 2024.
[2023/06] 🎉 Two papers was accepted by ICASSP 2023.
[2022/06] 🎉 One paper was accepted by Interspeech 2022.
[2020/05] 🎉 One paper was accepted by BMVC 2020.
[2020/03] 🎉 One paper was accepted by IEEE FG 2020.
|
Research
|
DreamActor-M2: Universal Character Image Animation via Spatiotemporal In-Context Learning
Mingshuang Luo*, Shuang Liang*, Zhengkun Rong*, Yuxuan Luo†, Tianshu Hu§, Ruibing Hou§, Hong Chang, Yong Li, Yuan Zhang, Mingyuan Gao
Tech Report, 2026
project
page / arxiv
We present DreamActor-M2, a universal character image animation framework that reformulates motion conditioning as a spatiotemporal in-context learning task. Our design harnesses the inherent generative priors of video foundation models while facilitating a critical evolution toward pose-free, end-to-end motion transfer directly from raw videos.
|
|
FlowAct-R1: Towards Interactive Humanoid Video Generation
FlowAct Team, ByteDance Intelligent Creation
Tech Report, 2026
project
page / arxiv
We present FlowAct-R1, a novel framework that enables lifelike, responsive, and high-fidelity
humanoid video generation for seamless real-time interaction.
|
|
CLIP-Guided Adaptable Self-Supervised Learning for Human-Centric Visual Tasks
Mingshuang Luo, Ruibing Hou*, Bo Chao, Hong Chang, Zimo Liu*, Yaowei Wang, Shiguang Shan
IEEE TMM, 2026
project
page / arxiv
We propose CLASP (CLIP-guided Adaptable Self-suPervised
learning), a novel framework designed for unsupervised pretraining in human-centric visual tasks.
|
|
Morph: A Motion-free Physics Optimization Framework for Human Motion Generation
Zhuo Li*, Mingshuang Luo*, Ruibing Hou†, Xin Zhao, Hao Liu, Hong Chang, Zimo Liu, Chen Li
ICCV, 2025
project
page / arxiv
We propose Morph, a Motion-Free physics optimization framework, consisting of a Motion Generator and a Motion Physics Refinement module, for enhancing physical plausibility without relying on expensive real-world motion data.
|
|
CLIP-Guided Adaptable Self-Supervised Learning for Human-Centric Visual Tasks
Mingshuang Luo, Ruibing Hou*, Zhuo Li, Hong Chang, Zimo Liu, Yaowei Wang, Shiguang Shan
NeurIPS, 2024
project
page / arxiv
We present M3GPT, an advanced Multimodal, Multitask framework for Motion comprehension and generation.
|
Experience and Education
This homepage is designed based on Jon Barron's website and
deployed on Github
Pages. Last updated: Jan. 2026
© 2026 Mingshuang Luo
罗明双
|