"Research for curiosity, Engineering for impact."
Zhenhong Sun
My research explores open 3D world models with generative AI and embodied intelligence. My PhD training helps me think about fundamental questions in intelligent systems, while building Engineering-AI Lab (led by A/P Huadong Mo) pushes me to connect research with real engineering. Together, these experiences shape how I explore intelligent agents in digital worlds.
My current focus is LingJing, a research direction for open 3D world models. I study how reasoning and decision-making, 3D representation, spatial world synthesis, and action dynamics can work together in one system. I am interested in this because I believe real intelligence should not come only from text or abstract reasoning, but also from vision, action, and feedback from the world, closer to how biological intelligence develops.
Research on 3D World Model LingJing
Foundation Representation and Manipulation
- Unified 3D asset representations
- Single and multi-view reconstruction
- Part-aware structure decomposition
Spatial Understanding and Synthesis
- Compositional 3D scene generation
- Engine-grounded spatial perception
- Simulation data and evaluation
Interaction Dynamics and Motion
- Expressive avatar motion
- Human-scene interaction modeling
- Multi-agent behavioral dynamics
Decision Intelligence and Reasoning
- LLM-based world reasoning
- Language-guided planning
- Decision making in 3D environments
News
- May2026 Invited to serve as an Area Chair for NeurIPS 2026.
- Feb2026 Co-author paper accepted by ICLR 2026 on Scalable In-Context Q-Learning.
- Feb2026 First-author paper accepted by TMLR 2026 on Sketch-to-Scene generation.
- Oct2025 Started an academic exchange to NTU with Prof. Daocheng Tao and NUS with Dr. Yatao Bian.
- Sep2025 Co-Corresponding-author paper accepted by NeurIPS 2025 on Text-to-Decision Agent.
- Feb2025 Started building Engineering-AI Lab on 3D World Model under the leadership of A.P. Huadong Mo.
- Oct2024 First-author paper accepted by ACM MM 2024 on multi-entity text-to-image generation.
- Jun2024 Co-author paper accepted by CVPR 2024 on diffusion-based human image generation.
- Jan2024 Started Ph.D. study at the Australian National University, supervised by Prof. Daoyi Dong and Dr. Dong Gong.
- Feb2023 Equal-first-author paper accepted by ICLR 2023 on efficient neural architecture search.
- Jul2022 First-author papers accepted by ICML 2022 and NeurIPS 2022 on efficient neural architecture search.
- Aug2021 First-author paper accepted by ACM MM 2021 on learning image compression.
Representative Publications
View all publications-
StoryBlender: Inter-Shot Consistent and Editable 3D Storyboard with Spatial-temporal Dynamics Preprint Co-First LeaderarXiv preprint arXiv:2604.03315, 2026.
-
3DXTalker: Unifying Identity, Lip Sync, Emotion, and Spatial Dynamics in Expressive 3D Talking Avatars Preprint Co-First LeaderarXiv preprint arXiv:2602.10516, 2026.
-
T3-S2S: Training-free Triplet Tuning for Sketch to Scene Generation Journal FirstTransactions on Machine Learning Research (TMLR), 2026.
-
Text-to-Decision Agent: Offline Meta-Reinforcement Learning from Natural Language Supervision Conference Co-CorrConference on Neural Information Processing Systems (NeurIPS 2025), 2025.