HuskyDoge

Benhao Huang

MAITRIX
TRAIS
SJTU-XAI
Shanghai, China

About Me

Hello! Husky here! I'm an incoming student in the CMU MSML program. Throughout my research journey, I've explored a variety of topics. Currently, my primary focus is on generative world modeling, where I have been working with Professor Zhiting Hu on World Model Projects. Prior to this, I collaborated with my amazing supervisor Jiaqi W. Ma at the TRAIS Lab, focusing on dataset curation using LLM agents. I also had a valuable research experience in AI interpretability at Professor Quanshi Zhang's XAI Lab.

Within world modeling, I'm particularly(currently) interested in the following directions:

  • How can we enable world models to operate effectively in long-sequence scenarios? This includes both looking ahead (simulating long trajectories) and looking back (designing effective memory mechanisms).
  • How can we make world models real-time interactive? This is especially crucial for real world applications, and I find both the mathematical and ML systems perspectives fascinating.
  • How can we make world models more physically grounded? I'm excited to explore data-driven approaches, RLHF, and other emerging methods.

If you're also passionate about these areas, feel free to reach out! I'm always open to collaborations and eager to gain more experience along the way.

News

  • 2025/05: DCA-Bench has been accepted to KDD 2025 DB-Track as an oral paper! See you in Toronto 🎉

Education

Aug 2025 - Feb 2027 (expected)
M.S. in Machine Learning
Carnegie Mellon University
Sept 2021 - July 2025
B.S.E. in Computer Science
Shanghai Jiao Tong University
Sept 2018 - June 2021
High School
Zhejiang Ruian High School

Interests

World Model, Reasoning and Planning
Data-centric AI, AI Automation
Efficient ML, Long Sequence Modeling

Selected Works

PAN: Towards General World Model with Natural Language Actions and Video States
Benhao Huang, 
Pandora Team, 
Zhiting Hu, 
Eric P. Xing
#World Model
#Image to Video
#Diffusion

A step towards a General World Model (GWM) that can simulate complex video scenarios with natural language actions.

paper (in progress)
report-v1
code
DCA-Bench: A Benchmark for Dataset Curation Agents
Benhao Huang, 
Yingzhuo Yu, 
Jin Huang, 
Xingjian Zhang, 
Jiaqi W. Ma
KDD-2025 DB Track (Oral)
#LLM Agent
#Benchmark
#2025

A benchmark exploring the performance of LLM Agents on detecting issues in datasets hosted on popular platforms.

paper
code

Awards & Scholarships

National Scholarship (Top 0.2% nationwide)
2024
Rui Yuan Hong Shan Scholarship (Top 2%, SJTU)
2023
Shao Qiu Scholarship (Top 4%, SJTU)
2022
Meritorious Winner of MCM/ICM
2022
Last updated: 2025/05/18