Industry

Amazon

Applied Scientist · 2025.02 – Present
Building Amazon Nova (2.0 and beyond) post-training and reasoning systems.

Agentic Post-Training · 2025.06 – Present
  • Designed agentic SFT/RL recipes with tool use (Python interpreter + external tools).
  • Built efficient inference infrastructure for data curation and online rollouts.
  • Built and maintain an agentic RL framework (training + evaluation + debugging).
  • Ongoing: scaling and reliability improvements.
Multi-Modal Reasoning · 2025.02 – 2025.05
  • Built multi-modal reasoning SFT/RL recipes.
  • Developed an efficient inference framework for multi-modal data generation.
  • Extended text-only RL pipelines into a multi-modal RL framework.

Boson AI

Applied Scientist · 2024.08 – 2025.02
Post-training and evaluations for LLM alignment and usefulness.

  • Developed internal DPO variants to improve alignment.
  • Built diagnosis pipelines for preference alignment (data vs reward-model issues).
  • Created RPGBench, a benchmark for evaluating LLMs as role-playing game engines.
  • Optimized data synthesis and evaluation pipelines for inference efficiency.

Education

University of Illinois Urbana–Champaign University of Illinois Urbana–Champaign

PhD in Computer Science · 2019.08 – 2024.08
Advisor: Prof. Heng Ji

Committee (Alphabetical): Prof. Jiawei Han, Prof. Derek Hoiem, Prof. Gramham Neubig, Dr. Scott Wen-tau Yih

Tsinghua University Tsinghua University

BE in Electronic Engineering; secondary BS in Mathematics · 2015.09 – 2019.07
NLP research advised by Prof. Zhiyuan Liu.