Experience
Industry
Amazon
Applied Scientist · 2025.02 – Present
Building Amazon Nova (2.0 and beyond) post-training and reasoning systems.
Agentic Post-Training · 2025.06 – Present
- Designed agentic SFT/RL recipes with tool use (Python interpreter + external tools).
- Built efficient inference infrastructure for data curation and online rollouts.
- Built and maintain an agentic RL framework (training + evaluation + debugging).
- Ongoing: scaling and reliability improvements.
Multi-Modal Reasoning · 2025.02 – 2025.05
- Built multi-modal reasoning SFT/RL recipes.
- Developed an efficient inference framework for multi-modal data generation.
- Extended text-only RL pipelines into a multi-modal RL framework.
Boson AI
Applied Scientist · 2024.08 – 2025.02
Post-training and evaluations for LLM alignment and usefulness.
- Developed internal DPO variants to improve alignment.
- Built diagnosis pipelines for preference alignment (data vs reward-model issues).
- Created RPGBench, a benchmark for evaluating LLMs as role-playing game engines.
- Optimized data synthesis and evaluation pipelines for inference efficiency.
Education
University of Illinois Urbana–Champaign
PhD in Computer Science · 2019.08 – 2024.08
Advisor: Prof. Heng Ji
Committee (Alphabetical): Prof. Jiawei Han, Prof. Derek Hoiem, Prof. Gramham Neubig, Dr. Scott Wen-tau Yih
Tsinghua University
BE in Electronic Engineering; secondary BS in Mathematics · 2015.09 – 2019.07
NLP research advised by Prof. Zhiyuan Liu.