I’m currently a fourth year undergrad at Peking University (PKU), advised by Prof. Wentao Zhang. My primary research interests lie in data-centric machine learning and its applications in large language models (LLMs). In particular, I focus on how high-quality data can be leveraged to enhance models’ generalization and logical reasoning abilities. I am also interested in how LLM agents can perform decision-making and reasoning in complex, dynamic settings.

Prior to this, I worked in Prof. Yitao Liang’s CraftJarvis Team, committed to developing a generalist agent capable of mastering a wide range of tasks and challenges within the open-world Minecraft.

🔥 News

2026.04: 🎉🎉 Our project page of “FlipVQA: Scaling Multi-modal Instruction Tuning via Textbook-to-Knowledge Synthesis” is out here.
2026.04: 🎉🎉 Our survey “Data-Centric Perspectives on Agentic Retrieval-Augmented Generation: A Survey” is accepted by ACL Findings 2026.
2025.06: 🎉🎉 Our paper “Open-World Skill Discovery from Unsegmented Demonstration Videos” is accepted by ICCV 2025.

📝 Publications

ICCV 2025

Open-World Skill Discovery from Unsegmented Demonstration Videos

Jingwen Deng*, Zihao Wang*, Shaofei Cai, Anji Liu, Yitao Liang

Paper | Project

We propose a self-supervised method, Skill Boundary Detection (SBD), that segments unlabelled long videos into semantic aware skill-consistent parts by detecting prediction-error peaks.
In Minecraft experiments, SBD significantly improves both short-term and long-horizon task performance, enabling effective use of diverse online videos to train instruction-following agents.

ACL Findings 2026

Data-Centric Perspectives on Agentic Retrieval-Augmented Generation: A Survey

Jingwen Deng, Jihao Huang, Zhen Hao Wong, Hao Liang, Quanqing Xu, Bin Cui, Wentao Zhang

Paper | Project

This survey provides a data-centric overview of Agentic RAG, outlining its full data lifecycle and offering guidance for building scalable datasets to power adaptive, knowledge-seeking LLM agents.

LogicPuzzleRL: Cultivating Robust Mathematical Reasoning in LLMs via Reinforcement Learning

Zhen Hao Wong*, Jingwen Deng*, Runming He, Zirong Chen, Qijie You, Hejun Dong, Hao Liang, Chengyu Shen, Bin Cui, Wentao Zhang

arXiv 2025

Paper | Code

📝 Works in Progress

FlipVQA: Scaling Multi-modal Instruction Tuning via Textbook-to-Knowledge Synthesis

Zhen Hao Wong*, Jingwen Deng*, Yuzhao Wang, Wenkai Yu, Jihao Huang, Runming He, Chengyu Shen, Hao Liang, Wentao Zhang

Paper | Project

We propose an automated pipeline that extracts well-formed QA and VQA pairs from college textbooks by combining layout-aware OCR with LLM-based semantic parsing.

🎖 Honors and Awards

2025
- Leo KoGuan Scholarship, Peking University
2024
- Leo KoGuan Scholarship, Peking University
- First Prize, Mathematics competition of Chinese College Students
2023
- YanChuang Capital Scholarship, Peking University

📖 Educations

2022.09 - , Peking University, BS in Computer Science
- GPA: 3.834/4.0 (rank 9/146, top 7% in major)

💻 Internships

2026.01 - , Ubiquant, China.
2025.09 -2025.12 , DPTechnology, China.