Yifan Wu

Hi there! I am currently an AI Research Scientist at Meta, working on LLM post-training (Llama 3 & 4, Meta AI). Recently, I'm interested in LLM coding agents and automated research agents — harness design and advancing reinforcement learning for long-horizon, multi-turn agentic tasks.

I did my PhD at the University of Pennsylvania, working at the intersection of AI, computer vision, and healthcare, where I was affiliated with GRASP Lab and Penn Image Computing and Science Laboratory. I was grateful to be advised by Prof. James C. Gee, and worked closely with Prof. Jianbo Shi and Prof. Mark Yatskar.

Email: yfwu@seas.upenn.edu  /  Google Scholar  /  Linkedin  /  Github  /  X

profile photo

Selected Publications/Blog

Selected publications listed newest first, the full list lives on Google Scholar. Some works are highlighted.

Synthetic Sandbox for Training Machine Learning Engineering Agents
LLMAgent
Yuhang Zhou*, Lizhu Zhang*, Yifan Wu, Jiayi Liu, Xiangjun Fan, Zhuokai Zhao†, Hong Yan†
Arxiv, 2026.
Paper

We introduce SandMLE, a multi-agent framework that generates diverse, verifiable synthetic MLE environments at micro-scale (50–200 training samples per task), cutting training time of one round evolution by 13× and enabling large-scale on-policy trajectory-wise RL.

Scaling Agent Learning via Experience Synthesis
LLMAgent
Zhaorun Chen, Zhuokai Zhao, Kai Zhang, Bo Liu, Qi Qi, Yifan Wu, Tarun Kalluri, Sara Cao, Yuanhao Xiong, Haibo Tong, Huaxiu Yao, Hengduo Li, Jiacheng Zhu, Xian Li, Dawn Song, Bo Li, Jason Weston†, Dat Huynh†
ICLR, 2026.
Paper

We introduce DreamGym, a unified framework that synthesizes diverse agent experiences via a reasoning-based experience model, enabling scalable online RL without costly real-environment rollouts and providing a strong warm-start for sim-to-real transfer.

Agent Learning via Early Experience
LLMAgent
Kai Zhang, Xiangchao Chen, Bo Liu, Tianci Xue, Zeyi Liao, Zhihan Liu, Xiyao Wang, Yuting Ning, Zhaorun Chen, Xiaohan Fu, Jian Xie, Yuxuan Sun, Boyu Gou, Qi Qi, Zihang Meng, Jianwei Yang, Ning Zhang, Xian Li, Ashish Shah, Dat Huynh, Hengduo Li, Zi Yang, Sara Cao, Lawrence Jang, Shuyan Zhou, Jiacheng Zhu, Huan Sun, Jason Weston, Yu Su†, Yifan Wu
Arxiv, 2025.
Paper

We introduce "early experience", a paradigm in which agents learn from interactions generated by their own actions, using the resulting future states as supervision in place of reward signals — bridging imitation learning and reinforcement learning.

The Llama 4 Herd: The Beginning of a New Era of Natively Multimodal AI Innovation
LLM
Llama team
Meta AI Blog, 2025.
Blog  /  Paper
A Concept-based Interpretable Model for the Diagnosis of Choroid Neoplasias using Multimodal Data
Medical AILLMMultimodal
Yifan Wu*, Yang Liu*, Yue Yang, Michael S. Yao, Wenli Yang, Xuehui Shi, Lihong Yang, Dongjun Li, Yueming Liu, James C. Gee, Xuan Yang, Wenbin Wei, Shi Gu
Nature Communications, 2025.
Paper  /  Demo

We demonstrated how to encode the expertise of specialized clinicians into AI to build an interpretable machine learning model that produces outputs understandable by humans.

A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis
Medical AILLMMultimodal
Yue Yang, Mona Gandhi, Yufei Wang, Yifan Wu, Michael S. Yao, Chris Callison-Burch, James C. Gee, Mark Yatskar
NeurIPS 2024 (Spotlight).
Paper  /  Website

We introduced KnoBo, incorporating medical knowledge priors into interpretable models to enhance robustness against distribution shifts in hospitals, demographics, sex, and race, etc.

The Role of Chain-of-Thought in Complex Vision-Language Reasoning Task
LLMMultimodal
Yifan Wu, Pengchuan Zhang, Wenhan Xiong, Barlas Oguz, James C. Gee, Yixin Nie
Arxiv, 2023
Paper

We found that GPT-4V can benefit significantly from the Chain-of-Thought prompt. We present the "Description then Decision" strategy, which improves Winoground task performance by 50%.

Towards Establishing Dense Correspondence on Multiview Coronary Angiography: From Point-to-Point to Curve-to-Curve Query Matching
Medical AIComputer Vision
Yifan Wu*, Rohit Jena*, Mehmet Gulsun, Vivek Singh, Puneet Sharma, James C. Gee
Arxiv, 2023, under review.  
Paper

We established dense correspondence in multi-view angiography by formulating it as a query matching problem and extending point matching to curve matching for enhanced topological awareness.

NODEO: A Neural Ordinary Differential Equation Based Optimization Framework for Deformable Image Registration
Medical AIComputer Vision
Yifan Wu*, Tom Z Jiahao*, Jiancong Wang, Paul A Yushkevich, M Ani Hsieh, James C. Gee
CVPR, 2022  
Project Page/ Paper/ Supplementary / Sequential Registration Extension Work

We model each voxel as a moving particle and consider the set of all voxels in a 3D image as a high-dimensional dynamical system whose trajectory determines the targeted deformation field.

Interpretable Identification of Interstitial Lung Disease (ILD) Associated Findings from CT
Medical AIComputer Vision
Yifan Wu, Jiancong Wang, William D. Lindsay, Tarmily Wen, Jianbo Shi, and James C. Gee
MICCAI, 2020  
Paper

Formulated the radiologic ILD findings identification as a multi-class classification problem given the raw thoracic CT dataset.

From Image to Video Face Inpainting: Spatial-Temporal Nested GAN (STN-GAN) for Usability Recovery
Computer Vision
Yifan Wu, Vivek Singh, Ankur Kapoor
WACV, 2020  
Paper/ Video Result

We propose to use constrained inpainting methods to recover usability of corrupted images, which are masked for privacy protection but complete images are required for further algorithm development.

Towards Generating Personalized Volumetric Phantom from Patient's Surface Geometry
Medical AIComputer Vision
Yifan Wu, Vivek Singh, Brian Teixeira, Kai Ma, Birgi Tamersoy, Andreas Krauss, and Terrence Chen
MICCAI, 2019  
Paper

This paper presents a method to generate a volumetric phantom with internal anatomical structures from the patient?s skin surface geometry.

Privacy-Protective-GAN for Face De-identification
Computer Vision
Yifan Wu, Fan Yang, and Haibin Ling
Arxiv, 2018  
Paper

Defined the face-identification task by establishing an effective de-identification measurement: achieve privacy protection simultaneously preserving data utility. Proposed an end-to-end trainable framework to synthesize de-identified facial images.

Experiences
ibm Research Scientist Intern, May, 2023 - Nov, 2023, Menlo Park, CA, USA
Siemens Research Scientist, May, 2017 - Dec, 2018, Princeton, NJ, USA
Teaching
upenn Fall 2022: CIS537 Biomedical Image Analysis, Teaching Assistant.
upenn Fall 2021: CIS581 Computer Vision and Computational Photography, Teaching Assistant.

Website template from Jon Barron.