Yifei Wang

Postdoctoral researcher at MIT CSAIL, advised by Stefanie Jegelka.

avatar.JPG

I work on understanding and advancing the machine learning principles in foundation models (esp LLMs), with major interests in self-supervised learning, long-context learning, reasoning, and safety (overview). My recent papers on these topics received 4 best paper awards and were featured by MIT News, CSAIL News, and Anthropic. I serve as an Area Chair for ICLR and review for ICML, NeurIPS, JMLR, PNAS.

I earned my Ph.D. in Applied Mathematics from Peking University, advised by Yisen Wang, Zhouchen Lin, and Jiansheng Yang. Previously, I received a B.S. in Data Science from the School of Mathematical Sciences, Peking University, advised by Tong Lin, and a B.A. in Philosophy from Peking University, advised by Zengding Wu.

news

August, 2025 I gave an invited talk Two New Dimensions of Sparsity for Scaling LLMs at Google DeepMind’s Gemini team, covering our recent work on sparse long-context training (ICLR 2024) and sparse embedding (ICML 2025 Oral).
June, 2025 Our ICML 2025 paper was featured in an MIT News article, Unpacking the bias of large language models, where we identified and theoretically proved the root causes of position bias in Transformers.
June, 2025 I gave an invited talk at the ASAP Seminar on Your Next-Token Prediction and Transformers Are Biased for Long-Context Modeling—see the recording at YouTube.
May, 2025 Three papers were accepted to ICML 2025. Our oral presentation (top 1%) introduces contrastive sparse representations (CSR) to compress state-of-the-art embedding models to just 32 active dimensions, enabling ~100× faster retrieval with minimal accuracy loss and low training cost for large-scale vector databases and RAG systems.
April, 2025 Our recent work When More is Less: Understanding Chain-of-Thought Length in LLMs received the Best Paper Runner-Up Award 🏆 at ICLR 2025 Workshop on Reasoning and Planning for LLMs.

recent highlights

  1. arXiv
    G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning
    Xiaojun Guo*, Ang Li*Yifei Wang*, Stefanie Jegelka, and Yisen Wang
    arXiv preprint arXiv:2505.18499, 2025
  2. ICML Oral
    Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
    Tiansheng Wen*Yifei Wang*, Zequn Zeng, Zhong Peng, Yudi Su, Xinyang Liu, Bo Chen, Hongwei Liu, Stefanie Jegelka, and Chenyu You
    ICML Oral Presentation (1%), 2025
  3. ICML
    On the Emergence of Position Bias in Transformers
    Xinyi Wu,  Yifei Wang, Stefanie Jegelka, and Ali Jadbabaie
    ICML, 2025
    Featured by MIT News 📰.
  4. ICLR Workshop Best Paper Runner-up
    When More is Less: Understanding Chain-of-Thought Length in LLMs
    Yuyang Wu*Yifei Wang*, Ziyu Ye, Tianqi Du, Stefanie Jegelka, and Yisen Wang
    ICLR 2025 Workshop on Reasoning and Planning for LLMs, 2025
    🏆 Best Paper Runner-up Award
  5. ICLR LLM training and eval
    What is Wrong with Perplexity for Long-context Language Modeling?
    Lizhe Fang*Yifei Wang*, Zhaoyang Liu, Chenheng Zhang, Stefanie Jegelka, Jinyang Gao, Bolin Ding, and Yisen Wang
    ICLR, 2025