Yifei Wang

Postdoctoral researcher at MIT CSAIL, advised by Stefanie Jegelka.

avatar.JPG

I am interested in understanding and building AI models with good representation of the world, with a focus on unsupervised learning and language models these days (see research). I obtained my PhD in Applied Mathematics from Peking University in 2023, advised by Yisen Wang, Zhouchen Lin, Jiansheng Yang. I did my undergraduate at School of Mathematical Sciences at Peking University as well. My first-author papers received 4 best paper awards and I served as an Area Chair for ICLR 2024 and 2025.

I like solving mysterious machine learning puzzles with major impacts, such as, why overthinking harms LLM reasoning, why Transformers have position bias, why DINO features won’t collapse, why MAE learns good features, why adversarial training severely overfits, and why robust models become generative. Don’t wanna bother reading papers? Buy me a coffee and I’ll give you a 5-min walkthrough. :)

news

May, 2025 3 papers were accepted at ICML 2025. We proposed CSR that builds state-of-the-art shortening embedding models (image/text/multimodal) with sparse coding. We characterized the reasons behind Transformers’ position bias and how LLMs’ length generalization requires output alignment.
April, 2025 Our recent work When More is Less: Understanding Chain-of-Thought Length in LLMs received the Best Paper Runner-Up Award 🏆 at ICLR 2025 Workshop on Reasoning and Planning for LLMs.
April, 2025 I will be giving a tutorial on the Principles of Self-supervised Learning in the Foundation Model Era at IJCAI 2025 (Aug 16 - Aug 22). See you in Montreal.
April, 2025 I was invited to give a lightning talk titled Contextual Self-supervised Learning: A Lesson from LLMs (video) at the Self-Supervised Learning Workshop hosted by Flatiron Institute (Simons Foundation).