Yifei Wang

Postdoc at MIT CSAIL

avatar.JPG

I am currently a postdoctoral researcher at MIT CSAIL, advised by Stefanie Jegelka. I am interested in principled, scalable, and safety-aware machine learning algorithms for building self-supervised foundation models, with applications to vision, language, graph, and multimodal domains.

My first-author papers received the Best ML Paper Award at ECML-PKDD 2021, the Silver Best Paper Award at the ICML 2021 AdvML workshop, and the Best Paper Award at the ICML 2024 ICL workshop.

I obtained my PhD in Applied Mathematics from Peking University in 2023, advised by Yisen Wang, Zhouchen Lin, Jiansheng Yang. Prior to that, I got my bachelor’s degrees from PKU math and philosophy.

I am on job market. Feel free to reach out if interested.

news

November, 2024 I will give a talk at the Department of Applied Mathematics and Statistics at Johns Hopkins University.
October, 2024 New preprints are out, addressing why perplexity fails to reflect long-context performance and how to fix it (paper), how interpretability techniques (eg SAEs) also endow better model robustness (paper), and whether ICL can truly extrapolate to OOD scenarios (paper).
October, 2024 6 papers were accepted to NeurIPS 2024. In the foundation model world, we demystified how LLMs are capable of self-correction (paper), how to make wonderful joint embedding models capable of representation-space in-context learning (paper), why predicting data corruptions (e.g., noise) learns good representations (paper), and how Transformers avoid feature collapse with LayerNorm (paper).
September, 2024 I gave a talk at NYU Tandon on Building Safe Foundation Models from Principled Understanding.
August, 2024 I will be organizing ML Tea seminar at MIT CSAIL this fall, a weekly 30-minute talk series from members of the machine learning community around MIT. Join us on Mondays at 32-G882!
August, 2024 I gave a talk at Princeton University on Reimagining Self-Supervised Learning with Context.
August, 2024 I will continue to serve as an Area Chair for ICLR 2025.
July, 2024 I will be organizing the NeurIPS 2024 workshop on Red Teaming GenAI: What Can We Learn from Adversaries? Join us to discuss the brighter side of redteaming.

selected publications

  1. NeurIPS
    A Theoretical Understanding of Self-Correction through In-context Alignment
    Yifei Wang*, Yuyang Wu*, Zeming Wei, Stefanie Jegelka, and Yisen Wang
    In NeurIPS, 2024
    Best Paper Award at ICML 2024 ICL Workshop
    We introduced the first theoretical explanation of how self-correction works in LLMs (as in o1) and showed its effectiveness against social bias and jailbreak attacks.
  2. NeurIPS
    In-Context Symmetries: Self-Supervised Learning through Contextual World Models
    Sharut Gupta*, Chenyu Wang*Yifei Wang*, Tommi Jaakkola, and Stefanie Jegelka
    In NeurIPS, 2024
    Oral Presentation (top 4) at NeurIPS 2024 SSL Workshop
    We introduced in-context learning abilities to joint embedding methods, making them more general-purpose and efficiently adaptable to downstream tasks.
  3. ICLR
    Non-negative Contrastive Learning
    Yifei Wang*, Qi Zhang*, Yaoyu Guo, and Yisen Wang
    In ICLR, 2024
    Drawing inspirations from Non-negative Matrix Factorization (NMF), we introduced a principled one-line technique that significantly boosts representation interpretability.
  4. ICLR
    Do Generated Data Always Help Contrastive Learning?
    Yifei Wang*, Jizhe Zhang*, and Yisen Wang
    In ICLR, 2024
    We revealed both theoretically and practically that synthetic data introduces fundamental bias to SSL generalization, but, with an adaptive strategy of data mixing and augmentation, can yield substantial benefits.
  5. ICML
    On the Generalization of Multi-modal Contrastive Learning
    Qi Zhang*Yifei Wang*, and Yisen Wang
    In ICML, 2023
    We established the first generalization analysis for multi-modal contrastive learning (e.g., CLIP) and explained how it outperforms self-supervised contrastive learning.
  6. ICLR
    A Message Passing Perspective on Learning Dynamics of Contrastive Learning
    Yifei Wang*, Qi Zhang*, Tianqi Du, Jiansheng Yang, Zhouchen Lin, and Yisen Wang
    In ICLR, 2023
    We revealed that contrastive learning performs message passing on sample graph, which connects self-supervised learning and graph neural networks as a whole.
  7. ICLR
    Towards a Unified Theoretical Understanding of Non-contrastive Learning via Rank Differential Mechanism
    Zhijian Zhuo*Yifei Wang*, Jinwen Ma, and Yisen Wang
    In ICLR, 2023
    We revealed that various asymmtric designs in non-contrastive learning (BYOL, SimSiam, DINO, SwAV) can be explained from a unified spectral filtering perspective.
  8. NeurIPS Spotlight
    How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders
    Qi Zhang*Yifei Wang*, and Yisen Wang
    In NeurIPS Spotlight (Top 5%), 2022
    We established the first generalization analysis of masked autoencoders and revealed an inherent connection to contrastive learning.
  9. ICLR
    Chaos is a Ladder: A New Theoretical Understanding of Contrastive Learning via Augmentation Overlap
    Yifei Wang*, Qi Zhang*, Yisen Wang, Jiansheng Yang, and Zhouchen Lin
    In ICLR, 2022
    We established a new graph perspective to formulate how contrastive learning works, and established practical generalization bounds and unsupervised measures.
  10. ICLR
    A Unified Contrastive Energy-based Model for Understanding the Generative Ability of Adversarial Training
    Yifei Wang, Yisen Wang, Jiansheng Yang, and Zhouchen Lin
    In ICLR, 2022
    Silver Best Paper Award at ICML 2021 AdvML workshop
    From an energy-based perspective, we formulated contrastive learning as a generative model, and established the connection between adversarial training and maximum likelihood, thus briding generative and discriminative models together.
  11. ECML-PKDD
    Reparameterized Sampling for Generative Adversarial Networks
    Yifei Wang, Yisen Wang, Jiansheng Yang, and Zhouchen Lin
    In ECML-PKDD, 2021
    Best ML Paper Award (1/685), invited to Machine Learning
    We explored using GAN discriminator (as a good reward model) to bootstrap sample quality through an efficient MCMC algorithm, which not only guarantees theoretical convergence but also improves sample efficiency and quality in practice.