Eric Wong

Assistant Professor, University of Pennsylvania

Office	Amy Gutman Hall 621
Email	exwong@cis
Lab Blog	debugml.github.io

I am an assistant professor at the Department of Computer and Information Science at the University of Pennsylvania. I lead Brachio Lab on debugging machine learning and making systems actually do what we want them to do. I’m also a part of the ASSET Center on safe, explainable, and trustworthy AI systems. Previously, I completed my PhD at CMU advised by Zico Kolter, and did a postdoc with Aleksander Madry.

PhD applicants: If you’re interested, you will need to

Apply to the CIS department
Select me as a potential advisor in your application.

Undergraduates/masters students: If you are a UPenn student and are interested in doing independent machine learning research, then I would recommend (1) take CIS 5200 (2) read this blog post, and (3) fill out this form. We will be in touch if there is a good fit. I strongly recommend undergraduates take CIS 3333 Mathematics for Machine Learning, which will prepare you for the mathematics behind ML research. If you’re interested in doing advanced research or graduate coursework in AI/machine learning but have taken only the core math requirements of the CS degree, then this course is for you. If you are not at UPenn, I do not currently have opportunities for external students.

Recent News

June ‘25: Our paper, “The FIX Benchmark: Extracting Features Interpretable to eXperts” was accepted to DMLR 2025.
May ‘25: Weiqiu You’s paper, “Sum-of-Parts: Self-Attributing Neural Networks with End-to-End Learning of Feature Groups”, as well as another collaboration “DOLPHIN: A Programmable Framework for Scalable Neurosymbolic Learning” was accepted to ICML 2025. Our FIX Benchmark was also accepted to DMLR.
January ‘25: Our paper, “Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference” has been accepted to ICLR 2025. In addition, “Avoiding Copyright Infringement via Machine Unlearning” has been accepted to NAACL-Findings 2025.
December ‘24: We will present three papers at NeurIPS 2024 this month: “AR-Pro: Counterfactual Explanations for Anomaly Repair with Formal Properties”, “Data-Efficient Learning with Neural Programs”, and “JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models”.
October ‘24: We’ve released the FIX benchmark for extracting features interpretable to experts! Check it out at our website here
July ‘24: We will present two papers at ICML 2024: “DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation” and “Towards Compositionality in Concept Learning”.
May ‘24: Our paper “Evaluating Groups of Features via Consistency, Contiguity, and Stability” will be presented at ICLR 2024 as an oral.
April ‘24: We’ve been given an Amazon Research Award. Thanks Amazon!
October ‘23: We’ve released new blog posts on faithful grouped attributions and certified jailbreak defenses. We’ve also released new work on semantic jailbreaks
October ‘23: I gave a talk at the UCSB Responsible Machine Learning Summit
September ‘23: Our paper “Stability Guarantees for Feature Attributions with Multiplicative Smoothing” will be presented at NeurIPS 2023
July ‘23: We released a new blog post on certified stability guarantees for feature attributions
July ‘23: Our paper “Do Machine Learning Models Learn Statistical Rules Inferred from Data?” will be presented at ICML 2023
May ‘23: I gave a keynote talk at DLSP 2023 on adversarial prompting
Mar ‘22: I am on the organizing committee for the ICML 2023 2nd Workshop on New Frontiers in Adversarial Machine Learning
Mar ‘23: We’ve released a new blog post covering our recent work on adversarial prompting
Mar ‘23: We’ve released a new blog post covering our recent work on in-context influences
Jan ‘23: I am teaching CIS 5200 Machine Learning with Surbhi Goel
July ‘22: I am creating a new course on debugging the ML pipeline for the Fall 2022 semester
May ‘22: I will be moving to UPenn CIS as an Assistant Professor starting Fall 2022