7000-005: Debugging Data & Models (Fall 2022)

Do you trust your model? Despite their widespread adoption and impressive performance, modern machine learning models have a crucial flaw: it is extremely difficult to discern when and how models fail. This pitfall has given rise to a field of research known as trustworthy machine learning, in order to make these systems safe, responsible, and understandable.

This course will explore the tools and methods for analyzing the machine learning pipeline and assessing their trustworthiness (or lack thereof), from the datasets, models, and predictions perspective. A tentative schedule of these topics can be found at the bottom of this page.

Instructor: Eric Wong (exwong@cis)

Class: Tues 1:45-3:15pm Eastern, DRLB 4C6 / Thurs 1:45-3:15pm Eastern, CHEM 514

Website: https://www.cis.upenn.edu/~exwong/debugml/

Ed discussion: Self sign-up link

Mask policy: Masks are required.

Students from all majors and degree levels are welcome. There are no specific course requirements, but a background in machine learning at an introductory course level is expected, as well as basic programming experience for the course project.

Grading will be based off of 80% course project (15% proposal + 20% progress report + 25% final report + 20% presentation) and 20% participation (5% readings + 15% discussion). There will be no homeworks or exams.

This class will combine lectures and discussions. The lectures will typically cover the core groundwork, followed by a student-led in-depth discussion based on assigned readings. Readings and lecture materials will be posted on the schedule.


As part of this course, students will inspect and debug machine learning problems for deficiencies in settings of their choice. All parts of the pipeline are fair game, including data collection, training algorithms, models and architectures, the resulting predictions, and even the debugging tools themselves. This can take the form of an audit (identifying the shortcomings of a fixed pipeline) or a patch/update (changing the pipeline to fix a problem). Example projects at various stages in the pipeline include the following:

Tentative schedule and topics

The schedule and topics can change based on students’ interests and as time permits. If you don’t see something you’d like to learn about, send me an email.

Date Topic Notes
August 30 Overview Slides
Lecture notes
Supplementary reading - Problems in health care
Failure modes    
September 1 Bias Types of Bias
Lecture notes
The trouble with Bias - NeurIPS 2017 Keynote by Kate Crawford
Supplementary reading - Suresh & Guttag, 2019
September 6 Bias Assigned reading - Bolukbasi et al. 2016
Supplementary reading - Arteaga et al. 2019
September 8 Out of distribution Covariate, label & concept shifts
Lecture notes
Assigned reading - Rabanser et al. 2019
Supplementary reading - Ruan et al. 2022
September 13 Out of distribution Measuring distribution shift
Assigned reading - Riegar et al. 2019
Supplementary reading - Guo et al. 2022
September 15 Adversarial Adversarial attacks
Lecture notes
Assigned reading - Beery et al. 2018
September 20 No class  
September 22 Adversarial Data poisoning, backdoors, Byzantine faults
Assigned reading - Li et al. 2020
Supplementary reading - Rice et al. 2021
Supplementary reading - Robey et al. 2022
September 27 Adversarial Model stealing & membership inference
Assigned reading - Nguyen et al. 2014
Assigned reading - Sinha et al. 2017
Supplementary reading - Tramer et al. 2016
Supplementary reading - Jagielski et al. 2020
September 29 Explainability Data visualization, feature visualization, & interpretable models
Assigned reading - Javanmard et al. 2020
Assigned reading - Shamir et al. 2021
Supplementary reading - Rudin 2019
Supplementary reading - Nguyen et al. 2016
Debugging models    
October 4 Explainability Local & global explanations Project proposal due Proposal guidelines
October 6 Fall term break  
October 11 Explainability Example-based & model visualizations
October 13 Verification Complete & incomplete
October 18 Verification Specifications and properties
October 20 Scientific discovery Finding correlations
October 25 Scientific discovery Influence functions & data models
October 27 Robust learning Robust training & overfitting
ML repair    
November 1 Robust learning Provable defenses (bound propagation & smoothing)
Progress report due
November 3 Robust learning Distributional robustness (Domain generalization, Group DRO, IRM, JTT)
November 8 Election day Reading group only
November 10 Data interventions Data balancing, source selection, pruning hard examples
November 15 Data interventions Data augmentations (classical, subgroups & generative)
November 17 Model adjustments Model editing and fine-tuning
November 22 Model adjustments Model patching & repair
November 24 Thanksgiving  
November 29 NeurIPS  
December 1 NeurIPS  
December 6 Presentations  
December 8 Presentations  
December 13 Reading period  
December 15 Final examinations Final report due
December 22 Term ends  

There is no official textbook for this course, but you may find the following references to be useful: