Rethinking Backdoor Attacks

We introduce a new perspective on backdoor attacks and defenses in deep learning.

TRAK-ing Model Behavior with Data

We introduce TRAK, a new data attribution method that scales to large(r) models!

Dataset Interfaces: Diagnosing Model Failures Using Controllable Counterfactual Generation

We introduce dataset interfaces, a scalable framework that synthesizes counterfactual examples under user-specified shifts

Tailored Data Augmentation to Mitigate Model Failures

We demonstrate how we can use Stable Diffusion to target a model's failure modes

ModelDiff: A Framework for Comparing Learning Algorithms

We introduce a framework for comparing ML models trained with different learning algorithms.

Raising the Cost of Malicious AI-Powered Image Editing

Inspired by an episode of the Daily Show, we hacked together a technique for "immunizing" images against being edited by diffusion models.

A Data-Based Perspective on Transfer Learning

We present a framework for pinpointing the impact of the source datasets in transfer learning.

When does Bias Transfer in Transfer Learning?

We demonstrate that biases from pre-trained models can persist even after fine-tuning.