When does Bias Transfer in Transfer Learning?

We demonstrate that biases from pre-trained models can persist even after fine-tuning.

Distilling Model Failures as Directions in Latent Space

We demonstrate how to distill patterns of model errors as directions in a latent space.

Uncovering Brittleness with Datamodels

In the second part of our datamodels series, we use datamodels to identify and study a new form of model brittleness.

Missingness Bias in Model Debugging

We demonstrate how current missingness approximations introduce biases into model debugging.

Predicting Predictions with Datamodels

In the first part of our datamodels series, we introduce datamodeling and its (linear) instantiation on CIFAR-10.

Editing a Classifier

We develop a methodology for directly editing the predictions rules of a pre-trained classifier with virtually no additional data collection.

Combining Diverse Feature Priors

We explore how a diverse set of feature priors can be leveraged to improve model generalization.

Certified Patch Robustness Via Smoothed Vision Transformers (Part 2)

We demonstrate how vision transformers lead to strong certified patch defenses with standard accuracy and inference times comparable to standard (non-robust) models.