Using ContextCite for LLM reliability

We use our method ContextCite to detect unverified statements and discover poisoned documents.

ContextCite: Attributing Model Generation to Context

We present ContextCite, a method for attributing statements generated by language models back to specific information provided in-context.

Editing Predictions by Modeling Model Computation

We use our component modeling framework to design targeted model edits.

Decomposing Predictions by Modeling Model Computation

We introduce a framework called component modeling for studying how model components collectively shape ML predictions.

How Can We Harness Pre-Training to Develop Robust Models?

We explore a simple principle for harnessing pre-training to develop robust models.

Ask Your Distribution Shift if Pre-Training is Right for You

We study the robustness benefits of pre-training and characterize failure modes that pre-training can and cannot address.

DsDm: Model-Aware Dataset Selection with Datamodels

Selecting better data by approximating how models learn from data.

How Training Data Guides Diffusion Models

We introduce a new framework for data attribution in generative settings, and propose an efficient method to attribute diffusion models.