Using ContextCite for LLM reliability
We use our method ContextCite to detect unverified statements and discover poisoned documents.ContextCite: Attributing Model Generation to Context
We present ContextCite, a method for attributing statements generated by language models back to specific information provided in-context.Editing Predictions by Modeling Model Computation
We use our component modeling framework to design targeted model edits.Decomposing Predictions by Modeling Model Computation
We introduce a framework called component modeling for studying how model components collectively shape ML predictions.How Can We Harness Pre-Training to Develop Robust Models?
We explore a simple principle for harnessing pre-training to develop robust models.Ask Your Distribution Shift if Pre-Training is Right for You
We study the robustness benefits of pre-training and characterize failure modes that pre-training can and cannot address.DsDm: Model-Aware Dataset Selection with Datamodels
Selecting better data by approximating how models learn from data.
Newer