Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals
The inevitable appearance of spurious correlations in training datasets hurts the generalization of NLP models on unseen data. Previous work has found that datasets with paired inputs are prone to…
Applying Intrinsic Debiasing on Downstream Tasks: Challenges and Considerations for Machine Translation
Most works on gender bias focus on intrinsic bias -- removing traces of information about a protected group from the model's internal representation. However, these works are often disconnected from…
Evaluating n-Gram Novelty of Language Models Using Rusty-DAWG
How novel are texts generated by language models (LMs) relative to their training corpora? In this work, we investigate the extent to which modern LMs generate /n/-grams from their training data,…
Detection and Measurement of Syntactic Templates in Generated Text
Recent work on evaluating the diversity of text generated by LLMs has focused on word-level features. Here we offer an analysis of syntactic features to characterize general repetition in models,…
CLIN: A Continually Learning Language Agent for Rapid Task Adaptation and Generalization
Language agents have shown some ability to interact with an external environment, e.g., a virtual world such as ScienceWorld, to perform complex tasks, e.g., growing a plant, without the startup…
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Today's most advanced multimodal models remain proprietary. The strongest open-weight models rely heavily on synthetic data from proprietary VLMs to achieve good performance, effectively distilling…
Application of the AI2 Climate Emulator to E3SMv2's global atmosphere model, with a focus on precipitation fidelity
Can the current successes of global machine learning-based weather simulators be generalized beyond 2-week forecasts to stable and accurate multiyear runs? The recently developed AI2 Climate…
Pushing the frontiers in climate modelling and analysis with machine learning
Climate modelling and analysis are facing new demands to enhance projections and climate information. Here we argue that now is the time to push the frontiers of machine learning beyond…
Weather and climate predicted accurately — without using a supercomputer
A cutting-edge global model of the atmosphere combines machine learning with a numerical model based on the laws of physics. This ‘hybrid’ system accurately predicts the weather — and even shows…
The Unreasonable Effectiveness of Easy Training Data for Hard Tasks
How can we train models to perform well on hard test data when hard training data is by definition difficult to label correctly? This question has been termed the scalable oversight problem and has…