Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties
Human values are crucial to human decision-making. Value pluralism is the view that multiple correct values may be held in tension with one another (e.g., when considering lying to a friend to…
Global Precipitation Correction Across a Range of Climates Using CycleGAN
Accurate precipitation simulations for various climate scenarios are critical for understanding and predicting the impacts of climate change. This study employs a Cycle‐generative adversarial…
TimeArena: Shaping Efficient Multitasking Language Agents in a Time-Aware Simulation
Despite remarkable advancements in emulating human-like behavior through Large Language Models (LLMs), current textual simulations do not adequately address the notion of time. To this end, we…
OLMo: Accelerating the Science of Language Models
Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off,…
Neural Network Parameterization of Subgrid‐Scale Physics From a Realistic Geography Global Storm‐Resolving Simulation
Parameterization of subgrid‐scale processes is a major source of uncertainty in global atmospheric model simulations. Global storm‐resolving simulations use a finer grid (less than 5 km) to reduce…
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Information about pretraining corpora used to train the current best-performing language models is seldom discussed: commercial models rarely detail their data, and even open models are often…
Selective Visual Representations Improve Convergence and Generalization for Embodied-AI
Embodied AI models often employ off the shelf vision backbones like CLIP to encode their visual observations. Although such general purpose representations encode rich syntactic and semantic…
MARG: Multi-Agent Review Generation for Scientific Papers
We study the ability of LLMs to generate feedback for scientific papers and develop MARG, a feedback generation approach using multiple LLM instances that engage in internal discussion. By…
Tropical Cirrus Are Highly Sensitive to Ice Microphysics Within a Nudged Global Storm‐Resolving Model
Cirrus dominate the longwave radiative budget of the tropics. For the first time, the variability in cirrus properties and longwave cloud radiative effects (CREs) that arises from using different…
Catwalk: A Unified Language Model Evaluation Framework for Many Datasets
The success of large language models has shifted the evaluation paradigms in natural language processing (NLP). The community's interest has drifted towards comparing NLP models across many tasks,…