Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Information about pretraining corpora used to train the current best-performing language models is seldom discussed: commercial models rarely detail their data, and even open models are often…
Selective Visual Representations Improve Convergence and Generalization for Embodied-AI
Embodied AI models often employ off the shelf vision backbones like CLIP to encode their visual observations. Although such general purpose representations encode rich syntactic and semantic…
MARG: Multi-Agent Review Generation for Scientific Papers
We study the ability of LLMs to generate feedback for scientific papers and develop MARG, a feedback generation approach using multiple LLM instances that engage in internal discussion. By…
Tropical Cirrus Are Highly Sensitive to Ice Microphysics Within a Nudged Global Storm‐Resolving Model
Cirrus dominate the longwave radiative budget of the tropics. For the first time, the variability in cirrus properties and longwave cloud radiative effects (CREs) that arises from using different…
Catwalk: A Unified Language Model Evaluation Framework for Many Datasets
The success of large language models has shifted the evaluation paradigms in natural language processing (NLP). The community's interest has drifted towards comparing NLP models across many tasks,…
Kilometer-scale global warming simulations and active sensors reveal changes in tropical deep convection
Changes in tropical deep convection with global warming are a leading source of uncertainty for future climate projections. A comparison of the responses of active sensor measurements of cloud ice…
Self-Refine: Iterative Refinement with Self-Feedback
Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for…
IfQA: A Dataset for Open-domain Question Answering under Counterfactual Presuppositions
Although counterfactual reasoning is a fundamental aspect of intelligence, the lack of large-scale counterfactual open-domain question-answering (QA) benchmarks makes it difficult to evaluate and…
ACE: A fast, skillful learned global atmospheric model for climate prediction
Existing ML-based atmospheric models are not suitable for climate prediction, which requires long-term stability and physical consistency. We present ACE (AI2 Climate Emulator), a 200M-parameter,…
Probabilistic Precipitation Downscaling with Optical Flow-Guided Diffusion
In climate science and meteorology, local precipitation predictions are limited by the immense computational costs induced by the high spatial resolution that simulation methods require. A common…