Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Tropical Cirrus Are Highly Sensitive to Ice Microphysics Within a Nudged Global Storm‐Resolving Model

R. AtlasC. BrethertonA. SokolM. F. Khairoutdinov
2024
Geophysical Research Letters

Cirrus dominate the longwave radiative budget of the tropics. For the first time, the variability in cirrus properties and longwave cloud radiative effects (CREs) that arises from using different… 

Paloma: A Benchmark for Evaluating Language Model Fit

Ian MagnussonAkshita BhagiaValentin HofmannJesse Dodge
2023
arXiv

Language models (LMs) commonly report perplexity on monolithic data held out from training. Implicitly or explicitly, this data is composed of domains$\unicode{x2013}$varying distributions of… 

Catwalk: A Unified Language Model Evaluation Framework for Many Datasets

Dirk GroeneveldAnas AwadallaIz BeltagyJesse Dodge
2023
arXiv.org

The success of large language models has shifted the evaluation paradigms in natural language processing (NLP). The community's interest has drifted towards comparing NLP models across many tasks,… 

Kilometer-scale global warming simulations and active sensors reveal changes in tropical deep convection

Maximilien BolotLucas M. HarrisKai-Yuan ChengLinjiong Zhou & Stephan Fueglistaler
2023
NPJ Climate and Atmospheric Science

Changes in tropical deep convection with global warming are a leading source of uncertainty for future climate projections. A comparison of the responses of active sensor measurements of cloud ice… 

Self-Refine: Iterative Refinement with Self-Feedback

Aman MadaanNiket TandonPrakhar GuptaPeter Clark
2023
NeurIPS

Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for… 

IfQA: A Dataset for Open-domain Question Answering under Counterfactual Presuppositions

Wenhao YuMeng JiangPeter ClarkAshish Sabharwal
2023
EMNLP

Although counterfactual reasoning is a fundamental aspect of intelligence, the lack of large-scale counterfactual open-domain question-answering (QA) benchmarks makes it difficult to evaluate and… 

ACE: A fast, skillful learned global atmospheric model for climate prediction

Oliver Watt‐MeyerGideon DresdnerJ. McGibbonChristopher S. Bretherton
2023
NeurIPS • Tackling Climate Change with Machine Learning

Existing ML-based atmospheric models are not suitable for climate prediction, which requires long-term stability and physical consistency. We present ACE (AI2 Climate Emulator), a 200M-parameter,… 

Probabilistic Precipitation Downscaling with Optical Flow-Guided Diffusion

Prakhar SrivastavaRuihan YangGavin KerriganS. Mandt
2023
arXiv

In climate science and meteorology, local precipitation predictions are limited by the immense computational costs induced by the high spatial resolution that simulation methods require. A common… 

RealTime QA: What's the Answer Right Now?

Jungo KasaiKeisuke SakaguchiYoichi TakahashiKentaro Inui
2023
NeurIPS

We introduce R EAL T IME QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis (weekly in this version). R E AL T IME QA inquires about the… 

A Logic for Expressing Log-Precision Transformers

William MerrillAshish Sabharwal
2023
NeurIPS

One way to interpret the reasoning power of transformer-based language models is to describe the types of logical rules they can resolve over some input text. Recently, Chiang et al. (2023) showed…