Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

FloeNet: A mass-conserving global sea ice emulator that generalizes across climates

William GregoryM. BushukJames P. C. DuncanL. Zanna
2026
arXiv

We introduce FloeNet, a machine-learning emulator trained on the Geophysical Fluid Dynamics Laboratory global sea ice model, SIS2. FloeNet is a mass-conserving model, emulating 6-hour mass and area… 

Examining Fast Radiative Feedbacks Using Machine-Learning Weather Emulators

Ankur MaheshWilliam D. CollinsTravis A. O'BrienDa Yang
2026
arXiv

The response of the climate system to increased greenhouse gases and other radiative perturbations is governed by a combination of fast and slow feedbacks. Slow feedbacks are typically activated in… 

HiRO-ACE: Fast and skillful AI emulation and downscaling trained on a 3 km global storm-resolving model

Andre PerkinsAnna KwaJeremy McGibbonLucas Harris
2025
arXiv

Kilometer-scale simulations of the atmosphere are an important tool for assessing local weather extremes and climate impacts, but computational expense limits their use to small regions, short… 

Leveraging In-Context Learning for Language Model Agents

Shivanshu GuptaSameer SinghAshish SabharwalBen Bogin
2025
NeurIPS • Workshop on Multi-Turn Interactions in LLMs

In-context learning (ICL) with dynamically selected demonstrations combines the flexibility of prompting large language models (LLMs) with the ability to leverage training data to improve… 

A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers

William MerrillAshish Sabharwal
2025
NeurIPS

Recent theoretical results show transformers cannot express sequential reasoning problems over long inputs, intuitively because their computational *depth* is bounded. However, prior work treats the… 

Exact Expressive Power of Transformers with Padding

William MerrillAshish Sabharwal
2025
NeurIPS

Chain of thought is a natural inference-time method for increasing the computational power of transformer-based large language models (LLMs), but comes at the cost of sequential decoding. Are there… 

Language Modeling by Language Models

Junyan ChengPeter ClarkKyle Richardson
2025
NeurIPS

Can we leverage LLMs to model the process of discovering novel language model (LM) architectures? Inspired by real research, we propose a multi-agent LLM approach that simulates the conventional… 

Open-ended Scientific Discovery via Bayesian Surprise

Dhruv AgarwalBodhisattwa Prasad MajumderReece AdamsonPeter Clark
2025
NeurIPS

The promise of autonomous scientific discovery (ASD) hinges not only on answering questions, but also on knowing which questions to ask. Most recent works in ASD explore the use of large language… 

SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks

Yilun ZhaoKaiyan ZhangTiansheng HuArman Cohan
2025
NeurIPS

We present SciArena, an open and collaborative platform for evaluating foundation models on scientific literature tasks. Unlike traditional benchmarks for scientific literature understanding and… 

SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature

David WaddenKejian ShiJacob Daniel MorrisonArman Cohan
2025
EMNLP

We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following instances for training and evaluation, covering 54 tasks. These tasks span… 

1-10Next