An abstract illustration of swirling shapes, meant to denote a futuristic feeling.

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

IfQA: A Dataset for Open-domain Question Answering under Counterfactual Presuppositions

Wenhao YuMeng JiangPeter ClarkAshish Sabharwal

2023

EMNLP

Although counterfactual reasoning is a fundamental aspect of intelligence, the lack of large-scale counterfactual open-domain question-answering (QA) benchmarks makes it difficult to evaluate and…

ACE: A fast, skillful learned global atmospheric model for climate prediction

Oliver Watt‐MeyerGideon DresdnerJ. McGibbonChristopher S. Bretherton

2023

NeurIPS • Tackling Climate Change with Machine Learning

Existing ML-based atmospheric models are not suitable for climate prediction, which requires long-term stability and physical consistency. We present ACE (AI2 Climate Emulator), a 200M-parameter,…

Probabilistic Precipitation Downscaling with Optical Flow-Guided Diffusion

Prakhar SrivastavaRuihan YangGavin KerriganS. Mandt

2023

arXiv

In climate science and meteorology, local precipitation predictions are limited by the immense computational costs induced by the high spatial resolution that simulation methods require. A common…

Harmonic Mobile Manipulation

Ruihan YangYejin KimAniruddha KembhaviKiana Ehsani

2023

IROS

Recent advancements in robotics have enabled robots to navigate complex scenes or manipulate diverse objects independently. However, robots are still impotent in many household tasks requiring…

RealTime QA: What's the Answer Right Now?

Jungo KasaiKeisuke SakaguchiYoichi TakahashiKentaro Inui

2023

NeurIPS

We introduce R EAL T IME QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis (weekly in this version). R E AL T IME QA inquires about the…

A Logic for Expressing Log-Precision Transformers

William MerrillAshish Sabharwal

2023

NeurIPS

One way to interpret the reasoning power of transformer-based language models is to describe the types of logical rules they can resolve over some input text. Recently, Chiang et al. (2023) showed…

Fine-Grained Human Feedback Gives Better Rewards for Language Model Training

Zeqiu WuYushi HuWeijia ShiHanna Hajishirzi

2023

NeurIPS

Language models (LMs) often exhibit undesirable text generation behaviors, including generating false, toxic, or irrelevant outputs. Reinforcement learning from human feedback (RLHF) - where human…

How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources

Yizhong WangHamish IvisonPradeep DasigiHanna Hajishirzi

2023

NeurIPS

In this work we explore recent advances in instruction-tuning language models on a range of open instruction-following datasets. Despite recent claims that open models can be on par with…

SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks

Bill Yuchen LinYicheng FuKarina YangXiang Ren

2023

NeurIPS

We introduce SwiftSage, a novel agent framework inspired by the dual-process theory of human cognition, designed to excel in action planning for complex interactive reasoning tasks. SwiftSage…

Faith and Fate: Limits of Transformers on Compositionality

Nouha DziriXiming LuMelanie SclarYejin Choi

2023

NeurIPS

Transformer large language models (LLMs) have sparked admiration for their exceptional performance on tasks that demand intricate multi-step reasoning. Yet, these models simultaneously show failures…

Previous132-141Next