Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Retrieval Data Augmentation Informed by Downstream Question Answering Performance

James FergusonPradeep DasigiTushar KhotHannaneh Hajishirzi
2022
ACL • FEVER

Training retrieval models to fetch contexts for Question Answering (QA) over large corpora requires labeling relevant passages in those corpora. Since obtaining exhaustive manual annotations of all… 

Cross-Task Generalization via Natural Language Crowdsourcing Instructions

Swaroop MishraDaniel KhashabiChitta BaralHanna Hajishirzi
2022
ACL

Can we enable NLP models to appropriately respond to instructional prompts and consequently generalize to new tasks? To study this question, we leverage the existing NLP datasets and the… 

Hey AI, Can You Solve Complex Tasks by Talking to Agents?

Tushar KhotKyle RichardsonDaniel KhashabiAshish Sabharwal
2022
Findings of ACL

Humans often solve complex problems by interacting (in natural language) with existing agents, such as AI assistants, that can solve simpler sub-tasks. These agents themselves can be powerful… 

NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks

Swaroop MishraArindam MitraNeeraj VarshneyA. Kalyan
2022
ACL

Given the ubiquitous nature of numbers in text, reasoning with numbers to perform simple calculations is an important skill of AI systems. While many datasets and models have been developed to this… 

Better Retrieval May Not Lead to Better Question Answering

Zhengzhong LiangTushar KhotSteven BethardAshish Sabharwal
2022
arXiv

Considerable progress has been made recently in open-domain question answering (QA) problems, which require Information Retrieval (IR) and Reading Comprehension (RC). A popular approach to improve… 

Saturated Transformers are Constant-Depth Threshold Circuits

William MerrillAshish SabharwalNoah A. Smith
2022
TACL

Transformers have become a standard neural network architecture for many NLP problems, motivating theoretical analysis of their power in terms of formal languages. Recent work has shown that… 

Memory-assisted prompt editing to improve GPT-3 after deployment

Aman MadaanNiket TandonPeter ClarkYiming Yang
2022
ACL • Workshop on Commonsense Reasoning

Large LMs such as GPT-3 are powerful, but can commit mistakes that are obvious to humans. For example, GPT-3 would mistakenly interpret "What word is similar to good?" to mean a homonym, while the… 

Multi-Modal Answer Validation for Knowledge-Based VQA

Jialin WuJiasen LuAshish SabharwalR. Mottaghi
2022
AAAI

The problem of knowledge-based visual question answering involves answering questions that require external knowledge in addition to the content of the image. Such knowledge typically comes in a… 

Pushing the Limits of Rule Reasoning in Transformers through Natural Language Satisfiability

Kyle RichardsonAshish Sabharwal
2022
AAAI

Investigating the reasoning abilities of transformer models, and discovering new challenging tasks for them, has been a topic of much interest. Recent studies have found these models to be… 

MuSiQue: Multihop Questions via Single-hop Question Composition

Harsh TrivediNiranjan BalasubramanianTushar KhotAshish Sabharwal
2022
TACL

Multihop reasoning remains an elusive goal as existing multihop benchmarks are known to be largely solvable via shortcuts. Can we create a question answering (QA) dataset that, by construction,…