Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Answering Questions by Meta-Reasoning over Multiple Chains of Thought

Ori YoranTomer WolfsonBen BoginJonathan Berant
2023
EMNLP

Modern systems for multi-hop question answering (QA) typically break questions into a sequence of reasoning steps, termed chain-of-thought (CoT), before arriving at a final answer. Often, multiple… 

Continued Pretraining for Better Zero- and Few-Shot Promptability

Zhaofeng WuRobert L. Logan IVPete WalshIz Beltagy
2022
EMNLP

Recently introduced language model prompting methods can achieve high accuracy in zero-and few-shot settings while requiring few to no learned task-specific parameters. Never-theless, these methods… 

Exploring The Landscape of Distributional Robustness for Question Answering Models

Anas AwadallaMitchell WortsmanGabriel IlharcoLudwig Schmidt
2022
Findings of EMNLP

We conduct a large empirical evaluation to investigate the landscape of distributional robustness in question answering. Our investigation spans over 350 models and 16 question answering datasets,… 

Hyperdecoders: Instance-specific decoders for multi-task NLP

Hamish IvisonMatthew E. Peters
2022
Findings of EMNLP

We investigate input-conditioned hypernetworks for multi-tasking in NLP, generating parameter-efficient adaptations for a decoder using a hypernetwork conditioned on the output of an encoder. This… 

Lila: A Unified Benchmark for Mathematical Reasoning

Swaroop MishraMatthew FinlaysonPan LuAshwin Kalyan
2022
EMNLP

Mathematical reasoning skills are essential for general-purpose intelligent systems to perform tasks from grocery shopping to climate modeling. Towards evaluating and improving AI systems in this… 

Abstract Visual Reasoning with Tangram Shapes

Anya JiNoriyuki KojimaN. RushYoav Artzi
2022
EMNLP

We introduce KiloGram, a resource for studying abstract visual reasoning in humans and machines. Drawing on the history of tangram puzzles as stimuli in cognitive science, we build a richly… 

CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation

Abhilasha RavichanderMatt GardnerAna Marasović
2022
EMNLP

The full power of human language-based communication cannot be realized without negation. All human languages have some form of negation. Despite this, negation remains a challenging phenomenon for… 

Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems

Yoshitomo MatsubaraLuca SoldainiEric LindAlessandro Moschitti
2022
Findings of EMNLP

Large transformer models can highly improve Answer Sentence Selection (AS2) tasks, but their high computational costs prevent their use in many real-world applications. In this pa-per, we explore… 

Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning

Oyvind TafjordBhavana Dalvi MishraPeter Clark
2022
EMNLP

Our goal is a question-answering (QA) system that can show how its answers are implied by its own internal beliefs via a systematic chain of reasoning . Such a capability would allow better… 

GENIE: Toward Reproducible and Standardized Human Evaluation for Text Generation

Daniel KhashabiGabriel StanovskyJonathan BraggDaniel S. Weld
2022
EMNLP

While often assumed a gold standard, effective human evaluation of text generation remains an important, open area for research. We revisit this problem with a focus on pro-ducing consistent…