Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Transformer Feed-Forward Layers Are Key-Value Memories

Mor GevaR. SchusterJonathan BerantOmer Levy
2021
EMNLP

Feed-forward layers constitute two-thirds of a transformer model’s parameters, yet their role in the network remains underexplored. We show that feed-forward layers in transformer-based language… 

Understanding Mention Detector-Linker Interaction in Neural Coreference Resolution

Zhaofeng WuMatt Gardner
2021
EMNLP • CRAC

Despite significant recent progress in coreference resolution, the quality of current state-of-the-art systems still considerably trails behind human-level performance. Using the CoNLL-2012 and… 

Value-aware Approximate Attention

Ankit GuptaJonathan Berant
2021
EMNLP

Following the success of dot-product attention in Transformers, numerous approximations have been recently proposed to address its quadratic complexity with respect to the input length. However, all… 

What's in your Head? Emergent Behaviour in Multi-Task Transformer Models

Mor GevaUri KatzAviv Ben-ArieJonathan Berant
2021
EMNLP

The primary paradigm for multi-task training in natural language processing is to represent the input with a shared pre-trained language model, and add a small, thin network (head) per task. Given… 

DIALKI: Knowledge Identification in Conversational Systems through Dialogue-Document Contextualization

Zeqiu WuBo-Ru LuHannaneh HajishirziMari Ostendorf
2021
EMNLP

Identifying relevant knowledge to be used in conversational systems that are grounded in long documents is critical to effective response generation. We introduce a knowledge identification model… 

Finding needles in a haystack: Sampling Structurally-diverse Training Sets from Synthetic Data for Compositional Generalization

Inbar OrenJonathan HerzigJonathan Berant
2021
EMNLP

Modern semantic parsers suffer from two principal limitations. First, training requires expensive collection of utterance-program pairs. Second, semantic parsers fail to generalize at test time to… 

CORA: Benchmarks, Baselines, and Metrics as a Platform for Continual Reinforcement Learning Agents

Sam PowersEliot XingEric KolveA. Gupta
2021
CoLLAs

Progress in continual reinforcement learning has been limited due to several barriers to entry: missing code, high compute requirements, and a lack of suitable benchmarks. In this work, we present… 

Container: Context Aggregation Network

Peng GaoJiasen LuHongsheng LiAniruddha Kembhavi
2021
arXiv

Convolutional neural networks (CNNs) are ubiquitous in computer vision, with a myriad of effective and efficient variations. Recently, Transformers – originally introduced in natural language… 

SciA11y: Converting Scientific Papers to Accessible HTML

Lucy Lu WangIsabel CacholaJonathan BraggDaniel S. Weld
2021
ASSETS

We present SciA11y, a system that renders inaccessible scientific paper PDFs into HTML. SciA11y uses machine learning models to extract and understand the content of scientific PDFs, and reorganizes… 

Can Machines Learn Morality? The Delphi Experiment

Liwei JiangChandra BhagavatulaJenny LiangYejin Choi
2021
arXiv

As AI systems become increasingly powerful and pervasive, there are growing concerns about machines’ morality or a lack thereof. Yet, teaching morality to machines is a formidable task, as morality…