An abstract illustration of swirling shapes, meant to denote a futuristic feeling.

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Ethical-Advice Taker: Do Language Models Understand Natural Language Interventions?

Jieyu ZhaoDaniel KhashabiTushar KhotAshish Sabharwal and Kai-Wei Chang

2021

ACL-IJCNLP

Is it possible to use natural language to intervene in a model’s behavior and alter its prediction in a desired way? We investigate the effectiveness of natural language interventions for…

Expected Validation Performance and Estimation of a Random Variable's Maximum

Jesse DodgeSuchin GururanganD. CardNoah A. Smith

2021

Findings of EMNLP

Research in NLP is often supported by experimental results, and improved reporting of such results can lead to better understanding and more reproducible science. In this paper we analyze three…

Investigating Transfer Learning in Multilingual Pre-trained Language Models through Chinese Natural Language Inference

Hai HuHe ZhouZuoyu TianKyle Richardson

2021

Findings of ACL

Multilingual transformers (XLM, mT5) have been shown to have remarkable transfer skills in zero-shot settings. Most transfer studies, however, rely on automatically translated resources (XNLI,…

ReadOnce Transformers: Reusable Representations of Text for Transformers

Shih-Ting LinAshish SabharwalTushar Khot

2021

ACL

While large-scale language models are extremely effective when directly fine-tuned on many end-tasks, such models learn to extract information and solve the task simultaneously from end-task…

Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation

Jungo KasaiNikolaos PappasHao PengNoah A. Smith

2021

ICLR

State-of-the-art neural machine translation models generate outputs autoregressively, where every step conditions on the previously generated tokens. This sequential nature causes inherent decoding…

Random Feature Attention

Hao PengNikolaos PappasDani YogatamaLingpeng Kong

2021

ICLR

Transformers are state-of-the-art models for a variety of sequence modeling tasks. At their core is an attention function which models pairwise interactions between the inputs at every timestep.…

Symbolic Brittleness in Sequence Models: on Systematic Generalization in Symbolic Mathematics

S. WelleckPeter WestJize CaoYejin Choi

2021

AAAI

Neural sequence models trained with maximum likelihood estimation have led to breakthroughs in many tasks, where success is defined by the gap between training and test performance. However, their…

S2AND: A Benchmark and Evaluation System for Author Name Disambiguation

Shivashankar SubramanianDaniel KingDoug DowneySergey Feldman

2021

JCDL

Author Name Disambiguation (AND) is the task of resolving which author mentions in a bibliographic database refer to the same real-world person, and is a critical ingredient of digital library…

COVR: A test-bed for Visually Grounded Compositional Generalization with real images

Ben BoginShivanshu GuptaMatt GardnerJonathan Berant

2021

EMNLP

While interest in models that generalize at test time to new compositions has risen in recent years, benchmarks in the visually-grounded domain have thus far been restricted to synthetic images. In…

Conversational Multi-Hop Reasoning with Neural Commonsense Knowledge and Symbolic Logic Rules

Forough ArabshahiJennifer LeeA. BosselutTom Mitchell

2021

EMNLP

One of the challenges faced by conversational agents is their inability to identify unstated presumptions of their users’ commands, a task trivial for humans due to their common sense. In this…

Previous561-570Next