An abstract illustration of swirling shapes, meant to denote a futuristic feeling.

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Shortformer: Better Language Modeling using Shorter Inputs

Ofir PressNoah A. SmithM. Lewis

2021

ACL

We explore the benefits of decreasing the input length of transformers. First, we show that initially training the model on short subsequences, before moving on to longer ones, both reduces overall…

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World

Rowan ZellersAri HoltzmanMatthew E. PetersYejin Choi

2021

ACL

We propose PIGLeT: a model that learns physical commonsense knowledge through interaction, and then uses this knowledge to ground language. We factorize PIGLeT into a physical dynamics model, and a…

All That’s ‘Human’ Is Not Gold: Evaluating Human Evaluation of Generated Text

Elizabeth ClarkTal AugustSofia SerranoNoah A. Smith

2021

ACL

Human evaluations are typically considered the gold standard in natural language generation, but as models' fluency improves, how well can evaluators detect and judge machine-generated text? We run…

How effective is BERT without word ordering? Implications for language understanding and data privacy

Jack HesselAlexandra Schofield

2021

ACL

Ordered word sequences contain the rich structures that define language. However, it’s often not clear if or how modern pretrained language models utilize these structures. We show that the token…

Edited Media Understanding Frames: Reasoning about the Intent and Implications of Visual Disinformation

Jeff DaMaxwell ForbesRowan ZellersYejin Choi

2021

ACL

Multimodal disinformation, from `deepfakes' to simple edits that deceive, is an important societal problem. Yet at the same time, the vast majority of media edits are harmless -- such as a filtered…

PAWLS: PDF Annotation With Labels and Structure

Mark NeumannZejiang ShenSam Skjonsberg

2021

Demo • ACL

Adobe’s Portable Document Format (PDF) is a popular way of distributing view-only documents with a rich visual markup. This presents a challenge to NLP practitioners who wish to use the information…

Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets

Julia KreutzerIsaac CaswellLisa WangMofetoluwa Adeyemi

2021

TACL

With the success of large-scale pre-training and multilingual modeling in Natural Language Processing (NLP), recent years have seen a proliferation of large, web-mined text datasets covering…

Explaining Relationships Between Scientific Documents

Kelvin LuuXinyi WuRik Koncel-KedziorskiNoah A. Smit

2021

ACL

We address the task of explaining relationships between two scientific documents using natural language text. This task requires modeling the complex content of long technical documents, deducing a…

ProofWriter: Generating Implications, Proofs, and Abductive Statements over Natural Language

Oyvind TafjordB. D. MishraP. Clark

2021

Findings of ACL

Transformers have been shown to emulate logical deduction over natural language theories (logical rules expressed in natural language), reliably assigning true/false labels to candidate…

Efficient Passage Retrieval with Hashing for Open-domain Question Answering

Ikuya YamadaAkari AsaiHanna Hajishirzi

2021

ACL

Most state-of-the-art open-domain question answering systems use a neural retrieval model to encode passages into continuous vectors and extract them from a knowledge source. However, such retrieval…

Previous552-561Next