Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Promoting Graph Awareness in Linearized Graph-to-Text Generation

Alexander M. HoyleAna MarasovićNoah A. Smith

2021

Findings of ACL

Generating text from structured inputs, such as meaning representations or RDF triples, has often involved the use of specialized graphencoding neural networks. However, recent applications of…

Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets

Julia KreutzerIsaac CaswellLisa WangMofetoluwa Adeyemi

2021

TACL

With the success of large-scale pre-training and multilingual modeling in Natural Language Processing (NLP), recent years have seen a proliferation of large, web-mined text datasets covering…

Shortformer: Better Language Modeling using Shorter Inputs

Ofir PressNoah A. SmithM. Lewis

2021

ACL

We explore the benefits of decreasing the input length of transformers. First, we show that initially training the model on short subsequences, before moving on to longer ones, both reduces overall…

Efficient Passage Retrieval with Hashing for Open-domain Question Answering

Ikuya YamadaAkari AsaiHanna Hajishirzi

2021

ACL

Most state-of-the-art open-domain question answering systems use a neural retrieval model to encode passages into continuous vectors and extract them from a knowledge source. However, such retrieval…

Prompting Contrastive Explanations for Commonsense Reasoning Tasks

Bhargavi ParanjapeJulian MichaelMarjan GhazvininejadHanna Hajishirzi

2021

Findings of ACL

Many commonsense reasoning NLP tasks involve choosing between one or more possible answers to a question or prompt based on knowledge that is often implicit. Large pretrained language models (PLMs)…

Scarecrow: A Framework for Scrutinizing Machine Text

Yao DouMaxwell ForbesRik Koncel-KedziorskiYejin Choi

2021

arXiv

Modern neural text generation systems can produce remarkably fluent and grammatical texts. While earlier language models suffered from repetition and syntactic errors, the errors made by…

Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text

Yao DouMaxwell ForbesRik Koncel-KedziorskiYejin Choi

2021

arXiv

Modern neural language models can produce remarkably fluent and grammatical text. So much, in fact, that recent work by Clark et al. (2021) has reported that conventional crowdsourcing can no longer…

Infusing Finetuning with Semantic Dependencies

Zhaofeng WuHao PengNoah A. Smith

2021

TACL

Abstract For natural language processing systems, two kinds of evidence support the use of text representations from neural language models “pretrained” on large unannotated corpora: performance on…

Provable Limitations of Acquiring Meaning from Ungrounded Form: What will Future Language Models Understand?

William MerrillYoav GoldbergRoy SchwartzNoah A. Smith

2021

TACL

Language models trained on billions of tokens have recently led to unprecedented results on many NLP tasks. This success raises the question of whether, in principle, a system can ever “understand”…

A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers

Pradeep DasigiKyle LoIz BeltagyMatt Gardner

2021

NAACL

Readers of academic research papers often read with the goal of answering specific questions. Question Answering systems that can answer those questions can make consumption of the content much more…

Previous211-220Next