Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations

Tianlu WangJieyu ZhaoMark YatskarVicente Ordonez
2019
ICCV

In this work, we present a framework to measure and mitigate intrinsic biases with respect to protected variables --such as gender-- in visual recognition tasks. We show that trained models… 

COMET: Commonsense Transformers for Automatic Knowledge Graph Construction

Antoine BosselutHannah RashkinMaarten SapYejin Choi
2019
ACL

We present the first comprehensive study on automatic knowledge base construction for two prevalent commonsense knowledge graphs: ATOMIC (Sap et al., 2019) and ConceptNet (Speer et al., 2017).… 

Compositional Questions Do Not Necessitate Multi-hop Reasoning

Sewon MinEric WallaceSameer SinghLuke Zettlemoyer
2019
ACL

Multi-hop reading comprehension (RC) questions are challenging because they require reading and reasoning over multiple paragraphs. We argue that it can be difficult to construct large multi-hop RC… 

GrapAL: Connecting the Dots in Scientific Literature

Christine BettsJoanna PowerWaleed Ammar
2019
ACL

We introduce GrapAL (Graph database of Academic Literature), a versatile tool for exploring and investigating a knowledge base of scientific literature, that was semi-automatically constructed using… 

HellaSwag: Can a Machine Really Finish Your Sentence?

Rowan ZellersAri HoltzmanYonatan BiskYejin Choi
2019
ACL

Recent work by Zellers et al. (2018) introduced a new task of commonsense natural language inference: given an event description such as "A woman sits at a piano," a machine must select the most… 

The Risk of Racial Bias in Hate Speech Detection

Maarten SapDallas CardSaadia GabrielNoah A. Smith
2019
ACL

We investigate how annotators’ insensitivity to differences in dialect can lead to racial bias in automatic hate speech detection models, potentially amplifying harm against minority populations. We… 

Question Answering is a Format; When is it Useful?

Matt GardnerJonathan BerantHannaneh HajishirziSewon Min
2019
arXiv

Recent years have seen a dramatic expansion of tasks and datasets posed as question answering, from reading comprehension, semantic role labeling, and even machine translation, to image and video… 

Robust Navigation with Language Pretraining and Stochastic Sampling

Xiujun LiChunyuan LiQiaolin XiaYejin Choi
2019
EMNLP

Core to the vision-and-language navigation (VLN) challenge is building robust instruction representations and action decoding schemes, which can generalize well to previously unseen instructions and… 

Shallow Syntax in Deep Water

Swabha SwayamdiptaMatthew E. PetersBrendan RoofNoah A. Smith
2019
arXiv

Shallow syntax provides an approximation of phrase-syntactic structure of sentences; it can be produced with high accuracy, and is computationally cheap to obtain. We investigate the role of shallow… 

Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets

Mor GevaYoav GoldbergJonathan Berant
2019
arXiv

Crowdsourcing has been the prevalent paradigm for creating natural language understanding datasets in recent years. A common crowdsourcing practice is to recruit a small number of high-quality…