Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Probing Factually Grounded Content Transfer with Factual Ablation

Peter WestChris QuirkMichel GalleyYejin Choi
2022
Findings of ACL

Despite recent success, large neural models often generate factually incorrect text. Compounding this is the lack of a standard automatic evaluation for factuality–it cannot be meaningfully improved… 

ScienceWorld: Is your Agent Smarter than a 5th Grader?

Ruoyao WangPeter Alexander JansenMarc-Alexandre CôtéPrithviraj Ammanabrolu
2022
arXiv

This paper presents a new benchmark, SCIENCEWORLD, to test agents’ scientific reasoning abilities in a new interactive text environment at the level of a standard elementary school science… 

Faking Fake News for Real Fake News Detection: Propaganda-loaded Training Data Generation

Kung-Hsiang HuangPreslav NakovYejin ChoiHeng Ji
2022
arXiv

While there has been a lot of research and many recent advances in neural fake news detection, defending against human-written disinformation remains underexplored. Upon analyzing current approaches… 

Knowledge is Power: Symbolic Knowledge Distillation, Commonsense Morality, & Multimodal Script Knowledge

Yejin Choi
2022
WSDM

Scale appears to be the winning recipe in today's AI leaderboards. And yet, extreme-scale neural models are still brittle to make errors that are often nonsensical and even counterintuitive. In this… 

Computational Lens on Cognition: Study Of Autobiographical Versus Imagined Stories With Large-Scale Language Models

Maarten SapA. JafarpourYejin ChoiE. Horvitz
2022
arXiv

Lifelong experiences and learned knowledge lead to shared expectations about how common situations tend to unfold. Such knowledge enables people to interpret story narratives and identify salient… 

Imagined versus Remembered Stories: Quantifying Differences in Narrative Flow

Maarten SapA. JafarpourYejin ChoiE. Horvitz
2022
Sociology

Lifelong experiences and learned knowledge lead to shared expectations about how common situations tend to unfold. Such knowledge of narrative event flow enables people to weave together a story.… 

PROMPT WAYWARDNESS: The Curious Case of Discretized Interpretation of Continuous Prompts

Daniel KhashabiShan LyuSewon MinYejin Choi
2022
NAACL

Fine-tuning continuous prompts for target tasks has recently emerged as a compact alternative to full model fine-tuning. Motivated by these promising results, we investigate the feasibility of… 

UnifiedQA-v2: Stronger Generalization via Broader Cross-Format Training

Daniel KhashabiYeganeh KordiHannaneh Hajishirzi
2022
arXiv

We present UNIFIEDQA-v2, a QA model built with the same process as UNIFIEDQA, except that it utilizes more supervision – roughly 3× the number of datasets used for UNIFIEDQA. This generally leads to… 

Inherently Explainable Reinforcement Learning in Natural Language

Xiangyu PengMark O. RiedlPrithviraj Ammanabrolu
2021
arXiv

We focus on the task of creating a reinforcement learning agent that is inherently explainable—with the ability to produce immediate local explanations by thinking out loud while performing a task… 

CommonsenseQA 2.0: Exposing the Limits of AI through Gamification

Alon TalmorOri YoranRonan Le BrasJonathan Berant
2021
NeurIPS

Constructing benchmarks that test the abilities of modern natural language un1 derstanding models is difficult – pre-trained language models exploit artifacts in 2 benchmarks to achieve human…