Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space

Mor GevaAvi CaciularuKevin Ro WangYoav Goldberg
2022
arXiv

Transformer-based language models (LMs) are at the core of modern NLP, but their inter-nal prediction construction process is opaque and largely not understood. In this work, we make a substantial… 

COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics

Lianhui QinS. WelleckDaniel KhashabiYejin Choi
2022
arXiv

Many applications of text generation require incorporating different constraints to control the semantics or style of generated text. These constraints can be hard (e.g., ensuring certain keywords… 

CiteRead: Integrating Localized Citation Contexts into Scientific Paper Reading

Napol RachatasumritJonathan BraggAmy X. ZhangDaniel S. Weld
2022
IUI

When reading a scholarly paper, scientists oftentimes wish to understand how follow-on work has built on or engages with what they are reading. While a paper itself can only discuss prior work, some… 

Probing Factually Grounded Content Transfer with Factual Ablation

Peter WestChris QuirkMichel GalleyYejin Choi
2022
Findings of ACL

Despite recent success, large neural models often generate factually incorrect text. Compounding this is the lack of a standard automatic evaluation for factuality–it cannot be meaningfully improved… 

Memory-assisted prompt editing to improve GPT-3 after deployment

Aman MadaanNiket TandonPeter ClarkYiming Yang
2022
ACL • Workshop on Commonsense Reasoning

Large LMs such as GPT-3 are powerful, but can commit mistakes that are obvious to humans. For example, GPT-3 would mistakenly interpret "What word is similar to good?" to mean a homonym, while the… 

Don't Say What You Don't Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search

Daniel KingZejiang ShenNishant SubramaniDoug Downey
2022
GEM Workshop 2022

Abstractive summarization systems today produce fluent and relevant output, but often “hallucinate” statements not supported by the source text. We analyze the connection between hallucinations and… 

Object Manipulation via Visual Target Localization

Kiana EhsaniAli FarhadiAniruddha KembhaviRoozbeh Mottaghi
2022
arXiv

Object manipulation is a critical skill required for Embodied AI agents interacting with the world around them. Training agents to manipulate objects, poses many challenges. These include occlusion… 

ScienceWorld: Is your Agent Smarter than a 5th Grader?

Ruoyao WangPeter Alexander JansenMarc-Alexandre CôtéPrithviraj Ammanabrolu
2022
arXiv

This paper presents a new benchmark, SCIENCEWORLD, to test agents’ scientific reasoning abilities in a new interactive text environment at the level of a standard elementary school science… 

Staged Training for Transformer Language Models

Sheng ShenPete WalshK. KeutzerIz Beltagy
2022
ICML 2022

The current standard approach to scaling transformer language models trains each model size from a different random initialization. As an alternative, we consider a staged training setup that begins… 

Faking Fake News for Real Fake News Detection: Propaganda-loaded Training Data Generation

Kung-Hsiang HuangPreslav NakovYejin ChoiHeng Ji
2022
arXiv

While there has been a lot of research and many recent advances in neural fake news detection, defending against human-written disinformation remains underexplored. Upon analyzing current approaches…