Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation

Alisa LiuSwabha SwayamdiptaNoah A. SmithYejin Choi
2022
Findings of EMNLP

A recurring challenge of crowdsourcing NLP datasets at scale is that human writers often rely on repetitive patterns when crafting examples, leading to a lack of linguistic diversity. We introduce a… 

Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection

Suchin GururanganDallas CardSarah K. DrierNoah A. Smith
2022
EMNLP

Language models increasingly rely on massive web dumps for diverse text data. However, these sources are rife with undesirable content. As such, resources like Wikipedia, books, and news often… 

Modeling the Machine Learning Multiverse

Samuel J BellOnno P. KampmanJesse DodgeNeil D. Lawrence
2022
NeurIPS

Amid mounting concern about the reliability and credibility of machine learning research, we present a principled framework for making robust and generalizable claims: the multiverse analysis . Our… 

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le ScaoAngela FanChristopher AkikiThomas Wolf
2022
arXiv

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption,… 

Quantifying the narrative flow of imagined versus autobiographical stories.

Maarten SapA. JafarpourYejin ChoiE. Horvitz
2022
Proceedings of the National Academy of Sciences of the United States of America

Lifelong experiences and learned knowledge lead to shared expectations about how common situations tend to unfold. Such knowledge of narrative event flow enables people to weave together a story.… 

Just-DREAM-about-it: Figurative Language Understanding with DREAM-FLUTE

Yuling GuYao FuValentina PyatkinPeter Clark
2022
EMNLP • The Third Workshop on Figurative Language Processing

Figurative language (e.g., “he flew like the wind”) is challenging to understand, as it is hard to tell what implicit information is being conveyed from the surface form alone. We hypothesize that… 

SciFact-Open: Towards open-domain scientific claim verification

David WaddenKyle LoBailey KuehlHannaneh Hajishirzi
2022
EMNLP 2022

While research on scientific claim verification has led to the development of powerful systems that appear to approach human performance, these approaches have yet to be tested in a realistic… 

Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering

Jiacheng LiuSkyler HallinanXiming LuYejin Choi
2022
Conference on Empirical Methods in Natural Language Processing

Knowledge underpins reasoning. Recent research demonstrates that when relevant knowledge is provided as additional context to commonsense question answering (QA), it can substantially enhance the… 

Transparency Helps Reveal When Language Models Learn Meaning

Zhaofeng WuWill MerrillHao PengNoah A. Smith
2022
arXiv

Many current NLP systems are built from language models trained to optimize unsupervised objectives on large amounts of raw text. Under what conditions might such a procedure acquire meaning? Our… 

Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization

R. RamamurthyPrithviraj AmmanabroluKianté BrantleyYejin Choi
2022
ArXiv

We tackle the problem of aligning pre-trained large language models (LMs) with human preferences. If we view text generation as a sequential decision-making problem, reinforcement learning (RL)…