Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

What Makes Instruction Learning Hard? An Investigation and a New Challenge in a Synthetic Environment

Matthew FinlaysonKyle RichardsonAshish SabharwalPeter Clark
2022
EMNLP

The instruction learning paradigm—where a model learns to perform new tasks from task descriptions alone—has become popular in general-purpose model research. The capabilities of large transformer… 

Teaching Broad Reasoning Skills via Decomposition-Guided Contexts

Harsh TrivediNiranjan BalasubramanianTushar KhotAshish Sabharwal
2022
EMNLP

Question-answering datasets require a broad set of reasoning skills. We show how to use question decompositions to teach language models these broad reasoning skills in a robust fashion.… 

On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization

Shruti PalaskarAkshita BhagiaYonatan BiskAna Marasović
2022
Findings of EMNLP

Integrating vision and language has gained no-table attention following the success of pretrained language models. Despite that, a fraction of emerging multimodal models is suitable for text… 

WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation

Alisa LiuSwabha SwayamdiptaNoah A. SmithYejin Choi
2022
Findings of EMNLP

A recurring challenge of crowdsourcing NLP datasets at scale is that human writers often rely on repetitive patterns when crafting examples, leading to a lack of linguistic diversity. We introduce a… 

Modeling Context With Linear Attention for Scalable Document-Level Translation

Zhaofeng WuHao PengNikolaos PappasNoah A. Smith
2022
Findings of EMNLP

Document-level machine translation leverages inter-sentence dependencies to produce more coherent and consistent translations. However, these models, predominantly based on transformers, are… 

Lexical Generalization Improves with Larger Models and Longer Training

Elron BandelYoav GoldbergYanai Elazar
2022
Finding of EMNLP

While fine-tuned language models perform well on many tasks, they were also shown to rely on superficial surface features such as lexical overlap. Excessive utilization of such heuristics can lead to… 

Towards Teachable Reasoning Systems: Using a Dynamic Memory of User Feedback for Continual System Improvement

Bhavana Dalvi MishraOyvind TafjordPeter Clark
2022
EMNLP

Our goal is a teachable reasoning system for question-answering (QA), where a user can interact with faithful answer explanations, and correct its errors so that the system improves over time. Our… 

Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning

Oyvind TafjordBhavana Dalvi MishraPeter Clark
2022
EMNLP

Our goal is a question-answering (QA) system that can show how its answers are implied by its own internal beliefs via a systematic chain of reasoning . Such a capability would allow better… 

Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection

Suchin GururanganDallas CardSarah K. DrierNoah A. Smith
2022
EMNLP

Language models increasingly rely on massive web dumps for diverse text data. However, these sources are rife with undesirable content. As such, resources like Wikipedia, books, and news often… 

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

Tianbao XieChen Henry WuPeng ShiTao Yu
2022
EMNLP

Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases. Since the inputs…