Papers

See AI2's Award Winning Papers

Learn more about AI2's Lasting Impact Award

Viewing 51-60 of 991 papers

Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
Ximing Lu, Faeze Brahman, Peter West, Jaehun Jang, Khyathi Raghavi Chandu, Abhilasha Ravichander, Lianhui Qin, Prithviraj Ammanabrolu, Liwei Jiang, Sahana Ramnath, Nouha Dziri, Jillian R. Fisher, Bill Yuchen Lin, Skyler Hallinan, Xiang Ren, S. Welleck, Yejin ChoiEMNLP • 2023 Large language models excel at a variety of language tasks when prompted with examples or instructions. Yet controlling these models through prompting alone is limited. Tailoring language models through fine-tuning (e.g., via reinforcement learning) can be…
Language Models with Rationality
Nora Kassner, Oyvind Tafjord, Ashish Sabharwal, Kyle Richardson, Hinrich Schütze, Peter ClarkEMNLP • 2023 While large language models (LLMs) are proficient at question-answering (QA), the dependencies between their answers and other "beliefs" they may have about the world are typically unstated, and may even be in conflict. Our goal is to uncover such…
Machine Reading Comprehension using Case-based Reasoning
Dung Ngoc Thai, Dhruv Agarwal, Mudit Chaudhary, Rajarshi Das, M. Zaheer, J. Lee, Hannaneh Hajishirzi, A. McCallumEMNLP • 2023 We present an accurate and interpretable method for answer extraction in machine reading comprehension that is reminiscent of case-based reasoning (CBR) from classical AI. Our method (CBR-MRC) builds upon the hypothesis that contextualized answers to similar…
Measuring and Narrowing the Compositionality Gap in Language Models
Ofir Press, Muru Zhang, Sewon Min, Ludwig Schmidt, Noah A. Smith, Mike LewisEMNLP Findings • 2023 We investigate the ability of language models to perform compositional reasoning tasks where the overall solution depends on correctly composing the answers to sub-problems. We measure how often models can correctly answer all sub-problems but not generate…
PaperMage: A Unified Toolkit for Processing, Representing, and Manipulating Visually-Rich Scientific Documents
Kyle Lo, Zejiang Shen, Benjamin Newman, Joseph Chee Chang, Russell Authur, Erin Bransom, Stefan Candra, Yoganand Chandrasekhar, Regan Huff, Bailey Kuehl, Amanpreet Singh, Chris Wilhelm, Angele Zamarron, Marti A. Hearst, Daniel S. Weld, Doug Downey, Luca SoldainiEMNLP • 2023
Best Paper Demo Award
Despite growing interest in applying natural language processing (NLP) and computer vision (CV) models to the scholarly domain, scientific documents remain challenging to work with. They’re often in difficult-to-use PDF formats, and the ecosystem of models to…
Related:
Demo Code
SHARCS: Efficient Transformers through Routing with Dynamic Width Sub-networks
Mohammadreza Salehi, Sachin Mehta, Aditya Kusupati, Ali Farhadi, Hannaneh HajishirziEMNLP • 2023 We introduce SHARCS for adaptive inference that takes into account the hardness of input samples. SHARCS can train a router on any transformer network, enabling the model to direct different samples to sub-networks with varying widths. Our experiments…
SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization
Hyunwoo Kim, Jack Hessel, Liwei Jiang, Ximing Lu, Youngjae Yu, Pei Zhou, Ronan Le Bras, Malihe Alikhani, Gunhee Kim, Maarten Sap, Yejin ChoiEMNLP • 2023
Outstanding Paper Award
We present SODA : the ﬁrst publicly available, million-scale high-quality social dialogue dataset. Using SODA , we train COSMO : a generalizable conversation agent outperforming previous best-performing agents on both in- and out-of-domain datasets. In…
TaskWeb: Selecting Better Source Tasks for Multi-task NLP
Joongwon Kim, Akari Asai, Gabriel Ilharco, Hannaneh HajishirziEMNLP • 2023 Recent work in NLP has shown promising results in training models on large amounts of tasks to achieve better generalization. However, it is not well-understood how tasks are related, and how helpful training tasks can be chosen for a new task. In this work…
Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements
Jiacheng Liu, Wenya Wang, Dianzhuo Wang, Noah A. Smith, Yejin Choi, Hanna HajishirziEMNLP • 2023 Despite the much discussed capabilities of today's language models, they are still prone to silly and unexpected commonsense failures. We consider a retrospective verification approach that reflects on the correctness of LM outputs, and introduce Vera, a…
We're Afraid Language Models Aren't Modeling Ambiguity
Alisa Liu, Zhaofeng Wu, Julian Michael, Alane Suhr, Peter West, Alexander Koller, Swabha Swayamdipta, Noah A. Smith, Yejin ChoiEMNLP • 2023 Ambiguity is an intrinsic feature of natural language. Managing ambiguity is a key part of human language understanding, allowing us to anticipate misunderstanding as communicators and revise our interpretations as listeners. As language models (LMs) are…

1
•••
5
6
7
•••
100

Natural Language Processing

Computer Vision

AI for the Environment

Experimentation and Communication

Research

Research

Papers

Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning

Language Models with Rationality

Machine Reading Comprehension using Case-based Reasoning

Measuring and Narrowing the Compositionality Gap in Language Models

PaperMage: A Unified Toolkit for Processing, Representing, and Manipulating Visually-Rich Scientific Documents

SHARCS: Efficient Transformers through Routing with Dynamic Width Sub-networks

SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization

TaskWeb: Selecting Better Source Tasks for Multi-task NLP

Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements

We're Afraid Language Models Aren't Modeling Ambiguity