Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

Sewon MinKalpesh KrishnaXinxi LyuHannaneh Hajishirzi
2023
EMNLP

Evaluating the factuality of long-form text generated by large language models (LMs) is non-trivial because (1) generations often contain a mixture of supported and unsupported pieces of… 

TaskWeb: Selecting Better Source Tasks for Multi-task NLP

Joongwon KimAkari AsaiGabriel IlharcoHannaneh Hajishirzi
2023
EMNLP

Recent work in NLP has shown promising results in training models on large amounts of tasks to achieve better generalization. However, it is not well-understood how tasks are related, and how… 

Crystal: Introspective Reasoners Reinforced with Self-Feedback

Jiacheng LiuRamakanth PasunuruHannaneh HajishirziAsli Celikyilmaz
2023
EMNLP

Extensive work has shown that the performance and interpretability of commonsense reasoning can be improved via knowledge-augmented reasoning methods, where the knowledge that underpins the… 

Machine Reading Comprehension using Case-based Reasoning

Dung Ngoc ThaiDhruv AgarwalMudit ChaudharyA. McCallum
2023
EMNLP

We present an accurate and interpretable method for answer extraction in machine reading comprehension that is reminiscent of case-based reasoning (CBR) from classical AI. Our method (CBR-MRC)… 

SHARCS: Efficient Transformers through Routing with Dynamic Width Sub-networks

Mohammadreza SalehiSachin MehtaAditya KusupatiHannaneh Hajishirzi
2023
EMNLP

We introduce SHARCS for adaptive inference that takes into account the hardness of input samples. SHARCS can train a router on any transformer network, enabling the model to direct different samples… 

"You Are An Expert Linguistic Annotator": Limits of LLMs as Analyzers of Abstract Meaning Representation

Allyson EttingerJena D. HwangValentina PyatkinYejin Choi
2023
Conference on Empirical Methods in Natural Language Processing

Large language models (LLMs) show amazing proficiency and fluency in the use of language. Does this mean that they have also acquired insightful linguistic knowledge about the language, to an extent… 

What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations

Kavel RaoLiwei JiangValentina PyatkinYejin Choi
2023
Conference on Empirical Methods in Natural Language Processing • Findings

Moral or ethical judgments rely heavily on the specific contexts in which they occur. Understanding varying shades of defeasible contextualizations (i.e., additional information that strengthens or… 

Localized Symbolic Knowledge Distillation for Visual Commonsense Models

Jae Sung ParkJack HesselKhyathi Raghavi ChanduYejin Choi
2023
NeurIPS

Instruction following vision-language (VL) models offer a flexible interface that supports a broad range of multimodal tasks in a zero-shot fashion. However, interfaces that operate on full images… 

RCT Rejection Sampling for Causal Estimation Evaluation

Katherine A. KeithSergey FeldmanDavid JurgensRohit Bhattacharya
2023
Transactions on Machine Learning Research

Confounding is a significant obstacle to unbiased estimation of causal effects from observational data. For settings with high-dimensional covariates -- such as text data, genomics, or the… 

CHAMP: Efficient Annotation and Consolidation of Cluster Hierarchies

Arie CattanTom HopeDoug DowneyIdo Dagan
2023
Conference on Empirical Methods in Natural Language Processing

Various NLP tasks require a complex hierarchical structure over nodes, where each node is a cluster of items. Examples include generating entailment graphs, hierarchical cross-document coreference…