Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering

Yushi HuBenlin LiuJungo KasaiNoah A. Smith
2023
ICCV • Proceedings

Despite thousands of researchers, engineers, and artists actively working on improving text-to-image generation models, systems often fail to produce images that accurately align with the text… 

LEXPLAIN: Improving Model Explanations via Lexicon Supervision

Orevaoghene AhiaHila GonenVidhisha BalachandranNoah A. Smith
2023
*SEM • Proceedings

Model explanations that shed light on the model’s predictions are becoming a desired additional output of NLP models, alongside their predictions. Challenges in creating these explanations include… 

When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories

Alex MallenAkari AsaiVictor Zhong
2023
Annual Meeting of the Association for Computational Linguistics

Despite their impressive performance on diverse tasks, large language models (LMs) still struggle with tasks requiring rich world knowledge, implying the difficulty of encoding a wealth of world… 

Data-Efficient Finetuning Using Cross-Task Nearest Neighbors

Hamish IvisonNoah A. SmithHannaneh HajishirziPradeep Dasigi
2023
ACL Findings

Language models trained on massive prompted multitask datasets like T0 (Sanh et al., 2021) or FLAN (Wei et al., 2021a) can generalize to tasks unseen during training. We show that training on a… 

HINT: Hypernetwork Instruction Tuning for Efficient Few- and Zero-Shot Generalisation

Hamish IvisonAkshita BhagiaYizhong WangMatthew E. Peters
2023
ACL

Recent NLP models have shown the remarkable ability to effectively generalise `zero-shot' to new tasks using only natural language instructions as guidance. However, many of these approaches suffer… 

Reproducibility in NLP: What Have We Learned from the Checklist?

Ian H. MagnussonNoah A. SmithJesse Dodge
2023
Findings of ACL

Scientific progress in NLP rests on the reproducibility of researchers' claims. The *CL conferences created the NLP Reproducibility Checklist in 2020 to be completed by authors at submission to… 

CREPE: Open-Domain Question Answering with False Presuppositions

Xinyan Velocity YuSewon MinLuke ZettlemoyerHannaneh Hajishirzi
2023
ACL

When asking about unfamiliar topics, information seeking users often pose questions with false presuppositions. Most existing question answering (QA) datasets, in contrast, assume all questions have… 

Efficient Methods for Natural Language Processing: A Survey

Marcos Vinícius TrevisoTianchu JiJi-Ung LeeRoy Schwartz
2023
TACL

Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource… 

Elaboration-Generating Commonsense Question Answering at Scale

Wenya WangVivek SrikumarHannaneh HajishirziNoah A. Smith
2023
ACL

In question answering requiring common sense, language models (e.g., GPT-3) have been used to generate text expressing background knowledge that helps improve performance. Yet the cost of working… 

Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation

Marius MosbachTiago PimentelShauli RavfogelYanai Elazar
2023
Findings of ACL 2023

Few-shot fine-tuning and in-context learning are two alternative strategies for task adaptation of pre-trained language models. Recently, in-context learning has gained popularity over fine-tuning…