Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
Analyzing Commonsense Emergence in Few-shot Knowledge Models
Recently, commonsense knowledge models — pretrained language models (LM) finetuned on knowledge graph (KG) tuples — showed that considerable amounts of commonsense knowledge can be encoded in the…
Scarecrow: A Framework for Scrutinizing Machine Text
Modern neural text generation systems can produce remarkably fluent and grammatical texts. While earlier language models suffered from repetition and syntactic errors, the errors made by…
Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text
Modern neural language models can produce remarkably fluent and grammatical text. So much, in fact, that recent work by Clark et al. (2021) has reported that conventional crowdsourcing can no longer…
ParsiNLU: A Suite of Language Understanding Challenges for Persian
Despite the progress made in recent years in addressing natural language understanding (NLU) challenges, the majority of this progress remains to be concentrated on resource-rich languages like…
Measuring and Improving Consistency in Pretrained Language Models
Consistency of a model — that is, the invariance of its behavior under meaning-preserving alternations in its input — is a highly desirable property in natural language processing. In this paper we…
Provable Limitations of Acquiring Meaning from Ungrounded Form: What will Future Language Models Understand?
Language models trained on billions of tokens have recently led to unprecedented results on many NLP tasks. This success raises the question of whether, in principle, a system can ever “understand”…
Infusing Finetuning with Semantic Dependencies
Abstract For natural language processing systems, two kinds of evidence support the use of text representations from neural language models “pretrained” on large unannotated corpora: performance on…
Break, Perturb, Build: Automatic Perturbation of Reasoning Paths through Question Decomposition
Recent efforts to create challenge benchmarks that test the abilities of natural language understanding models have largely depended on human annotations. In this work, we introduce the “Break,…
Revisiting Few-shot Relation Classification: Evaluation Data and Classification Schemes
We explore few-shot learning (FSL) for relation classification (RC). Focusing on the realistic scenario of FSL, in which a test instance might not belong to any of the target categories…
MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting
Citation context analysis (CCA) is an important task in natural language processing that studies how and why scholars discuss each others’ work. Despite being studied for decades, traditional…