Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
Correcting weather and climate models by machine learning nudged historical simulations
Due to limited resolution and inaccurate physical parameterizations, weather and climate models consistently develop biases compared to the observed atmosphere. Using the FV3GFS model at coarse…
Turning Tables: Generating Examples from Semi-structured Tables for Endowing Language Models with Reasoning Skills
Models pre-trained with a language modeling objective possess ample world knowledge and language skills, but are known to struggle in tasks that require reasoning. In this work, we propose to…
Analyzing Commonsense Emergence in Few-shot Knowledge Models
Recently, commonsense knowledge models — pretrained language models (LM) finetuned on knowledge graph (KG) tuples — showed that considerable amounts of commonsense knowledge can be encoded in the…
Scarecrow: A Framework for Scrutinizing Machine Text
Modern neural text generation systems can produce remarkably fluent and grammatical texts. While earlier language models suffered from repetition and syntactic errors, the errors made by…
Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text
Modern neural language models can produce remarkably fluent and grammatical text. So much, in fact, that recent work by Clark et al. (2021) has reported that conventional crowdsourcing can no longer…
Break, Perturb, Build: Automatic Perturbation of Reasoning Paths through Question Decomposition
Recent efforts to create challenge benchmarks that test the abilities of natural language understanding models have largely depended on human annotations. In this work, we introduce the “Break,…
Infusing Finetuning with Semantic Dependencies
Abstract For natural language processing systems, two kinds of evidence support the use of text representations from neural language models “pretrained” on large unannotated corpora: performance on…
Measuring and Improving Consistency in Pretrained Language Models
Consistency of a model — that is, the invariance of its behavior under meaning-preserving alternations in its input — is a highly desirable property in natural language processing. In this paper we…
MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting
Citation context analysis (CCA) is an important task in natural language processing that studies how and why scholars discuss each others’ work. Despite being studied for decades, traditional…
ParsiNLU: A Suite of Language Understanding Challenges for Persian
Despite the progress made in recent years in addressing natural language understanding (NLU) challenges, the majority of this progress remains to be concentrated on resource-rich languages like…