Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
ProofWriter: Generating Implications, Proofs, and Abductive Statements over Natural Language
Transformers have been shown to emulate logical deduction over natural language theories (logical rules expressed in natural language), reliably assigning true/false labels to candidate…
fv3gfs-wrapper: a Python wrapper of the FV3GFS atmospheric model
Simulation software in geophysics is traditionally written in Fortran or C++ due to the stringent performance requirements these codes have to satisfy. As a result, researchers who use…
Correcting weather and climate models by machine learning nudged historical simulations
Due to limited resolution and inaccurate physical parameterizations, weather and climate models consistently develop biases compared to the observed atmosphere. Using the FV3GFS model at coarse…
Turning Tables: Generating Examples from Semi-structured Tables for Endowing Language Models with Reasoning Skills
Models pre-trained with a language modeling objective possess ample world knowledge and language skills, but are known to struggle in tasks that require reasoning. In this work, we propose to…
Analyzing Commonsense Emergence in Few-shot Knowledge Models
Recently, commonsense knowledge models — pretrained language models (LM) finetuned on knowledge graph (KG) tuples — showed that considerable amounts of commonsense knowledge can be encoded in the…
Scarecrow: A Framework for Scrutinizing Machine Text
Modern neural text generation systems can produce remarkably fluent and grammatical texts. While earlier language models suffered from repetition and syntactic errors, the errors made by…
Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text
Modern neural language models can produce remarkably fluent and grammatical text. So much, in fact, that recent work by Clark et al. (2021) has reported that conventional crowdsourcing can no longer…
Break, Perturb, Build: Automatic Perturbation of Reasoning Paths through Question Decomposition
Recent efforts to create challenge benchmarks that test the abilities of natural language understanding models have largely depended on human annotations. In this work, we introduce the “Break,…
Infusing Finetuning with Semantic Dependencies
Abstract For natural language processing systems, two kinds of evidence support the use of text representations from neural language models “pretrained” on large unannotated corpora: performance on…
Measuring and Improving Consistency in Pretrained Language Models
Consistency of a model — that is, the invariance of its behavior under meaning-preserving alternations in its input — is a highly desirable property in natural language processing. In this paper we…