Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
Generating Scientific Definitions with Controllable Complexity
Unfamiliar terminology and complex language can present barriers to understanding science. Natural language processing stands to help address these issues by automatically defining unfamiliar terms.…
Generating Scientific Claims for Zero-Shot Scientific Fact Checking
Automated scientific fact checking is difficult due to the complexity of scientific language and a lack of significant amounts of training data, as annotation requires domain expertise. To address…
Hey AI, Can You Solve Complex Tasks by Talking to Agents?
Humans often solve complex problems by interacting (in natural language) with existing agents, such as AI assistants, that can solve simpler sub-tasks. These agents themselves can be powerful…
Is GPT-3 Text Indistinguishable from Human Text? SCARECROW: A Framework for Scrutinizing Machine Text
Modern neural text generation systems can produce remarkably fluent and grammatical texts. While earlier language models suffered from repetition and syntactic errors, the errors made by contemporary…
Large Scale Substitution-based Word Sense Induction
We present a word-sense induction method based on pre-trained masked language models (MLMs), which can cheaply scale to large vocabularies and large corpora. The result is a corpus which is…
NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks
Given the ubiquitous nature of numbers in text, reasoning with numbers to perform simple calculations is an important skill of AI systems. While many datasets and models have been developed to this…
Productive Performance Engineering for Weather and Climate Modeling with Python
Earth system models are developed with a tight coupling to target hardware, often containing highly-specialized code predicated on processor characteristics. This coupling stems from using…
Reframing Instructional Prompts to GPTk's Language
How can model designers turn task instructions into effective prompts for language models? Backed by extensive empirical analysis on GPT3, we observe important features for successful instructional…
Situated Dialogue Learning through Procedural Environment Generation
We teach goal-driven agents to interactively act and speak in situated environments by training on generated curriculums. Our agents operate in LIGHT (Urbanek et al. 2019)—a large-scale…
ACCoRD: A Multi-Document Approach to Generating Diverse Descriptions of Scientific Concepts
Systems that can automatically define unfamiliar terms hold the promise of improving the accessibility of scientific texts, especially for readers who may lack prerequisite background knowledge.…