ARISTO

Build machines that read, learn and reason.

The Aristo Project aims to build systems that demonstrate a deep understanding of the world, integrating technologies for reading, learning, reasoning, and explanation.

A multiple choice question and reasoning explaining each answer
Our research integrates multiple AI technologies, including:
  • Natural language processing
  • Information extraction
  • Knowledge representation
  • Machine reasoning
  • Commonsense knowledge

Research Areas

Probing Reasoning with Language Models

Language models (LMs) have dominated much of AI recently. But what kind(s) of reasoning are they capable of? And how can they be taught to do more? We are developing analytical datasets to probe LMs and help answer these questions.

Learn More:

Multihop Reasoning

Many questions require multiple pieces of information to be combined to arrive at an answer. We are developing new multihop models capable of identifying and combining relevant facts to answer such questions.

Learn More:

Explanation

An intelligent system should not only answer questions correctly, but also be able to explain why its answers are correct. Such a capability is essential for practical acceptance of AI technology. It is also essential for the broader goals of communicating knowledge to a user, and receiving correction from the user when the system's answer is wrong.

Learn More:

Reasoning about Actions

A key aspect of intelligence is being able to reason about the dynamics of the world. This requires modeling what state the world might be in, and how different actions might affect that state. Such capabilities are essential for understanding what happens during a procedure or process, for planning, and for reasoning about "what if..." scenarios.

Learn More:

  • Transformers as Soft Reasoners over Language | Aristo

    RuleTaker determines whether statements are True or False based on rules given in natural language.

    Try the demo
    RuleTaker demo logo
  • RuleTaker demo logo
    Transformers as Soft Reasoners over Language | Aristo

    RuleTaker determines whether statements are True or False based on rules given in natural language.

    Try the demo
  • UnifiedQA screenshot
    Crossing format boundaries with a single QA system | Aristo

    UnifiedQA is a single pre-trained QA model that performs surprisingly well across 17 QA datasets spanning 4 diverse formats. Fine-tuning UnifiedQA into specialized models results in a new state-of-the-art on 6 datasets, establishing this model as a strong starting point for building QA systems.

    Try the demo
  • UnifiedQA screenshot
    Crossing format boundaries with a single QA system | Aristo

    UnifiedQA is a single pre-trained QA model that performs surprisingly well across 17 QA datasets spanning 4 diverse formats. Fine-tuning UnifiedQA into specialized models results in a new state-of-the-art on 6 datasets, establishing this model as a strong starting point for building QA systems.

    Try the demo
    • Everything Happens for a Reason: Discovering the Purpose of Actions in Procedural Text

      Bhavana Dalvi Mishra, Niket Tandon, Antoine Bosselut, Wen-tau Yih, Peter ClarkEMNLP2019Our goal is to better comprehend procedural text, e.g., a paragraph about photosynthesis, by not only predicting what happens, but why some actions need to happen before others. Our approach builds on a prior process comprehension framework for predicting actions' effects, to also identify… more
    • QASC: A Dataset for Question Answering via Sentence Composition

      Tushar Khot, Peter Clark, Michal Guerquin, Paul Edward Jansen, Ashish Sabharwal AAAI2020Composing knowledge from multiple pieces of texts is a key challenge in multi-hop question answering. We present a multi-hop reasoning dataset, Question Answering via Sentence Composition (QASC), that requires retrieving facts from a large corpus and composing them to answer a multiple-choice… more
    • Probing Natural Language Inference Models through Semantic Fragments

      Kyle Richardson, Hai Na Hu, Lawrence S. Moss, Ashish SabharwalAAAI2020Do state-of-the-art models for language understanding already have, or can they easily learn, abilities such as boolean coordination, quantification, conditionals, comparatives, and monotonicity reasoning (i.e., reasoning about word substitutions in sentential contexts)? While such phenomena are… more
    • TransOMCS: From Linguistic Graphs to Commonsense Knowledge

      Hongming Zhang, Daniel Khashabi, Yangqiu Song, Dan RothIJCAI2020Commonsense knowledge acquisition is a key problem for artificial intelligence. Conventional methods of acquiring commonsense knowledge generally require laborious and costly human annotations, which are not feasible on a large scale. In this paper, we explore a practical way of mining commonsense… more
    • Not All Claims are Created Equal: Choosing the Right Approach to Assess Your Hypotheses

      Erfan Sadeqi Azer, Daniel Khashabi, Ashish Sabharwal, Dan RothACL2020Empirical research in Natural Language Processing (NLP) has adopted a narrow set of principles for assessing hypotheses, relying mainly on p-value computation, which suffers from several known issues. While alternative proposals have been well-debated and adopted in other fields, they remain rarely… more

    QuaRTz Dataset

    3864 questions about open domain qualitative relationships

    QuaRTz is a crowdsourced dataset of 3864 multiple-choice questions about open domain qualitative relationships. Each question is paired with one of 405 different background sentences (sometimes short paragraphs).

    QuaRel Dataset

    2771 story questions about qualitative relationships

    QuaRel is a crowdsourced dataset of 2771 multiple-choice story questions, including their logical forms.

    RuleTaker: Transformers as Soft Reasoners over Language

    Datasets used to teach transformers to reason

    Can transformers be trained to reason (or emulate reasoning) over rules expressed in language? In the associated paper and demo we provide evidence that they can. Our models, that we call RuleTakers, are trained on datasets of synthetic rule bases plus derived conclusions, provided here. The resulting models provide the first demonstration that this kind of soft reasoning over language is indeed learnable.

    GenericsKB

    A large knowledge base of generic sentences

    The GenericsKB contains 3.4M+ generic sentences about the world, i.e., sentences expressing general truths such as "Dogs bark," and "Trees remove carbon dioxide from the atmosphere." Generics are potentially useful as a knowledge source for AI systems requiring general world knowledge. The GenericsKB is the first large-scale resource containing naturally occurring generic sentences (as opposed to extracted or crowdsourced triples), and is rich in high-quality, general, semantically complete statements. Generics were primarily extracted from three large text sources, namely the Waterloo Corpus, selected parts of Simple Wikipedia, and the ARC Corpus. A filtered, high-quality subset is also available in GenericsKB-Best, containing 1,020,868 sentences. We recommend you start with GenericsKB-Best.

    “Knowing is not enough, we must apply. Willing is not enough, we must do.”
    Johann Wolfgang von Goethe

    Paul Allen's 'Digital Aristotle' sets eyes on accomplishing practical tasks

    KOMO News
    February 5, 2020
    Read the Article

    מערכת בינה מלאכותית עברה בהצטיינות יתרה מבחן במדעים של כיתה ח' (Artificial Intelligence System Cum Laude Passed 8th Grade Science Test)

    Haaretz
    September 6, 2019
    Read the Article

    Allen Institute's Aristo AI makes breakthrough, passes eighth-grade science test

    TechSpot
    September 5, 2019
    Read the Article

    A Breakthrough for A.I. Technology: Passing an 8th-Grade Science Test

    The New York Times
    September 4, 2019
    Read the Article

    Allen Institute’s Aristo AI system finally passes an eighth-grade science test

    GeekWire
    September 4, 2019
    Read the Article

    How to tutor AI from an ‘F’ to an ‘A’

    Vulcan Inc
    September 4, 2019
    Read the Article

    AI assistants say dumb things, and we’re about to find out why

    MIT Tech Review
    March 14, 2018
    Read the Article

    Moving Beyond the Turing Test with the Allen AI Science Challenge

    CACM
    September 4, 2017
    Read the Article

    Team

    • Peter Clark's Profile PhotoPeter ClarkResearch
    • Sumithra Bhakthavatsalam's Profile PhotoSumithra BhakthavatsalamEngineering
    • Bhavana Dalvi's Profile PhotoBhavana DalviResearch
    • Michal Guerquin's Profile PhotoMichal GuerquinEngineering
    • Daniel Khashabi's Profile PhotoDaniel KhashabiYoung Investigator
    • Tushar Khot's Profile PhotoTushar KhotResearch
    • Kyle Richardson's Profile PhotoKyle RichardsonResearch
    • Ashish Sabharwal's Profile PhotoAshish SabharwalResearch
    • Carissa Schoenick's Profile PhotoCarissa SchoenickProduct
    • Oyvind Tafjord's Profile PhotoOyvind TafjordResearch
    • Niket Tandon's Profile PhotoNiket TandonResearch