ARISTO

Build machines that read, learn and reason.

The Aristo Project aims to build systems that demonstrate a deep understanding of the world, integrating technologies for reading, learning, reasoning, and explanation.

A multiple choice question and reasoning explaining each answer
Our research integrates multiple AI technologies, including:
  • Natural language processing
  • Information extraction
  • Knowledge representation
  • Machine reasoning
  • Commonsense knowledge

Research Areas

Probing Reasoning with Language Models

Language models (LMs) have dominated much of AI recently. But what kind(s) of reasoning are they capable of? And how can they be taught to do more? We are developing analytical datasets to probe LMs and help answer these questions.

Learn More:

Multihop Reasoning

Many questions require multiple pieces of information to be combined to arrive at an answer. We are developing new multihop models capable of identifying and combining relevant facts to answer such questions.

Learn More:

Explanation

An intelligent system should not only answer questions correctly, but also be able to explain why its answers are correct. Such a capability is essential for practical acceptance of AI technology. It is also essential for the broader goals of communicating knowledge to a user, and receiving correction from the user when the system's answer is wrong.

Learn More:

Reasoning about Actions

A key aspect of intelligence is being able to reason about the dynamics of the world. This requires modeling what state the world might be in, and how different actions might affect that state. Such capabilities are essential for understanding what happens during a procedure or process, for planning, and for reasoning about "what if..." scenarios.

Learn More:

  • Generating Implications, Proofs, and Abductive Statements over Natural Language | Aristo

    Like RuleTaker, ProofWriter determines whether statements are True or False based on rules given in natural language - but also generates the proof of its answers.

    Try the demo
    ProofWriter OpenGraph image
  • ProofWriter OpenGraph image
    Generating Implications, Proofs, and Abductive Statements over Natural Language | Aristo

    Like RuleTaker, ProofWriter determines whether statements are True or False based on rules given in natural language - but also generates the proof of its answers.

    Try the demo
  • ModularQA
    Modular QA answers questions by breaking them down into a series of smaller, more specific ones. This produces answers in a human-like way that's more explainable than black-box systems. | Aristo

    ModularQA is a neuro-symbolic question-answering system that answers complex questions by asking a series of sub-questions to existing simpler QA systems or symbolic modules. It explains each of its reasoning steps in language, in terms of a simple question and its answer as produced by a simpler model or a math…

    Try the demo
  • ModularQA
    Modular QA answers questions by breaking them down into a series of smaller, more specific ones. This produces answers in a human-like way that's more explainable than black-box systems. | Aristo

    ModularQA is a neuro-symbolic question-answering system that answers complex questions by asking a series of sub-questions to existing simpler QA systems or symbolic modules. It explains each of its reasoning steps in language, in terms of a simple question and its answer as produced by a simpler model or a math…

    Try the demo
    • BeliefBank: Adding Memory to a Pre-Trained Language Model for a Systematic Notion of Belief

      Nora Kassner, Oyvind Tafjord, H. Schutze, P. ClarkEMNLP2021 Although pretrained language models (PTLMs) have been shown to contain significant amounts of world knowledge, they can still produce inconsistent answers to questions when probed, even after using specialized training techniques to reduce inconsistency. As a…
    • Explaining Answers with Entailment Trees

      Bhavana Dalvi, Peter A. Jansen, Oyvind Tafjord, Zhengnan Xie, Hannah Smith, Leighanna Pipatanangkura, Peter ClarkEMNLP2021 Our goal, in the context of open-domain textual question-answering (QA), is to explain answers by not just listing supporting textual evidence (“rationales”), but also showing how such evidence leads to the answer in a systematic way. If this could be done…
    • GooAQ: Open Question Answering with Diverse Answer Types

      Daniel Khashabi, Amos Ng, Tushar Khot, Ashish Sabharwal, Hanna Hajishirzi, Chris Callison-BurchFindings of EMNLP2021 While day-to-day questions come with a variety of answer types, the current questionanswering (QA) literature has failed to adequately address the answer diversity of questions. To this end, we present GOOAQ, a large-scale dataset with a variety of answer…
    • proScript: Partially Ordered Scripts Generation

      Keisuke Sakaguchi, Chandra Bhagavatula, R. L. Bras, Niket Tandon, P. Clark, Yejin ChoiFindings of EMNLP2021 Scripts standardized event sequences describing typical everyday activities have been shown to help understand narratives by providing expectations, resolving ambiguity, and filling in unstated information. However, to date they have proved hard to author or…
    • Ethical-Advice Taker: Do Language Models Understand Natural Language Interventions?

      Jieyu Zhao, Daniel Khashabi, Tushar Khot, Ashish Sabharwal and Kai-Wei Chang ACL-IJCNLP2021 Is it possible to use natural language to intervene in a model’s behavior and alter its prediction in a desired way? We investigate the effectiveness of natural language interventions for reading-comprehension systems, studying this in the context of social…

    The Fermi Challenge

    A challenge dataset of Fermi (estimation) problems, currently beyond the capabilities of modern methods.

    A challenge dataset of Fermi (estimation) problems, currently beyond the capabilities of modern methods.

    BeliefBank

    4998 facts and 12147 constraints to test a model's consistency

    Dataset of 4998 simple facts and 12147 constraints to test, and improve, a model's accuracy and consistency

    EntailmentBank

    2k multi-step entailment trees, explaining the answers to ARC science questions

    2k multi-step entailment trees, explaining the answers to ARC science questions

    StrategyQA

    2,780 implicit multi-hop reasoning questions

    StrategyQA is a question-answering benchmark focusing on open-domain questions where the required reasoning steps are implicit in the question and should be inferred using a strategy. StrategyQA includes 2,780 examples, each consisting of a strategy question, its decomposition, and evidence paragraphs.

    “Knowing is not enough, we must apply. Willing is not enough, we must do.”
    Johann Wolfgang von Goethe

    Paul Allen's 'Digital Aristotle' sets eyes on accomplishing practical tasks

    KOMO News
    February 5, 2020
    Read the Article

    Allen Institute launches GENIE, a leaderboard for human-in-the-loop language model benchmarking

    VentureBeat
    January 20, 2021
    Read the Article

    מערכת בינה מלאכותית עברה בהצטיינות יתרה מבחן במדעים של כיתה ח' (Artificial Intelligence System Cum Laude Passed 8th Grade Science Test)

    Haaretz
    September 6, 2019
    Read the Article

    Allen Institute's Aristo AI makes breakthrough, passes eighth-grade science test

    TechSpot
    September 5, 2019
    Read the Article

    How to tutor AI from an ‘F’ to an ‘A’

    Vulcan Inc
    September 4, 2019
    Read the Article

    A Breakthrough for A.I. Technology: Passing an 8th-Grade Science Test

    The New York Times
    September 4, 2019
    Read the Article

    Allen Institute’s Aristo AI system finally passes an eighth-grade science test

    GeekWire
    September 4, 2019
    Read the Article

    AI assistants say dumb things, and we’re about to find out why

    MIT Tech Review
    March 14, 2018
    Read the Article

    Team

    • Peter Clark's Profile PhotoPeter ClarkResearch
    • Bhavana Dalvi's Profile PhotoBhavana DalviResearch
    • personal photoMatt FinlaysonPredoctoral Young Investigator
    • personal photoAshwin KalyanResearch
    • Tushar Khot's Profile PhotoTushar KhotResearch
    • Kyle Richardson's Profile PhotoKyle RichardsonResearch
    • Ashish Sabharwal's Profile PhotoAshish SabharwalResearch
    • Carissa Schoenick's Profile PhotoCarissa SchoenickProduct
    • Oyvind Tafjord's Profile PhotoOyvind TafjordResearch
    • Niket Tandon's Profile PhotoNiket TandonResearch