Towards teachable reasoning systems
Bhavana Dalvi Mishra / January 10, 2023
What are they thinking? You may find yourself asking this question when you’re having a conversation with a friend or loved one, and wanting to identify how that person came to a certain conclusion. But this question isn’t limited to people — researchers are starting to ask it of their artificially intelligent models.
Traditionally, outputs from AI models were shrouded in mystery. Researchers might understand something about an output based on the dataset a model was trained on, but may wonder about the specific data that model was pulling from to answer a query or provide its output. But what if a model could show its work, transparently outlining the reasoning it used to produce an output — and then, in turn, be corrected by a user if that reasoning was faulty?
At AI2, a team of researchers from the Aristo project aspired to build a teachable reasoning system with 2 desired attributes:
Figure 1: We created Desiderata for a Teachable Reasoning System to understand why the system chose a certain answer, and correct the system's behavior by providing NL feedback.
The Approach
The proposed system, called TeachMe, consists of two main components: (1) a T5-based machine reasoning model, Entailer, which can produce truthful and faithful chains of reasoning (enables us to understand why the system chose a certain answer), and (2) a dynamic memory of user feedback (enables us to correct system's behavior).
1. Entailer
Pretrained Language Models contain a vast amount of knowledge, and can be thought of as a kind of knowledge base to tap into. Our approach articulates this latent knowledge, but here in a goal directed way, namely by producing a faithful chain of reasoning from facts it validates ("believes") as true to an answer. The proposed model, Entailer, uses a combination of generation and verification to construct multi-step reasoning chains. For each step, Entailer over-generates candidate entailments, then filters out those that do not conform to its own internal knowledge ("beliefs") by self-querying, asking itself whether (a) the generated premises (leaves of the proof step) are true, and (b) each entailment step is valid (Figure 2). It then recursively backward chains on premises until the overall proof confidence cannot be further improved (or a depth limit d is reached). Finally, the candidate answer supported by the highest-scoring chain of a reasoning is returned. As a result, the system has materialized some of its latent knowledge from which the selected answer follows. Most significantly, the resulting proof is thus both faithful (the answer follows from the proof) and truthful (the proof reflects the system's beliefs), providing a previously unavailable window into the model's beliefs about the world and their implications.
Figure 2: When presented with a hypothesis, our approach, Entailer, searches for a hypothesis supported by a 1-step entailment proof by first over-generating candidate proofs then removing those that the model itself does not “believe” (i.e., verifies that it considers all the generated proof elements correct via self-querying).
2. TeachMe
People are typically able to provide a chain of reasoning for their decisions, and may change their mind if a flaw in their knowledge or reasoning is exposed. Our goal is to similarly have machines provide reasoned answers to questions, showing how the answer follows from its internal knowledge (and possibly externally available information), and where it is capable of changing its answer if errors in that knowledge are identified. Our approach has three components. First, the system produces answers supported by an entailment-based chain of reasoning, showing how the answer follows from the system's own internal beliefs. Second, if an answer is wrong, a user can inspect the reasoning to diagnose and correct the failure. For example in Figure 3, the system incorrectly concludes that "a magnet can pick up a penny" from its over-general (false) belief that "metals are magnetic". The user can thus correct the mistake by asserting that "not all metals are magnetic", in particular copper. Finally, to store and apply the user's feedback, we augment the model with a dynamic memory. Given a new question (or re-asking an old question), TeachMe retrieves user supplied facts from this memory. These are then used as context while generating an entailment-supported answer to the question, e.g., step (C) in Figure 3. This helps override prior, erroneous model beliefs, thus biasing TeachMe to avoid similar mistakes in future - a novel application of memory-based continual learning to belief maintenance, in which the model itself remains fixed (frozen) and retraining is not required.
Figure 3: TeachMe augments the basic question-answering model with a memory of user feedback. (A) Given a new question, facts retrieved from memory are used as additional context for the model, influencing its answers and proofs. (B) If the user disagrees with an answer, they localize the error in the explanation and offer corrective feedback, which is added to memory. (C) These new facts can then be retrieved if the query is re-asked, helping the system avoid repeating mistakes. Note that these also help improve answers on new, similar questions that are asked later, helping the system improve over time.
The Results
To our knowledge, Entailer is the first system to generate multi-step chains that are both faithful (the answer follows from the reasoning) and truthful (the chain reflects the system's own internal beliefs). In evaluation using two different datasets, users judge that a majority (70%+) of generated chains clearly show how an answer follows from a set of facts - substantially better than a high-performance baseline - while preserving answer accuracy. Materializing model beliefs that systematically support an answer facilitates interaction where users can understand the model's beliefs and correct its misunderstandings when an answer is wrong. Our TeachMe architecture augments Entailer with a dynamic memory of user feedback, containing user-supplied corrections to erroneous model beliefs that users identify during such interaction. Retrievals from memory are used as additional context for QA, to help avoid previous mistakes in similar new situations - a novel application of memory-based continuous learning. With simulated feedback, we find that TeachMe continually improves with time, and without model retraining, requiring feedback on only 25% of training examples to reach within 1% of the upper-bound (feedback on all examples). Similarly, in experiments with real users, we observe a similar trend, with performance improving by over 15% on a hidden test set after teaching. This suggests new opportunities for using frozen language models in an interactive setting where users can inspect, debug, and correct the model's beliefs, leading to improved system's performance over time.
The Impact
Our goal is a teachable reasoning system, where users can interact to see its beliefs and reasoning, and correct it when it is wrong. We have shown that by embedding an entailment-based QA model in a larger system with a dynamic, persistent memory, users can correct and override model beliefs, resulting in an overall system that can improve over time without retraining. To our knowledge, this is the first system to show that user-provided and model-internal beliefs can be integrated together for systematic reasoning. This is significant as it is a step towards systems that can not only interact with users, but continually learn from them.
The Next Steps
We hope that our TeachMe effort will inspire the research community to build AI systems that can learn directly from users in a conversational way, rather than solely training on large datasets. It also suggests a way of overcoming the opaqueness of neural systems, by viewing models as components in a larger system with a persistent memory that can systematically reason.
To read more, see our papers authored by Bhavana Dalvi Mishra, Oyvind Tafjord, and Peter Clark
Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning