Digital Socrates

Aristo • 2023
DS Critique Bank (DSCB) is a dataset of multiple-choice questions with associated answers and explanations provided by "student models", along with "critiques" of the explanations provided by "critique models". Many of the instances have human annotations.
License: ODC-BY

The student models are: gpt-4-0613, gpt-3.5-turbo-0613, Llama-2-70b-Chat, and Llama-2-7b-Chat.

The critique models are: gpt-4-0613, DS-13B, and DS-7B (the latter two are Digital Socrates models fine-tuned on the DSCB training data starting from Llama-2-Chat models)

The trained critique models can be accessed from the Hugging Face Model Hub. The recommended model is DS-13B vs the smaller DS-7B model.

The following files are in the dataset:

  • DSCB-train-silver.jsonl: 3240 instances with silver GPT-4 critiques
  • DSCB-train-crowd-anno.jsonl: 3240 instances with human-annotated GPT-4 critiques
  • DSCB-train-expert.jsonl: 198 instances with human-edited critiques
  • DSCB-dev-crowd-anno.jsonl: 270 instances with human-annotated critiques from GPT-4, DS-13B, and DS-7B
  • DSCB-dev-non-anno.jsonl: 6330 instances with critiques from GPT-4, DS-13B, and DS-7B
  • DSCB-prompts.json: The prompts used for querying student model explanations and critique model critiques

The prompts have placeholders in double brackets, like [[QUESTION]], for inserting the different variables.

The jsonl files have the following fields:

  • id: Unique id of instance (combining qid, student_model and student_prompt)
  • qid: Question id from original dataset
  • dataset: Which dataset the question comes from
  • question: Full text of question, with answer choices
  • gold_answer: The label of the correct answer to the question
  • student_model: Which student model was used
  • student_prompt: Which prompt was used for student model (see DSCB-prompts.json for actual prompt)
  • student_llm_options: Options (like temperature) used by student model
  • student_answer: Answer predicted by student model
  • student_accuracy: Whether answer is correct (1) or incorrect (0)
  • student_explanation: Explanation text provided by student model
  • student_raw_output: Raw output from student model (which was parsed into student_answer and student_explanation)
  • critiques: A list of critiques of the student explanation, with the following fields for each critique:
    • critique_model: Which critique model was used
    • critique_llm_options: Options (like temperature) used by critique model
    • critique_text: The full text of the critique
    • critique_elements: A dictionary of the elements of the critique, namely main_flaw, dimension, general_feedback, specific_feedback, and explanation_score (number from 0 to 5)

In addition, some instances will have human annotations from crowd workers, both at the explanation level and for each critique. At the top level there will then be a explanation_annotations field which is a list of json objects with the following fields:

  • explanation_score: Explanation score assigned by worker
  • dimensions: A list of major flaw dimensions identified by worker
  • worker: A unique ID associated with each worker

For each critique, there might be a critique_annotations which is again a list of json objects with these fields:

  • critique_score: The quality of the critique (on 0-3 scale) according to worker
  • worker: A unique ID associated with each worker

Please cite this work as:

      title={Digital {Socrates}: Evaluating {LLMs} through Explanation Critiques}, 
      author={Yuling Gu and Oyvind Tafjord and Peter Clark},


Yuling Gu, Oyvind Tafjord, Peter Clark