Digital Socrates

Aristo • 2023

DS Critique Bank (DSCB) is a dataset of multiple-choice questions with associated answers and explanations provided by "student models", along with "critiques" of the explanations provided by "critique models". Many of the instances have human annotations.

Download Read Paper

License: ODC-BY

The student models are: gpt-4-0613, gpt-3.5-turbo-0613, Llama-2-70b-Chat, and Llama-2-7b-Chat.

The critique models are: gpt-4-0613, DS-13B, and DS-7B (the latter two are Digital Socrates models fine-tuned on the DSCB training data starting from Llama-2-Chat models)

The trained critique models can be accessed from the Hugging Face Model Hub. The recommended model is DS-13B vs the smaller DS-7B model.

The following files are in the dataset:

DSCB-train-silver.jsonl: 3240 instances with silver GPT-4 critiques
DSCB-train-crowd-anno.jsonl: 3240 instances with human-annotated GPT-4 critiques
DSCB-train-expert.jsonl: 198 instances with human-edited critiques
DSCB-dev-crowd-anno.jsonl: 270 instances with human-annotated critiques from GPT-4, DS-13B, and DS-7B
DSCB-dev-non-anno.jsonl: 6330 instances with critiques from GPT-4, DS-13B, and DS-7B
DSCB-prompts.json: The prompts used for querying student model explanations and critique model critiques

The prompts have placeholders in double brackets, like [[QUESTION]], for inserting the different variables.

The jsonl files have the following fields:

id: Unique id of instance (combining qid, student_model and student_prompt)
qid: Question id from original dataset
dataset: Which dataset the question comes from
question: Full text of question, with answer choices
gold_answer: The label of the correct answer to the question
student_model: Which student model was used
student_prompt: Which prompt was used for student model (see DSCB-prompts.json for actual prompt)
student_llm_options: Options (like temperature) used by student model
student_answer: Answer predicted by student model
student_accuracy: Whether answer is correct (1) or incorrect (0)
student_explanation: Explanation text provided by student model
student_raw_output: Raw output from student model (which was parsed into student_answer and student_explanation)
critiques: A list of critiques of the student explanation, with the following fields for each critique:
- critique_model: Which critique model was used
- critique_llm_options: Options (like temperature) used by critique model
- critique_text: The full text of the critique
- critique_elements: A dictionary of the elements of the critique, namely main_flaw, dimension, general_feedback, specific_feedback, and explanation_score (number from 0 to 5)

In addition, some instances will have human annotations from crowd workers, both at the explanation level and for each critique. At the top level there will then be a explanation_annotations field which is a list of json objects with the following fields:

explanation_score: Explanation score assigned by worker
dimensions: A list of major flaw dimensions identified by worker
worker: A unique ID associated with each worker

For each critique, there might be a critique_annotations which is again a list of json objects with these fields:

critique_score: The quality of the critique (on 0-3 scale) according to worker
worker: A unique ID associated with each worker

Please cite this work as:

@misc{gu2024digital,
      title={Digital {Socrates}: Evaluating {LLMs} through Explanation Critiques}, 
      author={Yuling Gu and Oyvind Tafjord and Peter Clark},
      year={2024},
      eprint={2311.09613},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Authors

Yuling Gu, Oyvind Tafjord, Peter Clark

Natural Language Processing

Computer Vision

AI for the Environment

Experimentation and Communication

Research

Research

Digital Socrates

Authors