The student models are: gpt-4-0613
, gpt-3.5-turbo-0613
, Llama-2-70b-Chat
, and Llama-2-7b-Chat
.
The critique models are: gpt-4-0613
, DS-13B
, and DS-7B
(the latter two are Digital Socrates models
fine-tuned on the DSCB training data starting from Llama-2-Chat models)
The trained critique models can be accessed from the Hugging Face Model Hub. The recommended model is DS-13B vs the smaller DS-7B model.
The following files are in the dataset:
DSCB-train-silver.jsonl
: 3240 instances with silver GPT-4 critiquesDSCB-train-crowd-anno.jsonl
: 3240 instances with human-annotated GPT-4 critiquesDSCB-train-expert.jsonl
: 198 instances with human-edited critiquesDSCB-dev-crowd-anno.jsonl
: 270 instances with human-annotated critiques from GPT-4, DS-13B, and DS-7BDSCB-dev-non-anno.jsonl
: 6330 instances with critiques from GPT-4, DS-13B, and DS-7BDSCB-prompts.json
: The prompts used for querying student model explanations and critique model critiquesThe prompts have placeholders in double brackets, like [[QUESTION]]
, for inserting the different variables.
The jsonl files have the following fields:
id
: Unique id of instance (combining qid
, student_model
and student_prompt
)qid
: Question id from original datasetdataset
: Which dataset the question comes fromquestion
: Full text of question, with answer choicesgold_answer
: The label of the correct answer to the questionstudent_model
: Which student model was usedstudent_prompt
: Which prompt was used for student model (see DSCB-prompts.json
for actual prompt)student_llm_options
: Options (like temperature) used by student modelstudent_answer
: Answer predicted by student modelstudent_accuracy
: Whether answer is correct (1) or incorrect (0)student_explanation
: Explanation text provided by student modelstudent_raw_output
: Raw output from student model (which was parsed into student_answer
and student_explanation
)critiques
: A list of critiques of the student explanation, with the following fields for each critique:
critique_model
: Which critique model was usedcritique_llm_options
: Options (like temperature) used by critique modelcritique_text
: The full text of the critiquecritique_elements
: A dictionary of the elements of the critique, namely main_flaw
, dimension
, general_feedback
, specific_feedback
, and explanation_score
(number from 0 to 5)In addition, some instances will have human annotations from crowd workers, both at the explanation level and for each critique.
At the top level there will then be a explanation_annotations
field which is a list of json objects with the following fields:
explanation_score
: Explanation score assigned by workerdimensions
: A list of major flaw dimensions identified by workerworker
: A unique ID associated with each workerFor each critique, there might be a critique_annotations
which is again a list of json objects with these fields:
critique_score
: The quality of the critique (on 0-3 scale) according to workerworker
: A unique ID associated with each workerYuling Gu, Oyvind Tafjord, Peter Clark