Perceptual Reasoning and Interaction Research

A new dataset with both first and third-person videos—Charades-Ego—is now available.

Charades Dataset

Charades is dataset composed of 9848 videos of daily indoors activities collected through Amazon Mechanical Turk. 267 different users were presented with a sentence, that includes objects and actions from a fixed vocabulary, and they recorded a video acting out the sentence (like in a game of Charades). The dataset contains 66,500 temporal annotations for 157 action classes, 41,104 labels for 46 object classes, and 27,847 textual descriptions of the videos. This work was presented at ECCV2016.

Each video has been exhaustively annotated using consensus from 4 workers on the training set, and from 8 workers on the test set. Please refer to the updated accompanying publication for details. Please contact vision.amt@allenai.org for questions about the dataset.

Classification Performance

AlexNet	11.2% mAP
C3D	10.9% mAP
Two-Stream	14.2% mAP
IDT	17.2% mAP
Combined	18.6% mAP
Asynchronous Temporal Fields	22.4% mAP [*]

(Uses the official Charades_v1_classify.m evaluation code. More details may be found in the README and the papers below.)

Localization Performance

Random	2.42% mAP
VGG-16	7.89
Two-Stream	8.94% mAP
LSTM	9.60% mAP
LSTM w/ post-processing	10.4% mAP
Two-Stream w/ post-processing	10.9% mAP
Asynchronous Temporal Fields	12.8% mAP [*]

(Uses the official Charades_v1_localize.m evaluation code. More details may be found in the README and the papers below.)

Papers

Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding

Gunnar A. Sigurdsson, Gül Varol, Xiaolong Wang, Ali Farhadi, Ivan Laptev, and Abhinav Gupta • ECCV • 2016

PDF View PDF
Semantic Scholar View and cite on Semantic Scholar

Much Ado About Time: Exhaustive Annotation of Temporal Data

Gunnar A. Sigurdsson, Olga Russakovsky, Ali Farhadi, Ivan Laptev, and Abhinav Gupta • HCOMP • 2016

PDF View PDF
Semantic Scholar View and cite on Semantic Scholar

Asynchronous Temporal Fields for Action Recognition

Gunnar A. Sigurdsson, Santosh Divvala, Ali Farhadi, and Abhinav Gupta • CVPR • 2017

PDF View PDF
Semantic Scholar View and cite on Semantic Scholar
arXiv View on arXiv

Charades

A dataset which guides our research into unstructured video activity recogntion and commonsense reasoning for daily human activities.