自动识别并定位来自视频的大量动作类别,这对于视频理解和多媒体事件检测非常重要。 THUMOS研讨会和挑战旨在在现实的环境中,通过从开源视频中获取大量课程,探索用于大规模动作识别的新挑战和方法。
Action Recognition in Temporally Untrimmed Videos!
Automatically recognizing and localizing a large number of action categories from videos in the wild of significant importance for video understanding and multimedia event detection. THUMOS workshop and challenge aims at exploring new challenges and approaches for large-scale action recognition with large number of classes from open source videos in a realistic setting.
Most of the existing action recognition datasets are composed of videos that have been manually trimmed to bound the action of interest. This has been identified to be a considerable limitation as it poorly matches how action recognition is applied in practical settings. Therefore, THUMOS 2015 will conduct the challenge on temporally untrimmed videos. The participants may train their methods using trimmed clips but will be required to test their systems on untrimmed data.
A new forward-looking dataset containing over 430 hours of video data and 45 million frames (70% larger than THUMOS'14) with the following components is made available under this challenge:
All videos are collected from YouTube. We will evaluate the success of the proposed methods based on their performance on the new THUMOS 2015 Dataset in two tasks:
Participants may either submit a notebook paper that briefly describes their system, or a research paper detailing their approach. All of the submission results will be summarized during the workshop and included in the workshop\conference proceedings. Additionally, the top performers will be invited to give oral presentations, with remaining entries encouraged to present their work in the poster session.
For more details, please see the Evaluation Setup document or the released resources.
Please fill this form in order to receive the password required for unzipping some of the shared data.
Training Data (13320 trimmed videos) -- each includes one action: UCF101 videos (zipped folder): [Download] (updated Apr. 08, 2015) UCF101 videos (individual files): [Link] (updated Apr. 08, 2015) Description of UCF101: [Link] List of action classes and their numerical index: [Download(http://crcv.ucf.edu/THUMOS14/Class Index.txt)]
Background Data (2980 untrimmed videos) -- each guaranteed to not include any instance of the 101 actions: Videos (zipped folder - complete): [Download] (updated Apr. 13, 2015) Videos (zipped folder - 25GB splits): [Part0] [Part1] [Part2] [Part3] [Part4] (updated Apr. 08, 2015) Videos (individual files): [Link] (updated Apr. 08, 2015) Metadata and annotations (primary action class for each video): [Download] (updated Apr. 08, 2015)
Validation Data (2104 untrimmed videos) -- each includes one/multiple instances of one/multiple actions: Videos (zipped folder - complete): [Download] (updated Apr. 13, 2015)
Videos (zipped folder - 25GB splits): [Part0] [Part1] [Part2] [Part3] [Part4] [Part5] [Part6] (updated Apr. 08, 2015)
Videos (individual files): [Link] (updated Apr. 08, 2015) Metadata and class-level annotations (action classes in each video): [Download] (updated Apr. 08, 2015) Temporal annotations of actions (videos of [20 classes](http://crcv.ucf.edu/THUMOS14/Class Index_Detection.txt)): [Download] (updated Apr. 08, 2015)
Development Kit (evaluation code & additional software): [Link] (updated Apr. 08, 2015)
Additional Data: Class-level attributes: [Download] Bounding box annotations of humans (24 classes of UCF101): [Download] Evaluation Setup Document: [Download] (updated May 04, 2015) Sample submission files: [Temporal Action Detection],[Action Recognition] (updated May 04, 2015)
Test Data (5613 untrimmed videos): Videos (zipped folder - Complete): [Download] (updated May 01, 2015) Videos (individual files): [Link] (updated May 01, 2015) Metadata and class-level annotations (action classes in each video): () Temporal annotations of actions (videos of [20 classes](http://crcv.ucf.edu/THUMOS14/Class Index_Detection.txt)): ()
* The ground truth and meta data are being withheld.
If you make use of the data and resources shared for the competition, e.g., annotations or the attribute lexicon, or want to cite the THUMOS challenge, please use the following references:
@article{idrees2017thumos,
title={The THUMOS challenge on action recognition for videos “in the wild”},
author={Idrees, H. and Zamir, A. R. and Jiang, Y. and Gorban, A. and Laptev, I. and Sukthankar, R. and Shah, M.},
journal={Computer Vision and Image Understanding},
volume={155},
pages={1--23},
year={2017},
publisher={Elsevier}}
@misc{THUMOS15,
author = "Gorban, A. and Idrees, H. and Jiang, Y.-G. and Roshan Zamir, A. and Laptev,
I. and Shah, M. and Sukthankar, R.",
title = "{THUMOS} Challenge: Action Recognition with a Large
Number of Classes",
howpublished = "\url{http://www.thumos.info/}",
Year = {2015}}
UCF101 Dataset can be cited as:
@inproceedings{UCF101,
author = {Soomro, K. and Roshan Zamir, A. and Shah, M.},
booktitle = {CRCV-TR-12-01},
title = {{UCF101}: A Dataset of 101 Human Actions Classes From
Videos in The Wild},
year = {2012}}
Participants are strongly encouraged to read the *Evaluation Setup Document* for the details of the competition.
We truly appreciate the following who spent many hours performing annotation/verification for THUMOS: INRIA: Anastasia Syromyatnikova, Ivan Laptev. UCF: Xavier Banks, Ann Dang, Alec Dilanchian, Gabrielle Garcia, Noel Gayle, Christopher Grenci, Cory Kinberger, Camilo Montoya, Lucas Pasqualin, Jose Rodriguez, Alejandro Torroella, Iramis Valentin, Fatemeh Yazdian. FUDAN: Qi Dai, Xue Guo, Huijuan Liu, Chao Ma, Xiaoxin Qiu, Guorui Sun, Jian Tu, Jiajun Wang, Qiang Wang, Zuxuan Wu, Nina Xiang, Yangjin Yao, Yiye Zhu.