YouTube-8M: A Large-Scale Video Classification Benchmark
来源:互联网 发布:超凡战队 知乎 编辑:程序博客网 时间:2024/05/17 22:13
Abstract
Many recent advancements in Computer Vision are attributed to large datasets. Open-source software packages for Machine Learning and inexpensive commodity hardware have reduced the barrier of entry for exploring novel approaches at scale. It is possible to train models over millions of examples within a few days. Although large-scale datasets exist for image understanding, such as ImageNet, there are no comparable size video classification datasets. In this paper, we introduce YouTube-8M, the largest multi-label video classification dataset, composed of ~8 million videos (500K hours of video), annotated with a vocabulary of 4800 visual entities. To get the videos and their labels, we used a YouTube video annotation system, which labels videos with their main topics. While the labels are machine-generated, they have high-precision and are derived from a variety of human-based signals including metadata and query click signals. We filtered the video labels (Knowledge Graph entities) using both automated and manual curation strategies, including asking human raters if the labels are visually recognizable. Then, we decoded each video at one-frame-per-second, and used a Deep CNN pre-trained on ImageNet to extract the hidden representation immediately prior to the classification layer. Finally, we compressed the frame features and make both the features and video-level labels available for download. We trained various (modest) classification models on the dataset, evaluated them using popular evaluation metrics, and report them as baselines. Despite the size of the dataset, some of our models train to convergence in less than a day on a single machine using TensorFlow. We plan to release code for training a TensorFlow model and for computing metrics.
YouTube-8M: A Large-Scale Video Classification Benchmark. Available from:https://www.researchgate.net/publication/308716424_YouTube-8M_A_Large-Scale_Video_Classification_Benchmark [accessed Jun 26, 2017].
阅读全文
0 0
- YouTube-8M: A Large-Scale Video Classification Benchmark
- [Paper note] MARS: A Video Benchmark for Large-Scale Person Re-identification
- Large-scale Video Classification with Convolutional Neural Networks(泛读)
- Large-scale Video Classification with Convolution Neural Networks
- Large-scale Video Classification with Convolutional Neural Networks
- 【论文学习】Large-scale Video Classification with Convolutional Neural Networks
- [深度学习论文笔记][Video Classification] Large-scale Video Classification with Convolutional Neural Networks
- CV论文笔记(二) Large-scale Video Classification with Convolutional Neural Networks
- Large-Scale Scene Classification Using Gist Feature
- MXNet应用之一:Large Scale Image Classification
- Deep Fisher Networks for Large-Scale Image Classification(精读)
- Very Deep Convolutional Networks for Large-Scale Image Classification
- [Shogun] A large scale machine learning toolbox
- [深度学习论文笔记][Image Classification] Very Deep Convolutional Networks for Large-Scale Image Recognitio
- 论文笔记:Deep convolutional activation features for large scale histopathology image classification
- Knowledge Guided Disambiguation for Large-Scale Scene Classification With Multi-Resolution CNNs》阅读笔记
- The Anatomy of a Large-Scale Hypertextual Web Search Engine
- Large-Scale Software Architecture: A Practical Guide using UML
- MediaEval Benchmarking Initiative for Multimedia Evaluation
- Mac_Github 本地配置
- LASSO推导及其在恒星光谱上的应用
- 事件的分发传递机制
- 数据库阿里连接池 druid配置详解
- YouTube-8M: A Large-Scale Video Classification Benchmark
- SSH proxycommand:利用跳板机让不在同一局域网的机器ssh直连
- Laravel源码里面为什么要用::class语法?
- Flash开发IOS证书过期问题
- SIFT特征匹配算法介绍——寻找图像特征点的原理
- Office 2010/2013 图标显示不正常
- 关于OkHttp的用法
- 基于稀疏主成分分析的股票研究
- 在ubuntu14.4里编译UBOOT出错