[深度学习论文笔记][Video Classification] Large-scale Video Classification with Convolutional Neural Networks
来源:互联网 发布:知乎赚钱 编辑:程序博客网 时间:2024/05/18 03:47
1 Spatio-Temporal CNN
We treat every video as a bag of short, fixed-sized clips (15 frames in our case). Since each clip contains several contiguous frames in time, we can extend the connectivity of the network in time dimension to learn spatio-temporal features. There are four fuse information across temporal domain. See Fig.
Late Fustion] Place two separate single-frame neworks with shared parameters a distance of 15 frames apart, and then merges the two streams in the first fully connected layer, which can compute global motion characteristics by comparing outputs of both networks.
[Early Fusion] Modify the filters on conv1 in the single-frame network by extending them to be size (DT) × F H × F W .
[Slow Fusion] This is a balance between late fustion and slow fusion, in which higher layers get access to progressively more global information in both spatial and temporal dimensions. This is implemented by extending the connectivity of all convolutional layers in time dimension and carrying out temporal convolutions in addition to spatial convolutions to compute activations. This model turns to work best.
2 Multi-resolution CNNs
We want to speed up the networks. However, simply reducing the nuber of layers or neurons or training with lower resolution will hurt the performance. We proposed multi-resolution CNN which composed by two separate streams. The context stream receives the downsampled frames at half the original spatial resolution (89 × 89 pixels), while the fovea stream receives the center 89 × 89 region at the original resolution. In this way, the the total input dimensionality is halved. Notably, this design takes advantage of the camera bias present in many online videos, since the object of interest often occupies the center regio he activations from both streams are concatenated and fed into the first fully connected layer with dense connections. See Fig.
3 Results
The single-frame model already displays strong performance, suggesting that local motion cues may not be critically important.
4 References
[1]. https://www.youtube.com/watch?v=qrzQ_AB1DZk.
[2]. http://techtalks.tv/talks/large-scale-video-classification-with-convolutional-neural-networks-2/60272/.
[3]. https://vimeo.com/101555393.
- [深度学习论文笔记][Video Classification] Large-scale Video Classification with Convolutional Neural Networks
- 【论文学习】Large-scale Video Classification with Convolutional Neural Networks
- CV论文笔记(二) Large-scale Video Classification with Convolutional Neural Networks
- Large-scale Video Classification with Convolutional Neural Networks(泛读)
- Large-scale Video Classification with Convolutional Neural Networks
- Large-scale Video Classification with Convolution Neural Networks
- [深度学习论文笔记][Video Classification] Learning Spatiotemporal Features with 3D Convolutional Networks
- [深度学习论文笔记][Video Classification] Delving Deeper into Convolutional Networks for Learning Video Repre
- [深度学习论文笔记][Image Classification] ImageNet Classification with Deep Convolutional Neural Networks
- [深度学习论文笔记][Image Classification] Very Deep Convolutional Networks for Large-Scale Image Recognitio
- [深度学习论文笔记][Video Classification] Two-Stream Convolutional Networks for Action Recognition in Videos
- [深度学习论文笔记][Video Classification] Long-term Recurrent Convolutional Networks for Visual Recognition a
- ImageNet Classification with Deep Convolutional Neural Networks论文笔记
- 论文笔记《ImageNet Classification with Deep Convolutional Neural Networks》
- 论文笔记ImageNet Classification with Deep Convolutional Neural Networks(AlexNet)
- ImageNet Classification with Deep Convolutional Neural Networks 论文笔记
- 论文笔记:ImageNet Classification with Deep Convolutional Neural Networks
- ImageNet Classification with Deep Convolutional Neural Networks 论文学习
- oracle面试题
- chapter 6 exercise 13
- Hibernate一一映射
- 位运算
- 应用开机自启动&首启动&是否安装&应用信息界面&启动程序&桌面选择器
- [深度学习论文笔记][Video Classification] Large-scale Video Classification with Convolutional Neural Networks
- Android 打造形形色色的进度条 实现可以如此简单
- ICA
- python学习日志3
- 电信手机数据连接的那些问题
- 【matlab】从图片中截取矩形区域(手工选取/标记在原图上/截取矩形区域并保存)
- 响应式设计之媒体查询一
- Jetty实战(3)之嵌入式简单文件服务器
- Arm汇编 位置无关代码 adr 指令