深度学习综述

来源：互联网发布：农村淘宝站点查询编辑：程序博客网时间：2024/05/29 23:22

- 什么是深度学习
- 深度学习的应用
- 深度学习的框架
- CNN

什么是深度学习

深度学习是机器学习中的一个分支，是当今AI领域最热门最前沿的研究内容。它考虑的是用non-linear multiple processing layers来学习数据的multiple levels features的抽象表征用反向传播BP算法对计算模型训练，通过supervised或weakly supervised的特征学习和分层特征提取代替手工获取特征。

深度学习的应用

深度学习由于其不需要人工提取特征，只需要大量labelled data进行训练，从而在image, video, speech, text, audio等领域大放异彩。应用包含定位、识别、匹配、语音转文本、电商中的商品推荐等。当然还有Google的AlphaGo。

深度学习的框架

现在深度学习的框架已经有很多，如深度神经网络DNN、卷积神经网络CNN、深度置信网络DBN、递归神经网络RNN和生成对抗网络GAN等。其中以CNN和GAN受关注度最高，前者不但可以通过得到候选区域来实现定位，而且可以通过减少参数加快训练速度；后者则是可以通过输入label得到输出数据（如image），从某种意义上相当于其它常用深度学习框架的逆过程。

CNN

卷积神经网络(Convolutional Neural Network)是目前应用最广泛的深度学习框架，其经典网络模型包括LeNet, AlexNet [1], GoogleNet [2], VGG [3], ResNet [4]。
下面的四张图就可以清楚的描述CNN的结构：（取自[5]）
这里写图片描述

这里写图片描述
在Convolution中，可以看到一个batch有N张feature maps输入，每张feature maps的size是宽为W、高为H、通道数为C。对于每一个batch的数据，使用的都是相同的共M组C通道的卷积核filters，filter size是R x S。将每一组对应通道的filter和feature map卷积并相加得到E x F的一个通道的输出feature map，M组filters就可以得到一个M通道的feature maps，N张输入就可以得到N张M通道的输出。卷积细节还有边缘补零padding, 步长stride等
这里写图片描述

这里写图片描述
BP算法需要与最优化optimize算法结合使用才能根据error对网络中的parameters进行update，目前效果比较好的最优化算法有Adagrad [6], AdagradDA [6] , Adadelta [7]和Adam [8]。

[1]: Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]. Advances in neural information processing systems. 2012: 1097-1105.
[2]: Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 1-9.
[3]: Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.
[4]: He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 770-778.
[5]: Sze V, Chen Y H, Yang T J, et al. Efficient processing of deep neural networks: A tutorial and survey[J]. arXiv preprint arXiv:1703.09039, 2017.
[6]: Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization[J]. Journal of Machine Learning Research, 2011, 12(Jul): 2121-2159.
[7]: Zeiler M D. ADADELTA: an adaptive learning rate method[J]. arXiv preprint arXiv:1212.5701, 2012.
[8]: Kingma D, Ba J. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv:1412.6980, 2014.

阅读全文

0 0