Deep learning: prevent overfitting && speed up training
来源:互联网 发布:美国邮箱数据出售 编辑:程序博客网 时间:2024/06/06 11:41
prevent overfitting:
1. 在loss function上添加L1、L2正则项
2. Drop out
3. Data augmentation
4. early stopping
在Keras上都有对应的实现方式
speed up training
1. normalization(零均值化 + 归一化方差),why normalization?因为如果input 差别很大的话,loss function通常是这样的,这就意味着需要很小的learning rate(太大了直接跳出optimal范围了),而且梯度下降过程也是在optimal区域内内来回震荡
Unintuitive effects and their consequences. Notice that if one of the inputs to the multiply gate is very small and the other is very big, then the multiply gate will do something slightly unintuitive: it will assign a relatively huge gradient to the small input and a tiny gradient to the large input. Note that in linear classifiers where the weights are dot producted
http://cs231n.github.io/optimization-2/
2. 权重随机初始化:这起源于deep model训练上的问题,deep model由于hidden layer很多,链式求导的公式gradient连乘项就会比较多,如果每个连乘项比1大,那就是exploding gradient,反之就是vainshing gradient,
将初始权重设为0均值,variance为1/n(n是该层的input dimension),这样就可以使W*X在合理的范围,而导数与W*X相关,对应的也被约束了
一般不同的激活函数会选择不同的variance,
- Deep learning: prevent overfitting && speed up training
- training deep learning model
- Techniques for preventing overfitting in training Deep Neural Networks
- Deep learning from the bottom up
- 「Deep Learning」Batch Normalization - Accelerating Deep Network Training
- Deep Learning:Optimization for Training Deep Models(零)
- Deep Learning:Optimization for Training Deep Models(一)
- Deep Learning:Optimization for Training Deep Models(二)
- Why Does Unsupervised Pre-training Help Deep Learning?
- Reducing Overfitting in Deep CNNs
- 【Machine Learning】笔记:Overfitting 问题
- 【deep learning学习笔记】Greedy Layer-Wise Training of Deep Networks
- Deep Learning读书笔记(三):Greedy Layer-Wise Training of Deep Networks
- 【Deep Learning】笔记:Understanding the difficulty of training deep feedforward neural networks
- Dropout: A Simple Way to Prevent Neural Networks from Overfitting
- Dropout:A Simple Way to Prevent Neural Networks from Overfitting
- Learning Flex 3: Getting up to Speed with Rich Internet Applications
- Speed up Eclipse
- Java锁优化
- [C]成员运算符"."和间接成员运算符"->"浅析
- js统计html中标签出现次数最多的标签
- zookeeper搭建环境配置
- 如何查看80端口被占用?
- Deep learning: prevent overfitting && speed up training
- hdu 6161--Big binary tree(思维--压缩空间)
- 使用MappedByteBuffer读取大文件(1G以上)和释放MappedByteBuffer的资源
- 微信小程序----组件之rich-text
- Android Ble蓝牙开发(服务器端)
- 无缓冲chan
- 迷宫问题
- MyEclipse中server服务窗口的问题
- PHP-上传图片