吴恩达深度学习课程第三部分笔记要点
来源:互联网 发布:壁虎java视频百度网盘 编辑:程序博客网 时间:2024/06/05 16:18
1-Using a single number evaluation metric:
eg. Trade-off betweeen the Precision & the Recall --> F1 Score;
2-Satisfacing & Optimizing metrics:
one optimizing metric(尽可能优化) with multiple satisfacing metrics(需满足指定threshold);
3-Dev set & test set:
dev set(hold out cross validation set) + one single metric;
确保开发集和测试集数据来自同一分布;
4-训练集、开发集与测试集大小:
-总数据量较小(<10000):60% + 20% + 20%;
-总数据量较大(百万级):98% + 1% + 1%;
-有时候只有训练集和开发集(应防止在开发集上过拟合);
5-根据具体情况改变metric和测试数据:
-正则化;定义一个evaluation metric,优化此metric的表现;
eg. cat recognition给把porn图片错误识别为猫的情况加大惩罚;
eg. 用户上传的图片没有训练数据中的图片清晰;
6-Bayes Optimal Error(best possible error) & human-level performance;
7-Avoidable bias = Training error - Bayes error;
Avoidable Variance = Dev error - Training error;
8-Human-level error as a proxy for Bayes error;
9-To reduce avoidable bias:
Train bigger model;
longer/better optimization algorithm(momentum, RMSprop, Adam);
hyperparameters search;
-To reduce avoidable variance:
More data;
Regularization(L2, dropout, data augmentation);
hyperparameters search;
10-Error Analysis:
从测试集或开发集中选择一定数量的错误例子手动检查错误类型(假阳性/假阴性),统计属于不同错误类型的样本数量;
11-Remove incorrectly labeled data(for dev/test set):
-Compare the errors due to incorrect labels with the overall dev/test set errors by Error Analysis;
-Apply same process to your dev set & test set to make sure they still come from the same distribution;
-Consider examining examples your algo got right as well as ones it got wrong;
-Train and dev/test data may now come from slightly different distributions;
12-Training & testing on different distributions;
13-Bias & variance with mismatched data distributions
Train-dev set: same distribution as training set, but not used for training;
-> if the errors on train-dev set is almost as high as on dev set, generalization problem;
-> if the errors on train-dev set is almost as low as on train set, data mismatch problem;
training set errors - human level = avoidable bias;
train-dev set errors - training set errors = variance;
dev set errors - train-dev set errors = data mismatch;
test set errors - dev set errors = degree of overfitting;
14-Addressing data mismatch:
-do manuel error analysis to understand difference between train set & dev/test set;
-collect more training data similar to dev/test set;
-artificial data synthesis(eg. car noise, might cause overfitting on synthesised data);
15-Transfer learning:
pre-training & fine-tuning;
-low level features from the pre-trained network could be helpful for the fine-tuned network;
-useful when you have lots of data for the problem you're transferring from
& usually relatively less data for the problem you're transferring to;
eg. 在原有的训练好的图像识别网络上更换输出层或最后几层网络重新训练其他特定网络,
比如用1百万数据量的猫照片识别网络迁移学习X光照片诊断网络;
16-Multi-task learning:
一个数据可以有多个标签(在一张图片中标记多个物体);
-Training on a set tasks that could benefit from having shared lower-level features;
-traditionnal pipeline vs end-to-end approach;
-let the data speak;
-less hand-designing of components needed;
-may need a large amount of data;
eg. Trade-off betweeen the Precision & the Recall --> F1 Score;
2-Satisfacing & Optimizing metrics:
one optimizing metric(尽可能优化) with multiple satisfacing metrics(需满足指定threshold);
3-Dev set & test set:
dev set(hold out cross validation set) + one single metric;
确保开发集和测试集数据来自同一分布;
4-训练集、开发集与测试集大小:
-总数据量较小(<10000):60% + 20% + 20%;
-总数据量较大(百万级):98% + 1% + 1%;
-有时候只有训练集和开发集(应防止在开发集上过拟合);
5-根据具体情况改变metric和测试数据:
-正则化;定义一个evaluation metric,优化此metric的表现;
eg. cat recognition给把porn图片错误识别为猫的情况加大惩罚;
eg. 用户上传的图片没有训练数据中的图片清晰;
6-Bayes Optimal Error(best possible error) & human-level performance;
7-Avoidable bias = Training error - Bayes error;
Avoidable Variance = Dev error - Training error;
8-Human-level error as a proxy for Bayes error;
9-To reduce avoidable bias:
Train bigger model;
longer/better optimization algorithm(momentum, RMSprop, Adam);
hyperparameters search;
-To reduce avoidable variance:
More data;
Regularization(L2, dropout, data augmentation);
hyperparameters search;
10-Error Analysis:
从测试集或开发集中选择一定数量的错误例子手动检查错误类型(假阳性/假阴性),统计属于不同错误类型的样本数量;
11-Remove incorrectly labeled data(for dev/test set):
-Compare the errors due to incorrect labels with the overall dev/test set errors by Error Analysis;
-Apply same process to your dev set & test set to make sure they still come from the same distribution;
-Consider examining examples your algo got right as well as ones it got wrong;
-Train and dev/test data may now come from slightly different distributions;
12-Training & testing on different distributions;
13-Bias & variance with mismatched data distributions
Train-dev set: same distribution as training set, but not used for training;
-> if the errors on train-dev set is almost as high as on dev set, generalization problem;
-> if the errors on train-dev set is almost as low as on train set, data mismatch problem;
training set errors - human level = avoidable bias;
train-dev set errors - training set errors = variance;
dev set errors - train-dev set errors = data mismatch;
test set errors - dev set errors = degree of overfitting;
14-Addressing data mismatch:
-do manuel error analysis to understand difference between train set & dev/test set;
-collect more training data similar to dev/test set;
-artificial data synthesis(eg. car noise, might cause overfitting on synthesised data);
15-Transfer learning:
pre-training & fine-tuning;
-low level features from the pre-trained network could be helpful for the fine-tuned network;
-useful when you have lots of data for the problem you're transferring from
& usually relatively less data for the problem you're transferring to;
eg. 在原有的训练好的图像识别网络上更换输出层或最后几层网络重新训练其他特定网络,
比如用1百万数据量的猫照片识别网络迁移学习X光照片诊断网络;
16-Multi-task learning:
一个数据可以有多个标签(在一张图片中标记多个物体);
-Training on a set tasks that could benefit from having shared lower-level features;
-Can train a big enough networks to do well on all the tasks;
-traditionnal pipeline vs end-to-end approach;
-let the data speak;
-less hand-designing of components needed;
-may need a large amount of data;
-excludes potentially useful hang-designed components(human knowledge).
阅读全文
0 0
- 吴恩达深度学习课程第三部分笔记要点
- 吴恩达深度学习课程第二部分笔记要点
- 吴恩达深度学习课程笔记
- 吴恩达 神经网络和深度学习 第一部分课程 第三章课后习题 Shallow Neural Networks Quiz, 10 questions
- 吴恩达 神经网络和深度学习 第一部分课程 第三章课后习题 Planar data classification with a hidden layer
- 吴恩达《深度学习》课程
- 吴恩达深度学习课程四:卷积神经网络(学习笔记)
- 吴恩达 神经网络和深度学习 第一部分课程 第三章课后习题 Key concepts on Deep Neural Networks Quiz, 10 questions
- 吴恩达深度学习课程笔记 1.2什么是神经网络?
- 吴恩达深度学习课程笔记 2.1二分分类
- 吴恩达深度学习课程笔记 2.2Logistic Regression逻辑回归
- 吴恩达深度学习课程笔记 2.3逻辑回归cost function
- 吴恩达《神经网络与深度学习》课程笔记归纳(一)
- Coursera吴恩达《神经网络和深度学习》课程笔记(2)
- Coursera吴恩达《神经网络和深度学习》课程笔记(3)
- 吴恩达深度学习课程笔记之神经网络基础
- 深度学习笔记--引言部分
- 吴恩达《深度学习》课程介绍
- bzoj1179: [Apio2009]Atm(强联通+最短路)
- HDU
- DevOps是什么
- 生产者消费者实例
- 1、Solr基础
- 吴恩达深度学习课程第三部分笔记要点
- error LNK2001:无法解析的外部符号_main
- Python解惑:整数比较
- 为什么执行 x in range(1000000000) 如此快?
- 有关OpenGL ES
- 基于微博数据用 Python 打造一颗“心”
- 代码这样写更优雅(Python版)
- 代码这样写不止于优雅(Python版)
- Python 编码为什么那么蛋疼?