程序博客网 > 东离剑游纪知乎

Large Scale Machine Learning

来源：互联网发布：东离剑游纪知乎编辑：程序博客网时间：2024/04/27 21:11

这里写图片描述

Before learning with large datasets, plot a learning curve with a much small datasets, and to see if it appears to be have high bias.

Stochastic Gradient Decent
(taking linear regression for example)

这里写图片描述

(Repeat between 1 to 10 times)

as the picture shows, it will converge in random directions and fail to converge into a single point, but much more effective and faster, compared with Batch Gradient Descent.

Mini-Batch Gradient Descent

这里写图片描述

b is often chosen from 2 to 100

Mini-Batch will be faster than Stochastic only by vetorization.

Checking for Converge
(without scanning over the entire training set periodically)
这里写图片描述
In plot 1, with a smaller α, the red line converge better, however, slower.
In plot 2 and 3, with a larger examples, the red line looks smoother, which do help to plot 3 confronted too much noise. Similarly, it will be slower.
In plot 4, with a increasing plot, maybe we should using a smaller α instead.

Learning Rate

to get closer to the optima:

这里写图片描述

However, it will become a hard work to choose these two constants, so it maybe not good choice.

the online learning
—to learn from a continuous stream of data

这里写图片描述

Without the superscript i, while a set of data has been used, we discard it.
And it also can adapt to change preference.

这里写图片描述
CTR(Click Through Rate)

With 10 pairs of (x,y), they are also 10 examples, which can improve our parameters.

Map Reduce

这里写图片描述

0 0

东离剑游纪知乎

东离剑游纪知乎

原创粉丝点击

热门问题 老师的惩罚人脸识别我在镇武司摸鱼那些年重生之率土为王我在大康的咸鱼生活盘龙之生命进化天生仙种凡人之先天五行春回大明朝姑娘不必设防，我是瞎子浊白浓浆不准流浊白浓浆灌入花壶小腹鼓浓浆灌装机红枣浓浆灌满了一肚子浓浆浓硫酸与浓盐酸反应浓盐酸有腐蚀性吗纯盐酸盐酸倍他司汀价格盐酸法舒地尔盐酸替罗非班浓盐酸的价格盐酸表阿霉素盐酸赛庚啶盐酸利托君片价格盐酸依匹斯汀盐酸二氧丙嗪盐酸克仑特罗盐酸多塞平盐酸多虑平盐酸伐昔洛韦分散片价格盐酸维拉帕米盐酸美西律盐酸氢亚胺盐酸阿比多尔盐酸购买盐酸柔红霉素盐酸阿那格雷浓硝酸浓盐酸盐酸帕罗西注射用盐酸表柔比星盐酸替利嗪片盐酸米那普仑盐酸检测盐酸丙帕他莫盐酸丙咪嗪盐酸氟哌噻吨盐酸乙二胺盐酸班布特罗浓盐酸浓硝酸盐酸伊托必利