CS229 Lecture Notes(2): Logistic Regression
来源:互联网 发布:舆情监控软件多少钱 编辑:程序博客网 时间:2024/05/29 17:28
Logistic Regression
Binary classification problem
Failure of OLS regression in binary classification problem:
- hard to define the threshold
- no sense if
y>1 ory<0
Hypothesis:
hθ(x)=g(θTx)=11+e−θTx
whereis called the logistic function or the sigmoid function.g(z)=11+e−z
A useful property of sigmoid function:g′(z)=g(z)(1−g(z)) 理论上,似乎任何一个值域在
[0,1] 区间上的平滑单增函数都可以做为hypothesis中的g(z) 。然而,在学习了GLM和generative learning algorithms后,我们会看到这里选择sigmoid function的原因。
Maximum Likelihood Estimation
Probabilistic assumption: Bernoulli distribution
p(y|x;θ)=(hθ(x))y(1−hθ(x))1−y Likelihood function:
L(θ)=∏i=1mp(y(i)|x(i);θ)=∏i=1m(hθ(x(i)))y(i)(1−hθ(x(i)))1−y(i)
log likelihood:l(θ)=logL(θ)=∑i=1my(i)loghθ(x(i))+(1−y(i))log(1−hθ(x(i))) Gradient ascent (since we’re maximizing rather than minimizing a function now):
θ:=θ+α∇θl(θ)
where∂∂θjl(θ)=(y−hθ(x))xj 在logistic regression中,我们得到一个与linear regression类似的更新法则:除了这里的
hθ(x) 是θTx 的一个非线性函数。这只是一个巧合,还是有什么更深层次的原因呢?我们会在学习GLM模型时给出解答。
Digression: The perceptron learning algorithm
Hypothesis:
hθ(x)=g(θTx)
whereg(z)={10z≥0z<0 注意这里的
g(z) 在z=0 处不可微,所以很难给予perceptron一个概率性的解释,并用最大似然法去求解。Perceptron learning algorithm:
θj:=θj+α(y(i)−hθ(x(i)))x(i)j
Newton’s method for maximizing l(θ)
Newton’s method: to find a value of
θ so thatf(θ)=0 , we perform the following update:θ:=θ−f(θ)f′(θ) Using Newton’s method to maximize
l(θ) by lettingf(θ)=l′(θ)=0 :θ:=θ−l′(θ)l″(θ) Newton-Raphson method (also called Fisher scoring when applied to logistic regression problem): a vectorized generalization of Newton’s method:
θ:=θ−H−1∇θl(θ)
whereis called Hessian Matrix.Hij=∂2l(θ)∂θi∂θj
虽然计算Hessian矩阵比较耗时,但由于引入了二阶偏导信息,Newton迭代法在求解最大似然函数时往往要比Gradient Descent更快地收敛。
- CS229 Lecture Notes(2): Logistic Regression
- CS229 Lecture notes 1
- CS229 Lecture notes
- Stanford机器学习__Lecture notes CS229. Logistic Regression(逻辑回归)(2)Perceptron Learning Algorithm
- Stanford机器学习__Lecture notes CS229. Logistic Regression(逻辑回归)(1)
- Stanford机器学习__Lecture notes CS229. Linear Regression(2)
- CS229 Lecture Note(1): Linear Regression
- CS229 Lecture Notes(3): Generalized Linear Models
- CS229 Lecture Notes(4): Generative Learning Algorithm
- Machine Learning Notes - Logistic Regression
- Stanford ML - Lecture 3 - Logistic regression
- 「Machine Learning」Learning Theory from CS229 Lecture Notes
- CS229 笔记:关于 Logistic Regression 的六个小问题
- Stanford机器学习__Lecture notes CS229. Linear Regression(1)
- Stanford机器学习__Lecture notes CS229. Linear Regression(3)
- Lecture note 3: Linear and Logistic Regression in TensorFlow
- logistic regression(2)
- Ng机器学习课程Notes学习及编程实战系列-Part 2 Logistic Regression
- Android通过注解初始化View
- 没事看看
- SQl 语句(常见) 新建,删除,修改表结构
- cmd命令终止和启动进程
- 如何将ubuntu控制台输出到串口?
- CS229 Lecture Notes(2): Logistic Regression
- kerberos原理分析
- MFC ComboBox的使用
- 解决eclipse导出javadoc时的“错误: 编码GBK的不可映射字符”问题
- 设计模式学习---(2)
- 简单地C#word导出(二)
- 在Eclipse中中tomcat会启动两次
- BFS与DFS
- OpenCV2.4.10之samples_cpp_tutorial-code_learn------安装配置与第一个Opencv程序