linear regression(3)-Gradient Descent in Practice I/II(Feature Scalling/Learning Rate)

来源:互联网 发布:电脑怎么清理软件 编辑:程序博客网 时间:2024/05/22 04:52

Gradient Descent in Practice I - Feature Scaling


goal:speed up gradient descent by having each of our input values in roughly the same range

xi:=(xi−μi)/si

Where μi is the average of all the values for feature (i) and si is the range of values (max - min), or si is the standard deviation.

Gradient Descent in Practice II - Learning Rate


goal:find the fit learning rate to make the J(θ) will decrease on every iteration.

summary:

If α is too small: slow convergence.

If α is too large: may not decrease onevery iteration and thus may not converge.

Polynomial Regression

goal:simplify our hypothesis function

combine multiple features into one

For example, if our hypothesis function is hθ(x)=θ0+θ1x1

then we can create additional features based on x1, to get the quadratic function hθ(x)=θ0+θ1x1+θ2x12

or the cubic function hθ(x)=θ0+θ1x1+θ2x12+θ3x13

In the cubic version, we have created new features x2 and x3 where x2=x12 and x3=x13.

To make it a square root function, we could do: hθ(x)=θ0+θ1x1+θ2x1

One important thing to keep in mind is, if you choose your features this way thenfeature scaling becomes very important.




0 0
原创粉丝点击