[blabla]a quick code about linear regression using gradient descent
来源:互联网 发布:美工设计岗位要求 编辑:程序博客网 时间:2024/05/22 01:36
use linear regression to predict the house price
the original data is from the cousera machine learning course.
when I finished the exercise with MATLAB, the idea about implementing the algorithm with python comes out.
so I’d like to refresh the knowledge and have fun with the data:)
the algorithm is so simple that you can scan it quickly, and save your time.:)
1. focus on the data
as a data scientist, what data do you have means how far can you deep into the superface of the data.
i nead to load data, and keep eyes on what scheme is the data stored.
import numpy as npdef load_data(filename): data = [] with open(filename,'rb') as f: for line in f: line = line.decode('utf-8').strip().split(',') data.append([int(_) for _ in line]) return datafilename = r'ex1data2.txt'data = load_data(filename)# look at the first three line of the dataprint('\n'.join([str(data[i]) for i in range(3)]))[2104, 3, 399900][1600, 3, 329900][2400, 3, 369000]
2. math model
so, what the three integers in the first line mean?
the first element 2104 means house width, the second element 3 means the house depth, and the last one means the price.
it is time to choose our math model to deal with the data.
apparently, the article pay attention to linear regression.
all right, the model is linear regression.
to find the parameters
initialize the vector
θ=[θ0,θ1,θ2] minimize the error:
error=0.5m∗∑mi=1(price(xi)−yi))2 to achieve the minimization we use the gradient descent algorithm due to the cost function is a convex function.
talk is cheap, show me the code.
3. implement
# normalizationdata = np.array(data)x = data[:,[0,1]]y = data[:,2]mu = np.mean(x, axis=0)std = np.std(x, axis=0)x = (x-mu)/stdrow = x.shape[0]X = np.ones((row,3))X[:,[1,2]] = xX = np.matrix(X)# get the X to computationtheta = np.zeros((3,1))theta = np.matrix(theta)y = np.matrix(y)#implement grad descent methoddef grad_descent(X, y, theta, iter_num, alpha): m = len(y) for _ in range(iter_num): theta -= alpha/m*(X.T*X*theta-X.T*y.T) return theta# initialize the parametersiter_num = 900alpha = 0.01new_theta = grad_descent(X, y, theta, iter_num, alpha)print('the theta parameter is:')print(new_theta)# Estimate the price of a 1650 sq-ft, 3 br houseprice = np.dot(np.array([1, (1650-mu[0])/std[0], (3-mu[1])/std[1]]), new_theta)print('for a 1650 sq-ft, 3 br house,the price is')print(price)the theta parameter is:[[ 340412.65957447] [ 109447.79646964] [ -6578.35485416]]for a 1650 sq-ft, 3 br house,the price is[[ 293081.4643349]]
3. Normal Euqation
when the number of featuers in data is below 1000.
we always use normal equation to compute theta.
what the relationship between these two methods?
when n becomes infinite the
so
new_X = np.ones((47,3))new_X[:,1:] = data[:,:2]new_X = np.matrix(new_X)new_theta1 = np.linalg.pinv(new_X.T*new_X)*new_X.T*y.Tprint(new_theta1)[[ 89597.90954435] [ 139.21067402] [ -8738.01911278]]new_price = np.dot(np.array([1, 1650, 3]), new_theta1)print(new_price)[[ 293081.46433506]]
the two result is close enough.
- [blabla]a quick code about linear regression using gradient descent
- (一)How to Do Linear Regression using Gradient Descent
- 2 Linear Regression, Gradient descent
- Linear regression and gradient descent algorithm
- A summary about gradient descent
- 线性回归、梯度下降(Linear Regression、Gradient Descent)
- 2 - 7 - Gradient Descent For Linear Regression (10 min)
- 机器学习笔记5---Gradient descent for linear regression
- linear regression(2)-Gradient Descent for Multiple Variables
- linear regression(4)-normal equation***compare with gradient descent
- 线性回归、梯度下降(Linear Regression、Gradient Descent)
- 线性回归、梯度下降(Linear Regression、Gradient Descent)
- 我的第一个ML算法(linear regression gradient descent)
- Regression gradient descent
- 线性回归、梯度下降、逻辑回归(Linear Regression、Gradient Descent、Logistic Regression)
- Liner Regression(Gradient descent algorithm)
- Gradient Descent for Linear Regression,线性回归的梯度下降算法
- 斯坦福机器学习视频笔记 Week1 线性回归和梯度下降 Linear Regression and Gradient Descent
- eclipse中读取nutch爬取的数据内容
- 摄氏温度转化为相应的华氏温度
- 就业窍门:1.想好自己打算将来干什么
- xFire调用wsdl webservice超时时间设置
- MYSQL启动报1067错误 MySQL服务无法启动 或者Can't create/write to file#sql6e80_1905f7_3.MYI'
- [blabla]a quick code about linear regression using gradient descent
- Eclipse开发环境搭建
- PL/SQL database character set(AL32UTF8) and Client character set(ZHS16GBK) are different
- 黑马程序员——异常处理全过程:不怕一万,就怕万一
- 预读、物理读、逻辑读
- 【Jenkins系列之三】在Windows上安装Jenkins master & slave
- 最长公共子序列(LCS)
- HDU-5455 Fang Fang(2015沈阳网赛,带坑点水题)
- lucene的使用详解