Stanford coursera Andrew Ng 机器学习课程编程作业（Exercise 1）Python3.x （补）

来源：互联网发布：mmap文件阅读器 mac 编辑：程序博客网时间：2024/05/21 10:49

Exercise 1：Linear Regression---实现一个多元线性回归

房屋价格数据集，其中有2个变量（房子的大小，卧室的数量）和目标（房子的价格）。我们使用我们已经应用的技术来分析数据集。

2104,3,399900
1600,3,329900
2400,3,369000
1416,2,232000
3000,4,539900
1985,4,299900
1534,3,314900
1427,3,198999
1380,3,212000

代码

（和第一个差不多，就是需要数据特征归一化，因为 3列数据差值过大，影响结果）

import numpy as npimport pandas as pddata2 = pd.read_csv('ex1data2.txt',names=['size','house','cost'])print(data2.describe())#数据归一化data2 = (data2 - data2.mean()) / data2.std()#偏置单元x0data2.insert(0, 'Ones', 1)#选取元素cols = data2.shape[1]X = data2.iloc[:,0:cols-1]y = data2.iloc[:,cols-1:cols]#转化成矩阵X = np.matrix(X.values)y = np.matrix(y.values)theta = np.matrix(np.array([0,0,0]))#初始化代价函数def J(x,y,theta):    temp = np.power(((x * theta.T) - y),2)       #  .T  一定要小心    return np.sum(temp) / (2 * len(x))print(J(X,y,theta))#梯度下降公式def gradDescent(x,y,theta,alpa,iters):    temp  = np.matrix(np.zeros(theta.shape))   #创建和theta同行同列的全0矩阵    parameters = int(theta.shape[1])   #得到列数，就是theta的数量 ；revel()合并矩阵,可以删掉！    cost = np.zeros(iters)     # 每一次循环得到的代价值    for i in range(iters):        error = (x * theta.T) - y        for j in range(parameters):            temp1 = np.multiply(error,x[:,j])         #x[:,j] 就是第j列的所有数据            temp[0,j] = theta[0,j] - ((alpa / len(x)) * np.sum(temp1))   # 每一个theta 所得的值        theta = temp        cost[i] = J(x,y,theta)  #  就是每一次梯度下降后所得的代价值    return theta,cost       #  数值型一般需要返回值theta1,cost = gradDescent(X,y,theta,alpa=0.1,iters=1000)print(theta1)print(J(X,y,theta1))

阅读全文

0 0