Machine Learning in action --regression(已勘误)
来源:互联网 发布:一落叶而知天下秋启示 编辑:程序博客网 时间:2024/06/01 09:55
最近在自学机器学习,应导师要求,先把《Machine Learning with R》动手刷了一遍,感觉R真不能算是一门计算机语言,感觉也就是一个功能复杂的计算器。所以这次就决定使用经典教材《Machine Learning in action》。因为开学得换work station ,怕到时候代码又丢了,所以就索性开个博客,把代码上传上来。
因为书上的原代码有很多错误,并且网上的许多博客的代码也是没有改正的,这次我把修正过的代码po上来
version:python3.5
talk is cheap show me the code
函数定义代码
#coding=utf-8from numpy import *import matplotlib.pyplot as pltdef loadDataSet(fileName): fr = open(fileName) numFeat = len(fr.readline().split('\t')) dataMat = [] ; labelMat = [] for line in fr.readlines(): lineArr = [] curLine = line.strip().split('\t') for i in range(numFeat - 1): lineArr.append(float(curLine[i])) dataMat.append(lineArr) labelMat.append(float(curLine[-1])) return dataMat, labelMatdef standRegres(xArr, yArr): xMat = mat(xArr) ; yMat = mat(yArr).T xTx = xMat.T * xMat if linalg.det(xTx) == 0: print("This matrix is singular, can not do inverse") ws = xTx.I * (xMat.T * yMat) return wsdef lwlr(testPoint, xArr, yArr, k = 1.0): xMat = mat(xArr) ; yMat = mat(yArr).T m = shape(xMat)[0] weights = mat(eye((m))) #创建权重对角矩阵 for j in range(m): diffMat = testPoint - xMat[j, :] weights[j, j] = exp(diffMat * diffMat.T / (-2.0 *k **2)) xTx = xMat.T * (weights * xMat) if linalg.det(xTx) == 0.0: print("this matrix is singular, cannot do inverse") ws = xTx.I * (xMat.T * (weights * yMat)) return testPoint * wsdef lwlrTest(testArr, xArr, yArr, k = 1.0): m = shape(testArr)[0] yHat = zeros(m) for i in range(m): yHat[i] = lwlr(testArr[i], xArr, yArr, k) return yHatdef rssError(yArr, yHatArr): return ((yArr - yHatArr)**2).sum()def ridgeRegres(xMat, yMat, lam = 0.2): xTx = xMat.T * xMat denom = xTx + eye(shape(xMat)[1]) * lam if linalg.det(denom) == 0: print("This Matrix is singular, cannot do inverse") return ws = denom.I * (xMat.T * yMat) return wsdef ridgeTest(xArr, yArr): xMat = mat(xArr) ; yMat = mat(yArr).T yMean = mean(yMat, 0)#对列求均值 #数据标准化 yMat = yMat -yMean xMeans = mean(xMat, 0) #对 列 求均值 xVar = var(xMat, 0)#对列求方差 xMat = (xMat - xMeans) / xVar numTestPts = 30 wMat = zeros((numTestPts, shape(xMat)[1])) for i in range(numTestPts): ws = ridgeRegres(xMat, yMat, exp(i-10)) wMat[i,:] = ws.T return wMatdef regularize(xMat): inMat = xMat.copy() inMeans = mean(inMat, 0) inVar = var(inMat, 0) inMat = (inMat - inMeans)/inVar return inMatdef stageWise(xArr, yArr, eps = 0.01, numIt = 100): xMat = mat(xArr) ; yMat = mat(yArr).T yMean = mean(yMat, 0) yMat = yMat -yMean xMat = regularize(xMat) m, n = shape(xMat) returnMat = zeros((numIt, n)) ws = zeros((n, 1)) ; wsTest = ws.copy() ; wsMax = ws.copy() for i in range(numIt): print("ws.T: ",ws.T) lowestError = inf for j in range(n): for sign in [-1, 1]: wsTest = ws.copy() wsTest[j] += eps * sign yTest = xMat * wsTest rssE = rssError(yMat.A, yTest.A) if rssE < lowestError : lowestError = rssE wsMax = wsTest ws = ws.copy() returnMat[i, :] = ws.T return returnMat
上面代码块只是定义了主要的函数,离运行还差一点。由于书原文中,采用了使用 iPython 命令行的运行方式,但是博主比较懒,所以干脆舍弃掉原来的方式。
废话不多少,直接上代码
实验1
if __name__=="__main__": xArr, yArr = loadDataSet('ex0.txt') ws = standRegres(xArr,yArr) print(ws) xMat = mat(xArr) yMat = mat(yArr) yHat = xMat * ws
实验2 :
if __name__ == "__main__": fig = plt.figure() ax = fig.add_subplot(111) ax.scatter(xMat[:,1].flatten().A[0], yMat.T[:,0].flatten().A[0]) xCopy = xMat.copy() xCopy.sort(0) yHat = xCopy * ws ax.plot(xCopy[:, 1], yHat) plt.show()
实验3 :
if __name__ == "__main__": xArr, yArr = loadDataSet('ex0.txt') print("actual yArr[0]:",yArr[0]) lwlr(xArr[0], xArr, yArr, 1.0) print(lwlr(xArr[0], xArr, yArr, 1.0))
实验4 :
if __name__ == "__main__": xArr, yArr = loadDataSet('ex0.txt') print(lwlrTest(xArr, xArr, yArr, 0.003 ))
实验5 :
if __name__ == "__main__": xArr, yArr = loadDataSet('ex0.txt') xMat = mat(xArr) print("xMat: ",xMat) yMat = mat(yArr) yHat = lwlrTest(xArr, xArr, yArr, 0.01 ) srtInd = xMat[:, 1].argsort(0) #返回的是数组值从小到大的索引值, 按列排序 print("srtInd: ",srtInd) xSort = xMat[srtInd][:, 0, :] #从小到大 排序 print("xSort: ",xSort) fig = plt.figure() ax = fig.add_subplot(111) ax.plot(xSort[:,1], yHat[srtInd]) ax.scatter(xMat[:,1].flatten().A[0], mat(yArr).T.flatten().A[0], s=2, c = 'red') plt.show()
实验6 :
if __name__ == "__main__": abX, abY = loadDataSet('abalone.txt') ridgeWeights = ridgeTest(abX, abY) fig = plt.figure() ax = fig.add_subplot(111) ax.plot(ridgeWeights) plt.show()
实验7 :
if __name__ == "__main__": xArr, yArr = loadDataSet('abalone.txt') #stageWise(xArr, yArr, 0.001, 5000) print(stageWise(xArr, yArr, 0.001,5000))
更多请戳github
https://github.com/Edgis/Machine-learning-in-action/blob/master/regression.py
阅读全文
0 0
- Machine Learning in action --regression(已勘误)
- Machine Learning in action --AdaBoost(已勘误)
- Machine Learning in action –kNN(已勘误)
- Machine Learning in action --朴素贝叶斯(已勘误)
- Machine Learning in action --逻辑回归(已勘误)
- 【Machine Learning in Action】Chap8|Predict numeric values--regression
- machine learning in action
- Machine Learning in Action
- Machine Learning In Action
- Machine Learning In Action
- Machine Learning In Action
- Machine Learning In Action
- Machine Learning In Action
- Machine Learning In Action
- Machine Learning In Action
- Machine Learning In Action
- Machine Learning In Action
- Machine Learning in Action
- Docker 常用命令集锦
- JAVA面向对象
- spring实战-Spring中Filter以及处理Exception方式
- 数据结构与算法_二叉查找树
- iOS项目统计总代码行数
- Machine Learning in action --regression(已勘误)
- Java多线程复习与巩固(六)--线程池ThreadPoolExecutor
- 设计模式之抽象工厂模式
- myeclipse配置maven
- 华为中兴设备debug使用
- 创建守护式容器
- syslog的使用方法个人小结
- 《JAVA与模式》之单例模式
- 在 Ubuntu 16.04 上安装 Bro 网络分析器