Logistic逻辑回归
来源:互联网 发布:nokia5230软件下载 编辑:程序博客网 时间:2024/06/13 21:32
1.二分类问题选择sigmoid函数
如果将上面的函数扩展到多维空间,并且加上参数,则函数变成:
是求最大值的问题,这次使用梯度上升法(梯度上升法是用来求函数的最大值,梯度下降法是用来求函数的最小值)。梯度上升法的的思想是:要找到某函数的最大值,最好的方法是沿着该函数的梯度方向探寻,这样梯度算子总是指向函数增长最快的方向: ,a为每次上升移动的步长, 是f(w)的导数。
下来呢,为了求 的
sbp810050504 的BLOG
from numpy import *def loaddata():#数据集 datamat=[];labels=[] fr=open('D:/DATA/python/机器学习/machinelearninginaction/Ch05/testSet.txt') for i in fr.readlines(): linearr=i.strip().split() datamat.append([1.0,float(linearr[0]),float(linearr[1])]) labels.append(float(linearr[2])) return datamat,labels def sigmod(inx):#sigmoid函数 return 1.0/(1+exp(-inx))#梯度上升 def gradascen(datamatin,classlabel): datamatrix=mat(datamatin) labelmat=mat(classlabel).transpose() m,n=shape(datamatrix) wight=ones((n,1)) maxcy=500 alpha=0.001 for k in range(maxcy): h=sigmod(datamatrix*wight) error=labelmat-h wight=wight+alpha*datamatrix.transpose()*error return wight def stocgradscen(dataatin,classlabel): m,n=shape(dataatin) weight=ones(n) alpha=0.001 for i in range(m): h=sigmod(sum(datamatrix[i],weight)) error=classlabel[i]-h weight=weight+alpha*error*dataatin[i] return weight#随机梯度上升 def stocgradscen1(datamatin,classlabel,numlter=150): m,n=shape(datamatin) weight=ones(n) for j in range(numlter): dataindex=list(range(m)) for i in range(m): alpha=4/(1.0+i+j)+0.0001#步长 randindex=int(random.uniform(0,len(dataindex))) h=sigmod(sum(datamatin[randindex]*weight))#计算预测标签 error=classlabel[randindex]-h#实际减去预测 weight=weight+alpha*error*datamatin[randindex]#求取系数w del (dataindex[randindex]) return weight#分类 def classifyvector(inx,weight): prob=sigmod(sum(inx*weight)) if prob>0.5:return 1.0 else:return 0.0#完整例子def colictest(): frtrain=open('D:/DATA/python/机器学习/machinelearninginaction/Ch05/horseColicTraining.txt') frtest=open('D:/DATA/python/机器学习/machinelearninginaction/Ch05/horseColicTest.txt') traindata=[];trainlabel=[] for line in frtrain.readlines(): curline=line.strip().split('\t') linarr=[] for i in range(21): linarr.append(float(curline[i]))#注意数据格式 traindata.append(linarr) trainlabel.append(float(curline[21])) weight=stocgradscen1(array(traindata),trainlabel,1000)#计算系数w numcishu=0.0;error=0.0 for line in frtest.readlines(): numcishu+=1#计数 curline=line.strip().split('\t') linearr1=[] for i in range(21): linearr1.append(float(curline[i])) if int(classifyvector(array(linearr1),weight)) !=int(curline[21]): error+=1#计算错误率 print('the error is:%f'% float(error/numcishu)) return error def mulitest(): numshuzi=10 errorsum=0.0 for i in range(10):#进行多次计算错误率,看是否收敛 errorsum+=colictest() print ('after %d iterations the average error is:%f'%(numshuzi,errorsum/numshuzi))
0 0
- logistic regression 逻辑回归
- 逻辑回归(Logistic Regression)
- 逻辑回归(logistic regression)
- 逻辑回归logistic regression
- 逻辑回归(logistic regression)
- 逻辑回归(logistic regression)
- 逻辑回归(Logistic Regression)
- 逻辑回归(logistic regression)
- 逻辑回归Logistic Regression
- 逻辑(logistic)回归
- 逻辑回归(Logistic Regression)
- 逻辑回归(Logistic)
- Logistic Regression-逻辑回归
- Logistic逻辑回归
- 逻辑回归 logistic regression
- 逻辑回归(Logistic)
- Logistic逻辑回归总结
- 逻辑回归(Logistic Regression)
- Python tarfile模块解压报错 invalid mode ('wb') or filename
- TCP/IP协议栈初始化(五) 再向下,找到了IP协议的好队友ARP
- 三种线性问题的优化
- CentOS+Nginx+Tomcat集群及负载均衡
- lintcode-55(comparestrings)
- Logistic逻辑回归
- mysql刷新权限命令
- Longest Ordered Subsequence(最长上升子序列)
- 2016.12.03【初中部 NOIP提高C组】模拟赛
- TCP三次握手与四次挥手的全过程
- android查远程网关mac
- sublime 中文乱码解决方法
- LPT算法--时间调度问题
- 从“估车价”看机器学习