机器学习实战-logistic回归随机梯度上升浅见

来源:互联网 发布:嵌入式linux开发前景 编辑:程序博客网 时间:2024/06/11 04:44

本周跟着书本调试了一下实战第五章logistic回归,下面浅谈一下我在随机梯度上升中遇到的问题以及一些见解。

方式一:随机但有重复,增大遍历次数
def stocGradAscent1(dataMat,labels,numIter=150):    m,n = shape(dataMat)    weights = ones(n)    for j in range(numIter):        dataIndex = range(m)        for i in range(m):            alpha = 4/(1.0+i+j)+0.0001            randomIndex = int(random.uniform(0,len(dataIndex)))            h = sigmoid(sum(dataMat[randomIndex]*weights))            error = labels[randomIndex] - h            weights = weights + alpha * error * dataMat[randomIndex]            del(dataIndex[randomIndex])    return weightsm,labels = loadDataSet()#wei = gradAscent(m,labels)#wei = stocGradAscent0(array(m),labels)wei = stocGradAscent1(array(m),labels)plotBestFit(wei)




方式二:依次序逐条遍历,无增大遍历数
def stocGradAscent0(dataMat,labels):    m,n = shape(dataMat)    weights = ones(n)    alpha = 0.01    #for j in range(150):       #li = range(m)      # random.shuffle(li)单纯的将整个样本集逐次遍历一遍    for i in range(m):            print i            #alpha = 4 / (1.0 + i + j) + 0.0001            h = sigmoid(sum(dataMat[i] * weights))            error = labels[i] - h            weights = weights + alpha * error * dataMat[i]    return weights



由于遍历次数较少,分割不够理想
方式三:依次序逐条遍历,增大遍历数
def stocGradAscent0(dataMat,labels):    m,n = shape(dataMat)    weights = ones(n)    alpha = 0.01    for j in range(150):       li = range(m)       random.shuffle(li)       for i in li:            print i            #alpha = 4 / (1.0 + i + j) + 0.0001            h = sigmoid(sum(dataMat[i] * weights))            error = labels[i] - h            weights = weights + alpha * error * dataMat[i]    return weights



效果基本为正确区分,虽有遗漏,但是大体区分明确,不影响正常区分
方式四:随机取,不重复,增大遍历数:
def stocGradAscent1(dataMat,labels,numIter=150):    m,n = shape(dataMat)    weights = ones(n)    alpha = 0.01    for j in range(numIter):      dataIndex = range(m)      for i in range(m):            #alpha = 4/(1.0+i+j)+0.0001            randomIndex = int(random.uniform(0,len(dataIndex)))            h = sigmoid(sum(dataMat[dataIndex[randomIndex]]*weights))            #print "dataIndex:",dataIndex[randomIndex],"random",randomIndex            error = labels[dataIndex[randomIndex]] - h            weights = weights + alpha * error * dataMat[dataIndex[randomIndex]]            #plotBestFit(weights)            del(dataIndex[randomIndex])    return weights



效果与上面持同
下图为修改alpha=4/(1.0+i+j)+0.0001后的效果:

变化不是很明显
结论:对该样本集来说,遍历次数时改进区分效果最明显的因素,无论是随机取重复,随机取不重复,按序取增大遍历数,只要遍历的次数达到一定的量,最终的结果基本一样,由此推广到其他样本集,认为要保证区分的正确率,最主要的是要保证遍历的次数,随机取样,动态alpha虽然会使得结果趋于稳定后期无有较大波动,但是就效果来看,其影响微乎其微。

阅读全文
0 0
原创粉丝点击