基于DNN的semi-supervised learning
来源:互联网 发布:王作强鱼线淘宝店 编辑:程序博客网 时间:2024/05/17 07:43
半监督学习:就是通过部分有label的数据,学习得到其余没有label数据的label。
本文用的方法是 low-density separation (低密度的边界分类)中 self-training 方法;low-density separation 是按照“非黑即白”的观念进行分类。在对已有label数据建立模型时,本文采用的是DNN。整体基本步骤如下:
repeat:
1)将已有label数据带入DNN构建模型f*
2)将没有label数据带入f*预测其结果u
3)将u中满足hard-label条件的data拿出来,再带入更新f*
直到所有的unlabelled数据全都打上label
注意:对预测结果要采用hard-label重新打label,例如预测得到结果为[0.7,0.3]要变成[1,0],再重新带入模型。实验数据还是用的Iris,贴代码:
# -*- coding: utf-8 -*-"""Created on Mon Sep 25 10:22:49 2017@author: wjw"""#基于DNN的 semi-supervised learningimport kerasimport numpy as npfrom keras.layers.core import Dense, Activation #直接import函数def readText(filePath): lines = open(filePath,'r').readlines() data = [] dataClass = [] for line in lines: dataList = line.split(',') data.append([float(dataList[0]),float(dataList[1]),float(dataList[2]),float(dataList[3])]) dataClass.append(dataList[4].split("\n")[0]) new_class = [] for name in dataClass: if name=="Iris-setosa": new_class.append(0) elif name=="Iris-versicolor": new_class.append(1) else: new_class.append(2) return np.array(data),np.array(new_class)class network: def createDnn(self): model = keras.models.Sequential()#初始化一个神经网络 model.add(Dense(input_dim=4,output_dim=10))#Dense表示fully connected 的神经网络 model.add(Activation("sigmoid")) model.add(Dense(output_dim=10)) model.add(Activation("sigmoid")) model.add(Dense(output_dim=3)) model.add(Activation("softmax")) model.compile(loss="categorical_crossentropy",optimizer="adam",metrics=["accuracy"]) #"categorical crossentropy"按照crossenrropy的方法定义损失函数 #优化方法是"adam" return model def low_density_seperation(classes): for class_line in classes: if max(class_line)>=0.7: max_index = np.argmax(class_line) class_line[max_index]=1 class_line[list(filter(lambda x : x != max_index, range(class_line.size)))]=0 return classes def getTraindata(): filePath = r"E:\学习\研究生\李宏毅\data\iris.txt" data, Class = readText(filePath) train_index = np.random.randint(0,data.shape[0],size =int(data.shape[0]*0.2)) traindata = data[train_index] dataClass = Class[train_index] rest_index = list(filter(lambda x:x not in train_index,range(data.shape[0])))#filter过滤器,和map类似,将有标签的数据过滤掉 restdata = data[rest_index] restClass = Class[rest_index] return traindata,dataClass,restdata,restClassdef train(model,traindata,dataClass,restdata): if len(restdata)==0: return model.fit(traindata,dataClass,batch_size=15,epochs=1000)#traindata是输入数据,dataclass是输出结果,对应的都是神经元 testClasses = model.predict(restdata) testClasses = low_density_seperation(testClasses) low_density_index = list(filter(lambda x : max(testClasses[x])==1,range(len(testClasses)))) traindata = np.vstack((traindata , restdata[low_density_index])) # np.vstack将两个array进行纵向合并 dataClass = np.vstack((dataClass , testClasses[low_density_index])) #np.hstack 将两个array进行横向合并 rest_index = list(filter(lambda x:x not in low_density_index,range(len(restdata)))) if rest_index == '': return restdata = restdata[rest_index] train(model,traindata,dataClass,restdata) if __name__ == '__main__': traindata,dataClass,restdata,restClass = getTraindata() dataClass = keras.utils.to_categorical(dataClass)#装换成one-hot类型的数据比如说3就变成了[0,0,0,1] restClass = keras.utils.to_categorical(restClass) model = network().createDnn() train(model,traindata,dataClass,restdata) score = model.evaluate(np.vstack((traindata,restdata)),np.vstack((dataClass,restClass)),batch_size=20) print(score)差不多要进行3次递归,得到的score为:
20/155 [==>...........................] - ETA: 2s[0.22960705333980513, 0.94838709792783182]
阅读全文
0 0
- 基于DNN的semi-supervised learning
- Unsupervised, Semi-Supervised, Supervised Learning
- semi-supervised learning
- Semi-supervised learning
- semi-supervised learning
- Semi-supervised learning
- semi-supervised learning
- Semi-supervised Learning
- Semi-supervised learning
- Semi-supervised Learning Literature Survey
- Supervised learning、Unsupervised learning and Semi-Supervised learning (总结)
- 【Machine Learning】笔记:Semi-supervised learning
- Semi-supervised Learning in Gigantic Image Collections
- 半监督学习(semi-supervised learning)
- Semi-Supervised Learning with Generative Adversarial Networks
- 半监督学习(semi-supervised learning)
- What are the advantages of semi-supervised learning over supervised and unsupervised learning?
- Multimodal semi-supervised learning for image classification总结
- 异常数据点检测
- centos的硬盘操作日记
- 动画播放问题(动画播放完在执行后面的程序)
- 搜狐畅游测试开发一面试题..
- ActiveMq 集群部署 三种方案 + 负载均衡+其他细节点
- 基于DNN的semi-supervised learning
- ADB命令备份
- Docker:初识docker及工具介绍(一)
- 搭建GO开发环境(Win10 go-ethereum)
- Kylin使用之创建Cube和高级设置
- image——Data Augmentation的代码
- 如何用化学软件画立体图?
- 乱码问题
- OpenStack镜像如何使用Config Drive实现元数据注入