机器学习：scikit-learn 做笑脸识别 (SVM, KNN, Logisitc regression)

来源：互联网发布：linux运行中结束进程编辑：程序博客网时间：2024/05/16 08:58

scikit-learn 是 Python 非常强大的一个做机器学习的包，今天介绍scikit-learn 里几个常用的分类器
SVM, KNN 和 logistic regression，用来做笑脸识别。

这里用到的是GENKI4K 这个数据库，每张图像先做一个人脸检测与剪切，然后提取HOG特征。这个数据库有 4000 张图，分成4组，做一个 cross validation，取平均值作为最终的识别率:

import string, os, sysimport numpy as npimport matplotlib.pyplot as pltimport scipy.ioimport randomfrom sklearn import neighbors, linear_model, svmdir = '/GENKI4K/Feature_Data'  print '----------- no sub dir'  # prepare the datafiles = os.listdir(dir)  for f in files:      print dir + os.sep + ffile_path=dir+os.sep+files[14]#print file_pathdic_mat = scipy.io.loadmat(file_path)data_mat=dic_mat['Hog_Feat']print 'feature: ',  data_mat.shape#print data_mat.dtypefile_path2=dir+os.sep+files[15]#print file_path2dic_label=scipy.io.loadmat(file_path2)label_mat=dic_label['Label']file_path3=dir+os.sep+files[16]print 'fiel 3 path: ', file_path3dic_T=scipy.io.loadmat(file_path3)T=dic_T['T']T=T-1print T.shapelabel=label_mat.ravel()# Acc=np.zeros((1,4))Acc=[0,0,0,0]for i in range (0, 4):    print "the fold %d" % (i+1)    train_ind=[]    for j in range (0, 4):        if j==i:            test_ind=T[j]        else:            train_ind.extend(T[j])#    print len(test_ind), len(train_ind)#    print max(test_ind), max(train_ind)    train_x=data_mat[train_ind, :]    test_x=data_mat[test_ind, :]    train_y=label[train_ind]    test_y=label[test_ind]#   SVM       clf=svm.LinearSVC()#   KNN #    clf = neighbors.KNeighborsClassifier(n_neighbors=15)#    Logistic regression#    clf = linear_model.LogisticRegression()    clf.fit(train_x, train_y)    predict_y=clf.predict(test_x)    Acc[i]=np.mean(predict_y == test_y)    print "Accuracy: %.2f" % (Acc[i])print "The mean average classification accuracy: %.2f" % (np.mean(Acc))# SVM 的实验结果(4, 1000)the fold 1Accuracy: 0.89the fold 2Accuracy: 0.88the fold 3Accuracy: 0.89the fold 4Accuracy: 0.90The mean average classification accuracy: 0.89# KNN 的实验结果(4, 1000)the fold 1Accuracy: 0.83the fold 2Accuracy: 0.84the fold 3Accuracy: 0.84the fold 4Accuracy: 0.85The mean average classification accuracy: 0.84# logistic regression 的实验结果(4, 1000)the fold 1Accuracy: 0.91the fold 2Accuracy: 0.91the fold 3Accuracy: 0.90the fold 4Accuracy: 0.92The mean average classification accuracy: 0.91

0 0