机器学习基础篇学习

来源：互联网发布：java递归获取父节点编辑：程序博客网时间：2024/06/05 21:09

因为需要用到神经网络，但是要求有一定的机器学习基础，所以在网上找了一些相关的视频进行学习，最近几天看了彭亮老师的视频，有基础篇，也有进阶篇，感觉内容很充实，理论之后又有相应的编码实验，有助于对算法的理解与掌握，通过实验，真正感受到了python的强大。

实验环境采用的是Aconda+eclipse，可以直接在官网上下载，安装也比较简单。

视频资源：
基础阶段视频：https://pan.baidu.com/s/1bLlwNc
进阶阶段视频：https://pan.baidu.com/s/1gfhvOMv

整体框架：
这里写图片描述

如果觉得看视频太慢的话，在网上找到相关博客，整理的和视频内容一致：http://blog.csdn.net/ewfwewef/article/category/6458216，但是少了SVM（下）应用的代码，在下面会贴出来，希望可以和大家一起学习。

SVM应用（下）相关代码：

from __future__ import print_functionfrom time import timeimport loggingimport matplotlib.pyplot as pltfrom sklearn.cross_validation import train_test_splitfrom sklearn.datasets import fetch_lfw_peoplefrom sklearn.grid_search import GridSearchCVfrom sklearn.metrics import classification_reportfrom sklearn.decomposition import RandomizedPCAfrom sklearn.svm import SVCfrom sklearn.metrics.classification import confusion_matrixprint(__doc__)logging.basicConfig(level=logging.INFO,format='%(asctime)s%(message)s')lfw_people=fetch_lfw_people(min_faces_per_person=70, resize=0.4)n_samples,h,w=lfw_people.images.shapeX=lfw_people.datan_features=X.shape[1]y=lfw_people.targettarget_names=lfw_people.target_namesn_classes=target_names.shape[0]print("Total dataset size:")print("n_samples:%d"%n_samples)print("n_features:%d"%n_features)print("n_classes:%d"%n_classes)X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.25)n_components=150print("Extracting the top %d eigenfaces from %d faces"%(n_components,X_train.shape[0]))t0=time()pca=RandomizedPCA(n_components=n_components,whiten=True).fit(X_train)print("done in %0.3fs"%(time()-t0))eigenfaces = pca.components_.reshape((n_components,h,w))print("Projecting the input data on the eienfaces orthonormal basis")t0=time()X_train_pca=pca.transform(X_train)X_test_pca=pca.transform(X_test)print("done in %0.3fs"%(time()-t0))print("fitting the clasifier to the training set")t0=time()param_grid={'C':[1e3,5e3,1e4,5e4,1e5],            'gamma':[0.0001,0.0005,0.001,0.005,0.01,0.1],}clf=GridSearchCV(SVC(kernel='rbf',class_weight='balanced'),param_grid)clf=clf.fit(X_train_pca, y_train)print("done in %0.3fs"%(time()-t0))print("Best estimator by grid search:")print(clf.best_estimator_)print("Predicting people's names on the test set")t0=time()y_pred=clf.predict(X_test_pca)print("done in %0.3fs"%(time()-t0))print(classification_report(y_test, y_pred, target_names=target_names))print(confusion_matrix(y_test,y_pred,labels=range(n_classes)))def plot_gallery(images,titles,h,w,n_row=3,n_col=4):    plt.figure(figsize=(1.8*n_col,2.4*n_row))    plt.subplots_adjust(bottom=0,left=.01,right=.99,top=.90,hspace=.35)    for i in range(n_row*n_col):        plt.subplot(n_row,n_col,i+1)        plt.imshow(images[i].reshape((h,w)),cmap=plt.cm.gray)        plt.title(titles[i],size=12)        plt.xticks(())        plt.yticks(())def title(y_pred,y_test,taget_names,i):    pred_name=target_names[y_pred[i]].rsplit(' ',1)[-1]    true_name=target_names[y_test[i]].rsplit(' ',1)[-1]    return 'predicted:%s\ntrue:     %s'%(pred_name,true_name)prediction_titles=[title(y_pred, y_test, target_names, i)                   for i in range(y_pred.shape[0])]plot_gallery(X_test, prediction_titles, h, w)eigenfaces_titles=["eigenfaces %d"%i for i in range(eigenfaces.shape[0])]plot_gallery(eigenfaces, eigenfaces_titles, h, w)plt.show()

这里写图片描述

1 0