在这里查询matplotlib.pyplot的函数,把结果很好地表现出来也是很有必要的,matplotlib可以很好的胜任这个工作。前面的链接中,在页面右边有个Quick Search 很好用。



import numpy as npimport matplotlib.pyplot as pltfrom sklearn import svmxx, yy = np.meshgrid(np.linspace(-3, 3, 500),np.linspace(-3, 3, 500))#linspace 表示线性地生成ndarray,前两个参数表示起始区间,第三个表示生成的元素个数#类似的是arange,不过arange的第三个参数表示步长#meshgrid,用于生成坐标,不过例子中一般是用来画图的所以我们先只考虑二维的情况#meshgrid(a,b)返回的xx,yy,xx的每一行都是向量a,重复len(b)次,yy的每一列都是向量b,重复len(a)次np.random.seed(0)X = np.random.randn(300, 2)#randn用于生成标准正态分布的数据,里面的两个数表示生成矩阵的大小Y = np.logical_xor(X[:, 0] > 0, X[:, 1] > 0)#求异或# fit the modelclf = svm.NuSVC()#生成一个NuSVC的estimator不过还没有进行训练clf.fit(X, Y)#使用数据进行训练,大多数estimator都有个fit函数# plot the decision function for each datapoint on the gridZ = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])Z = Z.reshape(xx.shape)#ravel()的作用是把多维数组拉伸成一维数组,c_的作用是将两个ndarray的相同位置处的#元素合在一起,对于这里的情况就是合并出一个坐标来,例子:#>>np.c_[np.array([1,2,3]), np.array([4,5,6])]#>>array([[1, 4],#       [2, 5], #      [3, 6]]) #注意c_后面直接跟的是[] #decision_function返回的是点到超平面的有向距离(带符号的)#最后reshape((tuple))的作用是再把一维数组变成括号里面的元组的大小plt.imshow(Z, interpolation='nearest',extent=(xx.min(), xx.max(), yy.min(), yy.max()), aspect='auto',origin='lower', cmap=plt.cm.PuOr_r)#画图,cmap参数决定是用什么样的颜色风格contours = plt.contour(xx, yy, Z, levels=[0], linewidths=2,linetypes='--')#contour画的是“云图”的线,我理解的就是等高线,其中levels参数,A list of floating point numbers indicating the level curves to draw, in increasing order; e.g., to draw just the zero contour pass levels=[0]。#这个数值我理解的就是在SVM确定后wx+b=0这个等式右边的值,0就代表超平面,+1和-1代表经过支持向量的平面,这个经过试验我发现应该是没错的。plt.scatter(X[:, 0], X[:, 1], s=30, c=Y, cmap=plt.cm.Paired,edgecolors='k')#画散点图,对于其中的参数c,我没看懂API的解释(c can be a single color format string, or a sequence of color specifications of length N, or a sequence of N numbers to be mapped to colors using the cmap and norm specified via kwargs (see below). )但是经过试验发现不能把不同类别的点标记成不同颜色了。plt.xticks(())plt.yticks(())plt.axis([-3, 3, -3, 3])plt.show()


class sklearn.svm.NuSVC(nu=0.5, kernel=’rbf’, degree=3, gamma=’auto’, coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=-1, decision_function_shape=’ovr’, random_state=None)
Nu-Support Vector Classification.
Similar to SVC but uses a parameter to control the number of support vectors.
The implementation is based on libsvm.

SVC的含义是Support Vector Classification
nu:float, optional (default=0.5)
An upper bound on the fraction of training errors and a lower bound of the fraction of support vectors. Should be in the interval (0, 1].

kernel:string, optional (default=’rbf’)
Specifies the kernel type to be used in the algorithm. It must be one of ‘linear’, ‘poly’, rbf’, ‘sigmoid’, ‘precomputed’ or a callable. If none is given, ‘rbf’ will be used. If a callable is given it is used to precompute the kernel matrix.

degree:int, optional (default=3)
Degree of the polynomial kernel function (‘poly’). Ignored by all other kernels.

gamma:float, optional (default=’auto’)
Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’. If gamma is ‘auto’ then 1/n_features will be used instead.



class_weight : {dict, ‘balanced’}, optional
Set the parameter C of class i to class_weight[i]*C for SVC. If not given, all classes
are supposed to have weight one. The “balanced” mode uses the values of y to auto-
matically adjust weights inversely proportional to class frequencies as n_samples /
(n_classes * np.bincount(y))
class_weight = {1:10},其中1代表数据类别,10代表权重。



import numpy as npimport matplotlib.pyplot as pltfrom sklearn import svmX=np.c_[(.4,-.7),(-1.5,-1),(-1.4,-.9),(-1.3,-1.2),(-1.1,-.2),(-1.2,-.4),(-.5,1.2),(-1.5,2.1),(1,1),         (1.3,.8),(1.2,.5),(.2,-2),(.5,-2.4),(.2,-2.3),(0,-2.7),(1.3,2.1)].T#这里的c_把各个元组变成了[],整个X变成了矩阵Y = [0]*8+[1]*8#Y是一个list,8个0,8个1fignum = 1for kernel in ('linear','poly','rbf'):#这是对三种核分别计算并画图    clf = svm.SVC(kernel =kernel, gamma=2)    #SVC的介绍见下方    clf.fit(X,Y)    plt.figure(fignum, figsize=(4,3))    #画图区域    plt.clf()    plt.scatter(clf.support_vectors_[:,0], clf.support_vectors_[:,1],s=80,               facecolors ='none', zorder = 10, edgecolors = 'k')    #画出支持向量    plt.scatter(X[:,0], X[:,1],c=Y, zorder=10, cmap=plt.cm.Paired, edgecolors = 'k')    plt.axis('tight')    x_min = -3    x_max = 3    y_min = -3    y_max = 3    XX, YY = np.mgrid[x_min:x_max:200j, y_min:y_max:200j]    #同样也是用来生成网格的与meshgrid类似,x_min:x_max:200j用于生成array,好像只能和mgrid连用,这点需要注意    Z = clf.decision_function(np.c_[XX.ravel(), YY.ravel()])    Z = Z.reshape(XX.shape)    plt.figure(fignum, figsize=(4,3))    plt.pcolormesh(XX, YY, Z>0, cmap = plt.cm.Paired)    #pcolormesh:Plot a quadrilateral mesh.参数C may be a masked array    plt.contour(XX, YY,Z, colors = ['k','k','k'], linestyles=['--','-','--'], levels = [-.5,0,.5])    #这里画出了三条线,分别是wx+b等于-0.5,0,0.5三种    plt.xlim(x_min, x_max)    plt.ylim(y_min, y_max)    plt.xticks(())    plt.yticks(())    fignum = fignum+1plt.show()

SVC:class sklearn.svm.SVC(C=1.0, kernel=’rbf’, degree=3, gamma=’auto’, coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=-1, decision_function_shape=’ovr’, random_state=None)
C-Support Vector Classification.
The implementation is based on libsvm. The fit time complexity is more than quadratic with the number of samples which makes it hard to scale to dataset with more than a couple of 10000 samples.
The multiclass support is handled according to a one-vs-one scheme.
从上面的名字就可以看出,SVC有一个很关键的参数C:float, optional (default=1.0)Penalty parameter C of the error term.,其实就是目标函数中的惩罚项前面的系数即



decision_function_shape:‘ovo’, ‘ovr’, default=’ovr’
Whether to return a one-vs-rest (‘ovr’) decision function of shape (n_samples, n_classes) as all other classifiers, or the original one-vs-one (‘ovo’) decision function of libsvm which has shape (n_samples, n_classes * (n_classes - 1) / 2).
另外在scikit-learn 中没有的一种多分类方式为MVM,一种常见的MVM技术叫做“输出纠错码技术”(ECOC)。通过编码,解码来预测类别。下面是输出结果:



import numpy as npimport matplotlib.pyplot as pltfrom matplotlib.colors import Normalizefrom sklearn.svm import SVCfrom sklearn.preprocessing import StandardScalerfrom sklearn.datasets import load_irisfrom sklearn.model_selection import StratifiedShuffleSplitfrom sklearn.model_selection import GridSearchCV# Utility function to move the midpoint of a colormap to be around# the values of interest.class MidpointNormalize(Normalize):def __init__(self, vmin=None, vmax=None, midpoint=None, clip=False):    self.midpoint = midpoint    Normalize.__init__(self, vmin, vmax, clip)def __call__(self, value, clip=None):    x, y = [self.vmin, self.midpoint, self.vmax], [0, 0.5, 1]    return np.ma.masked_array(np.interp(value, x, y))# ############################################################################## Load and prepare data set## dataset for grid searchiris = load_iris()X = iris.datay = iris.target# Dataset for decision function visualization: we only keep the first two# features in X and sub-sample the dataset to keep only 2 classes and# make it a binary classification problem.X_2d = X[:, :2]#只取前两个特征X_2d = X_2d[y > 0]y_2d = y[y > 0]y_2d -= 1# It is usually a good idea to scale the data for SVM training.# We are cheating a bit in this example in scaling all of the data,# instead of fitting the transformation on the training set and# just applying it on the test set.scaler = StandardScaler()#用于将数据归一化,也就是变成标准正态分布的数据,这只是一个对象X = scaler.fit_transform(X)#fit_trans_form 把数据归一化X_2d = scaler.fit_transform(X_2d)# ############################################################################## Train classifiers## For an initial search, a logarithmic grid with basis# 10 is often helpful. Using a basis of 2, a finer# tuning can be achieved but at a much higher cost.C_range = np.logspace(-2, 10, 13)gamma_range = np.logspace(-9, 3, 13)param_grid = dict(gamma=gamma_range, C=C_range)cv = StratifiedShuffleSplit(n_splits=5, test_size=0.2, random_state=42)grid = GridSearchCV(SVC(), param_grid=param_grid, cv=cv)#GridSearchCV的介绍见下方grid.fit(X, y)print("The best parameters are %s with a score of %0.2f"% (grid.best_params_, grid.best_score_))# Now we need to fit a classifier for all parameters in the 2d version# (we use a smaller set of parameters here because it takes a while to train)C_2d_range = [1e-2, 1, 1e2]gamma_2d_range = [1e-1, 1, 1e1]classifiers = []for C in C_2d_range:    for gamma in gamma_2d_range:        clf = SVC(C=C, gamma=gamma)        clf.fit(X_2d, y_2d)        classifiers.append((C, gamma, clf))#这里只是一个小数据集# ############################################################################## Visualization## draw visualization of parameter effectsplt.figure(figsize=(8, 6))xx, yy = np.meshgrid(np.linspace(-3, 3, 200), np.linspace(-3, 3, 200))for (k, (C, gamma, clf)) in enumerate(classifiers):# evaluate decision function in a grid    Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])    Z = Z.reshape(xx.shape)# visualize decision function for these parameters    plt.subplot(len(C_2d_range), len(gamma_2d_range), k + 1)    plt.title("gamma=10^%d, C=10^%d" % (np.log10(gamma), np.log10(C)),size='medium')# visualize parameter's effect on decision function    plt.pcolormesh(xx, yy, -Z, cmap=plt.cm.RdBu)    plt.scatter(X_2d[:, 0], X_2d[:, 1], c=y_2d, cmap=plt.cm.RdBu_r,edgecolors='k')    plt.xticks(())    plt.yticks(())    plt.axis('tight')scores = grid.cv_results_['mean_test_score'].reshape(len(C_range),len(gamma_range))## Draw heatmap of the validation accuracy as a function of gamma and C## The score are encoded as colors with the hot colormap which varies from dark# red to bright yellow. As the most interesting scores are all located in the# 0.92 to 0.97 range we use a custom normalizer to set the mid-point to 0.92 so# as to make it easier to visualize the small variations of score values in the# interesting range while not brutally collapsing all the low score values to# the same color.plt.figure(figsize=(8, 6))plt.subplots_adjust(left=.2, right=0.95, bottom=0.15, top=0.95)plt.imshow(scores, interpolation='nearest', cmap=plt.cm.hot,norm=MidpointNormalize(vmin=0.2, midpoint=0.92))plt.xlabel('gamma')plt.ylabel('C')plt.colorbar()plt.xticks(np.arange(len(gamma_range)), gamma_range, rotation=45)plt.yticks(np.arange(len(C_range)), C_range)plt.title('Validation accuracy')plt.show()

ShuffleSplit:class sklearn.model_selection.ShuffleSplit(n_splits=10, test_size=’default’,
train_size=None, random_state=None)
Random permutation cross-validator
Yields indices to split data into training and test sets.
Note: contrary to other cross-validation strategies, random splits do not guarantee that all folds will be different,
although this is still very likely for sizeable datasets.
n_splits : int, default 10

GridSearchCV:sklearn.model_selection.GridSearchCV(estimator,param_grid, scoring=None,fit_params=None, n_jobs=1, iid=True,refit=True, cv=None, verbose=0,
pre_dispatch=‘2*n_jobs’, error_score=’raise’,return_train_score=’warn’)
Exhaustive search over specified parameter values for an estimator.
Important members are fitt, predict.
GridSearchCV implements a “fit” and a “score” method. It also implements “predict”, “predict_proba”, “decision_function”, “transform” and “inverse_transform” if they are implemented in the estimator used.
The parameters of the estimator used to apply these methods are optimized by cross-validated grid-search over a parameter grid.

estimator:就是用来评价的分类器(或者直译叫预测器?)。所有的estimator实例都实现了estimator 接口,所以应该提供一个scoring函数。这个所谓的分数怎么来的我也不清楚,反正是越高越好。
param_grid:dict or list of dictionaries
Dictionary with parameters names (string) as keys and lists of parameter settings to try
as values, or a list of such dictionaries, in which case the grids spanned by each dictionary in the list are explored.
scoring=None string, callable, list/tuple, dict or None, default: None
cv int, cross-validation generator or an iterable, optional
cv应该就是来自于cross validation,也就是交叉验证。也就是为了充分对模型进行评估所使用的。

cv_results:dict of numpy (masked) ndarrays
**best_estimator_**estimator or dict

The best parameters are {‘C’: 1.0, ‘gamma’: 0.10000000000000001} with a score of 0.97


