cs231n的第一次作业_图像特征_HSV_HOG

来源：互联网发布：如何找pdf 知乎编辑：程序博客网时间：2024/06/05 01:04

cs231n的第一次作业_图像特征_HSV_HOG

上次作业最后加粗了，要用features来分类，原来后一个作业便是。

图像描述

之前四个小作业都是用像素作为图像的描述，进行分类，效果最好的是两层神经网络，达到50%了。现在将图像用features来描述。
用作业里的提示来说就是

For each image we will compute a Histogram of Oriented Gradients (HOG) as well as a color histogram using the hue channel in HSV color space. We form our final feature vector for each image by concatenating the HOG and color histogram feature vectors.
Roughly speaking, HOG should capture the texture of the image while ignoring color information, and the color histogram represents the color of the input image while ignoring texture. As a result, we expect that using both together ought to work better than using either alone. Verifying this assumption would be a good thing to try for the bonus section.
The hog_feature and color_histogram_hsv functions both operate on a single image and return a feature vector for that image. The extract_features function takes a set of images and a list of feature functions and evaluates each feature function on each image, storing the results in a matrix where each column is the concatenation of all feature vectors for a single image.

用HOG和HSV特征来描述图像，HOG会重视纹理结构，忽视了颜色，而HSV重视颜色而忽视了纹理结构，于是将两者一起用来描述图像。下面分别用HOG、 HSV 、HOG+HSV来做测试。

HOG

HOG特征是指方向梯度直方图（Histogram of Oriented Gradient, HOG），是一种在计算机视觉和图像处理中用来进行物体检测的特征描述子。具体可以Google。在这里，作业中给出了计算一张图片HOG特征的方法。

def hog_feature(im):  """Compute Histogram of Gradient (HOG) feature for an image       Modified from skimage.feature.hog       http://pydoc.net/Python/scikits-image/0.4.2/skimage.feature.hog     Reference:       Histograms of Oriented Gradients for Human Detection       Navneet Dalal and Bill Triggs, CVPR 2005    Parameters:      im : an input grayscale or rgb image    Returns:      feat: Histogram of Gradient (HOG) feature  """  # convert rgb to grayscale if needed  if im.ndim == 3:    image = rgb2gray(im)  else:    image = np.at_least_2d(im)  sx, sy = image.shape # image size  orientations = 9 # number of gradient bins  cx, cy = (8, 8) # pixels per cell  gx = np.zeros(image.shape)  gy = np.zeros(image.shape)  gx[:, :-1] = np.diff(image, n=1, axis=1) # compute gradient on x-direction  gy[:-1, :] = np.diff(image, n=1, axis=0) # compute gradient on y-direction  grad_mag = np.sqrt(gx ** 2 + gy ** 2) # gradient magnitude  grad_ori = np.arctan2(gy, (gx + 1e-15)) * (180 / np.pi) + 90 # gradient orientation  n_cellsx = int(np.floor(sx / cx))  # number of cells in x  n_cellsy = int(np.floor(sy / cy))  # number of cells in y  # compute orientations integral images  orientation_histogram = np.zeros((n_cellsx, n_cellsy, orientations))  for i in range(orientations):    # create new integral image for this orientation    # isolate orientations in this range    temp_ori = np.where(grad_ori < 180 / orientations * (i + 1),                        grad_ori, 0)    temp_ori = np.where(grad_ori >= 180 / orientations * i,                        temp_ori, 0)    # select magnitudes for those orientations    cond2 = temp_ori > 0    temp_mag = np.where(cond2, grad_mag, 0)    orientation_histogram[:,:,i] = uniform_filter(temp_mag, size=(cx, cy))[cx/2::cx, cy/2::cy].T  return orientation_histogram.ravel()

注意：为了方便，将最终返回的特征展成一维，方便与其他特征append。这里返回的特征为144维。
将测试代码中feature_fns设为一个hog_feature

feature_fns = [hog_feature]

最终用2层神经网络得到验证集准确率为56%

lr 5.000000e-01 reg 1.000000e-03 val accuracy: 0.560000
best validation accuracy achieved during cross-validation: 0.560000

测试集为56.4%

net = best_net
test_acc = (net.predict(X_test_feats) == y_test).mean()
print test_acc
0.564

其中用SVM做分类时，分类错误的图像可以看一下
这里写图片描述
卡车和汽车轮廓差不多，会误识别，

HSV

HSV颜色模型中颜色的参数分别是：色调（H），饱和度（S），明度（V）。作业里也给了计算特征的代码。

def color_histogram_hsv(im, nbin=10, xmin=0, xmax=255, normalized=True):  """  Compute color histogram for an image using hue.  Inputs:  - im: H x W x C array of pixel data for an RGB image.  - nbin: Number of histogram bins. (default: 10)  - xmin: Minimum pixel value (default: 0)  - xmax: Maximum pixel value (default: 255)  - normalized: Whether to normalize the histogram (default: True)  Returns:    1D vector of length nbin giving the color histogram over the hue of the    input image.  """  ndim = im.ndim  bins = np.linspace(xmin, xmax, nbin+1)  hsv = matplotlib.colors.rgb_to_hsv(im/xmax) * xmax  imhist, bin_edges = np.histogram(hsv[:,:,0], bins=bins, density=normalized)  imhist = imhist * np.diff(bin_edges)  # return histogram  return imhist

输出的imhist为10维，也就是在0～255中有10个bins，imhist归一化了，imhist.sum() = 1.0
将测试代码中feature_fns设为一个lambda img: color_histogram_hsv(img, nbin=num_color_bins)

num_color_bins = 10 # Number of bins in the color histogramfeature_fns = [lambda img: color_histogram_hsv(img, nbin=num_color_bins)]

最终用2层神经网络得到验证集准确率为30.2%

lr 5.000000e-01 reg 1.000000e-03 val accuracy: 0.302000
best validation accuracy achieved during cross-validation: 0.302000

测试集为26.9%

net = best_nettest_acc = (net.predict(X_test_feats) == y_test).mean()print test_acc0.269

相比与HOG特征，HSV得到的准确率降了很多，一种理解是卡车和汽车并不是用颜色来区别的，所以纹理更胜一筹。

HOG+HSV

将测试代码中feature_fns设为两个

num_color_bins = 10 # Number of bins in the color histogramfeature_fns = [hog_feature, lambda img: color_histogram_hsv(img, nbin=num_color_bins)]

最终用2层神经网络得到验证集准确率为58.9%，可以看到如果超参数没有选好，准确率会很低

lr 1.000000e-02 reg 1.000000e-03 val accuracy: 0.091000
lr 1.000000e-02 reg 5.000000e-03 val accuracy: 0.144000
lr 1.000000e-02 reg 1.000000e-02 val accuracy: 0.134000
lr 1.000000e-02 reg 1.000000e-01 val accuracy: 0.161000
lr 1.000000e-02 reg 5.000000e-01 val accuracy: 0.079000
lr 1.000000e-02 reg 1.000000e+00 val accuracy: 0.078000
lr 1.000000e-01 reg 1.000000e-03 val accuracy: 0.521000
lr 1.000000e-01 reg 5.000000e-03 val accuracy: 0.510000
lr 1.000000e-01 reg 1.000000e-02 val accuracy: 0.507000
lr 1.000000e-01 reg 1.000000e-01 val accuracy: 0.431000
lr 1.000000e-01 reg 5.000000e-01 val accuracy: 0.079000
lr 1.000000e-01 reg 1.000000e+00 val accuracy: 0.087000
lr 5.000000e-01 reg 1.000000e-03 val accuracy: 0.589000
lr 5.000000e-01 reg 5.000000e-03 val accuracy: 0.567000
lr 5.000000e-01 reg 1.000000e-02 val accuracy: 0.568000
lr 5.000000e-01 reg 1.000000e-01 val accuracy: 0.404000
lr 5.000000e-01 reg 5.000000e-01 val accuracy: 0.079000
lr 5.000000e-01 reg 1.000000e+00 val accuracy: 0.107000
lr 1.000000e+00 reg 1.000000e-03 val accuracy: 0.586000
lr 1.000000e+00 reg 5.000000e-03 val accuracy: 0.571000
lr 1.000000e+00 reg 1.000000e-02 val accuracy: 0.529000
lr 1.000000e+00 reg 1.000000e-01 val accuracy: 0.420000
lr 1.000000e+00 reg 5.000000e-01 val accuracy: 0.154000
lr 1.000000e+00 reg 1.000000e+00 val accuracy: 0.102000
lr 5.000000e+00 reg 1.000000e-03 val accuracy: 0.087000
lr 5.000000e+00 reg 5.000000e-03 val accuracy: 0.078000
lr 5.000000e+00 reg 1.000000e-02 val accuracy: 0.079000
lr 5.000000e+00 reg 1.000000e-01 val accuracy: 0.078000
lr 5.000000e+00 reg 5.000000e-01 val accuracy: 0.087000
lr 5.000000e+00 reg 1.000000e+00 val accuracy: 0.087000
best validation accuracy achieved during cross-validation: 0.589000

测试集为57.3%，比原先只用的56.4%提升了一点。O(∩_∩)O哈哈~

net = best_nettest_acc = (net.predict(X_test_feats) == y_test).mean()print test_acc0.573

其他特征作为分类依据，先加粗，防忘

参考

https://zhuanlan.zhihu.com/p/21441838?refer=intelligentunit

1 0