（转）非极大抑制（Non-Maximum Suppression）

来源：互联网发布：文森特梵高知乎编辑：程序博客网时间：2024/06/09 16:34

转载自非极大抑制（Non-Maximum Suppression）。

参考文章：
1. Non-Maximum Suppression for Object Detection in Python
2. NMS非极大值抑制

最近在做人脸识别的项目，其中在人脸检测算法中MTCNN算法是用到了NMS算法来筛选候选的人脸区域得到最佳的人脸位置。

这个算法其实应用非常广泛，在比较流行的检测算法中都有使用，包括RCNN、SPP-Net中，因为它主要作用就是在一堆候选区域找到最好最佳的区域。

大概原理如下：

假设从一个图像中得到了2000region proposals，通过在RCNN和SPP-net之后我们会得到2000*4096的一个特征矩阵，然后通过N的SVM来判断每一个region属于N的类的scores。其中，SVM的权重矩阵大小为4096*N，最后得到2000*N的一个score矩阵（其中，N为类别的数量）。

Non-Maximum Suppression就是需要根据score矩阵和region的坐标信息，从中找到置信度比较高的bounding box。首先，NMS计算出每一个bounding box的面积，然后根据score进行排序，把score最大的bounding box作为队列中。接下来，计算其余bounding box与当前最大score与box的IoU，去除IoU大于设定的阈值的bounding box。然后重复上面的过程，直至候选bounding box为空。最终，检测了bounding box的过程中有两个阈值，一个就是IoU，另一个是在过程之后，从候选的bounding box中剔除score小于阈值的bounding box。需要注意的是：Non-Maximum Suppression一次处理一个类别，如果有N个类别，Non-Maximum Suppression就需要执行N次。

python实现代码如下（参考自Non-Maximum Suppression for Object Detection in Python）：

# import the necessary packagesimport numpy as npimport cv2#  Felzenszwalb et al.def non_max_suppression_slow(boxes, overlapThresh):    # if there are no boxes, return an empty list    if len(boxes) == 0:        return []    # initialize the list of picked indexes    pick = []    # grab the coordinates of the bounding boxes    x1 = boxes[:,0]    y1 = boxes[:,1]    x2 = boxes[:,2]    y2 = boxes[:,3]    # compute the area of the bounding boxes and sort the bounding    # boxes by the bottom-right y-coordinate of the bounding box    area = (x2 - x1 + 1) * (y2 - y1 + 1)    idxs = np.argsort(y2)    # keep looping while some indexes still remain in the indexes    # list    while len(idxs) > 0:        # grab the last index in the indexes list, add the index        # value to the list of picked indexes, then initialize        # the suppression list (i.e. indexes that will be deleted)        # using the last index        last = len(idxs) - 1        i = idxs[last]        pick.append(i)        suppress = [last]        # loop over all indexes in the indexes list        for pos in xrange(0, last):            # grab the current index            j = idxs[pos]            # find the largest (x, y) coordinates for the start of            # the bounding box and the smallest (x, y) coordinates            # for the end of the bounding box            xx1 = max(x1[i], x1[j])            yy1 = max(y1[i], y1[j])            xx2 = min(x2[i], x2[j])            yy2 = min(y2[i], y2[j])            # compute the width and height of the bounding box            w = max(0, xx2 - xx1 + 1)            h = max(0, yy2 - yy1 + 1)            # compute the ratio of overlap between the computed            # bounding box and the bounding box in the area list            overlap = float(w * h) / area[j]            # if there is sufficient overlap, suppress the            # current bounding box            if overlap > overlapThresh:                suppress.append(pos)        # delete all indexes from the index list that are in the        # suppression list        idxs = np.delete(idxs, suppress)    # return only the bounding boxes that were picked    return boxes[pick]# construct a list containing the images that will be examined# along with their respective bounding boxesimages = [    ("images/audrey.jpg", np.array([    (12, 84, 140, 212),    (24, 84, 152, 212),    (36, 84, 164, 212),    (12, 96, 140, 224),    (24, 96, 152, 224),    (24, 108, 152, 236)])),    ("images/bksomels.jpg", np.array([    (114, 60, 178, 124),    (120, 60, 184, 124),    (114, 66, 178, 130)])),    ("images/gpripe.jpg", np.array([    (12, 30, 76, 94),    (12, 36, 76, 100),    (72, 36, 200, 164),    (84, 48, 212, 176)]))]# loop over the imagesfor (imagePath, boundingBoxes) in images:    # load the image and clone it    print "[x] %d initial bounding boxes" % (len(boundingBoxes))    image = cv2.imread(imagePath)    orig = image.copy()    # loop over the bounding boxes for each image and draw them    for (startX, startY, endX, endY) in boundingBoxes:        cv2.rectangle(orig, (startX, startY), (endX, endY), (0, 0, 255), 2)    # perform non-maximum suppression on the bounding boxes    pick = non_max_suppression_slow(boundingBoxes, 0.3)    print "[x] after applying non-maximum, %d bounding boxes" % (len(pick))    # loop over the picked bounding boxes and draw them    for (startX, startY, endX, endY) in pick:        cv2.rectangle(image, (startX, startY), (endX, endY), (0, 255, 0), 2)    # display the images    cv2.imshow("Original", orig)    cv2.imshow("After NMS", image)    cv2.waitKey(0)

效果如下图：
这里写图片描述

这里写图片描述

阅读全文

0 0