目标检测--Focal Loss for Dense Object Detection

来源：互联网发布：镧系元素物理性质知乎编辑：程序博客网时间：2024/05/22 17:40

Focal Loss for Dense Object Detection
ICCV2017
https://arxiv.org/abs/1708.02002

本文算是用简单的方法解决复杂的问题了，好的思想是简单的

针对目标检测，目前有两类主流算法： two-stage detectors 和 one-stage detectors， two-stage detectors 的精度好但是速度慢，one-stage detectors 速度快精度差一些，本文指出one-stage detectors 精度差主要是因为在训练时的 class imbalance，不仅仅是正负样本不平衡，更主要的是难易样本比例严重失调。随后提出了改进损失函数的定义，减小大量简单背景样本对总体损失函数的贡献，相对提高难得样本在损失函数中的权重。

In this work, we identify class imbalance as the primary obstacle preventing one-stage object detectors from surpassing top-performing, two-stage methods, such as Faster R-CNN variants. To address this, we propose the focal loss which applies a modulating term to the cross entropy loss in
order to focus learning on hard examples and down-weight the numerous easy negatives.

这里写图片描述

Related Work
先看看目标检测的历史
Classic Object Detectors: 最经典的思路就会滑动窗口了，Adboost 用于人脸检测， HOG， DPMs 等

Two-stage Detectors: 首先是候选区域提取，然后是使用 CNN 进行分类，从最开始的R-CNN，Fast R-CNN，到最经典的框架就是 Faster R-CNN framework

One-stage Detectors: One stage detectors are applied over a regular, dense sampling of object locations, scales, and aspect ratios
代表性方法：OverFeat， SSD ， YOLO

Class Imbalance: 不管是传统的 one-stage object detection 方法如 boosted detectors ， DPMs 还是最近的 SSD，在训练阶段都面临一个很大的 class imbalance, 这些检测器在一幅图像中大约要评估 10000-100000 个候选位置，但是只有很少的位置含有物体。这个不平衡会导致两个问题：1）训练的低效率，因为大部分位置都是简单的负样本，他们没有什么有用的学习信息。2）简单负样本占整个样本的绝大多数，导致学习到的模型泛化性能降低。以前解决这个问题的方法一般是 hard negative mining 或者赋予不同权重的策略 more complex sampling/reweighing schemes。

本文提出的 focal loss 很好的解决了 class imbalance，可以高效的训练所有的样本，不用设计采样策略来减少简单的负样本

Focal Loss
我们首先从二分类问题中的 cross entropy (CE) loss 谈起，慢慢引入 focal loss。

换一个马夹是这样的：

One notable property of this loss, which can be easily seen in its plot, is that even examples that are easily classified (p t>.5) incur a loss with non-trivial magnitude. When summed over a large number of easy examples, these small loss values can overwhelm the rare class
大量简单的负样本对总体的损失函数影响太大

3.1. Balanced Cross Entropy
解决这个class imbalance 一个常规方法就是引入 a weighting factor α
这里写图片描述

3.2. Focal Loss Definition
这里写图片描述

focal loss 有两个属性：
1）当一个样本被误分类， p_t 很小，误分类引入的误差不受影响， the modulating factor is near 1 and the loss is unaffected
2） The focusing parameter γ smoothly adjusts the rate at which easy examples are down-weighted 。
The focusing parameter γ 会平滑的调整降低简单负样本的权重

实际使用的 focal loss 引入了 α-balanced
这里写图片描述

这里写图片描述

3.3. Class Imbalance and Model Initialization
class imbalance 会导致训练初期的不稳定，这里我们引入了先验知识，一般为 0.01
the value of p estimated byt he model for the rare class (foreground) at the start of training

3.4. Class Imbalance and Two-stage Detectors
Two-stage Detectors 是怎么解决 class imbalance 了？
(1) a two-stage cascade and (2) biased minibatch sampling， 1:3

RetinaNet Detector
这里我们设计了一个 RetinaNet Detector 来验证我们提出的 local loss 的有效性

这里写图片描述

Feature Pyramid Network (FPN) + subnetworks for classifying anchor boxes + subnetworks for anchor boxes regress

这里写图片描述

阅读全文

2 0