Focal Loss for Dense Object Detection

来源：互联网发布：nginx怎么使用编辑：程序博客网时间：2024/05/19 11:47

Key Idea

rather than addressing outliers, our focal loss is designed to address class imbalance by down-weighting inliers (easy examples) such that their contribution to the total loss is small even if their number is large

key question

We discover that the extreme foreground-background class imbalance encountered during training of dense detectors is the central cause.
Class imbalance is addressed in R-CNN-like detectors by a two-stage cascade and sampling heuristics. The proposal stage (e.g., Selective Search [34], EdgeBoxes [37], DeepMask [23, 24], RPN [27]) rapidly narrows down the number of candidate object locations to a small number (e.g., 1-2k), filtering out most background samples. In the second classification stage, sampling heuristics, such as a fixed foreground-to-background ratio (1:3), or online hard example mining (OHEM) [30], are performed to maintain a manageable balance between foreground and background. In contrast, a one-stage detector must process a much larger set of candidate object locations regularly sampled across an image. In practice this often amounts to enumerating ∼100k locations that densely cover spatial positions, scales, and aspect ratios. While similar sampling heuristics may also be applied, they are inefficient as the training procedure is still dominated by easily classified background examples. This inefficiency is a classic problem in object detection that is typically addressed via techniques such as bootstrapping [32, 28] or hard example mining [36, 8, 30].

Method

cross entropy (CE) loss for binary classification

For notational convenience, we define pt

rewrite CE(p; y) = CE(pt) = − log(pt)
Balanced Cross Entropy
Focal Loss Definition
- 2 properties of FL
  - When an example is misclassified and pt is small, the modulating factor is near 1 and the loss is unaffected. As pt→1, the factor goes to 0 and the loss for well-classified examples is down-weighted.
  - The focusing parameter γ smoothly adjusts the rate at which easy examples are downweighted.
α-balanced variant of the focal loss
Class Imbalance and Model Initialization
introduce the concept of a ‘prior’ for the value of p estimated by the model for the rare class (foreground) at the start of training.We denote the prior by π and set it so that the model’s estimated p for examples of the rare class is low
Class Imbalance and Two-stage Detectors
- a two-stage cascade
  - possible object locations down to one or two thousand
  - selected proposals are not random, but are likely to correspond to true object locations
- biased minibatch sampling
  - 1:3 ratio of positive to negative examples

Architecture

这里写图片描述

Experiments