LLC学习笔记1

来源：互联网发布：阿里云虚拟主机装软件编辑：程序博客网时间：2024/04/28 05:28

Locality-constrained linear coding(LLC)是文献[1]提出的一种图像分类特征提取方法中的编码方法。
参(chao)考(xi)了多个博客：
图像的稀疏表示系列:
http://blog.csdn.net/jwh_bupt/article/details/9625469
Bag of Features (BOF)图像检索算法：
http://blog.csdn.net/chlele0105/article/details/9633397#comments
空间金字塔方法表示图像：
http://www.mamicode.com/info-detail-903166.html

参考了多个文献，在最后会附上。。

文献[1]的摘要：
基于bag-of-features(BOF)的传统的SPM算法需要非线性的分类器才能达到较好的分类性能。LLC提出简单高效的编码方案Locality-constrained Linear Coding (LLC)取代传统的SPM算法中的VQ coding。LLC利用局部约束将每个descriptor投影到它的local-coordinate system(局部坐标系)中，并且投影坐标通过max pooling (特征各维最大汇总)整合来产生the final representation。与sparse coding strategy(稀疏编码方案)相比，LLC所使用的objective function(目标函数)有一个analytical solution(解析解)。另外，文章还提出一个快速逼近的LLC方法，首先使用k近邻法进行搜索，然后解决constrained least square fitting problem(约束最小二乘法拟合问题)，时间复杂度为O(M+K2)。即便是codebooks规模较大，LLC算法still process multiple frames per second，较高的效率使LLC有很大实际应用价值。

目前顶尖的分类算法主要包括两部分：bag-of-features(BoF) [2] and spatial pyramid matching (SPM) [3]。BoF将一幅图像表示为它的local features的直方图，它在特征的spatial translations(空间尺度变化)情况下表现出很好的鲁棒性，并在whole-image分类任务中表现出decent performance。不过BoF忽略了特征的空间分布信息，在capturing shapes or locating an object时就无能为力了。于是出现许多对BoF的改进，其中使用SPM获得最好的效果。SPM将一幅图像划分为increasingly finer的空间子区域，一般划分为2lx2l （l次幂）(l =0,1,2)的子区域，然后计算每子区域的局部特征的直方图。生成的spatial pyramid 是the orderless BoF representation的computationally efficient extension(高效计算的扩展)，在大量图像的分类任务中表现出very promising performance。

LLC是对SPM的改进，期间还提出一个改进较ScSPM[4]，放一张论文中的图解释下：
LLC coding process
上图是基于BOF的SPM的流程图。
1.首先，输入图像中detected or densely located feature points，然后从每个feature point中 extracted descriptors(如SIFT和color moment)，即”Descriptor” layer。
2.用M entries的codebook来quantize(量化) each descriptor，接着进入”Code” layer，each descriptor转换为一个ℝ^M的code(绿色的圈圈)。假如使用hard vector quantization (VQ)，每个code就只有一个非零element，若使用soft-VQ，则有一小撮非零的elements。这部分就是各个ScSPM与LLC的改进，即将VQ改成各自自己的coding scheme。。
3.”SPM” layer，每个子区域的多个codes 通过求平均和归一化pooled together 到一个直方图中。最后子区域的直方图concatenated(串联)到一起产生the final representation。文献[5]提出一种改进SPM的Pooling阶段的方案。

参考文献：
[1]: J. Wang et al., “Locality-constrained linear coding for image classification,” in Proc. IEEE CVPR, 2010, pp. 3360–3367.[LLC]

[2]: Sivic J, Zisserman A. Video Google: A text retrieval approach to object matching in videos[C]//Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on. IEEE, 2003: 1470-1477.[BOF]

[3]: Lazebnik S, Schmid C, Ponce J. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories[C]//Computer Vision and Pattern Recognition(CVPR), 2006 IEEE Computer Society Conference on. IEEE, 2006, 2: 2169-2178.[SPM]

[4]: Yang J, Yu K, Gong Y, et al. Linear spatial pyramid matching using sparse coding for image classification[C]//Computer Vision and Pattern Recognition(CVPR), 2009. IEEE Conference on. IEEE, 2009: 1794-1801.[ScSPM]

[5]:Harada T, Ushiku Y, Yamashita Y, et al. Discriminative spatial pyramid[C]//Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011: 1617-1624.

1 0