（Andrew NG）The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization

来源：互联网发布：淘宝商城黑蜘蛛精编辑：程序博客网时间：2024/05/18 03:07

笔者近日打算阅读完Andrew NG 和 Hinton的所有论文

Andrew NG的第一篇

The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization

Coates A, Ng A Y. The importance of encoding versus training with sparse coding and vector quantization[C]//Proceedings of the 28th International Conference on Machine Learning (ICML-11). 2011: 921-928.

补充 vector quantization

Vector Quantization是一种数据压缩（data compression）和编码的方式，关于其内容可以参见网站

http://www.data-compression.com/vq.html

vector quantization的思想就是把某个区域的点都用一个点代替（k-means就是一种VQ），见图

Here, every pair of numbers falling in a particular region are approximated by a red star associated with that region

摘要

VQ在很长一段时间是特征编码的主要方法之一，后面逐渐被稀疏编码（sparse coding）取代了。这两个特征编码方法都包含两个阶段：训练阶段（learning phase）（学习编码用的字典dictionary/基函数basis function)和编码阶段（encoding phase）。本文分析了SC优于VQ的原因，以及训练和编码两个阶段的作用。

结论：稀疏编码的成功来自于其高效的编码阶段，稀疏编码的编码阶段搭配起他训练过程也可以取得很好的效果

这就启发我们，如果训练的数据集很大，就可以选择快速的简单训练算法，用高效的编码策略。

introduction

VQ在特征提取中，一般起补刀的作用。它一般作用于已经提取的低层次特征上（low level），为了抽象出高层（high-level）特征。VQ的这种作用现在可被SC代替，并且SC能取得更好的效果。问题来了，这是因为SC学习到了更好的表示数据结构的字典，还是因为稀疏编码就是比简单的非线性特征要好？有其他的训练算法或者编码策略比稀疏编码简单，且又能与稀疏编码抗衡的？

D：表示训练阶段学习到的字典（基函数）--训练阶段

M：基于D，从输入x映射到特征f的映射原则（mapping）--编码阶段

训练阶段和编码阶段不一定要匹配的。比如这样搭配训练阶段中使用硬分配hard assignment，编码阶段用软分配soft assignment，效果会比训练和编码都用hard assignment要好。

最近的文献也指出，基函数的选择远没有我们想象的那么重要。Jarrett[7]表明随机的权值也可以取得比较好的分类结果，即便没有学习到的权值好。

讨论

1）在只有少量标签数据时，稀疏编码的效果还是不错的

有用的知识总结

1，soft assignment比hard assignment要好,参考文献[1],[2]

2，The soft threshold function

在深度架构的算法中，用作非线性模块

3，“locality preserving” encodings

新的编码策略[3-6]

4，patch-based system

原来的图像太大了，用patch

5，分离正负特征

the positive and negative polarities plit into separate features

参考文献

[1]van Gemert, J. C., Geusebroek, J. M., Veenman, C. J., and Smeulders, A. W. M. Kernel codebooks for
scene categorization. In European Conference on Computer Vision, 2008.

[2]Agarwal, A. and Triggs, B. Hyperfeatures multilevel local coding for visual recognition. In European Conference on Computer Vision, 2006.

[3]Yu, K., Zhang, T., and Gong, Y. Nonlinear learning using local coordinate coding. In Advances in Neural Information Processing Systems, 2009.

[4]Yu, K. and Zhang, T. Improved local coordinate coding using local tangents. In International Conference on Machine Learning, 2010.

[5]Yang, Jianchao, Yu, Kai, Gong, Yihong, and Huang, Thomas S. Linear spatial pyramid matching using
sparse coding for image classification. In Computer Vision and Pattern Recognition, 2009.

[6] Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., and Gong, Y. Locality-constrained linear coding for image classification. In Computer Vision and Pattern Recognition, 2010.

[7] Jarrett, K., Kavukcuoglu, K., Ranzato, M., and Le-Cun, Y. What is the best multi-stage architecture
for object recognition? In International Conference on Computer Vision, 2009.

0 0