回归

来源:互联网 发布:钥匙扣定做厂家淘宝 编辑:程序博客网 时间:2024/04/29 11:48

岭回归基本概念

在进行特征选择时,一般有三种方式:

1. 子集选择
2. 收缩方式(Shrinkage method),又称为正则化(Regularization)。主 要包括岭回归和lasso回归。
3. 维数缩减
岭回归(Ridge Regression)是在平方误差的基础上增加正则项

岭回归

通过确定的值可以使得在方差和偏差之间达到平衡:随着的增大,模型方差减小而偏差增大。
对求导,结果为 : 这里写图片描述
令其为0,可求得的值:w值

正则化目的:

回归正则化

参数影响

正则化

This method has the same order of complexity than an Ordinary Least Squares. If X is a matrix of size (n, p) this method has a cost of O(n p^2), assuming that n >p.

Lasso

it consists of a linear model trained with L1 prior as regularizer. The objective function to minimize is:

这里写图片描述

  • This method has the same order of complexity than an Ordinary Least Squares. If X is a matrix of size (n, p) this method has a cost of O(n p^2), assuming that n >p.

Lasso

it consists of a linear model trained with \ell_1 prior as regularizer. The objective function to minimize is:
这里写图片描述

Elastic Net


  • A practical advantage of trading-off between Lasso and Ridge is it allows Elastic-Net to inherit some of Ridge’s stability under rotation.
  • Elastic-net is useful when there are multiple features which are correlated with one another. Lasso is likely to pick one of these at random, while elastic-net is likely to pick both.

EN

参考:
http://blog.csdn.net/google19890102/article/details/27228279
http://www.mit.edu/~9.520/spring07/Classes/rlsslides.pdf