Ridge Regression - 岭回归
来源:互联网 发布:云计算平台架构 编辑:程序博客网 时间:2024/04/24 23:37
Why 目的
predictors X之间存在严重的多重共线性(multicollinarity, 即自变量之间线性相关-correlation很高)时,会使得least-square(最小二乘法)计算公式
What 本质是什么
给R内部加上一个惩罚项,使得R成为singulation的可能性大大降低。
本质还是bias – variance tradeoff,这次是牺牲unbias换取小variance。
How 具体是怎么解决的
先回顾一下。
背景
为什么
记得我们前面说过岭回归牺牲了一部分无偏差性,使得variance变小。现在我们来看看数学推导。让
Bias-variance tradeoff: SSE(k) = ||Y - X\hat{\beta}(k)||^2
E(SSE(k)) = \sigma^2 tr((I - H(k) )) + ||\mu - \mu(k)||^2
bias-variance tradeoff同样反映在了组成model的各部分。我们先来看看\beta(k)
Risk: E(||\beta - \hat{\beta}(k)||^2) = E(||\hat{\beta(k)} - \beta(k) + \beta(k) + \beta||^2) = E(||\hat{\beta(k)} - \beta(k)||^2) + ||\beta(k) - \beta||^2 + 2E[(\hat{\beta}(k) - \beta(k))'(\beta(k) - \beta)]
因为实际上只有\hat{\beta}(k)
我们在前面使用谱分解算过variance 部分,
接着是bias部分,
让variance部分对
因此当k增大时,variance降低而bias增大。任意
把variance-bias结合起来得到当
同理我们还可以对
和上节相比分母没变,只是分子有点小变化。
R
X = as.matrix(crime.s.train[, -1])D = t(X) %*% Xeigen(D)eigen(D)[[1]][1]/sum(eigen(D)[[1]])library('MASS')select(lm.ridge(Crime ~ 0 + ., data = crime.s.train,lambda = seq(0,10, .01)))## Model with k = 2.48:k.ridge = 2.48library('ridge')fit.ridge = linearRidge(Crime ~ 0 + ., data = crime.s.train, lambda = k.ridge, scaling = 'none'); summary(fit.ridge)#fitted valuefitted.ridge = predict(fit.ridge, newx = crime.s.train[, 2:13], s = k.ridge)#residualsresid.ridge = crime.s.train[ ,1] - fitted.ridgepar(mfrow = c(1, 3))plot(crime.s.train$Crime, fitted.ridge , xlab = "Crime rate", ylab = " fitted Crime rate", main = "ridge method: Y vs Fitted Y")plot(resid.ridge, xlab = "", main = "ridge method: plot of resid")qqnorm(resid.ridge); qqline(resid.ridge)par(mfrow = c(1, 1))D.ridge = t(X) %*% X + k.ridge * diag(12)VIF.ridge = diag(solve(D.ridge) %*% t(X) %*% X %*% solve(D.ridge)) VIF.ridge
- Ridge Regression岭回归
- ridge regression岭回归
- Ridge Regression岭回归
- ridge regression岭回归
- Ridge Regression - 岭回归
- Ridge Regression 岭回归
- Ridge Regression岭回归
- 岭回归(Ridge Regression)
- Ridge Regression岭回归
- 岭回归(ridge regression)
- Ridge Regression岭回归,lasso
- 岭回归(Ridge Regression)
- 岭回归(ridge regression)
- 2.Ridge Regression 岭回归
- Python回归 岭回归(Ridge Regression)
- ridge regression 脊回归 / 岭回归
- 脊回归(Ridge Regression) 岭回归
- 机器学习-Ridge Regression岭回归
- GTK学习笔记之固定布局GtkFixed(4)
- GTK学习笔记之响应控件事件(5)
- Combination Sum - Leetcode
- 自己如何正确获取MYSQL的ADO连接字符串
- 那些炒作过度的技术和概念
- Ridge Regression - 岭回归
- [译]菊花驱动开发(ADD)
- Combination Sum II - Leetcode
- openwrt使用u盘扩展系统空间
- Leetcode NO.128 Longest Consecutive Sequence
- 间在掌心弥漫,浸入指间
- 横绕小桥,蜿蜒曲折,桥连
- [ LeetCode] Remove Duplicates from Sorted Array
- Leetcode NO.118 Pascal's Triangle