[Machine Learning][Linear Regression]Feature Scaling
来源:互联网 发布:curl windows 64 下载 编辑:程序博客网 时间:2024/06/06 01:28
Introduction
When I use gradient descent to get the h(x) which is similar to ‘x^2 + 2*x + 1’, I find a problem that the alpha need to be small like 0.000001, otherwise the variable can’t be regressed, so that the training will become unbearable slow, so I use the Feature Scaling.
Concept
For example, there are two features.
The first feature ranges from [1,100], and the second feature ranges from [1,10000].
So if we draw the contour map, it may be like this( the image on the left):
So, if we narrow the feature to the range [-1,1] (or little small or large eg.[-3,3] (max) [-1/3,1/3] (min)), the gradient descent will be like the image on the right, which will take much less time.
So, we find the max and min data in each column to get the standard deviation, and make each column data divide this number.
For example, we have database X like this:
X =
1 90 8100
1 3 9
1 68 4624
1 43 1849
1 4 16
1 88 7744
1 76 5776
1 21 441
1 12 144
1 60 3600
1 5 25
1 35 1225
1 24 576
1 5 25
1 90 8100
1 62 3844
1 6 36
1 82 6724
1 77 5929
1 15 225
1 38 1444
1 48 2304
1 46 2116
1 92 8464
1 21 441
1 45 2025
Just by the feature Scaling:
X = [ones(m,1),X(:,2) ./ (max(X,[],1) - min(X,[],1))(1,2),X(:,3) ./ (max(X,[],1) - min(X,[],1))(1,3)]
We can get the X’:
X =
1.0000e+000 9.0909e-001 8.1008e-001
1.0000e+000 3.0303e-002 9.0009e-004
1.0000e+000 6.8687e-001 4.6245e-001
1.0000e+000 4.3434e-001 1.8492e-001
1.0000e+000 4.0404e-002 1.6002e-003
1.0000e+000 8.8889e-001 7.7448e-001
1.0000e+000 7.6768e-001 5.7766e-001
1.0000e+000 2.1212e-001 4.4104e-002
1.0000e+000 1.2121e-001 1.4401e-002
1.0000e+000 6.0606e-001 3.6004e-001
1.0000e+000 5.0505e-002 2.5003e-003
1.0000e+000 3.5354e-001 1.2251e-001
1.0000e+000 2.4242e-001 5.7606e-002
1.0000e+000 5.0505e-002 2.5003e-003
1.0000e+000 9.0909e-001 8.1008e-001
1.0000e+000 6.2626e-001 3.8444e-001
1.0000e+000 6.0606e-002 3.6004e-003
1.0000e+000 8.2828e-001 6.7247e-001
1.0000e+000 7.7778e-001 5.9296e-001
1.0000e+000 1.5152e-001 2.2502e-002
1.0000e+000 3.8384e-001 1.4441e-001
1.0000e+000 4.8485e-001 2.3042e-001
1.0000e+000 4.6465e-001 2.1162e-001
1.0000e+000 9.2929e-001 8.4648e-001
1.0000e+000 2.1212e-001 4.4104e-002
1.0000e+000 4.5455e-001 2.0252e-001
1.0000e+000 1.1111e-001 1.2101e-002
1.0000e+000 1.9192e-001 3.6104e-002
1.0000e+000 4.5455e-001 2.0252e-001
1.0000e+000 9.5960e-001 9.0259e-001
1.0000e+000 4.4444e-001 1.9362e-001
1.0000e+000 9.3939e-001 8.6499e-001
1.0000e+000 7.9798e-001 6.2416e-001
1.0000e+000 8.7879e-001 7.5698e-001
1.0000e+000 1.0000e+000 9.8020e-001
1.0000e+000 2.1212e-001 4.4104e-002
1.0000e+000 7.1717e-001 5.0415e-001
1.0000e+000 9.7980e-001 9.4099e-001
1.0000e+000 6.4646e-001 4.0964e-001
1.0000e+000 7.4747e-001 5.4765e-001
1.0000e+000 8.7879e-001 7.5698e-001
And this will only take thousands of steps, compared with the millions of steps.
Expand
Except Feature Scaling, we can use another way to optimise the gradient descent.
Some data may range from [0, 5000] ,while some data may range from [-200,200].
It’s obvious that [-1,1] is much better than [0,2].
So we can make every data to the range [-range,range].
Guided by this theropy, we can get the following expression:
This will also make the training faster.
- [Machine Learning][Linear Regression]Feature Scaling
- 【machine learning】linear regression
- Machine Learning—Linear Regression
- stanford machine learning, linear regression
- Machine Learning #Lab1# Linear Regression
- Machine Learning Notes - Linear Regression
- Machine Learning - Regularized Linear Regression
- Machine Learning Notes ——Linear Regression
- Stanford Machine Learning: (1). Linear Regression
- Machine Learning:Linear Regression With One Variable
- Machine Learning:Linear Regression With Multiple Variables
- Andrew Ng Machine Learning 专题【Linear Regression】
- Machine Learning - Linear Regression with One Variable
- Machine Learning - Linear Regression with Multiple Variables
- NOTE: Coursera-Machine learning-linear regression
- Programming Exercise 1: Linear Regression Machine Learning
- Stanford Machine Learning Note1 - Linear Regression
- [Coursera/Machine Learning]--Why do we need feature scaling?
- python之matplotlib基础组成
- 1036. 跟奥巴马一起编程(15)
- DStream 生成 RDD 实例详解
- linux命令-sed命令使用(2)
- Spark学习笔记(1)Spark初体验
- [Machine Learning][Linear Regression]Feature Scaling
- 多线程编程-线程同步方式介绍(二)
- 如何代表与JAXB空元素空值?(jaxb中引用@XmlNullPolicy仍然不能显示空节点)
- Hadoop学习之HDFS读流程
- MyEclipse中的常用快捷键
- u-boot移植--4、对于NAND FLASH的支持
- ldap学习笔记 1
- python使用pymongo访问MongoDB的基本操作,以及CSV文件导出
- 从论文中寻找知识、思路、最新研究成果