多项式回归(polynomial regression)转换为线性回归(linear regression)

来源:互联网 发布:杜兰特技术特点知乎 编辑:程序博客网 时间:2024/06/05 03:00

一、介绍

一元m次多项式回归方程:


二元二次多项式回归方程:


多元多次的多项式回归方程较复杂,加之实际生产生活中一元m次多项式归回就已经能够解决了,所以略!


对于一元m次多项式回归方程,令:


则该一元m次多项式就转化为m元线性回归方程:


因此,用多元线性函数的回归方法就可解决多项式回归问题!需要指出的是,在多项式回归分析中,检验回归系数是否显著,实质上就是判断自变量xi次方项对因变量y的影响是否显著。


对于二元二次多项式回归方程,令:


则该二元二次多项式函数就转化为五元线性回归方程:



二、一元m次多项式回归的最小二乘解


用矩阵表示他们的关系:


用矩阵符号表示:


此处推导过程忽略(参考线性回归最小二乘解的推导过程,基本过程是对每一个参数求偏导,令偏导 = 0,解联立方程组即可),最小二乘法解:


三、Python环境下利用sklearn库写的简单示例

import numpy as npfrom sklearn.linear_model import LinearRegressionfrom sklearn.preprocessing import PolynomialFeaturesimport matplotlib.pyplot as plt# seedrng = np.random.RandomState(123)# construct samples. give a x, generate y with noisedef genY(x):a0, a1, a2, a3, e = 0.1, -0.02, 0.03, -0.04, 0.05yr = a0 + a1*x + a2*(x**2) + a3*(x**3) + ey = yr + 0.03*rng.rand(1)return y# plotplt.figure()plt.title('polynomial regression(sklearn)')plt.xlabel('x')plt.ylabel('y')plt.grid(True)x_tup = np.linspace(-1, 1, 30)y = [genY(a) for a in x_tup]print yx = x_tup.reshape(-1,1)y = np.array(y).reshape(-1,1)plt.plot(x, y, 'k.')qf = PolynomialFeatures(degree = 3)qModel = LinearRegression()qModel.fit(qf.fit_transform(x), y)print '----'print qf.get_params()xp = np.linspace(-1, 2, 100)yp = qModel.predict(qf.transform(xp.reshape(-1, 1)))plt.plot(xp, yp, 'r-')plt.show()

注释:PolynomialFeatures类的成员函数fit_transform根据自变量元数和指数次数(degree)转换成线性回归中的自变量,然后利用线性回归LinearRegression进行拟合。运行结果如下:



四、除了利用最小二乘直接解出参数的值外,也可以用梯度下降法最小化损失函数来训练出参数的值

#-*- coding:utf-8 -*-import numpy as npimport tensorflow as tfimport matplotlib.pyplot as pltrng = np.random.RandomState(123)def genY(x):    a0, a1, a2, a3, e = 0.1, -0.02, 0.03, -0.04, 0.05    yr = a0 + a1*x + a2*(x**2) + a3*(x**3) + e    y = yr + 0.03*rng.rand(1)    return y    plt.figure()plt.title('polynomial regression(tensorflow)')plt.xlabel('x')plt.ylabel('y')plt.grid(True)x = np.linspace(-1, 1, 30)y = [genY(a) for a in x]x = x.reshape(-1,1)y = np.array(y).reshape(-1,1)plt.plot(x, y, 'k.')X = tf.placeholder('float')Y = tf.placeholder('float')W = tf.Variable([0.] * 4)print Wdef Model(x, w):    terms = []    for i in range(0, 4):        term = tf.multiply(w[i], tf.pow(x, i))        terms.append(term)    rs = tf.add_n(terms)    return rsYModel = Model(X, W)Cost = tf.reduce_sum(tf.square(Y - YModel))LearnRate = 0.01train_op = tf.train.GradientDescentOptimizer(LearnRate).minimize(Cost)with tf.Session() as sess:Init = tf.global_variables_initializer()sess.run(Init)for i in range(0, 100):for (_x, _y) in zip(x, y):sess.run(train_op, feed_dict = {X: _x, Y: _y})print sess.run(W)xp = np.linspace(-1, 2, 100)yp = 0for i in range(0, 4):yp += sess.run(W)[i] * np.power(xp, i)plt.plot(xp, yp, 'g-')plt.show()