xgboost算法的学习小案例

来源:互联网 发布:mac下制作u盘linux 编辑:程序博客网 时间:2024/06/05 08:33
# xgboost#预测集直接从pandas转入就行data_predict2 = data_predict.ix[:, 2:]#训练集的特征与对应的标签dtrain = xgb.DMatrix(data_train.ix[:, :-1], label=data_train.ix[:, -1], missing=np.nan)#训练的时候看的观测值,跟训练集是一样的dwatch = xgb.DMatrix(data_watch.ix[:, :-1], label=data_watch.ix[:, -1], missing=np.nan)#把预测的pandas转为Dmatrixdtest = xgb.DMatrix(data_predict2, missing=np.nan)dtrain2 = xgb.DMatrix(data_2hour_stack, label=data_2hour_label_stack, missing = np.nan)dtest2 = xgb.DMatrix(data_2hour_test_stack, missing = np.nan)#对应的参数调整param = {'silent': 1,         'eta': 0.1,         'nthread': 8,         'objective': 'reg:linear',         'eval_metric': 'rmse'         }circle = 1000watchlist = [(dtrain, 'train'), (dwatch, 'watch')]xgb_model = xgb.train(param, dtrain, num_boost_round=circle, evals=watchlist)predict_xgb = xgb_model.predict(dtest)

解决训练集与测试集特征维度最大值不一样的问题:
feature_names mismatch XGBoost错误解析

python和pandas数据类型之间的转化:
http://blog.csdn.net/flyfrommath/article/details/69388675

修改columns的方法
ypred1 = DataFrame(ypred1)
ypred1_column=list(ypred1.columns)
ypred1.rename(columns={ypred1_column[0]: ‘label’}, inplace=True)

参考的网址:
http://blog.csdn.net/leichaoaizhaojie/article/details/52629549
http://blog.csdn.net/u011089523/article/details/72812019
http://blog.csdn.net/lujiandong1/article/details/52743396

原创粉丝点击