machine learning in coding(python):使用xgboost构建预测模型
来源:互联网 发布:red hat linux 6.0 编辑:程序博客网 时间:2024/05/21 19:23
接上篇:http://blog.csdn.net/mmc2015/article/details/47304591
def xgboost_pred(train,labels,test):params = {}params["objective"] = "reg:linear"params["eta"] = 0.005params["min_child_weight"] = 6params["subsample"] = 0.7params["colsample_bytree"] = 0.7params["scale_pos_weight"] = 1params["silent"] = 1params["max_depth"] = 9 plst = list(params.items())#Using 5000 rows for early stopping. offset = 4000num_rounds = 10000xgtest = xgb.DMatrix(test)#create a train and validation dmatrices xgtrain = xgb.DMatrix(train[offset:,:], label=labels[offset:])xgval = xgb.DMatrix(train[:offset,:], label=labels[:offset])#train using early stopping and predictwatchlist = [(xgtrain, 'train'),(xgval, 'val')]model = xgb.train(plst, xgtrain, num_rounds, watchlist, early_stopping_rounds=120)preds1 = model.predict(xgtest,ntree_limit=model.best_iteration)#reverse train and labels and use different 5k for early stopping. # this adds very little to the score but it is an option if you are concerned about using all the data. train = train[::-1,:]labels = np.log(labels[::-1])xgtrain = xgb.DMatrix(train[offset:,:], label=labels[offset:])xgval = xgb.DMatrix(train[:offset,:], label=labels[:offset])watchlist = [(xgtrain, 'train'),(xgval, 'val')]model = xgb.train(plst, xgtrain, num_rounds, watchlist, early_stopping_rounds=120)preds2 = model.predict(xgtest,ntree_limit=model.best_iteration)#combine predictions#since the metric only cares about relative rank we don't need to averagepreds = (preds1)*1.4 + (preds2)*8.6return preds
(code from kaggle)
代码具体分析有时间写,欢迎吐槽。。。。
1 0
- machine learning in coding(python):使用xgboost构建预测模型
- machine learning in coding(python):使用交叉验证【选择模型超参数】
- machine learning in coding(python):根据关键字合并多个表(构建组合feature)
- machine learning in coding(python):使用贪心搜索【进行特征选择】
- 【知识】如何使用Amazon Machine Learning构建机器学习预测模型
- machine learning in coding(python):polynomial curve fitting,python拟合多项式
- machine learning in coding(python):pandas数据包DataFrame数据结构简介
- machine learning in coding(python):拼接原始数据;生成高次特征
- Machine Learning in Python
- Learning Scikit-learn Machine Learning in Python
- 如何做出一个更好的Machine Learning预测模型(推荐)
- scikit-learn: machine learning in Python系列(一)
- python构建指数平滑预测模型
- scikit-learn: machine learning in Python
- 【ML】【python】Machine Learning in Action
- Machine Learning in Python part 1
- Machine Learning in Python part 2
- Machine Learning in Python (Scikit-learn)-(转)
- ubuntu install wiznote
- 字符串和整数之间的相互转化
- C#.net下填充jQuery.datatable犯的二
- 棋牌服务器,首选佛山高防,大带宽,高防御,稳定顺畅
- 正则匹配
- machine learning in coding(python):使用xgboost构建预测模型
- 数据挖掘十大经典算法之K最近邻算法
- 8.5总结
- 人机交互,十进制和二进制的对话,捎带认清八进制(女友)和十六进制(我)的本质。
- STL容器之优先队列
- 希尔排序
- 6.Python基础 循环
- error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
- 给出现el表达式异常,但自我检查没有发现任何问题的朋友的忠告