Python机器学习算法速查
来源:互联网 发布:网络优化培训费多少 编辑:程序博客网 时间:2024/04/23 15:46
常见的机器学习算法
以下是最常用的机器学习算法,大部分数据问题都可以通过它们解决:
线性回归 (Linear Regression)
逻辑回归 (Logistic Regression)
决策树 (Decision Tree)
支持向量机(SVM)
朴素贝叶斯 (Naive Bayes)
K邻近算法(KNN)
K-均值算法(K-means)
随机森林 (Random Forest)
降低维度算法(Dimensionality Reduction Algorithms)
Gradient Boost和Adaboost算法
图1:主要是对sklearn中的主要方法进行分类
图2:分别对降维和参数查找的方法进行列举
图3:常用数据预处理方法
1.线性回归 (Linear Regression)
#Import Library#Import other necessary libraries like pandas, numpy...from sklearn import linear_model#Load Train and Test datasets#Identify feature and response variable(s) and values must be numeric and numpy arraysx_train=input_variables_values_training_datasetsy_train=target_variables_values_training_datasetsx_test=input_variables_values_test_datasets# Create linear regression objectlinear = linear_model.LinearRegression()# Train the model using the training sets and check scorelinear.fit(x_train, y_train)linear.score(x_train, y_train)#Equation coefficient and Interceptprint('Coefficient: \n', linear.coef_)print('Intercept: \n', linear.intercept_)#Predict Outputpredicted= linear.predict(x_test)
2.逻辑回归 (Logistic Regression)
#Import Libraryfrom sklearn.linear_model import LogisticRegression#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset# Create logistic regression objectmodel = LogisticRegression()# Train the model using the training sets and check scoremodel.fit(X, y)model.score(X, y)#Equation coefficient and Interceptprint('Coefficient: \n', model.coef_)print('Intercept: \n', model.intercept_)#Predict Outputpredicted= model.predict(x_test)
3.决策树 (Decision Tree)
#Import Library#Import other necessary libraries like pandas, numpy...from sklearn import tree#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset# Create tree object model = tree.DecisionTreeClassifier(criterion='gini') # for classification, here you can change the algorithm as gini or entropy (information gain) by default it is gini # model = tree.DecisionTreeRegressor() for regression# Train the model using the training sets and check scoremodel.fit(X, y)model.score(X, y)#Predict Outputpredicted= model.predict(x_test)
4.支持向量机(SVM)
#Import Libraryfrom sklearn import svm#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset# Create SVM classification object model = svm.SVC() # there is various option associated with it, this is simple for classification. You can refer link, for mo# re detail.# Train the model using the training sets and check scoremodel.fit(X, y)model.score(X, y)#Predict Outputpredicted= model.predict(x_test)
5.朴素贝叶斯 (Naive Bayes)
#Import Libraryfrom sklearn.naive_bayes import GaussianNB#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset# Create SVM classification object model = GaussianNB() # there is other distribution for multinomial classes like Bernoulli Naive Bayes, Refer link# Train the model using the training sets and check scoremodel.fit(X, y)#Predict Outputpredicted= model.predict(x_test)
6.K邻近算法(KNN)
#Import Libraryfrom sklearn.neighbors import KNeighborsClassifier#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset# Create KNeighbors classifier object model = KNeighborsClassifier(n_neighbors=6) # default value for n_neighbors is 5# Train the model using the training sets and check scoremodel.fit(X, y)#Predict Outputpredicted= model.predict(x_test)
7.K-均值算法(K-means )
#Import Libraryfrom sklearn.cluster import KMeans#Assumed you have, X (attributes) for training data set and x_test(attributes) of test_dataset# Create KNeighbors classifier object model model = KMeans(n_clusters=3, random_state=0)# Train the model using the training sets and check scoremodel.fit(X)#Predict Outputpredicted= model.predict(x_test)
8.随机森林 (Random Forest)
#random forest#import libraryfrom sklearn.ensemble import RandomForestClassifier#assumed you have x(predictor)and y(target) for training data set and x_test(predictor)of test_dataset#create random forest objectmodel=RandomForestClassifier()#train the model using the training sets and chek scoremodel.fit(x,y)#predict outputpredict=model.presort(x_test)
9.降低维度算法(Dimensionality Reduction Algorithms)
#Import Libraryfrom sklearn import decomposition#Assumed you have training and test data set as train and test# Create PCA obeject pca= decomposition.PCA(n_components=k) #default value of k =min(n_sample, n_features)# For Factor analysis#fa= decomposition.FactorAnalysis()# Reduced the dimension of training dataset using PCAtrain_reduced = pca.fit_transform(train)#Reduced the dimension of test datasettest_reduced = pca.transform(test)
10.Gradient Boost和Adaboost算法
#Import Libraryfrom sklearn.ensemble import GradientBoostingClassifier#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset# Create Gradient Boosting Classifier objectmodel= GradientBoostingClassifier(n_estimators=100, learning_rate=1.0, max_depth=1, random_state=0)# Train the model using the training sets and check scoremodel.fit(X, y)#Predict Outputpredicted= model.predict(x_test)
以下实例中predict数据时为了验证其拟合度,采用的是训练集数据作为参数,实际中应该采用的是测试集,不要被误导了!!!
参考:http://blog.csdn.net/han_xiaoyang/article/details/51191386
1 0
- python机器学习算法速查
- Python机器学习算法速查
- 机器学习算法-python实现
- 机器学习 Python kNN算法
- 《Python机器学习算法》勘误
- Python机器学习算法 推荐
- 机器学习算法python实现
- python KNN算法 机器学习
- 机器学习概念速查
- Python机器学习Numpy, Scipy, Pandas, Scikit-learn, Matplotlib, Keras, NN速查手册
- python机器学习之K近邻算法
- python机器学习之SMO算法
- python机器学习之adaboost元算法
- Python机器学习(二)--kNN算法实现
- Python机器学习(三)--决策树算法
- 机器学习之PYTHON实现KNN算法
- 机器学习 Python实现 贝叶斯算法
- 机器学习算法和python实践
- Android项目之文件下载
- 分离mybatis的mapper.java和mapper.xml文件
- NOIP 提高组第一式第一题——玩具迷题
- exe4j生成的exe反编译成java代码
- java 中LinkedList详解,附带一部分源码和ArrayList的比较
- Python机器学习算法速查
- 1.6:继承
- 用Spark学习矩阵分解推荐算法
- 常见C++面试题及基本知识点总结(一)
- STM32更换晶振后没及时修改定时器参数,导致运算出错
- Hibernate4-4 对象关系映射文件
- 自己动手系列——实现一个简单的LinkedLis
- 使用git命令行进行项目合并
- 【Java集合系列】---总体框架