Hyperopt的使用

来源：互联网发布：美国的消防员知乎编辑：程序博客网时间：2024/06/05 11:12

对于机器学习工程师，要选择使用的模型，也要调整各个模型的参数，才找到最好的匹配。即使模型还可以，如果它的参数设置不匹配，同样无法输出好的结果。
常用的调参方式有 grid search 和 random search ，grid search 是全空间扫描，所以比较慢，random search 虽然快，但可能错失空间上的一些重要的点，精度不够，于是，贝叶斯优化出现了。
hyperopt是一种通过贝叶斯优化来调整参数的工具，对于像XGBoost这种参数比较多的算法，可以用它来获取比较好的参数值。

安装

pip install hyperopt

它会安装 networkx，如果运行异常，碰到 TypeError: 'generator' object is not subscriptable 的话，可以卸载并换成老版本来解决

pip uninstall networkxpip install networkx==1.11

几个重要的函数，变量和概念

fmin

from hyperopt import fmin, tpe, hpbest = fmin(    fn=lambda x: x,    space=hp.uniform('x', 0, 1),    algo=tpe.suggest,    max_evals=100)print best

以上面的函数为例，fmin寻找最佳匹配的 space ，使 fn 的函数返回值最小，采用了 tpe.suggest (tree of Parzen estimators) 的算法，反复尝试100次，最终得到的结果类似于 {'x': 0.000269455723739237}

space

对于变量的变化范围与取值概率，一般定义有这么几个

hp.choice(label, options) where options should be a python list or tuple. 如果需要枚举从[1, 100]，那么用choice，而不应该用quniform
hp.randint(label, upper) 返回从[0, upper)的随机整数，一般用作随机数的种子值。如果这个值会影响loss函数，那么需要考虑使用 quniform
hp.uniform(label, low, high) where low and high are the lower and upper bounds on the range. 限制上下边界的平均分布
hp.quniform(label, low, high) where low and high are the lower and upper bounds on the range. 但只取整数(round)，⌊[low,high]q⌋×q。返回可能是 1.0 这样的数值，如果参数类型有Interger的限制，需要显式做一个 int()的转换。
hp.loguniform(label, low, high) 返回的值在 [elow,ehigh] 之间，属于log uniform分布，取值偏聚集于前部，概率上类似于抛物线
hp.qloguniform(label, low, high, q) 返回值类似于 ⌊e[low,high]q⌋×q
hp.normal(label, mu, sigma) where mu and sigma are the mean and standard deviation σ , respectively. 正态分布，返回值范围没法限制。
hp.qnormal(label, mu, sigma, q)
hp.lognormal(label, mu, sigma)
hp.qlognormal(label, mu, sigma, q)

示例：

space = {    'x': hp.uniform('x', 0, 1),    'y': hp.normal('y', 0, 1),    'name': hp.choice('name', ['alice', 'bob']),}

整体变化比较多，感觉欠缺的点是：

对于choice，没有weight的支持

Trials

Trials只是用来记录每次eval的时候，具体使用了什么参数以及相关的返回值。这时候，fn的返回值变为dict，除了loss，还有一个status。Trials对象将数据存储为一个BSON对象，可以利用MongoDB做分布式运算。

from hyperopt import fmin, tpe, hp, STATUS_OK, Trialsfspace = {    'x': hp.uniform('x', -5, 5)}def f(params):    x = params['x']    val = x**2    return {'loss': val, 'status': STATUS_OK}trials = Trials()best = fmin(fn=f, space=fspace, algo=tpe.suggest, max_evals=50, trials=trials)print('best:', best)print('trials:')for trial in trials.trials[:2]:    print(trial)

对于STATUS_OK的返回，会统计它的loss值，而对于STATUS_FAIL的返回，则会忽略。输出类似于

best: {'x': 0.014420181637303776}trials:{'refresh_time': None, 'book_time': None, 'misc': {'tid': 0, 'idxs': {'x': [0]}, 'cmd': ('domain_attachment', 'FMinIter_Domain'), 'vals': {'x': [1.9646918559786162]}, 'workdir': None}, 'state': 2, 'tid': 0, 'exp_key': None, 'version': 0, 'result': {'status': 'ok', 'loss': 3.8600140889486996}, 'owner': None, 'spec': None}{'refresh_time': None, 'book_time': None, 'misc': {'tid': 1, 'idxs': {'x': [1]}, 'cmd': ('domain_attachment', 'FMinIter_Domain'), 'vals': {'x': [-3.9393509404526728]}, 'workdir': None}, 'state': 2, 'tid': 1, 'exp_key': None, 'version': 0, 'result': {'status': 'ok', 'loss': 15.518485832045357}, 'owner': None, 'spec': None}

可以通过这里面的值，把一些变量与loss的点绘图，来看匹配度。或者tid与变量绘图，看它搜索的位置收敛（非数学意义上的收敛）情况。
trials有这几种：

trials.trials - a list of dictionaries representing everything about the search
trials.results - a list of dictionaries returned by ‘objective’ during the search
trials.losses() - a list of losses (float for each ‘ok’ trial)
trials.statuses() - a list of status strings

使用的注意点

cross_val_score
- 对衡量的estimator，它默认返回的是一个array，包含K folder情况下的各次的评分，一般采用mean()。
- 需要确定这个estimator默认的 scoring 是什么，它的值是越大越匹配还是越小越匹配。如果自己指定了scoring，一定要确定这个scoring值的意义，切记切记！
- 而如果用户不指定，一般对于Classification类的estimator，使用accuracy，它是越小越好，那么，hyperopt里面的loss的值就应该是对这个值取负数，因为hyperopt通过loss最小取找最佳匹配。
可以把feature的normalize或者scale作为一个choice，然后看看是否更合适。如果更合适，best里面就会显示 normalize 为1
如下示例：

from sklearn.datasets import load_irisfrom sklearn import datasetsfrom sklearn.preprocessing import normalize, scalefrom hyperopt import fmin, tpe, hp, STATUS_OK, Trialsiris = load_iris()X = iris.datay = iris.targetdef hyperopt_train_test(params):    X_ = X[:]    # 因为下面的两个参数都不属于KNeighborsClassifier支持的参数，故使用后直接删除    if 'normalize' in params:        if params['normalize'] == 1:            X_ = normalize(X_)            del params['normalize']    if 'scale' in params:        if params['scale'] == 1:            X_ = scale(X_)            del params['scale']    clf = KNeighborsClassifier(**params)    return cross_val_score(clf, X_, y).mean()space4knn = {    'n_neighbors': hp.choice('n_neighbors', range(1,50)),    'scale': hp.choice('scale', [0, 1]),  # 必须是choice，不要用quniform    'normalize': hp.choice('normalize', [0, 1])}def f(params):    acc = hyperopt_train_test(params)    return {'loss': -acc, 'status': STATUS_OK}trials = Trials()best = fmin(f, space4knn, algo=tpe.suggest, max_evals=100, trials=trials)print best

一次同时比较多个模型，但需要设置一个比较大的 max_evals ，且它的运行时间会比较长

space = hp.choice('classifier_type', [    {        'type': 'naive_bayes', # BernoulliNB        'alpha': hp.uniform('alpha', 0.0, 2.0)    },    {        'type': 'svm',  # SVC        'C': hp.uniform('C', 0, 10.0),        'kernel': hp.choice('kernel', ['linear', 'rbf']),        'gamma': hp.uniform('gamma', 0, 20.0)    },    {        'type': 'randomforest', # RandomForestClassifier        'max_depth': hp.choice('max_depth', range(1,20)),        'max_features': hp.choice('max_features', range(1,5)),        'n_estimators': hp.choice('n_estimators', range(1,20)),        'criterion': hp.choice('criterion', ["gini", "entropy"]),        'scale': hp.choice('scale', [0, 1]),        'normalize': hp.choice('normalize', [0, 1])    },    {        'type': 'knn',  # KNeighborsClassifier        'n_neighbors': hp.choice('knn_n_neighbors', range(1,50))    }])

Hyperas

Hyperas 是一个wrapper，便于使用。好处是，模型定义的时候，直接通过 {{uniform(0, 1)}} {{choice(['relu', 'sigmoid'])}} 等赋值，可以不需要一个太大的space定义和一个复杂的fn（fn里面需要做参数的del和赋值等操作）。对于 NN这类模型，会显得更高效一些。
示例代码如下：

from hyperas.distributions import uniformdef create_model(x_train, y_train, x_test, y_test):    model = Sequential()    model.add(Dense(512, input_shape=(784,)))    model.add(Activation('relu'))    model.add(Dropout({{uniform(0, 1)}}))    model.add(Dense(512))    model.add(Activation('relu'))    model.add(Dropout({{uniform(0, 1)}}))    model.add(Dense(10))    model.add(Activation('softmax'))    # ... model fitting    score = model.evaluate(x_test, y_test, verbose=0)    accuracy = score[1]    return {'loss': -accuracy, 'status': STATUS_OK, 'model': model}

参考文献

Parameter Tuning with Hyperopt by Kris Wright
Official FMin doc

阅读全文

0 0