常用的调参方式有 grid search 和 random search ,grid search 是全空间扫描,所以比较慢,random search 虽然快,但可能错失空间上的一些重要的点,精度不够,于是,贝叶斯优化出现了。


pip install hyperopt

它会安装 networkx,如果运行异常,碰到 TypeError: 'generator' object is not subscriptable 的话,可以卸载并换成老版本来解决

pip uninstall networkxpip install networkx==1.11



from hyperopt import fmin, tpe, hpbest = fmin(    fn=lambda x: x,    space=hp.uniform('x', 0, 1),    algo=tpe.suggest,    max_evals=100)print best

以上面的函数为例,fmin寻找最佳匹配的 space ,使 fn 的函数返回值最小,采用了 tpe.suggest (tree of Parzen estimators) 的算法,反复尝试100次,最终得到的结果类似于 {'x': 0.000269455723739237}



  1. hp.choice(label, options) where options should be a python list or tuple. 如果需要枚举从[1, 100],那么用choice,而不应该用quniform
  2. hp.randint(label, upper) 返回从[0, upper)的随机整数,一般用作随机数的种子值。如果这个值会影响loss函数,那么需要考虑使用 quniform
  3. hp.uniform(label, low, high) where low and high are the lower and upper bounds on the range. 限制上下边界的平均分布
  4. hp.quniform(label, low, high) where low and high are the lower and upper bounds on the range. 但只取整数(round),[low,high]q×q。返回可能是 1.0 这样的数值,如果参数类型有Interger的限制,需要显式做一个 int()的转换。
  5. hp.loguniform(label, low, high) 返回的值在 [elow,ehigh] 之间,属于log uniform分布,取值偏聚集于前部,概率上类似于抛物线
  6. hp.qloguniform(label, low, high, q) 返回值类似于 e[low,high]q×q
  7. hp.normal(label, mu, sigma) where mu and sigma are the mean and standard deviation σ , respectively. 正态分布,返回值范围没法限制。
  8. hp.qnormal(label, mu, sigma, q)
  9. hp.lognormal(label, mu, sigma)
  10. hp.qlognormal(label, mu, sigma, q)


space = {    'x': hp.uniform('x', 0, 1),    'y': hp.normal('y', 0, 1),    'name': hp.choice('name', ['alice', 'bob']),}


  • 对于choice,没有weight的支持



from hyperopt import fmin, tpe, hp, STATUS_OK, Trialsfspace = {    'x': hp.uniform('x', -5, 5)}def f(params):    x = params['x']    val = x**2    return {'loss': val, 'status': STATUS_OK}trials = Trials()best = fmin(fn=f, space=fspace, algo=tpe.suggest, max_evals=50, trials=trials)print('best:', best)print('trials:')for trial in trials.trials[:2]:    print(trial)


best: {'x': 0.014420181637303776}trials:{'refresh_time': None, 'book_time': None, 'misc': {'tid': 0, 'idxs': {'x': [0]}, 'cmd': ('domain_attachment', 'FMinIter_Domain'), 'vals': {'x': [1.9646918559786162]}, 'workdir': None}, 'state': 2, 'tid': 0, 'exp_key': None, 'version': 0, 'result': {'status': 'ok', 'loss': 3.8600140889486996}, 'owner': None, 'spec': None}{'refresh_time': None, 'book_time': None, 'misc': {'tid': 1, 'idxs': {'x': [1]}, 'cmd': ('domain_attachment', 'FMinIter_Domain'), 'vals': {'x': [-3.9393509404526728]}, 'workdir': None}, 'state': 2, 'tid': 1, 'exp_key': None, 'version': 0, 'result': {'status': 'ok', 'loss': 15.518485832045357}, 'owner': None, 'spec': None}


  • trials.trials - a list of dictionaries representing everything about the search
  • trials.results - a list of dictionaries returned by ‘objective’ during the search
  • trials.losses() - a list of losses (float for each ‘ok’ trial)
  • trials.statuses() - a list of status strings


  • cross_val_score
    • 对衡量的estimator,它默认返回的是一个array,包含K folder情况下的各次的评分,一般采用mean()。
    • 需要确定这个estimator默认的 scoring 是什么,它的值是越大越匹配还是越小越匹配。如果自己指定了scoring,一定要确定这个scoring值的意义,切记切记!
    • 而如果用户不指定,一般对于Classification类的estimator,使用accuracy,它是越小越好,那么,hyperopt里面的loss的值就应该是对这个值取负数,因为hyperopt通过loss最小取找最佳匹配。
  • 可以把feature的normalize或者scale作为一个choice,然后看看是否更合适。如果更合适,best里面就会显示 normalize 为1
from sklearn.datasets import load_irisfrom sklearn import datasetsfrom sklearn.preprocessing import normalize, scalefrom hyperopt import fmin, tpe, hp, STATUS_OK, Trialsiris = load_iris()X = iris.datay = iris.targetdef hyperopt_train_test(params):    X_ = X[:]    # 因为下面的两个参数都不属于KNeighborsClassifier支持的参数,故使用后直接删除    if 'normalize' in params:        if params['normalize'] == 1:            X_ = normalize(X_)            del params['normalize']    if 'scale' in params:        if params['scale'] == 1:            X_ = scale(X_)            del params['scale']    clf = KNeighborsClassifier(**params)    return cross_val_score(clf, X_, y).mean()space4knn = {    'n_neighbors': hp.choice('n_neighbors', range(1,50)),    'scale': hp.choice('scale', [0, 1]),  # 必须是choice,不要用quniform    'normalize': hp.choice('normalize', [0, 1])}def f(params):    acc = hyperopt_train_test(params)    return {'loss': -acc, 'status': STATUS_OK}trials = Trials()best = fmin(f, space4knn, algo=tpe.suggest, max_evals=100, trials=trials)print best
  • 一次同时比较多个模型,但需要设置一个比较大的 max_evals ,且它的运行时间会比较长
space = hp.choice('classifier_type', [    {        'type': 'naive_bayes', # BernoulliNB        'alpha': hp.uniform('alpha', 0.0, 2.0)    },    {        'type': 'svm',  # SVC        'C': hp.uniform('C', 0, 10.0),        'kernel': hp.choice('kernel', ['linear', 'rbf']),        'gamma': hp.uniform('gamma', 0, 20.0)    },    {        'type': 'randomforest', # RandomForestClassifier        'max_depth': hp.choice('max_depth', range(1,20)),        'max_features': hp.choice('max_features', range(1,5)),        'n_estimators': hp.choice('n_estimators', range(1,20)),        'criterion': hp.choice('criterion', ["gini", "entropy"]),        'scale': hp.choice('scale', [0, 1]),        'normalize': hp.choice('normalize', [0, 1])    },    {        'type': 'knn',  # KNeighborsClassifier        'n_neighbors': hp.choice('knn_n_neighbors', range(1,50))    }])


Hyperas 是一个wrapper,便于使用。好处是,模型定义的时候,直接通过 {{uniform(0, 1)}} {{choice(['relu', 'sigmoid'])}} 等赋值,可以不需要一个太大的space定义和一个复杂的fn(fn里面需要做参数的del和赋值等操作)。对于 NN这类模型,会显得更高效一些。

from hyperas.distributions import uniformdef create_model(x_train, y_train, x_test, y_test):    model = Sequential()    model.add(Dense(512, input_shape=(784,)))    model.add(Activation('relu'))    model.add(Dropout({{uniform(0, 1)}}))    model.add(Dense(512))    model.add(Activation('relu'))    model.add(Dropout({{uniform(0, 1)}}))    model.add(Dense(10))    model.add(Activation('softmax'))    # ... model fitting    score = model.evaluate(x_test, y_test, verbose=0)    accuracy = score[1]    return {'loss': -accuracy, 'status': STATUS_OK, 'model': model}


