Machine Learning_实例1_SpeakingDetection

来源：互联网发布：魔法王座炮弩升阶数据编辑：程序博客网时间：2024/05/21 11:30

[Assignment Requirements]

Write a program to detect whether a person in a video speaks or not.

Features:
- S: Depicts the motion of mouth;
- V: Depicts the degree of mouth opening.
Feature vector:
- X = [Si-1 Si Si+1 Vi-1 Vi Vi+1]
  So that, the input feature vector X is an N*6 matrix, where N is the number of feature vectors.
Label:
- +1: Speaking
- -1: Not-speaking
  So that, the output label vector is an N*1 vector predY, predY(i)=-1 or 1.

[Solutions]

SVM
使用Scikit-learn，即sklearn，一个Python的机器学习包。
对于SVM问题，我们可以调用sklearn包中基于libsvm的svm模块。对于本次任务中的分类问题，我们调用其中的SVC函数（针对回归问题，可以调用SVR函数）。
任务实现的主要步骤为：
数据预处理-加载数据集-训练模型（网格搜索参数）-测试模型
- 数据预处理
  原数据集training.data格式为：
  
  我们通过FormatDatalibsvm.xls，按照
  http://blog.csdn.net/pangpang1239/article/details/7435842
  中所说的方式将数据处理为libsvm要求的格式，得到training.txt：
- 加载数据集
  获得特征向量xData和标签向量yData：
  xData, yData = datasets.load_svmlight_file(“training.txt”)
- 我们再在数据中留出一部分验证集：
```
training_data_x, test_data_x, training_data_y, test_data_y = train_test_split(xData, yData)
```
  train_test_split()函数是交叉验证中常用的函数，
  功能是：从样本中随机按比例选取trainData和testData；
  参数有：train_data, train_target, test_size, random_state，
  train_data：所要划分的样本特征集
  train_target：所要划分的样本结果
  test_size：样本占比，默认为0.25，即验证集占25%
  random_state：是随机数的种子，为0或者不填，则每次产生的随机数不同
- 训练模型
  通过网格搜索的方式对参数空间进行测试，寻求最佳的参数。
```
C = np.logspace(-1,1,5,base=2)gamma = np.logspace(-2,5,5,base=2)param_grid = dict(C=C, gamma=gamma)grid = GridSearchCV(estimator=clf,param_grid=param_grid,n_jobs=-1)
```
  得到grid.best_estimator_.C和grid.best_estimator_.gamma
  即，得到了最后的SVC模型：
```
clf = SVC()clf.fit(training_data_x, training_data_y)SVC(C=bestC, cache_size=200, class_weight=None, coef0=0.0,decision_function_shape=None, degree=3, gamma=bestGamma, kernel='rbf',max_iter=-1, probability=False, random_state=None, shrinking=True,tol=0.001, verbose=True)
```
  保存模型：
```
joblib.dump(clf, 'svcmodel.pkl')
```
- 测试模型
  定义接口函数 speakingDetection.py :
```
def speakingDetection(X):    clf = SVC()    clf = joblib.load('svcmodel.pkl')    predY = clf.predict(X)    return predY
```
  输入test_data_x 得到prediction
```
prediction = speakingDetection(test_data_x)
```
  得到测试报告
```
report = classification_report(test_data_y, prediction)
```
- 函数调用说明
  文件夹SVM下的speakingDetection.py是用python编写的接口函数，
  调用形式可以参照SK_TestSVM.py，主要就是
  prediction = speakingDetection(test_data_x)这一句，就能得到preY。
  训练得到的SVC模型为svcmodel.pkl
NN
使用keras，一个用python编写的基于Theano/tensorflow的深度学习框架，是一个高度模块化的神经网络库。
（非常）易于上手，把很多内部运算都隐藏了，类似一个黑箱，调用API就行了。
有扩展性，可以用theano或TensorFlow的语句来写扩展功能。
任务实现的主要步骤为：
数据预处理-加载数据集-训练模型-测试模型
值得一提的是，神经网络做二分类问题时，label值为0或1，故需要将原label进行处理。
除此之外，数据预处理&加载数据集和SVM部分中一致，以下略去不表。
- 训练模型
  通过Sequential()初始化一个神经网络
  通过add方法添加一层神经网，需要添加输入层、隐层、输出层。
  通过input_dim定义输出维度，units定义输出维度，activation定义激励函数。
  本次任务中，定义optimizer为SGD，loss为sparse_categorical_crossentropy，即稀疏的多类的对数损失，epochs为迭代次数，batch_size=128.
```
    #Create a model    model = Sequential()    # Input    model.add(Dense(64, activation='relu', input_dim=6))    model.add(Dropout(0.5))    # Hidden    model.add(Dense(64, activation='relu'))    model.add(Dropout(0.5))    model.add(Dense(64, activation='relu'))    model.add(Dropout(0.5))    # Output    model.add(Dense(2, activation='softmax'))    sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)    model.compile(loss='sparse_categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])    model.fit(training_data_x, training_data_y, epochs=3000, batch_size=128)
```
保存模型
```
model.save('nn_model.h5')
```
- 测试模型
  定义接口函数 speakingDetection.py :
```
def speakingDetection(X):    model = Sequential()    model = load_model('nn_model.h5')    predY = model.predict_classes(X, batch_size=128)    return predY
```
  输入test_data_x 得到prediction
```
prediction = speakingDetection(test_data_x)
```
  得到测试报告
```
report = classification_report(test_data_y, prediction)
```
- 函数调用说明
  文件夹NN下的speakingDetection.py是用python编写的接口函数，
  调用形式可以参照NN_Test.py，主要就是
  prediction = speakingDetection(test_data_x)这一句，就能得到preY。
  训练得到的NN模型为nn_model.h5

[Results]

在随机截取的25%的验证集上的正确率：SVM：74.92%NN：77.37%

[踩过的坑]

最一开始还是想用Matlab做的，但是在调用libsvm库的时候总是报错，编译.c文件也失败，只好转战python，后来才知道是Matlab版本太低，换上2015版，什么问题都没了；
转战python后，本来想直接使用调用libsvm库，但只是跑了个10*11的网格进行参数搜索就快要奔溃了，跑一个参数pair大概耗时20mins+，充分理解到了SVM的“计算量大”说的是什么，当然电脑配置不高是主要原因，只能果断放弃，选择sklearn，果然又快又方便（关于其运行速度比sklearn慢的问题，推测是sklearn做了某些优化）；
本次神经网络的编写完全速成，大概花了半天时间，查了网上的教程就基本完成本次的神经网络任务的初步编写，若是要扎实学习，还是要直接使用Tensorflow做实践吧。

最后附上本次实例的github：

https://github.com/BigRabbit71/speakingDetection

阅读全文

1 0