Pyhton数据挖掘-电力窃漏电用户的自动识别

来源：互联网发布：unity3d 粒子系统火焰编辑：程序博客网时间：2024/04/29 15:30

概述

本来主要是对博主在Python数据分析与挖掘实战的上第六章实践过程中所出现问题的总结，看本文的之前最好是看过这本书的第六章。

问题一：采用anaconda环境下安装tensorFlow后，pycharm识别不了TensorFlow。

情况是这样的，博主在首先安装好tensorFlow后，再安装keras成功后，pycharm识别不了TensorFlow。
然后设置python解析器路径，如图：
这里写图片描述
有好几个路径，之前不管是设置envs下面的TensorFlow下面的python.exe，还是anaconda下面的，都会出错，其中设置TensorFlow下面的时候，TensorFlow是正常的，但是运行代码会出现找不到anaconda下面的包，比如pandas，keras等，设置anaconda后，又会找不到TensorFlow下面的包ModuleNotFoundError: No module named 'tensorflow'。两者不能共存，之后再这篇博客下找到了答案，http://blog.csdn.net/zuochao_2013/article/details/72453416，非常感谢博主的帮助，其中有非常大的一个坑，现在anaconda官网的64位版本是4.4.0，千万不要安装这个版本，要安装64位4.2.0版本。安装TensorFlow也推荐那个博客链接。

问题二：使用书本上的代码跑不起来。

书上代码如下

#-*- coding: utf-8 -*-import pandas as pdfrom random import shuffledatafile = '../data/model.xls'data = pd.read_excel(datafile)data = data.as_matrix()shuffle(data)p = 0.8 #设置训练数据比例train = data[:int(len(data)*p),:]test = data[int(len(data)*p):,:]#构建LM神经网络模型from keras.models import Sequential #导入神经网络初始化函数from keras.layers.core import Dense, Activation #导入神经网络层函数、激活函数netfile = '../tmp/net.model' #构建的神经网络模型存储路径net = Sequential() #建立神经网络net.add(Dense(units = 10,input_dim = 3)) #添加输入层（3节点）到隐藏层（10节点）的连接net.add(Activation('relu')) #隐藏层使用relu激活函数net.add(Dense(units = 1)) #添加隐藏层（10节点）到输出层（1节点）的连接net.add(Activation('sigmoid')) #输出层使用sigmoid激活函数net.compile(loss = 'binary_crossentropy', optimizer = 'adam', class_mode = "binary") #编译模型，使用adam方法求解 此方法已经废掉了。要改成如下方法。# net.compile(loss='mse', optimizer='sgd', metrics=['accuracy'])net.fit(train[:,:3], train[:,3], epochs=1000, batch_size=1) #训练模型，循环1000次net.save_weights(netfile) #保存模型predict_result = net.predict_classes(train[:,:3]).reshape(len(train)) #预测结果变形#'''这里要提醒的是，keras用predict给出预测概率，predict_classes才是给出预测类别，而且两者的预测结果都是n x 1维数组，而不是通常的 1 x n'''from code.cm_plot import * #导入自行编写的混淆矩阵可视化函数cm_plot(train[:,3], predict_result).show() #显示混淆矩阵可视化结果from sklearn.metrics import roc_curve #导入ROC曲线函数import matplotlib.pyplot as pltpredict_result = net.predict(test[:,:3]).reshape(len(test))fpr, tpr, thresholds = roc_curve(test[:,3], predict_result, pos_label=1)plt.plot(fpr, tpr, linewidth=2, label = 'ROC of LM') #作出ROC曲线plt.xlabel('False Positive Rate') #坐标轴标签plt.ylabel('True Positive Rate') #坐标轴标签plt.ylim(0,1.05) #边界范围plt.xlim(0,1.05) #边界范围plt.legend(loc=4) #图例plt.show() #显示作图结果

这时会报错

Traceback (most recent call last):  File "C:/Users/黄海峰/PycharmProjects/anaconda/code/6-3_lm_model.py", line 29, in <module>    net.fit(train[:,:3], train[:,3], epochs=1000, batch_size=1) #训练模型，循环1000次  File "D:\Program Files\Anaconda3\lib\site-packages\keras\models.py", line 863, in fit    initial_epoch=initial_epoch)  File "D:\Program Files\Anaconda3\lib\site-packages\keras\engine\training.py", line 1430, in fit    initial_epoch=initial_epoch)  File "D:\Program Files\Anaconda3\lib\site-packages\keras\engine\training.py", line 1079, in _fit_loop    outs = f(ins_batch)  File "D:\Program Files\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py", line 2268, in __call__    **self.session_kwargs)TypeError: run() got an unexpected keyword argument 'class_mode'

这个可以查看keras的文档解决http://keras-cn.readthedocs.io/en/latest/other/objectives/，发现最新版本的keras2.0.2class_mode已经没了,取而代之的是metrics=[‘accuracy’]，
还会报错

You are passing a target array of shape (4231, 1) while using as loss `categorical_crossentropy`. `categorical_crossentropy` expects targets to be binary matrices (1s and 0s) of shape (samples, classes). If your targets are integer classes, you can convert them to the expected format via:from keras.utils.np_utils import to_categoricaly_binary = to_categorical(y_int)Alternatively, you can use the loss function `sparse_categorical_crossentropy` instead, which does expect integer targets.

最后经过查文档改代码，终于可以运行了。这里写图片描述

阅读全文

1 0