Andrew Ng 深度学习课程deeplearning.ai 编程作业——shallow network for datesets classification (1-3)
来源:互联网 发布:ftp做yum源 编辑:程序博客网 时间:2024/06/06 04:02
Planar data classification with one hidden layer
1.常用的Python Library
numpy:is the fundamental package for scientific computing with Python
sklearn:provides simple and efficient tools for data mining and data analysis
matplotlib: is a library for plotting graphs in Python
import numpy as npimport sklearn import matplotlib.pyplot as plt
2.Dataset
随机生成数据集,如下生成形如“flower”的数据集并进行可视化:
def load_planar_dataset(): #generate two random array X and Y np.random.seed(1) m=400 #样本的数量 N=int(m/2) #每一类的数量,共有俩类数据 D=2 #维数,二维数据 X=np.zeros((m,D)) # 生成(m,2)独立的样本 Y=np.zeros((m,1),dtype="uint8") #生成(m,1)矩阵的样本 a=4 #maximum ray of the flower for j in range(2): ix=range(N*j,N*(j+1)) #范围在[N*j,N*(j+1)]之间 t=np.linspace(j*3.12,(j+1)*3.12,N)+np.random.randn(N)*0.2 #theta r=a*np.sin(4*t)+np.random.randn(N)*0.2 #radius X[ix]=np.c_[r*np.sin(t),r*np.cos(t)] # (m,2),使用np.c_是为了形成(m,2)矩阵 Y[ix]=j X=X.T #(2,m) Y=Y.T # (1,m) return X,Y
对上述数据进行可视化;
plt.scatter(X[0,:],X[1,:],c=Y,s=40,cmap=plt.cm.Spectral)plt.show()
3.简单的Logistic Regression
clf=sklearn.linear_model.LogisticRegressionCV() #逻辑回归分类器clf.fit(X.T,Y.T) #对数据进行拟合 X.T (400,2) Y.T (400,1),同时完成迭代
进行训练之后,我们需要画分边界,对这些数据进行分类,如果是用简单的Logistic Regression的话,可能得到的效果不是很好,代码如下:
def plot_decision_boundary(model, X, y): # Set min and max values and give it some padding x_min, x_max = X[0, :].min() - 1, X[0, :].max() + 1 #得出x轴第一行的最小和第二行的最大 y_min, y_max = X[1, :].min() - 1, X[1, :].max() + 1 #得出y周第一行的最小和第二行的最大 h = 0.01 #step # Generate a grid of points with distance h between them xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) # Predict the function value for the whole grid Z = model(np.c_[xx.ravel(), yy.ravel()]) #np.c_[xx.ravel(),yy.ravel()]得到覆盖图像的点,这里需要知道.ravel(),np.c_[] 的操作,并对这些点进行预测 Z = Z.reshape(xx.shape) #进行reshape一下保证是((y_max-y_min)/0.01,(x_max-x_min)/0.01),就是保证覆盖了整个平面 # Plot the contour and training examples plt.contour(xx, yy, Z, cmap=plt.cm.Spectral) plt.ylabel('x2') plt.xlabel('x1') plt.scatter(X[0, :], X[1, :], c=y, cmap=plt.cm.Spectral)
那么可以执行以下结果:
plot_decision_boundary(lambda x:clf.predict(x),X,Y)plt.title("Logistic Regression")plt.show()LR_predictions=clf.predict(X.T) #注意这里是将X中的点给放了进去,不是整个画面的点print("Accuracy of logistic regression:%d"% float((np.dot(Y,LR_predictions)+np.dot(1-Y,1-LR_predictions))/float(Y.size)*100)+'%')
Accuracy of logistic regression:47%
4.Neural Netwokrk model
(1)通过输入和输出来判断输入层和输出层的尺寸,同时定义中间层(这里只有一层中间层):
def layer_size(X,Y): n_x=X.shape[0] n_h=4 n_y=Y.shape[0] return (n_x,n_h,n_y)
——————————————————————————————————————-
测试上述函数是否准确,我们用一个函数来随机生成X和Y矩阵,并对上面函数进行评价:
def layer_size_test_case(): ##测试layer_size(X,Y)这个函数是否正确 np.random.seed(1) #这里将随机初始化的种子设为1 x_assess=np.random.randn(5,3) y_assess=np.random.randn(2,3) return x_assess,y_assessx_assess,y_assess=layer_size_test_case()(n_x,n_h,n_y)=layer_size(x_assess,y_assess)print ("n_x"+str(n_x))print ("n_h"+str(n_h))print ("n_y"+str(n_y))outout:n_x 5n_h 4n_y 2
———————————————————————————————————————————-
(2) 初始化参数(initialize the parameter)
def initialize_parameters(n_x,n_h,n_y): np.random.seed(2) W1=np.random.randn(n_h,n_x) b1=np.zeros((n_h,1)) W2=np.random.randn(n_y,n_h) b2=np.zeros((n_y,1)) assert(W1.shape==(n_h,n_x)) assert(b1.shape==(n_h,1)) assert(W2.shape==(n_y,n_h)) assert(b2.shape==(n_y,1)) parameter={"W1":W1, "b1":b1, "W2":W2, "b2":b2 } return parameterparameter=initialize_parameters(n_x,n_h,n_y)print("W1"+str(parameter["W1"]))print("b1"+str(parameter["b1"]))print("W2"+str(parameter["W2"]))print("b2"+str(parameter["b2"]))
W1[[-0.41675785 -0.05626683 -2.1361961 1.64027081 -1.79343559] [-0.84174737 0.50288142 -1.24528809 -1.05795222 -0.90900761] [ 0.55145404 2.29220801 0.04153939 -1.11792545 0.53905832] [-0.5961597 -0.0191305 1.17500122 -0.74787095 0.00902525]]b1[[ 0.] [ 0.] [ 0.] [ 0.]]W2[[-0.87810789 -0.15643417 0.25657045 -0.98877905] [-0.33882197 -0.23618403 -0.63765501 -1.18761229]]b2[[ 0.] [ 0.]]
(3) 执行正向传播(forward propagation)
def forward_propagation_test_case(): ##测试forward_propagation()这个函数是否正确 np.random.seed(1) x_assess=np.random.randn(2,3) parameters = {'W1': np.array([[-0.00416758, -0.00056267], [-0.02136196, 0.01640271], [-0.01793436, -0.00841747], [ 0.00502881, -0.01245288]]), 'W2': np.array([[-0.01057952, -0.00909008, 0.00551454, 0.02292208]]), 'b1': np.array([[ 0.], [ 0.], [ 0.], [ 0.]]), 'b2': np.array([[ 0.]])} return x_assess,parameters
def forward_propagation(X,parameter): W1=parameter["W1"] b1=parameter["b1"] W2=parameter["W2"] b2=parameter["b2"] Z1=np.dot(W1,X)+b1 A1=(np.exp(Z1)-np.exp(-Z1))/(np.exp(Z1)+np.exp(-Z1)) assert(A1.shape==(n_h,X.shape[1])) Z2=np.dot(W2,A1)+b2 A2=1/(1+np.exp(-Z2)) assert(A2.shape==(1,X.shape[1])) cache={"Z1":Z1,"A1":A1,"Z2":Z2,"A2":A2} return cachex_assess,parameters=forward_propagation_test_case()cache=forward_propagation(x_assess,parameters)print (np.mean(cache['Z1']),np.mean(cache['A1']),np.mean(cache['Z2']),np.mean(cache['A2']))
output:
(-0.00049975577774199131, -0.00049696335323178595, 0.00043818745095914593, 0.50010954685243103)
(4)紧接着计算代价函数:
def compute_cost(A2,Y_assess,parameters): m=Y_assess.shape[1] cost=(-1.0/m)*np.sum(np.multiply(np.log(A2),Y_assess)+np.multiply(np.log(1-A2),1-Y_assess)) #这个式子执行了(13)的公式 cost=np.squeeze(cost) assert(isinstance(cost,float)) return cost
def compute_cost_test_case(): #为了测试compute_cost这个函数是否存在 np.random.seed(1) Y_assess = np.random.randn(1, 3) parameters = {'W1': np.array([[-0.00416758, -0.00056267], [-0.02136196, 0.01640271], [-0.01793436, -0.00841747], [ 0.00502881, -0.01245288]]), 'W2': np.array([[-0.01057952, -0.00909008, 0.00551454, 0.02292208]]), 'b1': np.array([[ 0.], [ 0.], [ 0.], [ 0.]]), 'b2': np.array([[ 0.]])} a2 = (np.array([[ 0.5002307 , 0.49985831, 0.50023963]])) return a2, Y_assess, parametersA2,Y_assess,parameters=compute_cost_test_case()cost=compute_cost(A2,Y_assess,parameters)print("cost "+str(cost))
cost 0.692919893776
(5)反向传播(back_propagation)
反向传播在深度学习是最难的部分,以下是反向传播公式的公式:
def backward_propagation(parameters,cache,X,Y): m=X.shape[1] W2=parameters["W2"] A1=cache["A1"] A2=cache["A2"] dZ2=A2-Y dW2=(1.0/m)*np.dot(dZ2,A1.T) db2=(1.0/m)*np.sum(dZ2,axis=1,keepdims=True) dZ1=np.dot(W2.T,dZ2)*(1-np.power(A1,2)) dW1=(1.0/m)*np.dot(dZ1,X.T) db1=(1.0/m)*np.sum(dZ1,axis=1,keepdims=True) grads={"dW1":dW1, "db1":db1, "dW2":dW2, "db2":db2} return grads
(6)更新其参数(udate parameter)
def update_parameters(parameters,grads,learning_rates=1.2): W1=parameters["W1"] W2=parameters["W2"] b1=parameters["b1"] b2=parameters["b2"] dW1=grads["dW1"] dW2=grads["dW2"] db1=grads["db1"] db2=grads["db2"] W1=W1-learning_rates*dW1 b1=b1-learning_rates*db1 W2=W2-learning_rates*dW2 b2=b2-learning_rates*db2 parameters={"W1":W1, "b1":b1, "W2":W2, "b2":b2} return parameters
(7)整合以上功能(integrate above part in nn_model)
def nn_model(X, Y, n_h, num_iterations = 10000, print_cost=False): np.random.seed(3) n_x = layer_sizes(X, Y)[0] n_y = layer_sizes(X, Y)[2] parameters = initialize_parameters(n_x, n_h, n_y) #初始化元素 costs=[] #创建损失函数的list for i in range(0, num_iterations): A2, cache = forward_propagation(X, parameters) #正向传播 cost = compute_cost(A2, Y) costs.append(cost) # Backpropagation. Inputs: "parameters, cache, X, Y". Outputs: "grads". grads = backward_propagation(parameters, cache, X, Y) # Gradient descent parameter update. Inputs: "parameters, grads". Outputs: "parameters". parameters = update_parameters(parameters, grads, learning_rates = 1.2) #更新元素 ### END CODE HERE ### # Print the cost every 1000 iterations if print_cost and i % 1000 == 0: print ("Cost after iteration %i: %f" %(i, cost)) return parameters,costsdef predict(parameters,X): #预测元素结果 A2,cache=forward_propagation(X,parameters) predictions=(A2>0.5) return predictions
Cost after iteration 0: 1.127380Cost after iteration 1000: 0.288553Cost after iteration 2000: 0.276386Cost after iteration 3000: 0.268077Cost after iteration 4000: 0.263069Cost after iteration 5000: 0.259617Cost after iteration 6000: 0.257070Cost after iteration 7000: 0.255105Cost after iteration 8000: 0.253534Cost after iteration 9000: 0.252245
(8)进行不同学习率及隐含层神经元个数的测试
learning_rate=[0.01,0.05,0.1,1.5,2.0] #不同学习率的测试cost_dic={}parameter_dic={}for i in learning_rate: parameter_dic[str(i)],cost_dic[str(i)]=nn_model(X, Y, n_h=4, num_iterations = 10000,learning_rates=i,print_cost=False)for i in learning_rate: plt.plot(np.squeeze(cost_dic[str(i)]),label=(str(i)+" learning rates"))plt.xlabel=('iteration')plt.ylabel=('cost')legend = plt.legend(loc='upper center', shadow=True)frame = legend.get_frame()frame.set_facecolor('0.90')plt.show()plt.figure(figsize=(16, 32)) #调整显示的figure大小hidden_layer_sizes=[1,2,3,4,5,20,50] #不同的隐含层的个数precision=[]for i,n_h in enumerate(hidden_layer_sizes): plt.subplot(5,4,i+1) tic=time.time() parameters,costs=nn_model(X,Y,n_h,num_iterations=5000) plot_decision_boundary(lambda x: predict(parameters, x.T), X, Y) toc=time.time() predictions = predict(parameters, X) time_consumption=toc-tic accuracy_per= float((np.dot(Y,predictions.T) + np.dot(1-Y,1-predictions.T))/float(Y.size)*100) plt.title('Layer Size {size},precision:{precision} %,time: {time} s'.format(size=n_h,precision=accuracy_per,time=float(int(time_consumption*1000))/1000)) print ("Accuracy for {} hidden units: {} %".format(n_h, accuracy_per)+",time consumption:"+str(float(int(time_consumption*1000))/1000)+"s")
可以看得出,在规定范围内,学习率越大,收敛的速度会越快
随着神经网络神经元数的增加,时间增加是必定的,但是准确率并不一定上升,因为随着神经元个数的增加,会出现过拟合现象,导致测试集的训练降低。
5.Perform on other datasets
def load_extra_datasets(): #导入不同的数据类型 N = 200 noisy_circles = sklearn.datasets.make_circles(n_samples=N, factor=.5, noise=.3) noisy_moons = sklearn.datasets.make_moons(n_samples=N, noise=.2) blobs = sklearn.datasets.make_blobs(n_samples=N, random_state=5, n_features=2, centers=6) gaussian_quantiles = sklearn.datasets.make_gaussian_quantiles(mean=None, cov=0.5, n_samples=N, n_features=2, n_classes=2, shuffle=True, random_state=None) no_structure = np.random.rand(N, 2), np.random.rand(N, 2) return noisy_circles, noisy_moons, blobs, gaussian_quantiles, no_structurenoisy_circles, noisy_moons, blobs, gaussian_quantiles, no_structure = load_extra_datasets()datasets = {"noisy_circles": noisy_circles, "noisy_moons": noisy_moons, "blobs": blobs, "gaussian_quantiles": gaussian_quantiles} #创建不同生成类型的数据### START CODE HERE ###(choose your dataset)dataset = "noisy_moons"### END CODE HERE ###X, Y = datasets[dataset]X, Y = X.T, Y.reshape(1, Y.shape[0])
以上各类型生成的数据分布如下:
选取其中一个分布(这里使用的是noisy_moons分布),测试上述算法,得到如下结果:
- Andrew Ng 深度学习课程deeplearning.ai 编程作业——shallow network for datesets classification (1-3)
- Andrew Ng 深度学习课程Deeplearning.ai 编程作业——deep Neural network for image classification(1-4.2)
- Andrew Ng 深度学习课程Deeplearning.ai 编程作业——forward and backward propagation(1-4.1)
- 吴恩达Coursera深度学习课程 DeepLearning.ai 编程作业——Tensorflow+tutorial(2-3)
- 吴恩达Coursera深度学习课程 DeepLearning.ai 编程作业——Initialize parameter(2-1.1)
- 吴恩达Coursera深度学习课程 DeepLearning.ai 编程作业——Regularization(2-1.2)
- 吴恩达Coursera深度学习课程 DeepLearning.ai 编程作业——Gradients_check(2-1.3)
- 吴恩达Coursera深度学习课程 DeepLearning.ai 编程作业——Optimization Methods(2-2)
- 吴恩达Coursera深度学习课程 DeepLearning.ai 编程作业——Keras tutorial
- 吴恩达Coursera深度学习课程 DeepLearning.ai 编程作业——Autonomous driving
- 吴恩达Coursera深度学习课程 DeepLearning.ai 编程作业(1-3)
- 吴恩达深度学习课程deeplearning.ai课程作业:Class 1 Week 3 assignment3
- 吴恩达Coursera深度学习课程 DeepLearning.ai 编程作业(2-3)
- 吴恩达Coursera深度学习课程 DeepLearning.ai 编程作业(4-3)
- 吴恩达Coursera深度学习课程 DeepLearning.ai 编程作业(1-2)
- 吴恩达Coursera深度学习课程 DeepLearning.ai 编程作业(1-4)
- 吴恩达Coursera深度学习课程 DeepLearning.ai 编程作业(2-1)
- 吴恩达Coursera深度学习课程 DeepLearning.ai 编程作业(4-1)
- 机器学习的常用算法,python实现
- 分治策略之最大子数组问题
- Java 抽象函数
- Java
- Android Gradle偶遇 Ensure that you have installed a JDK (not just a JRE) and configured your JAVA_HOM
- Andrew Ng 深度学习课程deeplearning.ai 编程作业——shallow network for datesets classification (1-3)
- UVa10375(唯一分解定理)
- Eclipse快捷键大全
- 搜狗微信文章爬取
- Linux下查看MySQL的安装路径
- 9.1.7 NULL值
- java中Class类及用法
- API管理基础知识集锦
- Junit (Assertions in JUNIT)