神经网络与深度学习(8)

来源:互联网 发布:手机淘宝比价软件 编辑:程序博客网 时间:2024/06/05 16:51

0. 写在前面

又到了一周作业的时间了。只不过从第二周开始,我们增加了编程题作业。也就是说以后一周会有2份作业了。

1. 习题1

What does a neuron compute?
A. A neuron computes a function g that scales the input x linearly (Wx + b)

B. A neuron computes the mean of all features before applying the output to an activation function

C. A neuron computes a linear function (z = Wx + b) followed by an activation function

D. A neuron computes an activation function followed by a linear function (z = Wx + b)

参考答案:coursera 荣誉准则不允许公布答案

解析:按照吴大大的视频里讲的,就是一个神经元是计算一个线性函数,并再紧接着一个激活函数(sigmoid or Relu)。

2. 习题2

Which of these is the “Logistic Loss”?

A. L(i)(y^(i),y(i))=(y(i)log(y^(i))+(1y(i))log(1y^(i)))

B. L(i)(y^(i),y(i))=max(0,y(i)y^(i))

C. L(i)(y^(i),y(i))=y(i)y^(i)2

D. L(i)(y^(i),y(i))=y(i)y^(i)

参考答案:coursera 荣誉准则不允许公布答案

解析:logistic loss又被称为log loss,也就是交叉熵损失。

3. 习题3

Suppose img is a (32,32,3) array, representing a 32x32 image with 3 color channels red, green and blue. How do you reshape this into a column vector?

A. x = img.reshape((32*32,3))

B. x = img.reshape((1,32*32,*3))

C. x = img.reshape((3,32*32))

D. x = img.reshape((32*32*3,1))

参考答案:coursera 荣誉准则不允许公布答案

解析:
我们需要的是一个列向量(column vector),那么很自然。

4. 习题4

Consider the two following random arrays “a” and “b”:

a = np.random.randn(2, 3) # a.shape = (2, 3)b = np.random.randn(2, 1) # b.shape = (2, 1)c = a + b

What will be the shape of “c”?

A. c.shape = (2, 3)

B. c.shape = (2, 1)

C. c.shape = (3, 2)

D. The computation cannot happen because the sizes don’t match. It’s going to be “Error”!

参考答案:coursera 荣誉准则不允许公布答案

解析:正如我们前面讲的广播规则:

  1. Numpy从最后开始往前逐个比较它们的维度(dimensions)大小。比较过程中,如果两者的对应维度相同,或者其中之一(或者全是)等于1,比较继续进行直到最前面的维度。否则,你将看到ValueError错误出现(如,”operands could not be broadcast together with shapes …”)。
  2. 当任何一个维度是1,那么另一个不为1的维度将被用作最终结果的维度。

因此这道题应该选。

5. 习题5

Consider the two following random arrays “a” and “b”:

a = np.random.randn(4, 3) # a.shape = (4, 3)b = np.random.randn(3, 2) # b.shape = (3, 2)c = a*b

What will be the shape of “c”?

A. c.shape = (3, 3)

B. c.shape = (4,2)

C. The computation cannot happen because the sizes don’t match. It’s going to be “Error”!

D. c.shape = (4, 3)

参考答案:coursera 荣誉准则不允许公布答案

解析:正如我们在第7节中讲的一个小问题:

其中∗如果连接两个矩阵的话,那么表示这两个矩阵要同大小,然后是对位点积。否则就要引发广播,而这个矩阵大小不相容,因此引发错误。而如果是numpy.dot的话,则是按照矩阵乘法规则算的。

因此这里是按照对位点积的形式,必须保证两个矩阵相同。否则会报错。

6. 习题6

Suppose you have nx input features per example. Recall that X=[x(1)x(2)...x(m)]. What is the dimension of X?

A. (m,1)

B. (1,m)

C. (m,nx)

D. (nx,m)

参考答案:coursera 荣誉准则不允许公布答案

解析:按照吴大大的讲法,所有的样例都是以列向量存储,也就是一个样例为一列,那么最终就会有m列,因此选择。

7. 习题7

Recall that “np.dot(a,b)” performs a matrix multiplication on a and b, whereas “a*b” performs an element-wise multiplication.

Consider the two following random arrays “a” and “b”:

a = np.random.randn(12288, 150) # a.shape = (12288, 150)b = np.random.randn(150, 45) # b.shape = (150, 45)c = np.dot(a,b)

What is the shape of c?

A. c.shape = (12288, 45)

B. c.shape = (12288, 150)

C. The computation cannot happen because the sizes don’t match. It’s going to be “Error”!

D. c.shape = (150,150)

参考答案:coursera 荣誉准则不允许公布答案

解析:由第6题和本题题干我们就知道应该是。

8. 习题8

Consider the following code snippet:

# a.shape = (3,4)# b.shape = (4,1)for i in range(3):  for j in range(4):    c[i][j] = a[i][j] + b[j]

How do you vectorize this?

A. c = a + b.T

B. c = a.T + b.T

C. c = a.T + b

D. c = a + b

参考答案:coursera 荣誉准则不允许公布答案

解析:由代码可知最终c的大小与a一样,所以不用a转置,又由广播机制可知,要使b转置才行。因此选。

9. 习题9

Consider the following code:

a = np.random.randn(3, 3)b = np.random.randn(3, 1)c = a*b

What will be c? (If you’re not sure, feel free to run this in python to find out).

A. This will invoke broadcasting, so b is copied three times to become (3,3), and ∗ is an element-wise product so c.shape will be (3, 3)

B. This will invoke broadcasting, so b is copied three times to become (3, 3), and ∗ invokes a matrix multiplication operation of two 3x3 matrices so c.shape will be (3, 3)

C. This will multiply a 3x3 matrix a with a 3x1 vector, thus resulting in a 3x1 vector. That is, c.shape = (3,1).

D. It will lead to an error since you cannot use “*” to operate on these two matrices. You need to instead use np.dot(a,b)

参考答案:coursera 荣誉准则不允许公布答案

解析:由第5题可知,此处的两个矩阵实相容的,因此会引发广播。而*指的是对位点积,因此选择。

10. 习题10

Consider the following computation graph.(又麻烦我在国外的同学帮忙下载的。)
这里写图片描述

What is the output J?

A. J = (c - 1)*(b + a)

B. J = (a - 1) * (b + c)

C. J = a*b + b*c + a*c

D. J = (b - 1) * (c + a)

参考答案:coursera 荣誉准则不允许公布答案

解析:这是一个很简单的题目,通过几步化简就可以知道了。

原创粉丝点击