神经网络与深度学习（8）

来源：互联网发布：手机淘宝比价软件编辑：程序博客网时间：2024/06/05 16:51

0. 写在前面

又到了一周作业的时间了。只不过从第二周开始，我们增加了编程题作业。也就是说以后一周会有2份作业了。

1. 习题1

What does a neuron compute?
A. A neuron computes a function g that scales the input x linearly (Wx + b)

B. A neuron computes the mean of all features before applying the output to an activation function

C. A neuron computes a linear function (z = Wx + b) followed by an activation function

D. A neuron computes an activation function followed by a linear function (z = Wx + b)

参考答案：coursera 荣誉准则不允许公布答案

解析：按照吴大大的视频里讲的，就是一个神经元是计算一个线性函数，并再紧接着一个激活函数（sigmoid or Relu）。

2. 习题2

Which of these is the “Logistic Loss”?

A. L(i)(y^(i),y(i))=−(y(i)log(y^(i))+(1−y(i))log(1−y^(i)))

B. L(i)(y^(i),y(i))=max(0,y(i)−y^(i))

C. L(i)(y^(i),y(i))=∣y(i)−y^(i)∣2

D. L(i)(y^(i),y(i))=∣y(i)−y^(i)∣

参考答案：coursera 荣誉准则不允许公布答案

解析：logistic loss又被称为log loss，也就是交叉熵损失。

3. 习题3

Suppose img is a (32,32,3) array, representing a 32x32 image with 3 color channels red, green and blue. How do you reshape this into a column vector?

A. x = img.reshape((32*32,3))

B. x = img.reshape((1,32*32,*3))

C. x = img.reshape((3,32*32))

D. x = img.reshape((32*32*3,1))

参考答案：coursera 荣誉准则不允许公布答案

解析：
我们需要的是一个列向量（column vector）,那么很自然。

4. 习题4

Consider the two following random arrays “a” and “b”:

a = np.random.randn(2, 3) # a.shape = (2, 3)b = np.random.randn(2, 1) # b.shape = (2, 1)c = a + b

What will be the shape of “c”?

A. c.shape = (2, 3)

B. c.shape = (2, 1)

C. c.shape = (3, 2)

D. The computation cannot happen because the sizes don’t match. It’s going to be “Error”!

参考答案：coursera 荣誉准则不允许公布答案

解析：正如我们前面讲的广播规则：

Numpy从最后开始往前逐个比较它们的维度（dimensions）大小。比较过程中，如果两者的对应维度相同，或者其中之一（或者全是）等于1，比较继续进行直到最前面的维度。否则，你将看到ValueError错误出现（如，”operands could not be broadcast together with shapes …”）。

当任何一个维度是1，那么另一个不为1的维度将被用作最终结果的维度。

因此这道题应该选。

5. 习题5

Consider the two following random arrays “a” and “b”:

a = np.random.randn(4, 3) # a.shape = (4, 3)b = np.random.randn(3, 2) # b.shape = (3, 2)c = a*b

What will be the shape of “c”?

A. c.shape = (3, 3)

B. c.shape = (4,2)

C. The computation cannot happen because the sizes don’t match. It’s going to be “Error”!

D. c.shape = (4, 3)

参考答案：coursera 荣誉准则不允许公布答案

解析：正如我们在第7节中讲的一个小问题：

其中∗如果连接两个矩阵的话，那么表示这两个矩阵要同大小，然后是对位点积。否则就要引发广播，而这个矩阵大小不相容，因此引发错误。而如果是numpy.dot的话，则是按照矩阵乘法规则算的。

因此这里是按照对位点积的形式，必须保证两个矩阵相同。否则会报错。

6. 习题6

Suppose you have nx input features per example. Recall that X=[x(1)x(2)...x(m)]. What is the dimension of X?

A. (m,1)

B. (1,m)

C. (m,nx)

D. (nx,m)

参考答案：coursera 荣誉准则不允许公布答案

解析：按照吴大大的讲法，所有的样例都是以列向量存储，也就是一个样例为一列，那么最终就会有m列，因此选择。

7. 习题7

Recall that “np.dot(a,b)” performs a matrix multiplication on a and b, whereas “a*b” performs an element-wise multiplication.

Consider the two following random arrays “a” and “b”:

a = np.random.randn(12288, 150) # a.shape = (12288, 150)b = np.random.randn(150, 45) # b.shape = (150, 45)c = np.dot(a,b)

What is the shape of c?

A. c.shape = (12288, 45)

B. c.shape = (12288, 150)

C. The computation cannot happen because the sizes don’t match. It’s going to be “Error”!

D. c.shape = (150,150)

参考答案：coursera 荣誉准则不允许公布答案

解析：由第6题和本题题干我们就知道应该是。

8. 习题8

Consider the following code snippet:

# a.shape = (3,4)# b.shape = (4,1)for i in range(3):  for j in range(4):    c[i][j] = a[i][j] + b[j]

How do you vectorize this?

A. c = a + b.T

B. c = a.T + b.T

C. c = a.T + b

D. c = a + b

参考答案：coursera 荣誉准则不允许公布答案

解析：由代码可知最终c的大小与a一样，所以不用a转置，又由广播机制可知，要使b转置才行。因此选。

9. 习题9

Consider the following code:

a = np.random.randn(3, 3)b = np.random.randn(3, 1)c = a*b

What will be c? (If you’re not sure, feel free to run this in python to find out).

A. This will invoke broadcasting, so b is copied three times to become (3,3), and ∗ is an element-wise product so c.shape will be (3, 3)

B. This will invoke broadcasting, so b is copied three times to become (3, 3), and ∗ invokes a matrix multiplication operation of two 3x3 matrices so c.shape will be (3, 3)

C. This will multiply a 3x3 matrix a with a 3x1 vector, thus resulting in a 3x1 vector. That is, c.shape = (3,1).

D. It will lead to an error since you cannot use “*” to operate on these two matrices. You need to instead use np.dot(a,b)

参考答案：coursera 荣誉准则不允许公布答案

解析：由第5题可知，此处的两个矩阵实相容的，因此会引发广播。而*指的是对位点积，因此选择。

10. 习题10

Consider the following computation graph.（又麻烦我在国外的同学帮忙下载的。）
这里写图片描述

What is the output J?

A. J = (c - 1)*(b + a)

B. J = (a - 1) * (b + c)

C. J = a*b + b*c + a*c

D. J = (b - 1) * (c + a)

参考答案：coursera 荣誉准则不允许公布答案

解析：这是一个很简单的题目，通过几步化简就可以知道了。

阅读全文

1 0