Machine Learning

来源:互联网 发布:切尔西靴 知乎 编辑:程序博客网 时间:2024/05/19 21:03

Introduce

5 试题

1
point
1。

A computer program is said to learn from experience E with

respect to some task T and some performance measure P if its

performance on T, as measured by P, improves with experience E.

Suppose we feed a learning algorithm a lot of historical weather

data, and have it learn to predict weather. In this setting, what is T?

The process of the algorithm examining a large amount of historical weather data.

None of these.

The weather prediction task.

正确 

The probability of it correctly predicting a future date's weather.

1
point
2。

Suppose you are working on weather prediction, and use a

learning algorithm to predict tomorrow's temperature (in

degrees Centigrade/Fahrenheit).

Would you treat this as a classification or a regression problem?

Regression

正确 

Classification

1
point
3。

Suppose you are working on stock market prediction, Typically

tens of millions of shares of Microsoft stock are traded

(i.e., bought/sold) each day. You would like to predict the

number of Microsoft shares that will be traded tomorrow.

Would you treat this as a classification or a regression problem?

Regression

正确 

Classification

1
point
4。

Some of the problems below are best addressed using a supervised

learning algorithm, and the others with an unsupervised

learning algorithm. Which of the following would you apply

supervised learning to? (Select all that apply.) In each case, assume some appropriate

dataset is available for your algorithm to learn from.

Take a collection of 1000 essays written on the US Economy, and find a way to automatically group these essays into a small number of groups of essays that are somehow "similar" or "related".

Given 50 articles written by male authors, and 50 articles written by female authors, learn to predict the gender of a new manuscript's author (when the identity of this author is unknown).

正确 

Examine a large collection of emails that are known to be spam email, to discover if there are sub-types of spam mail.

Given historical data of children's ages and heights, predict children's height as a function of their age.

正确 

1
point
5。

Which of these is a reasonable definition of machine learning?

Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed.

正确 

Machine learning is the field of allowing robots to act intelligently.

Machine learning learns from labeled data.

Machine learning is the science of programming computers.


Linear Regression with One Variable

5 试题

1
point
1。

Consider the problem of predicting how well a student does in her second year of college/university, given how well she did in her first year.

Specifically, let x be equal to the number of "A" grades (including A-. A and A+ grades) that a student receives in their first year of college (freshmen year). We would like to predict the value of y, which we define as the number of "A" grades they get in their second year (sophomore year).

Here each row is one training example. Recall that in linear regression, our hypothesis is hθ(x)=θ0+θ1x, and we use m to denote the number of training examples.

For the training set given above (note that this training set may also be referenced in other questions in this quiz), what is the value of m? In the box below, please enter your answer (which should be a number between 0 and 10).

1
point
2。

For this question, assume that we are

using the training set from Q1. Recall our definition of the

cost function was J(θ0,θ1)=12mmi=1(hθ(x(i))y(i))2.

What is J(0,1)? In the box below,

please enter your answer (Simplify fractions to decimals when entering answer, and '.' as the decimal delimiter e.g., 1.5).

1
point
3。

Suppose we set θ0=2,θ1=0.5 in the linear regression hypothesis from Q1. What is hθ(6)?

1
point
4。

Let f be some function so that

f(θ0,θ1) outputs a number. For this problem,

f is some arbitrary/unknown smooth function (not necessarily the

cost function of linear regression, so f may have local optima).

Suppose we use gradient descent to try to minimize f(θ0,θ1)

as a function of θ0 and θ1. Which of the

following statements are true? (Check all that apply.)

Even if the learning rate α is very large, every iteration of

gradient descent will decrease the value of f(θ0,θ1).

If the learning rate is too small, then gradient descent may take a very long

time to converge.

If θ0 and θ1 are initialized so that θ0=θ1, then by symmetry (because we do simultaneous updates to the two parameters), after one iteration of gradient descent, we will still have θ0=θ1.

If θ0 and θ1 are initialized at

a local minimum, then one iteration will not change their values.

1
point
5。

Suppose that for some linear regression problem (say, predicting housing prices as in the lecture), we have some training set, and for our training set we managed to find some θ0θ1such that J(θ0,θ1)=0.

Which of the statements below must then be true? (Check all that apply.)

Gradient descent is likely to get stuck at a local minimum and fail to find the global minimum.

For this to be true, we must have θ0=0 and θ1=0

so that hθ(x)=0

Our training set can be fit perfectly by a straight line,

i.e., all of our training examples lie perfectly on some straight line.

For this to be true, we must have y(i)=0 for every value of i=1,2,,m.


Linear Algebra

1。

Let two matrices be

A=[4639],B=[2592]

What is A + B?

[6111211]

[2192]

[61167]

2。

Let x=5527

What is 2x?

5252172

[1010414]

[5252172]

3。

Let u be a 3-dimensional vector, where specifically

u=351

What is uT?

153

351

[153]

4。

Let u and v be 3-dimensional vectors, where specifically

u=443

and

v=424

What is uTv?

(Hint: uT is a

1x3 dimensional matrix, and v can also be seen as a 3x1

matrix. The answer you want can be obtained by taking

the matrix product of uT and v.) Do not add brackets to your answer.

  
-4
5。

Let A and B be 3x3 (square) matrices. Which of the following

must necessarily hold true? Check all that apply.

0 0
原创粉丝点击