ML 错题集

来源：互联网发布：天睿软件科技有限公司编辑：程序博客网时间：2024/04/28 19:30

week 2.

1.Suppose m=4 students have taken some class, and the class had a midterm exam and a final exam. You have collected a dataset of their scores on the two exams, which is as follows:

midterm exam(midterm exam)

2final exam89792196725184749488368769476178

You'd like to use polynomial regression to predict a student's final exam score from their midterm exam score. Concretely, suppose you want to fit a model of the form hθ(x)=θ0+θ1x1+θ2x2, where x1 is the midterm score and x2 is (midterm score)2. Further, you plan to use both feature scaling (dividing by the "max-min", or range, of a feature) and mean normalization.

What is the normalized feature x(2)2? (Hint: midterm = 72, final = 74 is training example 2.) Please round off your answer to two decimal places and enter in the text box below.

【解析】mean normalization

Replace xi with xi-μi to make fetures have approximately zero mean.Do not apply to x0=1;

均值归一化

$$ x = \dfrac{x_i -avg }{max-min}$$

avg = (7921+5184+8836+4761)/4=6675.5

answer = (5184-(6675.5))/(8836-4761)

2.Which of the following are reasons for using feature scaling?

It speeds up gradient descent by making it require fewer iterations to get to a good solution.

【解析】Feature scaling speeds up gradient descent by avoiding many extra iterations that are required when one or more features take on much larger values than the rest.
The cost function J(θ) for linear regression has no local optima.
The magnitude of the feature values are insignificant in terms of computational cost.

0 0