Andrew ng ML Coursera Quiz1
来源:互联网 发布:手机淘宝彩票在哪里 编辑:程序博客网 时间:2024/05/22 01:35
-
Consider the problem of predicting how well a student does in her second year of college/university, given how well they did in their first year.
Specifically, let x be equal to the number of “A” grades (including A-. A and A+ grades) that a student receives in their first year of college (freshmen year). We would like to predict the value of y, which we define as the number of “A” grades they get in their second year (sophomore year).
Questions 1 through 4 will use the following training set of a small sample of different students’ performances. Here each row is one training example. Recall that in linear regression, our hypothesis is hθ(x)=θ0+θ1x, and we use m to denote the number of training examples.
For the training set given above, what is the value of m? In the box below, please enter your answer (which should be a number between 0 and 10).
-
Consider the following training set of m=4 training examples:
x y
1 0.5
2 1
4 2
0 0
Consider the linear regression model hθ(x)=θ0+θ1x. What are the values of θ0 and θ1 that you would expect to obtain upon running gradient descent on this model? (Linear regression will be able to fit this data perfectly.)
θ0=0.5,θ1=0.5
θ0=0,θ1=0.5
θ0=1,θ1=1
θ0=1,θ1=0.5
θ0=0.5,θ1=0
3.
Suppose we set θ0=−1,θ1=2. What is hθ(6)?
-
Let f be some function so that
f(θ0,θ1) outputs a number. For this problem,
f is some arbitrary/unknown smooth function (not necessarily the
cost function of linear regression, so f may have local optima).
Suppose we use gradient descent to try to minimize f(θ0,θ1)
as a function of θ0 and θ1. Which of the
following statements are true? (Check all that apply.)
No matter how θ0 and θ1 are initialized, so long
as α is sufficiently small, we can safely expect gradient descent to converge
to the same solution.
Setting the learning rate α to be very small is not harmful, and can
only speed up the convergence of gradient descent.
If θ0 and θ1 are initialized at
the global minimum, then one iteration will not change their values.
If the first few iterations of gradient descent cause f(θ0,θ1) to
increase rather than decrease, then the most likely cause is that we have set the
learning rate α to too large a value.
5.
Suppose that for some linear regression problem (say, predicting housing prices as in the lecture), we
have some training set, and for our training set we managed to find some θ0, θ1 such that J(θ0,θ1)=0. Which
of the statements below must then be true? (Check all that apply.)
This is not possible: By the definition of J(θ0,θ1), it is not possible for there to exist
θ0 and θ1 so that J(θ0,θ1)=0
For this to be true, we must have θ0=0 and θ1=0
so that hθ(x)=0
We can perfectly predict the value of y even for new examples that we have not yet seen.
(e.g., we can perfectly predict prices of even new houses that we have not yet seen.)
For these values of θ0 and θ1 that satisfy J(θ0,θ1)=0,
we have that hθ(x(i))=y(i) for every training example (x(i),y(i))
- Andrew ng ML Coursera Quiz1
- study notes of coursera-ml by Andrew Ng
- SVM(Andrew ng ML)
- Coursera机器学习(Andrew Ng)笔记:ML算法指导与系统设计
- Andrew Ng-ML-第一讲
- Andrew Ng-ML-第二讲
- Andrew Ng-ML-第三讲
- Andrew Ng-ML-第四讲
- coursera Machine Learning, Andrew Ng
- Andrew Ng ML -week4 Neural Networks-Representation
- Machine Learning System Design(Andrew ng ML)
- Neural Networks: Learning(Andrew ng ML)
- Andrew Ng ML课程总结(一)
- Andrew Ng ML课程总结(二)
- Coursera机器学习(Andrew Ng)笔记1
- Coursera机器学习(Andrew Ng)笔记:神经网络
- Andrew Ng(吴恩达) deep learning 课程 (coursera)
- Andrew Ng机器学习(ML)入门学习笔记(二)
- 欢迎使用CSDN-markdown编辑器
- boost ptree操作XML,方便又好用
- Rc与Box区别
- 数据结构-队列
- Android list 泛型 用Gson 转json字符串 出现 java.lang.StackOverflowError异常解决
- Andrew ng ML Coursera Quiz1
- 集成微信第三方登陆
- MyEclipse 强制杀掉后 INFO: Port busy 8009 java.net.BindException: Address already in use: JVM_Bind
- linux 查看文件内容的命令
- linux less从后向前查看日志信息
- tableView 和聊天对话一样,显示最新的数据,不用滑动
- 友好的 ContainerView & AutoLayout 资料推荐
- 跑马灯效果实现方式三种
- Android - 多个Fragment切换不重新实例化