【吴恩达】机器学习课程第一周笔记

来源:互联网 发布:做布料软件 编辑:程序博客网 时间:2024/05/16 06:35

机器学习定义

Tom Mitchell(1998):
  一个程序被认为能从经验E中学习,解决任务 T,达到 性能度量值P,当且仅当,有了经验E后,经过P评判, 程序在处理 T 时的性能有所提升。

监督学习

定义:
  监督学习,意指给出一个算法, 需要部分数据集已经有正确答案。
包括:
  1. 分类问题:预测一个离散值输出
   2. 回归问题:预测一个连续值的输出
两个例子:
  房价预测、癌症诊断

非监督学习

定义:
  训练样本不含有标记(label)信息,既没有类别信息,也不会给定目标值
典型例子
  鸡尾酒会问题(声音的分离)

错题

Some of the problems below are best addressed using a supervised learning algorithm, and the others with an unsupervised learning algorithm. Which of the following would you apply supervised learning to? (Select all that apply.) In each case, assume some appropriate dataset is available for your algorithm to learn from.

A. Given data on how 1000 medical patients respond to an experimental drug (such as effectiveness of the treatment, side effects, etc.), discover whether there are different categories or “types” of patients in terms of how they respond to the drug, and if so what these categories are.
B. Have a computer examine an audio clip of a piece of music, and classify whether or not there are vocals (i.e., a human voice
singing) in that audio clip, or if it is a clip of only musical
instruments (and no vocals).
C. Given genetic (DNA) data from a person, predict the odds of him/her developing diabetes over the next 10 years.
D. Given a large dataset of medical records from patients suffering from heart disease, try to learn whether there might be different clusters of such patients for which we might tailor separate treatments.

解析:错选AB,正确答案应该为BC,A不含对应的label信息,既没有类别信息,也没给定目标值,所以A是无监督学习。