[机器学习02]supervised learning and unsupervised learning

来源:互联网 发布:怎样设计淘宝店铺模板 编辑:程序博客网 时间:2024/06/05 22:33

1.Housing price prediction

这里写图片描述

图中坐标系横轴表示房子面积,纵轴表示房子价格,已收集到一些房屋面积和价格的数据,在图中用红色X标记。现在预测房屋面积为750英尺,价格为多少?可以根据点集进行拟合,根据拟合后的函数求解面积为750英尺时的房屋价格.

2.Cancer prediction

这里写图片描述
这里写图片描述

3.supervised learning

监督学习的特点是训练集已给出,通过已有的训练样本(即已知数据以及对应的输出)去训练得到一个最优模型(这个模型属于某个函数的集合,最优则表示在某个评价标准下是最佳的),再利用这个模型将所有的输入映射为相应的输出,从而实现分类的目的,也就是有了对未知数据进行分类的能力.
example:
You’re running a company, and you want to develop learning algorithms to address each of two problems.

Problem 1: You have a large inventory of identical items. You want to predict how many of these items will sell over the next 3 months.
Problem 2: You’d like software to examine individual customer accounts, and for each account decide if it has been hacked/compromised.
Should you treat these as classification or as regression problems?

  • Treat both as classification problems.
  • Treat problem 1 as a classification problem, problem 2 as a regression problem.
  • Treat problem 1 as a regression problem, problem 2 as a classification problem.
  • Treat both as regression problems.

4.unsupervised learning

这里写图片描述
在无监督学习中,数据集的特征是不知道的,没有任何训练样本。比如在实际生活中,我们从小被大人教育什么是猫、什么是树,我们在潜意识中记住了猫、树的特征,以后我们就会根据我们脑海中存在的特征(已知训练样本)去辨认哪些是猫、哪些是树,这就是典型的分类;而无监督学习就好比去参观一次画展,我们事先并不知道画展中有哪些作品(未知训练集),当参观完以后我们根据参观的过程中的认识(自主学习),会自动划分哪些作品是山水画、哪些作品是书法作品(聚类).

聚类的应用举例:

  1. 比如在百度中Q:习近平访英
    这里写图片描述
    绿框标志的新闻来自不同的新闻网站,但都有同一个主题,通过分类算法和无监督学习,搜索引擎将具有相同主题的网页链接聚类在一起,反馈给用户更好的搜索结果。
  2. 社交网络中人群分析
  3. 市场用户划分
  4. 超大规模行星数据分析

一个区分监督学习和无监督学习的例子:
Of the following examples, which would you address using an unsupervised learning algorithm? (Check all that apply.)

  • Given a database of customer data, automatically discover market segments and group customers into different market segments.(supervised)
  • Given email labeled as spam/not spam, learn a spam filter.(unsupervised)
  • Given a set of news articles found on the web, group them into set of articles about the same story.(unsupervised)
  • Given a dataset of patients diagnosed as either having diabetes or not, learn to classify new patients as having diabetes or not.(supervised)
1 0
原创粉丝点击