data mining - 实用机器学习与技术读书笔记（六）

来源：互联网发布：淘宝网时尚女牛仔裤编辑：程序博客网时间：2024/04/29 19:24

分类挖掘里面，有一种算法叫 Covering algorithm . 字面翻译 - 覆盖算法。

假设我们要为不同的人群推荐树脂镜片还是柔光镜片（ hard or soft lens ). 有以下那么一组数据，根据这组数据，我们可以解决类似：

if age = young and astigmatism = yes and tear production rate = normal then recommendation = hard

的问题。

这里写图片描述

age = young 2/8
age = pre-presbyopic 1/8
age = presbyopic 1/8
spectacle prescription = myope 3/12
spectacle prescription = hypermetrope 1/12
astigmatism = no 0/12
astigmatism = yes 4/12
tear production rate = reduced 0/12
tear production rate = normal 4/12

稍微解释下：近视阶段为早期 age = young, 8 个样本有 2 个用了 hard 树脂镜片。我们现在要解决的问题是当符合什么样的条件时，一定是用 Hard 镜片。

这里的算法其实跟 Decision Tree 差不多，就是轮询每一个属性值，看看是不是有一条路走下来完全是 1/1 (100%) 的找到一个确定的结果。这里和 Decision Tree 还是有差异的，covering 找到的一定是确定的值，哪怕分得很细。Decision Tree 是用 Entropy 来确定最终路径，但是可以有多个最终结果，以偏向性最大的作为结果值。

算法实现部分，现在暂时略过。

0 0

data mining - 实用机器学习与技术 读书笔记（六）

data mining - 实用机器学习与技术读书笔记（六）