机器学习演算法 第三讲 Types of Learning——学习笔记
来源:互联网 发布:照片分析软件 编辑:程序博客网 时间:2024/05/20 01:34
Learning with Different Output Space
Binary Classification 二元分类
二元分类的应用:
credit approve/disapprove
email spam/non-spam
patient sick/not sick
ad profitable/not profitable(广告会不会赚钱)
answer correct/incorrect(KDDCup 2010)
Multiclass Classification: Coin Recognition Problem
多元分类的应用:
written digits =》 0, 1 ... 9
pictures =》 apple, orange, strawberry
emails =》 spam, primary, social, promotion, update(Google gmail)
Regression: Patient Recovery Prediction Problem 回归分析
binary classification: patient features => sick or not
multiclass classification: patient features => which type of cancer
regression: patient features => how many days before recovery
回归分析应用:
company data => stock price(预测明天的股市情况)
climate data => temperature
Structured Leaning: Sequence Tagging Problem 很大的多类别问题/结构化学习
multiclass classification: word => word class
structured learning:
sentence => structure(class of each word)
huge multiclass classification problem(structure = hyperclass) without 'explicit' class definition
应用
protein data => protein folding
speech data => speech parse tree
summary
binary classification: y={-1, +1}
multiclass classification: y={1, 2, 3 ... k}
regression: y=R
structured learning: y=structures
Learning with Different Data Label
supervised learning 监督式学习
告诉你铜板是什么
unsupervised learning 未监督式学习
不告诉你铜板是什么
unsupervised multiclass classification =》 clustering 分群
分群的应用
articles => topics
consumer profiles => consumer groups
unsupervised: Learning without yn
clustering: {xn} => cluster{x}
density estimation: {xn} => density(x)
outlier detection: {xn} => unusual(x)
semi-supervised: Coin Recognition with Some yn
半监督式问题应用
face images with a few labeled => face identifier(Facebook)
medicine data with a few labeled => medicine effect predictor
特点:要找到标记很贵
Reinforcement Learning 增强式学习
a very difficult but natural way of learning
在另外一个输出进行奖励或者惩罚
增强式的应用
(customer, ad choice, ad click earning) => ad system 广告系统,顾客训练广告系统
(cards, strategy, winning amout) => black jack agent 棋类游戏
考的不是如何输出,而是另外的输出的好坏训练
——通常会序列的发生
summary
supervised: all yn
unsupervised: no yn
semi-supervised: some yn
reinforcement: implicit yn by goodness(yn)
Learning with Different Protocol
batch learning
batch supervised multiclass classification: learn from all known data
将资料整批整批的训练
应用:
batch of (email, spam?) => spam filter
batch of (patient, cancer) => cancer classifier
batch of patient data => group of patients
batch learning: a very common protocol
online learning线上学习
--- hypothesis 'improves' through receiving data instances sequentially
active learning主动学习:
----improve hypothesis with fewer labels (hopefully) by asking questions strategically
用在取得标记很贵的地方
batch: 'duck feeding' 填鸭式教育
online: 'passive sequential' 老师教书,一条条按照顺序来
以上两个都是被动的
active: 'question asking' (sequentially) 机器问问题 --- query the yn of the chosen xn
Mini Summary
batch: all known data
online: sequential (passive) data
active: strategically - observed data
Learning with Different Input Space
Credit Approval Problem Revisited
concrete features: each dimension of x belongs to Rd represents 'sophisticated physical meaning'
——有domain knowledge,专业知识的描述
应用:
(size, mass) for coin classification
customer info for credit approval
patient info for cancer diagnosis
often including 'human intelligence' on the learning task
Raw Features: Digit Recognition Problem
digit recognition problem: features => meaning of digit
a typical supervised multiclass classification problem
by concrete features: x = (symmetry, density)
by raw features: 16 by 16 gray image x = (0, 0, 0.9, 0.6 ...) 是一个256维的向量R
raw比concrete抽象,而越抽象就表示对机器来说解决起来越困难
raw features: often need human or machines to convert to concrete ones
deep learning: 大量的资料从中抽取出比较具体的数据
Abstract Features: Rating Prediction Problem
rating prediction problem (KDDCup 2011)
given previous (userid, itemid, rating) tuples, predict the rating that some userid would give to itemid?
a regression problem with y belongs to R as rating and x belongs to N * N as (userid, itemid)
'no physical meaning'; thus even more difficult for ML
其他应用
student ID in online tutoring system (KDDCup 2010)
advertisement ID in online ad system
人自己规定一部分特征,然后机器自己学习一部分特征
abstract: again need 'feature conversion/extraction/construction'
Mini Summary
concrete: sophisticated (and related) physical meaning
raw: simple physical meaning
abstract: no (or little) physical meaning
- 机器学习演算法 第三讲 Types of Learning——学习笔记
- 机器学习演算法 第四讲 Feasibility of Learning——学习笔记k
- 机器学习基石第三讲:types of learning
- 机器学习演算法 第六讲 Theory of Generalization——学习笔记
- 机器学习基石——第3-4讲.Types of Learning
- 机器学习演算法 第五讲 Training versus Testing——学习笔记
- 机器学习基石笔记 Lecture 3 - Types of Learning
- NTU-Coursera机器学习:Types of Learning
- 机器学习基石-3-Types of Learning
- 机器学习基石-Types of Learning
- 台湾大学林轩田机器学习基石课程学习笔记3 -- Types of Learning
- 机器学习中的学习方式-Types of learning
- 《机器学习基石》笔记:第三讲
- 林轩田之机器学习课程笔记(when can machines learn之types of learning)(32之3)
- 机器学习基石第四讲:feasibility of learning
- Stanford机器学习---第一讲. Introduction of machine learning
- Lecture 3:Types of Learning(各种类型的机器学习问题)
- Standford 机器学习—第三讲 Logistic Regression 逻辑回归
- Python 字符串操作(string替换、删除、截取、复制、连接、比较、查找、包含、大小写转换、分割等)
- jQuery全屏滚动插件FullPage.js中文帮助文档API
- Win7 Qt4.8.5+QtCreator2.8.0+mingw配置过程
- Go语言revel环境搭建
- 欢迎使用CSDN-markdown编辑器
- 机器学习演算法 第三讲 Types of Learning——学习笔记
- linux查看当前目录文件数
- 知识点点:判断int变量是否赋值 (C#)
- 程序员修炼内功心法
- WebService注解汇总
- 解决:OSX 10.7.5的Git的问题Illegal instruction: 4
- Mysql优化配置
- Java final介绍
- 移动开发过程概览