机器学习演算法 第三讲 Types of Learning——学习笔记

来源:互联网 发布:照片分析软件 编辑:程序博客网 时间:2024/05/20 01:34

Learning with Different Output Space

Binary Classification 二元分类


credit approve/disapprove

email spam/non-spam

patient sick/not sick

ad profitable/not profitable(广告会不会赚钱)

answer correct/incorrect(KDDCup 2010)

Multiclass Classification: Coin Recognition Problem


written digits =》 0, 1 ... 9

pictures =》 apple, orange, strawberry

emails =》 spam, primary, social, promotion, update(Google gmail)

Regression: Patient Recovery Prediction Problem 回归分析

binary classification: patient features => sick or not

multiclass classification: patient features => which type of cancer

regression: patient features => how many days before recovery


company data => stock price(预测明天的股市情况)

climate data => temperature

Structured Leaning: Sequence Tagging Problem 很大的多类别问题/结构化学习

multiclass classification: word => word class

structured learning:

sentence => structure(class of each word)

huge multiclass classification problem(structure = hyperclass) without 'explicit' class definition


protein data => protein folding 

speech data => speech parse tree


binary classification: y={-1, +1}

multiclass classification: y={1, 2, 3 ... k}

regression: y=R

structured learning: y=structures

Learning with Different Data Label

supervised learning 监督式学习


unsupervised learning 未监督式学习


unsupervised multiclass classification =》 clustering 分群


articles => topics

consumer profiles => consumer groups

unsupervised: Learning without yn

clustering: {xn} => cluster{x}

density estimation: {xn} => density(x)

outlier detection: {xn} => unusual(x)

semi-supervised: Coin Recognition with Some yn


face images with a few labeled => face identifier(Facebook)

medicine data with a few labeled => medicine effect predictor


Reinforcement Learning 增强式学习

a very difficult but natural way of learning



(customer, ad choice, ad click earning) => ad system 广告系统,顾客训练广告系统

(cards, strategy, winning amout) => black jack agent 棋类游戏




supervised: all yn

unsupervised: no yn

semi-supervised: some yn

reinforcement: implicit yn by goodness(yn)

Learning with Different Protocol

batch learning

batch supervised multiclass classification: learn from all known data



batch of (email, spam?) => spam filter

batch of (patient, cancer) => cancer classifier

batch of patient data => group of patients

batch learning: a very common protocol

online learning线上学习

--- hypothesis 'improves' through receiving data instances sequentially

active learning主动学习:

----improve hypothesis with fewer labels (hopefully) by asking questions strategically


batch: 'duck feeding' 填鸭式教育

online: 'passive sequential' 老师教书,一条条按照顺序来


active: 'question asking' (sequentially) 机器问问题 --- query the yn of the chosen xn

Mini Summary

batch: all known data

online: sequential (passive) data

active: strategically - observed data

Learning with Different Input Space

Credit Approval Problem Revisited

concrete features: each dimension of x belongs to Rd represents 'sophisticated physical meaning'

——有domain knowledge,专业知识的描述


(size, mass) for coin classification

customer info for credit approval

patient info for cancer diagnosis

often including 'human intelligence' on the learning task

Raw Features: Digit Recognition Problem

digit recognition problem: features => meaning of digit

a typical supervised multiclass classification problem

by concrete features: x = (symmetry, density)

by raw features: 16 by 16 gray image x = (0, 0, 0.9, 0.6 ...) 是一个256维的向量R


raw features: often need human or machines to convert to concrete ones

deep learning: 大量的资料从中抽取出比较具体的数据

Abstract Features: Rating Prediction Problem

rating prediction problem (KDDCup 2011)

given previous (userid, itemid, rating) tuples, predict the rating that some userid would give to itemid?

a regression problem with y belongs to R as rating and x belongs to N * N as (userid, itemid)

'no physical meaning'; thus even more difficult for ML


student ID in online tutoring system (KDDCup 2010)

advertisement ID in online ad system


abstract: again need 'feature conversion/extraction/construction'

Mini Summary

concrete: sophisticated (and related) physical meaning

raw: simple physical meaning

abstract: no (or little) physical meaning

0 0