CalTech machine learning, video 6 note,theory of generalization

来源:互联网 发布:工商信息查询网大数据 编辑:程序博客网 时间:2024/04/30 20:38
start CalTech machine learning, video 6


theory of generalization


8:21 2014-09-24
we're going to bound the growth function


by a polynomial


8:37 2014-09-24
B(N, k) // I give you N points, and k is the break point


8:40 2014-09-24
you still make an upperbound statement


8:43 2014-09-24
recursive bound on B(N, k)


8:53 2014-09-24
the upperbound of growth function


9:17 2014-09-24
for a given hypotheses set, the break point k


is a fixed 


9:37 2014-09-24
NN == Neural Network


9:38 2014-09-24
the break point for the neural network is 17


9:38 2014-09-24
2D perceptron


9:44 2014-09-24
outline:


* Proof that mH(N) is polynomial


* Proof that mH(N) can replace M


9:46 2014-09-24
How does the growth function mH(N) relate to overlap?


9:50 2014-09-24
space of data sets


9:52 2014-09-24
for the 1st hypothesis, you get this bad region


9:54 2014-09-24
union bound


9:54 2014-09-24
union bound => VC bound


9:55 2014-09-24
now they're overlapping, the total area,


which is the bad area, is a small fraction 


of the whole thing, and I can learn.


9:56 2014-09-24
how is the growth function going to characterize


the overlaps?


9:57 2014-09-24
what the growth function tells you is that,


9:58 2014-09-24
if you take a dichotomy, it's not the full hypotheses,


but hypothese on a finite set of points


9:59 2014-09-24
what to do about Eout?


10:01 2014-09-24
instead of picking one sample, I'm going to 


pick two samples independently.


they're coming fromt the same distribution.


10:02 2014-09-24
does Ein(h) tracks Ein'(h)?


each of them tracks Eout(h).


10:03 2014-09-24
so the mathematical ramfication is:


if you characterize using two samples,


then I'm completely in the realm of dichotomies.


10:07 2014-09-24
because now I'm not appealing to Eout(h) any more.


I'm only appealing to what happens in the sample


10:08 2014-09-24
It's a bigger sample, I now have 2 N samples instead 


of N samples.


10:08 2014-09-24
now the characterization is full.


I'm ready to go.


10:08 2014-09-24
these are the only two component you 


need to worry about as you read the proof.


10:09 2014-09-24
Not quite, but rather, because I have two 


sample now.


10:10 2014-09-24
now we have a polynomial, a bigger polynomial,


but can do the job we want.


10:13 2014-09-24
but the basic message is that: 


here is a statement holds true for any hypothese


sets that have a break point.


10:15 2014-09-24
you will eventually learn. Ein(h) tracks Eout(h) correctly.


10:15 2014-09-24 
The Vapnik-Chervonenkis Inequality


10:16 2014-09-24
if you have a break point, it guarantees the learning.


10:39 2014-09-24
for this hypothesis set over the input space, what


is the break point?


10:41 2014-09-24
How much resource do you need for learning?


10:43 2014-09-24
traing data, real data


10:49 2014-09-24
* Ein(h) to track Eout(h)


* try to minimize Ein(h)


10:50 2014-09-24
VC inequality


10:50 2014-09-24
N(N, k): maximum number of dichotomies on N points,


with break point k


10:56 2014-09-24
what is the maximum number of dichotomies you can


get without any other constraints?


B(N, k)  // use this to bound mH(N)(the growth function)
0 0