CalTech machine learning, video 6 note,theory of generalization

来源：互联网发布：工商信息查询网大数据编辑：程序博客网时间：2024/04/30 20:38

start CalTech machine learning, video 6

theory of generalization

8:21 2014-09-24
we're going to bound the growth function

by a polynomial

8:37 2014-09-24
B(N, k) // I give you N points, and k is the break point

8:40 2014-09-24
you still make an upperbound statement

8:43 2014-09-24
recursive bound on B(N, k)

8:53 2014-09-24
the upperbound of growth function

9:17 2014-09-24
for a given hypotheses set, the break point k

is a fixed

9:37 2014-09-24
NN == Neural Network

9:38 2014-09-24
the break point for the neural network is 17

9:38 2014-09-24
2D perceptron

9:44 2014-09-24
outline:

* Proof that mH(N) is polynomial

* Proof that mH(N) can replace M

9:46 2014-09-24
How does the growth function mH(N) relate to overlap?

9:50 2014-09-24
space of data sets

9:52 2014-09-24
for the 1st hypothesis, you get this bad region

9:54 2014-09-24
union bound

9:54 2014-09-24
union bound => VC bound

9:55 2014-09-24
now they're overlapping, the total area,

which is the bad area, is a small fraction

of the whole thing, and I can learn.

9:56 2014-09-24
how is the growth function going to characterize

the overlaps?

9:57 2014-09-24
what the growth function tells you is that,

9:58 2014-09-24
if you take a dichotomy, it's not the full hypotheses,

but hypothese on a finite set of points

9:59 2014-09-24
what to do about Eout?

10:01 2014-09-24
instead of picking one sample, I'm going to

pick two samples independently.

they're coming fromt the same distribution.

10:02 2014-09-24
does Ein(h) tracks Ein'(h)?

each of them tracks Eout(h).

10:03 2014-09-24
so the mathematical ramfication is:

if you characterize using two samples,

then I'm completely in the realm of dichotomies.

10:07 2014-09-24
because now I'm not appealing to Eout(h) any more.

I'm only appealing to what happens in the sample

10:08 2014-09-24
It's a bigger sample, I now have 2 N samples instead

of N samples.

10:08 2014-09-24
now the characterization is full.

I'm ready to go.

10:08 2014-09-24
these are the only two component you

need to worry about as you read the proof.

10:09 2014-09-24
Not quite, but rather, because I have two

sample now.

10:10 2014-09-24
now we have a polynomial, a bigger polynomial,

but can do the job we want.

10:13 2014-09-24
but the basic message is that:

here is a statement holds true for any hypothese

sets that have a break point.

10:15 2014-09-24
you will eventually learn. Ein(h) tracks Eout(h) correctly.

10:15 2014-09-24
The Vapnik-Chervonenkis Inequality

10:16 2014-09-24
if you have a break point, it guarantees the learning.

10:39 2014-09-24
for this hypothesis set over the input space, what

is the break point?

10:41 2014-09-24
How much resource do you need for learning?

10:43 2014-09-24
traing data, real data

10:49 2014-09-24
* Ein(h) to track Eout(h)

* try to minimize Ein(h)

10:50 2014-09-24
VC inequality

10:50 2014-09-24
N(N, k): maximum number of dichotomies on N points,

with break point k

10:56 2014-09-24
what is the maximum number of dichotomies you can

get without any other constraints?

B(N, k) // use this to bound mH(N)(the growth function)

0 0