CalTech machine learning video 5 note , training vs. testing

来源：互联网发布：工商信息查询网大数据编辑：程序博客网时间：2024/04/30 19:47

start CalTech machine learning, video 5

training vs testing

7:14 2014-09-23
training => testing

7:44 2014-09-23
key notion: break point

7:45 2014-09-23
final examination: training => testing

7:47 2014-09-23
this guarantee is not a guarantee at all:

M is too big!

7:55 2014-09-23
they could be independent which means they

could be proportionally overlapping

7:59 2014-09-23
where did the M come from?

7:59 2014-09-23
the overlap is so significant

8:00 2014-09-23
can we improve on M?

bad events are very overlapping

8:06 2014-09-23
Ein // in-sample error

8:12 2014-09-23
What are we going to replace with?

8:13 2014-09-23
What can we replace M with?

8:15 2014-09-23
input space

8:15 2014-09-23
the count should reflect the strength

of the hypothesis set.

8:18 2014-09-23
dichotomies // dichotomy

8:19 2014-09-23
Dichotomies: mini-hypotheses

8:23 2014-09-23
dichotomies are also hypotheses, but the domain

are not the full input space, just a few points

8:24 2014-09-23
why dichotomies?

#hypotheses |H| can be infinite,

#dichotomies can be FINITE

8:28 2014-09-23
the growth function:

the growth function counts the most dichotomies

8:30 2014-09-23
I give you the N budgts, you choose where to

put the points

8:31 2014-09-23
mH(N) // growth functions

counts the most number of dichotomies

8:36 2014-09-23
perceptron dichotomy

8:38 2014-09-23
positive rays

8:46 2014-09-23
positive intervals

9:01 2014-09-23
can we shatter this set?

9:13 2014-09-23
What we're trying to do is to replace M.

replace M by mH(N)

9:15 2014-09-23
once you declare that the hypotheses set

has a polynomial growth function, we can

declare that learning is feasible using that

hypotheses set.

9:18 2014-09-23
growth function is polynomial => good // learning is feasible

9:19 2014-09-23
with probability assurance

9:19 2014-09-23
key point: break point

9:20 2014-09-23
break point of H: // break point of a hypothses set

Definition:

If no data set of size k can be shattered by H.

then k is a break point for H.

9:22 2014-09-23
just view "break point" as the capability of H(hypothese set)

9:22 2014-09-23
"data set" can be "shattered" by "hypotheses set"

9:23 2014-09-23
main results:

no break point => exp(2, n)

any break point => polynomial

9:31 2014-09-23
this is a remarkable result

0 0