【PRML Read notes】绪论（some concepts）

来源：互联网发布：澳大利亚淘宝编辑：程序博客网时间：2024/05/18 15:05

The field of pattern recognition is concernedwith the automatic discovery of regularities in data through the use of computer algorithms and with the use of these regularities to take actions such as classifying the data into different categories.

模式识别领域关注的是利用计算机算法自动发现数据中的规律，并利用这些规律进行操作，比如数据分类。

The ability to categorize correctly new examples that differ from those used for training is known asgeneralization.

正确分类与训练集不同的新样本的能⼒叫做泛化（ generalization ）。

In practical applications, the variability of the input vectors will be such that the training data can comprise only a tiny fraction of all possible input vectors, and so generalization is a central goal in pattern recognition.

在实际应⽤中，输⼊向量的变化性是相当⼤的，以⾄于训练数据只所有可能的输⼊向量中相当⼩的⼀部分，所以泛化是模式识别的⼀个中⼼问题。

For most practical applications, the original input variables are typically preprocessedto transform them into some new space of variables where, it is hoped, the pattern recognition problem will be easier to solve.

对于大多数实际应用，原始输入的变量通常会经过“预处理”将他们变换到一个新的变量空间。这个空间是人们所期望的，能够简化处理模式识别问题的空间。

模式识别问题分为三类

有监督学习（supervised learning）
Definition：Applications in which the training data comprises examples of the input vectors along with their corresponding target vectors are known as supervised
learning problems.
give a finite number of discrete categories(类别在test data前已知)：分类（classification）
give one or more continuous variables：回归（regression）
无监督学习（unsupervised learning）
Definition：the training data consists of a set of input vectors x without any corresponding target values.
discover groups of similar examples within the data：聚类（clustering）
determine the distribution of data within the input space：密度估计（density estimation）
project the data from a high-dimensional space down to two or three dimensions：数据可视化（visualization）
反馈学习（reinforcement learning）
Definition：the technique of reinforcement learning (Sutton and Barto, 1998) is concerned with the problem of finding suitable actions to take in a given situation in
order to maximize a reward.（like the game of go）
反馈学习问题没有给定最优输出的⽤例。这些⽤例必须在⼀系列的实验和错误中被发现。这与有监督学习相反。通常，有⼀个状态和动作的序列，其中学习算法与环境交互。在许多情况下，当前动作不仅影响直接的奖励，也对所有后续时刻的奖励有影响。例如，通过使⽤合适的反馈学习技术，⼀个神经⽹络可以学会 backgammon 游戏的玩法，并且玩得很好（ Tesauro, 1994 ）。这⾥神经⽹络必须学习把⼀⼤组位置信息、骰⼦投掷的结果作为输⼊，产⽣⼀个移动的⽅式作为输出。通过让神经⽹络⾃⼰和⾃⼰玩⼀百万局，这个⽬的就可以达到。⼀个主要的挑战是 backgammon 游戏会涉及到相当多次的移动，但是只有在游戏结束的时候才能给出奖励（以胜利的形式）。奖励必须被合理地分配给所有引起胜利的移动步骤。这些移动中，有些移动很好，其他的移动不是那么好。这是信⽤分配（ credit assignment ）问题的⼀个例⼦。反馈学习的⼀个通⽤的特征是探索（ exploration ）和利⽤（ exploitation ）的折中。“探索”是指系统尝试新类型的动作，“利⽤”是指系统使⽤已知能产⽣较⾼奖励的动作。过分地集中于探索或者利⽤都会产⽣较差的结果。

阅读全文

1 0