A Brief Summary of Yann's "Gradient-Based Learning Applied to Document Recognition"

来源：互联网发布：特朗普金正恩知乎编辑：程序博客网时间：2024/06/05 17:16

Paper Info:Gradient-Based Learning Applied to Document Recognition

YANN LECUN, MEMBER, IEEE, L´EONBOTTOU, YOSHUA BENGIO, AND PATRICK HAFFNER

I. Introduction

II. CNN for isolatedcharacter recognition

Features of Tradition Pattern Recognition:

1. hand-designedfeature extractor

2. trainable classifier

Problem: Images too large;topology of input (space or temporal correlations) ignored

Solution:

Using Convolutional Networks

Features: 1)local receptive fields 2)shared weight 3)spatial or temporalsubsampling(Once a feature has been detected, location less important)->LeNet-5

III. Results andcomparison with other methods

IV. Multimodule systems and graph transformer networks(GTN)

V. Multiple object recognition: HOS (The first method for character string recognition)

Isolated characters TO strings of characters

optimizing a global criterion

A now classical method for segmentation andrecognition—HOS

Good candidate locations for cuts can be found by locating minima in the vertical projection profile, or minima of the distance between the upper and lower contours of the word.

Structure of the Process

Question: What's the meaning of Interpretation graph?

Definitions in the paper:

The goal of the recognitiontransformer is to generate a graph, called the interpretation graph orrecognition graph that contains all the possible interpretations for all thepossible segmentations of the input.

The interpretation graph hasalmost the same structure as the segmentation graph, except that each arc isreplaced by a set of arcs from and to the same node.

VI. Global training for graph transformer networks

?global training? The whole process?

1.Viterbi training 2.discriminative Viterbitraining 3.Forward training 4.discriminative forward training 5.remarks

VII. Multiple object recognition: Space displacement neural network