OCR项目汇总

来源：互联网发布：linux线程的优先级控制编辑：程序博客网时间：2024/05/29 08:13

基本介绍

1 OCR文字识别用的是什么算法？|知乎
2 深度学习文字识别论文综述|CSDN，综述中涉及到的论文都很旧，
3 文字检测与识别资源|CSDN,涉及的论文都很新，五颗星
4 Awesome Scene Text Recognition,awesome,五颗星
5 OCR, 这个博主的质量都很高，五颗星
6 YunOS场景文字识别|阿里云

paper

reading text in the wild, VGG 组
1 Reading Text in the Wild with Convolutional Neural Networks, VGG组，, IJCV2016
阅读笔记|CSDN
2 Synthetic Data for Text Localisation in Natural Images， VGG组， CVPR2016，
阅读笔记|CSDN，code
3 Deep Features for Text Spotting
, VGG组， ECCV2014
4 Detecting Text in Natural Image with
Connectionist Text Proposal Network,
code, ECCV2016

CVPR2017相关paper

Awesome Typography: Statistics-Based Text Effects Transfer,文字生成，效果很酷炫
EAST: An Efficient and Accurate Scene Text Detector, 快&准的场景文字检测
Detecting Oriented Text in Natural Images by Linking Segments
Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection
Unambiguous Text Localization and Retrieval for Cluttered Scenes
, 文本定位和检索

数据集

MSRA Text Detection 500 Database (MSRA-TD500)
The Street View Text Dataset)
The Street View House Numbers (SVHN)_Dataset
NEOCR: Natural Environment OCR Dataset
KAIST Scene Text Database
ICDAR 2003 Robust Reading Competitions
ICDAR 2005 Robust Reading Competitions
ICDAR 11
ICDAR 2013 Robust Reading Competition
COCO-Text: Dataset for Text Detection and Recognition

gtihub code

1 tesseract, stars 12k, C/C++接口
2 tesseract.js, stars 12k, pure js,支持62种语言的OCR
3 paperless, stars 3.6k, 主打document OCR
4 pyocr, starts 606, A Python wrapper for Tesseract and Cuneiform
5 doc2text, stars 1k, 依赖opencv与tesseract
6 pdftabextract, stars 668,pdf中的表格提取转换到excel中
7 tesserocr,tesseract-ocr API
的python 接口
8 SSD_scene_text_detection, 将SSD用于场景文本检测中

复现点：
1 paper: reading text in the wild with deep convolutional neural network
论文阅读笔记：论文阅读：Reading Text in the Wild with Convolutional Neural Networks,
部分代码为code|matlab

文章的主要思想为先利用region proposal产生出足够多的候选区域，再resize这些候选框到固定大小，用一个CNN来对这些候选框进行单词的分类，超过90k个单词。使用生成的带文本的图片的方法，能够保证文本单词的样本量。
思路很清晰，限制条件也很明显，不能出现样本外的单词，诸如一些合成词；此外，候选框也需要完整地包含单词。

2 paper : EAST: An Efficient and Accurate Scene Text Detector
旷视的最新成果。

阅读全文

0 0