Mac配置安装OCR过程

来源:互联网 发布:windows优化大师 7.2 编辑:程序博客网 时间:2024/06/16 07:29

本文记录在使用Mac过程中使用OCR进行身份证识别时的第一次尝试过程

依赖库安装

测试代码


主要需要安装tesseract、pytessseract和PIL这三个库过程很简单:

pip install tesseractCollecting tesseract  Downloading https://pypi.doubanio.com/packages/8d/b7/c4fae9af5842f69d9c45bf1195a94aec090628535c102894552a7a7dbe6c/tesseract-0.1.3.tar.gz (45.6MB)    100% |████████████████████████████████| 45.6MB 1.9MB/s Building wheels for collected packages: tesseract  Running setup.py bdist_wheel for tesseract ... done  Stored in directory: /Users/liaohuanghe/Library/Caches/pip/wheels/2c/b1/fd/f55ae4de4d210b44d642e02268cfc8e5d6f5c146389a21deb0Successfully built tesseractInstalling collected packages: tesseractSuccessfully installed tesseract-0.1.3liaohuanghedeMacBook-Air:ocr liaohuanghe$ brew install tesseract==> Installing dependencies for tesseract: libpng, jpeg, libtiff, leptonica==> Installing tesseract dependency: libpng==> Downloading https://homebrew.bintray.com/bottles/libpng-1.6.34.sierra.bottle######################################################################## 100.0%==> Pouring libpng-1.6.34.sierra.bottle.tar.gz��  /usr/local/Cellar/libpng/1.6.34: 26 files, 1.2MB==> Installing tesseract dependency: jpeg==> Downloading https://homebrew.bintray.com/bottles/jpeg-9b.sierra.bottle.tar.g######################################################################## 100.0%==> Pouring jpeg-9b.sierra.bottle.tar.gz��  /usr/local/Cellar/jpeg/9b: 20 files, 724KB==> Installing tesseract dependency: libtiff==> Downloading https://homebrew.bintray.com/bottles/libtiff-4.0.8_4.sierra.bott######################################################################## 100.0%==> Pouring libtiff-4.0.8_4.sierra.bottle.tar.gz��  /usr/local/Cellar/libtiff/4.0.8_4: 245 files, 3.4MB==> Installing tesseract dependency: leptonica==> Downloading https://homebrew.bintray.com/bottles/leptonica-1.74.4_1.sierra.b######################################################################## 100.0%==> Pouring leptonica-1.74.4_1.sierra.bottle.tar.gz��  /usr/local/Cellar/leptonica/1.74.4_1: 52 files, 5.7MB==> Installing tesseract==> Downloading https://homebrew.bintray.com/bottles/tesseract-3.05.01.sierra.bo######################################################################## 100.0%==> Pouring tesseract-3.05.01.sierra.bottle.tar.gz��  /usr/local/Cellar/tesseract/3.05.01: 79 files, 38.7MB

然后使用pip安装PIL:

pip install pytesseract

然后安装PIL库,注意新的库已经叫pillow,所以应该运行如下命令:

    pip install pillow

然后编写测试代码:

from PIL import imageimport pytesseractprintpytesseract.image_to_string(Image.open('image/obama2.jpeg'))

运行结果:

u'\ufb01a\ufb01EE\n$180 % Rx\u2018s\u2018l\ufb01\ufb02\ufb01\ufb01m\ufb01\nmg 196118548\n\n11 *3 $\xa7W$E\xa2Ibi9if\ufb01\xe9\n\n\ufb02lt\ufb01leooE \u2018\n5\n\n\ufb02\ufb01\ufb01m\ufb01is 123456196108041236'

注意这里的123456196108041236就是身份证号,至于其他汉字需要其他处理,将在另外的文章记录下,这里就不说了。 

附上奥观海的身份证:
 这里写图片描述


参考资源:
http://blog.csdn.net/ywjatjd/article/details/53354217
http://blog.csdn.net/yimingsilence/article/details/52015159
http://blog.csdn.net/u013421629/article/details/72677964

原创粉丝点击