【vc++/office ocr引擎】图片文字识别

来源：互联网发布：三星note8画画软件编辑：程序博客网时间：2024/06/05 19:07

花费了些时间去调试，出现的问题是使用miLANG_ENGLISH，可以正确的识别字母、数字，而一旦使用miLANG_CHINESE_SIMPLIFIED程序就会崩溃，最终找到原因：下载office2003，完全安装（或者已典型安装完的，添加与删除程序–office2003–更改–office工具–Microsoft Office Document Imaging–右键从本机运行）

运行环境：vc++6.0
关键代码如下：

BOOL CCCMD_OCRView::OCRImageFile( CString Name)//OCR{ IDocument *pDoc = new IDocument;  pDoc->CreateDispatch( "MODI.Document" );  pDoc->Create(Name);  //pDoc->OCR( miLANG_ENGLISH, 0, 0 );  pDoc->OCR( miLANG_CHINESE_SIMPLIFIED, 0, 0 );  IImages images = pDoc->GetImages();  long    num =images.GetCount();  for( int i = 0; i < num; i++ )  { IImage  image = images.GetItem(i);    ILayout layout = image.GetLayout();    SetDlgItemText(IDC_EDIT1, layout.GetText());  }  pDoc->Close(0);  pDoc->ReleaseDispatch();  delete pDoc;  return (num > 0) ? TRUE : FALSE;}

识别效果：
这里写图片描述

上面的关键代码可以得到图像中的文字，如果要得到每个文字的具体位置，可以调用ocr库中相应的类，关键代码如下：

BOOL CCCMD_OCRView::OCRImageFile( CString Name)//OCR{   clock_t start = clock();    IDocument *pDoc = new IDocument;  pDoc->CreateDispatch( "MODI.Document" );  pDoc->Create(Name);  //pDoc->OCR( miLANG_ENGLISH, 0, 0 );  pDoc->OCR( miLANG_CHINESE_SIMPLIFIED, 0, 0 );  IImages images = pDoc->GetImages();  long  num =images.GetCount();  CString output = "";  for( int i = 0; i < num; i++ )  {     IImage  image = images.GetItem(i);    ILayout layout = image.GetLayout();    int width = image.GetPixelWidth();    int height = image.GetPixelHeight();    CString wh;    wh.Format("%d %d",width,height);    IWords words = layout.GetWords();    for( int j = 0; j < words.GetCount(); j++)    {        IWord word = words.GetItem(j);        //文字        output += word.GetText();        //最小矩形的位置，以左上角为原点        IMiRects mirects = word.GetRects();        for( int k = 0; k < mirects.GetCount(); k++)        {            IMiRect mirect = mirects.GetItem(k);            long top = mirect.GetTop();            long left = mirect.GetLeft();            long right = mirect.GetRight();            long bottom = mirect.GetBottom();            CString pos ;            pos.Format( " %d %d %d %d \r\n",top,bottom,left,right);            output += pos;        }    }    output += wh;//图像的宽度，高度在最后一行输出  }  clock_t end = clock();  long time = end - start;  CString times ;  times.Format(" %d ",1000*time/CLK_TCK);  output += times;//运行的时间在最后一行输出  SetDlgItemText(IDC_EDIT1, output);//输出  pDoc->Close(0);  pDoc->ReleaseDispatch();  delete pDoc;  return (num > 0) ? TRUE : FALSE;}

部分运行结果如下（在这里没有列出全部结果）：

在 362 102 373 113 您 374 102 385 113 电 387 102 397 113 脑 398 102 409 113 上 410 102 421 113 看 422 102 433 113 到 434 102 445 113 的 446 102 457 113 商 458 102 469 113 品 471 102 481 113 图 483 102 493 113 片 494 102 505 113 可 506 102 517 113 能 518 102 529 113 会 530 102 541 113 与 542 102 553 113 您 554 102 565 113 收 566 102 577 113 到 578 102 589 113 的 590 102 601 113 商 602 102 613 113 品 615 102 625 113 存 626 102 637 113 在 638 102 649 113 颜 650 102 661 113 色 662 102 673 113 差 674 102 685 113 异 158 77 169 88 ， 173 78 176 82 实 182 77 193 88 际 194 77 205 88 颜 206 77 217 88 色 218 77 229 88 以 231 77 241 88 您 242 77 253 88 收 254 77 265 88 到 266 77 277 88 的 278 77 289 88 商 290 77 301 88 品 303 77 313 88 为 314 77 325 88 准 326 77 337 88 737 506

原图：
这里写图片描述

0 0