这两天在学习OCR相关的东西,还是挺好玩的。有很多算法还没有看懂,还需要慢慢去研究,这篇博客就记录一下搭建OCR应用遇到的那些坑吧,以后遇到坑了还可以回头看看…文章最后有本文demo,没有耐心看教程的可以直接下载demo去研究…

好了,开始动手!

  • 新建一个Single View Application工程,命名为TestOCR

  • 添加Podfile文件,在Podfile文件里添加以下代码,然后到终端执行pod install,就可以在项目里添加用来OCR扫描的第三方库了,关于这个库的介绍,请看文章底部的资料篇(关于CocoaPods的教程,请自行谷歌…)

1
2
3
platform :ios, "7.0"

pod 'TesseractOCRiOS'
  • 引入TesseractOCRiOS库之后,还需要自行去下载相应的用来识别的语言文件,地址走你!(ps:需要翻墙),我们下载一个英文的语言文件,名称为tesseract-ocr-3.02.eng.tar.gz,下载之后是一个压缩包,解压缩之后,将里面的tessdata放入工程文件目录下,(和AppDelegate.h文件同级目录)。

  • 使用新生成的TestOCR.xcworkspace打开工程项目,将刚才放入工程的tessdata文件夹引入工程,注意,选择时需要设置为引用,见下图:

  • 现在,你的工程目录结构应该如下图所示,图1的文件夹应该是蓝色

  • TesseractOCRiOS库的配置到这里就结束了,下面来开始做一个简单的应用界面,切换到Main.storyboard,选择View Controller对应的scene,在File inspector里取消Size Classes的勾选(默认是选中的)

  • View ControllerView里添加3个子视图,一个UIbutton,一个UIImageView和一个UITextView,见下图

  • ViewController.m中为UIImageView添加名为mImageViewIBOutlet,为UITextView添加名为mTextViewIBOutlet,并为UIbutton添加一个名为selectImageButtonClickedIBAction,添加完之后ViewController.m的代码应该如下图所示

  • 基础的界面布局终于弄完了,来正式开始代码部分。首先,在ViewController.m中引入TesseractOCR/TesseractOCR.hTesseractOCR/UIImage+G8FixOrientation.h,然后,为ViewController添加一些相关的协议,代码:

1
2
3
4
5
#import <TesseractOCR/TesseractOCR.h>
#import <TesseractOCR/UIImage+G8FixOrientation.h>

@interface ViewController()
<UINavigationControllerDelegate,UIImagePickerControllerDelegate,UIActionSheetDelegate,G8TesseractDelegate>
  • - (IBAction)selectImageButtonClicked:(id)sender方法中添加以下实现,使用UIActionSheet来实现选择照片的弹出框并实现UIActionSheet的代理方法
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
#pragma mark - Event
- (IBAction)selectImageButtonClicked:(id)sender {
UIActionSheet *actionSheet = [[UIActionSheet alloc] initWithTitle:nil
delegate:self
cancelButtonTitle:@"取消"
destructiveButtonTitle:nil
otherButtonTitles:@"拍照",@"从相册选取", nil];
[actionSheet showInView:self.view];
}

#pragma mark - UIActionSheetDelegate
- (void)actionSheet:(UIActionSheet *)actionSheet clickedButtonAtIndex:(NSInteger)buttonIndex {
if (buttonIndex == 0) {
[self shootPicture];
} else {
[self selectExistingPicture];
}
}

//拍照
-(void)shootPicture {
[self getPictureFromSource:UIImagePickerControllerSourceTypeCamera];
}

//从相册选择照片
- (void)selectExistingPicture {
[self getPictureFromSource:UIImagePickerControllerSourceTypePhotoLibrary];
}

//创建和配置图像选取器,使用传入的sourceType确定调出照相机还是媒体库
- (void)getPictureFromSource:(UIImagePickerControllerSourceType)sourceType {
NSArray *mediaTypes = [UIImagePickerController availableMediaTypesForSourceType:sourceType];

if ([UIImagePickerController isSourceTypeAvailable:sourceType] && [mediaTypes count] > 0) {
NSArray *mediaTypes = [UIImagePickerController availableMediaTypesForSourceType:sourceType];
UIImagePickerController *picker = [[UIImagePickerController alloc]init];
picker.mediaTypes = mediaTypes;
picker.delegate = self;
picker.allowsEditing = NO;
picker.sourceType = sourceType;
[self presentViewController:picker animated:YES completion:nil];
} else {
UIAlertView *alert = [[UIAlertView alloc]initWithTitle:@"错误" message:@"设备不支持!" delegate:nil cancelButtonTitle:@"确定" otherButtonTitles: nil];
[alert show];
}
}

#pragma mark - UIImagePickerControllerDelegate
- (void)imagePickerController:(UIImagePickerController *)picker didFinishPickingMediaWithInfo:(NSDictionary *)info {
UIImage *image = [info objectForKey:UIImagePickerControllerOriginalImage];
self.mImageView.image = image;
[self recognition:image];
[picker dismissViewControllerAnimated:YES completion:nil];
}

- (void)imagePickerControllerDidCancel:(UIImagePickerController *)picker {
[picker dismissViewControllerAnimated:YES completion:nil];
}
  • 上面的一大段代码用来获取照片,UIImagePickerControllerDelegate里的- (void)imagePickerController:(UIImagePickerController *)picker didFinishPickingMediaWithInfo:(NSDictionary *)info回调取得照片后,开始处理和识别照片,这里只是最基础的处理照片,代码如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
- (void)recognition:(UIImage *)image {
dispatch_async(dispatch_get_global_queue(0, 0), ^(void) {
// Languages are used for recognition (e.g. eng, ita, etc.). Tesseract engine
// will search for the .traineddata language file in the tessdata directory.
// For example, specifying "eng+ita" will search for "eng.traineddata" and
// "ita.traineddata". Cube engine will search for "eng.cube.*" files.
// See https://code.google.com/p/tesseract-ocr/downloads/list.

// Create your G8Tesseract object using the initWithLanguage method:
G8Tesseract *tesseract = [[G8Tesseract alloc] initWithLanguage:@"eng"];

// Optionaly: You could specify engine to recognize with.
// G8OCREngineModeTesseractOnly by default. It provides more features and faster
// than Cube engine. See G8Constants.h for more information.
tesseract.engineMode = G8OCREngineModeTesseractOnly;

// Set up the delegate to receive Tesseract's callbacks.
// self should respond to TesseractDelegate and implement a
// "- (BOOL)shouldCancelImageRecognitionForTesseract:(G8Tesseract *)tesseract"
// method to receive a callback to decide whether or not to interrupt
// Tesseract before it finishes a recognition.
tesseract.delegate = self;

// Optional: Limit the character set Tesseract should try to recognize from
// tesseract.charWhitelist = @"0123456789";
tesseract.charWhitelist = @"@.(){}/\\!*&#0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";

// This is wrapper for common Tesseract variable kG8ParamTesseditCharWhitelist:
// [tesseract setVariableValue:@"0123456789" forKey:kG8ParamTesseditCharBlacklist];
// See G8TesseractParameters.h for a complete list of Tesseract variables

// Optional: Limit the character set Tesseract should not try to recognize from
//tesseract.charBlacklist = @"OoZzBbSs";

// Specify the image Tesseract should recognize on
tesseract.image = [[image fixOrientation] g8_blackAndWhite];

// Optional: Limit the area of the image Tesseract should recognize on to a rectangle
// tesseract.rect = CGRectMake(20, 20, 100, 100);

// Optional: Limit recognition time with a few seconds
tesseract.maximumRecognitionTime = 2.0;

// Start the recognition
[tesseract recognize];

// Retrieve the recognized text
NSLog(@"%@", [tesseract recognizedText]);

dispatch_async(dispatch_get_main_queue(), ^(void) {
self.mTextView.text = [tesseract recognizedText];
});

// You could retrieve more information about recognized text with that methods:
NSArray *characterBoxes = [tesseract recognizedBlocksByIteratorLevel:G8PageIteratorLevelSymbol];
NSArray *paragraphs = [tesseract recognizedBlocksByIteratorLevel:G8PageIteratorLevelParagraph];

NSLog(@"characterBoxes = %@ \n paragraphs = %@ \n", characterBoxes,paragraphs);
});
}
  • 关于Tesseract-OCR-iOS的使用,代码中的英文注释已经解释的很详细了,也可以参照Using Tesseract OCR iOS

  • 最基础的使用就介绍完了,下篇会介绍一些关于图片处理的问题,在OCR识别前先对图片进行一些处理会提升识别准确度和识别效率。常见的处理方式有:

    • 二值化
    • 灰度
    • 倾斜校正
    • 图片切割

    也可以自行谷歌去解决相关问题。

  • 最后,附上本文的demo,下载地址

关于OCR的资料